Taming the Mob: Creating Believable Crowds in Assassin’s Creed
James Therien & Sylvain Bernard, Ubisoft
AC is an animation intensive game so crowds deal a lot with animation.
Measuring our expectations: Looking at other games, getting reference from real life footage. GTA. Focus more on crowds. Consoles of this gen weren’t even out yet so hard to know their capacities. Interesting trailer: Dead Rising, crowds of zombies. Showed what the console can achieve in numbers of characters on screen.
To create a realistic crowd: move in tight areas, realistic contact, proper reaction to the environment. Incorporate the crowd into the gameplay.
Rich NPCs: every NPC is an individual. Emergent crowd, not explicit in code except in spawning. Make those groups of NPCs believable and the crowd emerges from that.
Level design and art direction was a real life scale. Didn’t want Prince of Persia – realistic capabilities. Did need to cheat sometime – survive a 66 ft drop (20m).
Animation style: keyframe editing, mocap where possible. Same group as did Sands of Time, brought over animation like walling, but with more realistic touches. Mocap editing to match keyframe look. Add overlap, create more impact, adjust posing. “Stylized realistic” animation style. Skeleton shared between all characters. Create opportunity to replace the animations. Male or female, thin or fat. Did some scaling if needed (except on legs). Headache to modeling team to have to fit those meshes onto the skeleton, but was a really good decision.
Layering animation – every NPC shares the same behavior system but if want to customize (more feminine) just replace the needed animations within the system.
Lots of bones in our characters! 55 in main skeleton (w/hands), 35 in head, 40-80 procedural for hair, swords. 36 bones in the pigeons! Hinge bones: simple physics – wind, gravity, momentum. No collisions, just constraints. Used for hair and robes.
“Look at” – 3 independent targets: torso, head, eyes. Eye blink. Control the eyelids with the characters look up (lifting). All NPCs doing that. Small subset of total: IK, lipsync.
Movement system – complex graph. More realistic = controls not responsive. Really complex = too much impact on other systems. Separate the walk into 4 stages, each with their own transitions. Starts from every direction. Maybe a bad decision, started to explode when linked into other systems. 3 weight poses, so all systems needed to start on those 3 weigh poses, so simplify where possible. When you release control the animation takes an extra few steps to look realistic.
Went to a simplified version, keeping maximum fluidity. High / low profile – has a transition between the two to keep it more realistic rather than just snapping. NPC and assassin use the same animation system. The computer is like a player, playing against the human. Apply all of assassin’s animations to random shopkeepers.
122 movement and transition animations. 108 transitions. 14 movement, 10 wait and idle. Left-front vs right-front versions for movement. 12 cycle breakers. 24 other. 168 total animations. So every new major version required 168 of them.
Movement system – displacement node. Drives animation forward. (key point that drives movement). Crowd swimming, cycle breakers, and walk stop. Swimming – shifting shoulders sideways. Makes them look more lifelike. Torso and arms only animations that don’t need to disrupt the walk cycle. Can stop for just a moment to show the flow.
Oriented move. NPCs needed to walk backwards, which the NPC does not do. Blend from backwards to forward to turn around while fleeing. Became problematic since the assassin did not do it, and for navigation purpose. So not actually used in the game.
Environment: not on a grid, very large. Navigation not constrained to ground plane. Totally arbitrary environment, irregular, beams everywhere, not fixed heights. Detail everywhere. Generate navigation data from collisions. 2 layers in AI: behavior and decision. Behavior shared with the assassin – managing animation and doing fine-grained interaction. Guidance system – lines with normal information to cast against and interpret those lines in run time. Extremely expensive, so used only in specific cases like assassin climbing. Guards chasing use the same code as assassin.
Other data is more AI specific – 2.5D triangle mesh for flat spaces. Local planning and tests. On top of that generate a waypoint network. A* pathfinding on top of that. Object waypoints and metalinks – jump data encoded ahead of time. Use guidance data generated for assassin to compute jumps offline and then stored compactly. Also encode ladders, beams, etc. Steering pass to use long-term nav data. Continuous small corrections for avoiding walls, dynamic objects. Path validation to remove waypoint placement artifacts. Communicate with decision layer if there is trouble (pushing away other NPCs if stuck).
Level design specific tools for navigation. Fine for Pt A to Pt B, but need to specify where crowds go. Crowd flow lines, navigation highways that the level designers place in the map. Makes wandering very cheap, useful for fleeing. Only thing they have to do once on the network is decide what branch to take at an intersection. Fleeing has no specific destination, just want to get away, so they can use that same data with a different decision at the branch points.
Interaction with environment: Unbalance system. Completely animated, rag doll activated at the last moment. Ragdoll can look like fainting, but wanted something to feel more like protecting himself and having energy. Versions for each possible height and speed and outcome (falling, making over, recovering). Same thing for falling off a ledge – by speed & facing.
Crowd Density: Spawning in cities – originally just in a sphere. But as we added NPCs you wouldn’t see them, they would be in streets you couldn’t get to or on other side of buildings. Also couldn’t see down long avenues – just increasing the bubble just put more out of sight. 2d mesh and do a flood fill. Pick triangles out of LOS to spawn. Don’t bias the blob in any direction because player is so mobile. Just make sure connecting streets always full.
Creating variety: Entity builder. Mixed attributes on spawning. Endless possibilities – head structure, textures, color, thin, fat, military, accessories, AI reactions…
Crowd composition: Game play crowd. Base walking crowds + bench, monks, beggars. Lots of game play relevant crowd entities. All this together gives an immersive world. Had intended more duties but really just did kiosks. Wish we had more things like sweeping, drinking from a fountain. Sound faked some of this – hearing someone working metal even if you don’t see it.
Interaction: Physical interaction – soft push, grab and throw, collision, fight, assassination. Reaction system. Communicating design to programmers by generating fake footage of the different reaction scenarios. Reactions: individual reaction, sound from an NPC, body posture. Acrobatic reaction creates zones – yellow outer area to just look, grey area to stop and look. Then people slowly peel off. People following a fight sequence, attracted as long as no one dies. Chase sequence with recoil, dodging the assassin. Some just get bowled over if they don’t notice. Then generate fake conversations when pedestrians wind up nearby. Took a lot of work to keep the NPCs attracted to a fight from getting in the way. Body gestures to match sounds.
We often focus on textures and meshes, but the AI and behavior is just as important. Need to bring the world to life.
Reactions: more than just visual. “reaction packs” specified by level designers. Set of response to specific events. Draw the guards into the fight by reacting to the alert event. Guard awareness levels – seeing a dead body changes reaction packs.
How do you get lots of NPCs at 30 fps? Not easy with no LOD on decision or behavior layers. Lots of LOD on the animations – bone counts drop fast, drop IK, simple look at, simple procedural rigs. Focus on making common NPCs as cheap as possible. Don’t do stuff when NPCs out of sight. Lots of NPCs only pathfinding to get to position or crowdflows. Load balanced fairly well – a fair number always doing common tasks. As long as just player and a few guards being expensive it balances out.
Concurrency – lots of multithreading. Started with just render / engine, but not balanced. Multithread the engine the Wednesday before a big demo… Animation multithreading worked out well. Successful so went to a general distribution system. Filling up the physical threads on the 360. Have to balance well, optimal result is not the goal, steady frame rate is. As much SPU use as we could on PS3 but didn’t work quite as well.
Conclusion: met our goals of a believable crowd. Quality focus with good support from management. Lots of diversity, quality, and quantity. Harder than planned to incorporate gameplay – emergent crowd interactions hard to manage, difficulty for player to use the crowd as a tool. Fell short of that goal.
Q: Size of the team? Animators, programmers… 150-180 total. 1/3 programmer / art / design split.
Q: When you spawn a character, is the duty fixed for their lifespan? Sometimes yes, sometimes no. Different types of NPCs, most are fixed duty, but sometimes just released to the crowd. Bench guys fixed but if they are servants they can get up. Big system to do simulation but realized that it didn’t matter because the life of an NPC is so short.
Q: Tall AIs and short AIs – how to make climbing work? All environment detection is dynamic, all at runtime, so it all just worked. Dynamic responsive to scale.
Q: What were goals for expanding gameplay for player? Diversions… no need to add new types if old types already worked.
Q: performance vs memory? Measure a lot… did limit variety. NPCs expensive in many ways. Tried to make them cheap as possible for common NPCs, but there is a limit to the work you can do.
Q: flood fill for spawning… visual test for spawning? Yes – lots of raycasts in the game. Raycasts for everything. Multiple raycasts to make sure NPCs spawn out of sight or whether NPCs should react to something. Spawning was asynchronous so didn’t matter how long it took. Iteratively evolved flood fill triangles, since that was expensive. Took a lot of optimization on that process.
Q: how did you test the crowd system? Humans, mostly. That was a problem, systems breaking other systems without realizing it. Lots of testers, that’s pretty much it. Looking into other solutions for the future.
Q: reaction system, use of period events to represent continuous stimulus. Created problem with moving player creating reaction as he arrived? To avoid delay… solved with brute force. Not continuous but almost so. When you start / stop moving sends events, fill in the difference. All level design controlled, so they could tune and control. Primary and secondary reaction zones, mostly data driven.
Q: duties planned, how many wound up with? Not much, just orators and kiosks, used to create cluttering.
50 software engineers at the peak. AI was 15-20.
Q: What did you learn from AC to support much larger numbers of participants? Crowd behavior changes drastically as you increase number of people. 40 NPCs worked, +20-30 more everything broke down. Pattern of motion extremely different as crowd gets denser. Not just a question of detail, especially for navigation. Each NPC has their own speed of walk which didn’t make things easy. Suggest starting with a big number even if you can’t render and see how it works, don’t go incrementally.
Q: What sort of solving for complex scenarios like intersections? Too complex to discuss here.
Q: If there is a sequel, what crowd things differently to fix shortcomings? Big problem at start was we didn’t know how it will work out. Now we have the base and we can really build from that. Expecting better integration with gameplay, going to be a big focus.