Develop asks mo-cap experts for their insights on the ideal tech to use for your game

The cutting edge of motion capture

From Kevin Spacey declaring war on the world in Call of Duty to a frightened girl sneaking around dystopian halls in mobile series République, you don’t have to look far to see motion capture in action.

Countless titles use it for everything from cutscenes to in-game animations, but the technology may still seem daunting to some developers. Marker-based or markerless? Optical or inertial? Facial capture? Full performance? These are some of the many questions studios need to consider when looking into introducing motion capture to their development process.

Develop has consulted multiple mo-cap experts – from both the firms that provide the hardware and capture services, to triple-A studios – to compile this comprehensive guide.

On your markers

Let’s start with the most traditional form of motion capture: marker-based. Actors in figure-hugging suits covered in bobbles are tracked by multiple cameras, and the data is later retargeted onto in-game character rigs – it’s the form that initially springs to mind when mo-cap is mentioned.

Stoo Haskayne, head of production at Centroid, which recently signed a partnership with Pinewood, says this system’s biggest advantage is decades of experienced users: “The technology has been in use for so many years now that all the teething troubles have been resolved and it is solid and reliable.”

Audiomotion founder and MD Mick Morris agrees, adding: “The advantages of a marker-based system is that it always just works. The markers are soft so there’s no danger of performers injuring themselves or having the discomfort of wearing batteries or hard sensors.

“Our system is sub-millimetre accurate so it records even the tiniest nuance, the smallest detail. An optical marker system means you can also build a huge volume – the same needed for mo-cap – and still capture everything. With markerless, you are limted to a pretty small area.”

Another advantage, observes OptiTrack’s chief strategy officer Brian Nilles, is that this large volume enables devs to capture scenes with several characters all at once. However, no solution is perfect.

“The primary disadvantage of marker-based mo-ap is that performers have to wear reflective markers and, for best results, a tight fitting suit,” says Nilles.

Other experts point to the fact that optical marker-based systems require a clear line of sight to the markers, so any occlusion can disrupt a shoot – although some firms claim this issue can be resolved.

Free from bobbles

Markerless motion capture is the main alternative, with Nilles claiming it adds a little more freedom for performers.

“With markerless mo-cap, they can be in full costume or wearing street clothes,” he says. “However, this system is restricted to smaller volumes and lower character counts, and it produces less accurate skeleton solves that are typically marked by noisier data and inaccurate rotational measurements.

“The promise of markerless mo-cap is that eventually users will be able to capture animation from live-action sequences, although the technology still needs to progress quite a bit before that’s a reality.”

A solution to these issues is the use of inertial mo-cap, something tech firm Xsens specialises in. Its capture suits have sensors woven into the material or inserted into strategically placed pockets to track an actor’s movements.

“Ease of use, unlimited capture volume and clean data are typical advantages of inertial mo-cap,” says product manager Hein Beute. “The set-up time is however long it takes to place the inertial trackers on body using a suit or straps, plus a calibration of four seconds.

“Inertial motion capture systems have no limit in mo-cap volume as you can record anywhere, meaning your mo-cap can be done where the action takes place. Snowboard moves can be captured in the natural environment, not simulated in a studio. A performer will act more natural in an environment that is familiar to him.”

The obvious disadvantage, Beute observes, is that performers have to wear the all of the tech during the shoot. However, given the ongoing trend for technology – particularly wearables – to get smaller as they develop, this is unlikely to remain an issue for too long.

All about that face

Markerless mo-cap also has its advantages when you’re focusing on facial movements, argues Dimensional Imaging CEO Colin Urquhart.

“It is impractical to place and track more than 100 markers on a person’s face, which fundamentality limits the fidelity of marker-based facial motion capture,” he says. “It is also very difficult to place markers on some of the most important areas of the face, such as the lips or very close to the eyes.

“There are no such limitations with a markerless solution such as DI4D, which therefore delivers much higher fidelity data and captures more of the subtlety and nuance of facial performance and expression.”

Such a system also does away with the issue of consistently placing facial markers in the same spot before each shoot, or worrying about the markers falling or getting rubbed off. The result is a quicker set-up time.
Urquhart acknowledges that there are limits to the technology’s quality due to the “current lack of real-time processing”, but suggests it is still preferable to traditional animation techniques.

“As the graphical quality of games continues its relentless progress, it is becoming harder to create the required level of realism of facial animation by hand,” he says. “Fortunately, at the same time, it is increasingly easy to capture facial performance from real-life actors with much higher fidelity.”

When it comes to facial capture systems, Urquhart advises a multi-camera setup as “a single camera is only able to capture 2D data in which depth can only be inferred”. Haskayne agrees, adding more camera means more reference data, and “the more reference, the better”.

Morris, however, says a single camera system can suit some devs’ needs: “It’s simple and makes the head-mounted camera much lighter, which is important when you have an actor wearing one all day. Too much weight causes discomfort and headaches, and you aren’t going to get a really good performance from your actor if they are suffering.”

And this is crucial, as the actor’s are just as important as the technology tracking them. As the quality of animation increases, poor performances become harder to hide, which is why devs need to think carefully about their casting.

Capture the moment

So which mo-cap solution is right for your studio? There are a number of factors to consider, including: the environment you intend to shoot in, whether outdoors or indoors, the number of actors, the complexity of their actions, whether or not the scene will involve props or virtual cameras, the size of the space available, and, of course, the cost.

“Historically, high costs forced devs to make concessions surrounding some of these issues,” says OptiTrack’s Nilles. “High-res cameras offered more nuanced data, but were so expensive that studios had to settle for low camera counts, resulting in excessive data clean-up. Low-res cameras enable higher camera counts for more continuous but lower fidelity data.”

When it comes to capture space, Centroid’s Haskayne advises: “Generally a large volume – with no windows, skylights or natural lights – covered by a high number of cameras will be good for most purposes.”

On facial capture, Urquhart says the choice comes down to a head-mounted camera or a fixed array of cameras.

“Using a HMC gives the actors more freedom to move around, making full performance capture more practical,” he says.

“A fixed array typically makes actor movement much more constrained, but the actor is not encumbered by a helmet system, and the data is typically of better quality.”

Just Cause 3 dev Avalanche Studios uses a mix of optical and markerless mo-cap systems with keyframe animation for in-game content, switching to full performance capture – including facial – for the cutscenes. The animation programmers even work with the animators to plan mo-cap sessions.

Lead animator Alex Crowhurst offers the following advice: “Research the latest tech to find out what meets your project’s expectations and production needs.

“Build a pipeline that is robust and as efficient as possible: the quicker you can get your content from initial capture to engine side, the quicker your team will be able to iterate on feature sets.”

The quality of motion capture systems will inevitably improve in future, but what are the current limitations that need to be overcome in the meantime?

Moving forward

Nilles points to set-up times: “Motion capture is going to become easier to set up and use. It will become more invisible on set, faster and will require less work on the back end.”

Urquhart adds: “The quality and fidelity of mo-cap will undoubtedly continue to increase with the increasing pixel count of cameras. As a result the quality of the acting performances will become increasingly apparent.”
Xsens’ Beute predicts mo-cap will soon become “more accessible and affordable”.

“Today a motion capture performer is always faced with technology, whether faced with many cameras or wearing inertial trackers,” he says. “Inertial motion capture will get more unobtrusive and smaller, and will even be integrated into clothing.”

Creative Assembly mo-cap manager Peter Clapperton concludes: “I would like to see the future of mo-cap take advantage of combining marker-based and markerless together to create a system that is as fault free as possible. Better computing technology and more intelligent software will eventually enable a system to be relatively faultless during the capture process.

“Philosophically, in a mo-cap utopia, we wouldn’t be battling to build the better system, but working together to build the best system possible.”

About MCV Staff

Check Also

The shortlist for the 2024 MCV/DEVELOP Awards!

After carefully considering the many hundreds of nominations, we have a shortlist! Voting on the winners will begin soon, ahead of the awards ceremony on June 20th