Performance-Driven Facial Animation Performance-based Facial - - PowerPoint PPT Presentation

▶

Oct 14, 2023 193 likes •495 views

Performance-Driven Facial Animation Performance-based Facial Animation Creating an animation of a realistic and expressive human face is one of greatest challenges in computer graphics. The human face is an extremely complex

SLIDE 1

Performance-Driven Facial Animation

SLIDE 2

Performance-based Facial Animation

✦ Creating an animation of a realistic and expressive human face is

ne of greatest challenges in computer graphics.

✦ The human face is an extremely complex biomechanical system that

is very difficult to model.

✦ While some animators may be able to produce realistic facial

animation, the consistent production of large amounts of flawless animation is not practical.

✦ Simply mimicking the desired expression is far faster, easier, and

more natural than adjusting dozens of sliders.

SLIDE 3

Applications in Entertainment

✦ In 2000, LifeFx had a ground-breaking short

animation, “Young at Hear” – marking perhaps the first time that a CG actor actually fooled a few people.

SLIDE 4

Motion Capture + Keyframes

✦ Both Final Fantasy (2000) and an early test for Shrek chose to use

motion capture for the body animation but manual animation for the face.

✦ Gollum in the Lord of the Rings trilogy was done with traditional

keyframe animation, it was heavily guided by reference video of the actor Andy Serkis who “played” Gollum.

SLIDE 5

Motion Capture in Special Effects

✦ The Matrix sequels used performance-driven virtual actors in a

number of special effects shots.

✦ They used optical flow in each of 5 HD cameras, and applied

stereo to reconstruct the 3D motion of the face mesh.

✦ The Polar Express in 2004 was the breakthrough movie for

performance-driven facial animation

✦ The Avatar in 2009 truthfully captured the facial expression using a

new “head rig” technology.

SLIDE 6

The Uncanny Valley

✦ Current technology is

suitable for animated films where the characters have a somewhat “cartoony” feel.

✦ Some CG humans have

been described as “creepy”, “the living dead”.

SLIDE 7

Face Tracking

✦ Stereo. ✦ Feature-based tracking. ✦ Appearance-based tracking. ✦ Model-based tracking.

SLIDE 8

Face Retargeting

✦ Often the digital character that needs to be animated is not a digital

replica of the performer.

✦ The process of adapting the recorded performance to the target

character is called motion retargeting or cross-mapping.

SLIDE 9

Parameterization

✦ The important issue for retargeting is the choice of facial model

parameterization (or “rig”).

✦ A rig provides a parameterization of the facial expressions of a

digital face.

✦ It describes facial expressions with a small number of parameters

and limits the range of expressions to the allowed range of these parameters.

✦ There are many different approaches to parameterization of a

digital face: blendshape, PCA, or raw mesh.

SLIDE 10

Blendshape Parameterization

✦ The blendshapes provide a linear parameterization of the face

deformations.

✦ The space of potential faces is the linear space spanned by the

blendshapes (or a portion of this space if the weights are bounded).

✦ Retargeting is reduced to estimating a set of blending weights for

the target face at each frame of the source animation.

✦ Typical types of blendshapes include: whole-face blendshapes,

delta blendshapes, or local blendshapes.

SLIDE 11

PCA Parameterization

✦ Principal component analysis (PCA) of motion capture data

automatically produces a blendshape model.

✦ The advantages of a PCA model are that it accurately represents the

data and that it is obtained automatically.

✦ A disadvantage is that the model is quite poorly suited for artist

manipulation, because the individual targets resulting from PCA tend to have widespread effects that are difficult to describe.

✦ Also, motion capture of the target is often not available by

definition.

SLIDE 12

What is PCA?

✦ Principal Component Analysis uses orthogonal transformation to

convert a set data points into a set of values of linearly uncorrelated variables called principal components.

✦ The first principal component has the largest possible variance, and

each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to the preceding components.

✦ Can be computed via Singular Value Decomposition:

M = UΣV

SLIDE 13

Mapping

✦ The retargeting problem can be posed as a function estimation

problem where the goal is to create a mapping that produces a target expression for each source expression.

✦ In the case where the target face is parameterized by a rig, the

function maps source expressions into target parameters.

✦ An important issue for the source face is to determine which

components of the source animation is affecting the target face.

✦ Linear mapping is the simplest choice, but might cause undesired

artifacts.

SLIDE 14

Scattered Data Interpolation

✦ A function is estimated that maps the

source parameter space onto the target parameter face.

✦ Kernel-based techniques such as Radial

Basis Function (RBF) interpolation are widely used for nonlinear mapping.

✦ Partitioning of target space via a

Delaunay Triangulation is one way to solve the problem of scattered data interpolation.

SLIDE 15

What is a Radial Basis Function?

✦ A radial basis function (RBF) is a real-valued function whose value

depends only on the distance from the origin.

✦ For example, a spherical Gaussian function can be a RBF. ✦ Radial basis functions are typically used to build up function

approximations of the form:

y(x) =

i=1

wiφ(kx xik)

SLIDE 16

Art Direction

✦ The need for user input is clear when the source and the target

character are very different.

✦ The scattered data interpolation framework supports user input

quite naturally since the correspondences between source and target expressions can be determined by a user.

✦ It could be critical for animating a hero character in a feature film

but not necessary for animating chatroom avatars.

SLIDE 17

State of the Art

Digital Ira Project

SLIDE 18

Depth Sensors

✦ Microsoft Kinect provides consumer-grade depth sensors for facial

animation applications.

✦ A new research topic in facial animation focuses on real-time

retargeting applications.

✦ Realtime Performance-Based Facial Animation, Weise et al.,

Siggraph 2011.

SLIDE 19

Realtime Facial Retargeting

✦ A system that enables any user to control the facial expressions of a

digital avatar in realtime using a cheap depth sensor.

✦ A face tracking algorithm that combines geometry and texture

registration with pre-recorded animation priors in a single

ptimization.

✦ The technique emphasizes its usability, performance, and

robustness.

SLIDE 20

Overview

✦ Traditional facial animation solves the tracking and retaregeting

problems separately.

✦ The proposed technique combine these two problems in one single

ptimization.

SLIDE 21

Blendshape Representation

✦ Facial expressions is represented as weighted sum of blendshape

meshes.

✦ A blendshape model provides a compact representation of the facial

expression space, significantly reducing the dimensionality of the

ptimization problem.

✦ Can reuse existing blendshape animations, that are ubiquitous in

movie and game production.

✦ The output is a temporal sequence of blendshape weights, which

can be directly imported into commercial animation tools.

SLIDE 22

Acquisition Hardware

✦ All input data is acquired using the Kinect system. ✦ The Kinect supports simultaneous capture of a 2D color image and

a 3D depth map, based on invisible infrared projection.

✦ Data quality is much lower than state-of-the-art performance

capture systems based on markers and/or active lighting.

SLIDE 23

Offline Model Building

✦ Given the user’s expressions captured offline, create a set of user-

specific blendshapes by adapting the generic blendshapes.

✦ A pre-defined sequence of example expressions performed by the

user is recorded by the Kinect sensor.

SLIDE 24

Online Tracking

✦ Decouple the rigid from the non-rigid motion. ✦ Directly estimate the rigid transform of the user’s face before

performing the optimization of blendshape weights.

✦ For rigid tracking, align the reconstructed mesh of the previous

frame with the acquired depth map of the current frame using ICP.

✦ For non-rigid tracking, estimate the blendshape weights that

capture the dynamics of the facial expression of the recorded user.

SLIDE 25

Statistical Model

✦ Let D = (G, I) be the input data at the current frame consisting of a

depth map G and a color image I. Infer from D the most probable blendshape weights x for the current frame given the sequence Xn

f the n previously reconstructed blendshape vectors.

✦ Formulate this inference problem as a maximum a posteriori (MAP)

estimation:

✦ Using Bayes’ rule:

likelihood prior

SLIDE 26

Prior Distribution

✦ The prior term is modeled as a mixture of Probabilistic Principal

Component Analyzers (MPPCA).

✦ It’s a mixture of Gaussian models with low-dimensional covariance

matrices.

SLIDE 27

Likelihood Distribution

✦ By assuming conditional independence, the likelihood distribution

is modeled as the product of two Gaussians: p(D|x) = p(G|x)p(I|x).

✦ Let B be the blendshape matrix. Each column of B defines a

blendshape base mesh such that Bx generates the blendshape representation of the current pose.

✦ Denote vi = (Bx)i as the i-th vertex of the reconstructed mesh.

SLIDE 28

Optimization

✦ The MAP problem can be solved by an optimization, by

minimizing the negative logarithm of the MAP equation.

✦ Since the gradients can be computed, the problem can be solved

efficiently by an iterative gradient solver.