Advanced Animatronics Voice and Jaws v1.0
Flüüfgf – 22/11/2019
Floere T. Pillowbeaver, Devourer of Nuclear Submarines fmoere@robocow.be
Advanced Animatronics Voice and Jaws v1.0 Flfgf 22/11/2019 Floere - - PowerPoint PPT Presentation
Advanced Animatronics Voice and Jaws v1.0 Flfgf 22/11/2019 Floere T. Pillowbeaver, Devourer of Nuclear Submarines fmoere@robocow.be What is this Talk About ? An overview of the State of the Art of moving jaws and voice projection
Floere T. Pillowbeaver, Devourer of Nuclear Submarines fmoere@robocow.be
2 / 60
3 / 60
4 / 60
Lip-syncing with puppet mask (manual actuated) Radula Castion – Zuzu’s White Rabbit https://www.youtube.com/watch?v=b2pDuWh3ik8
5 / 60
– Ofg-the-shelf parts – 3D printable
Gustav Hoegen
6 / 60
Wikipedia - Uncanny Valley Conjecture (Mori 1970)
7 / 60
8 / 60
(no lip-syncing or over-dubbing)
–
Katey McGregor – T alking Mickey Mouse https://www.youtube.com/watch?v=762-tHwnAHg
–
Mascot – Animatronic Mascots https://www.youtube.com/watch?v=Ve3vuxII6Dc
–
Lunaspuppets - Human-Size Animatronic Robotic T alking Donkey Puppet
https://www.youtube.com/watch?v=Cv5yAfHWEY4
–
Bake Me Up Buttercup – How to Measure Flour Correctly https://www.youtube.com/watch?v=YBkT5woqmAY
–
Beautyofthe Bass – Speaker Costume T alks Live! V3 https://www.youtube.com/watch?v=UWOWqe1kP7U
–
DRAGON =^‿^= - Howwwwwwdy folks and welcome to Monday
T witter: @GRNdragon0
9 / 60
– Limited, static articulation (blinks + simple mouth) – Good voice quality
– Most costumes are actually puppets, controlled by
– Let’s have a look at this…
10 / 60
– Articulated jaws can work (but often don’t)
– Voice is dull in real life
11 / 60
12 / 60
13 / 60
– Speaking with exaggerated jaw motion – E.g.: Buttercup and NIIC do this well
14 / 60
– Big and very powerful ones for chewing and
– Little, fast ones for speech – The big ones disengage when speaking
– Under ~0.3 cm pronouncing /ta/ and /te/
– Under ~2.5 cm pronouncing /a/
15 / 60
“Human Jaw Movement in Mastication and Speech”, D.J. Ostry and J.R. Flanagan,
Sensor attached to the chin, just posterior to the mental notch.
16 / 60
Marker 4 cm from lower incisors, ~on the midsagittal plane. “An Analysis of the Dimensianality of Jaw Motion in Speech”, E. Vatikiotis-Bateson and D.J. Ostry, Journal of Phonetics, Vol. 23, pp. 101-117, 1995
17 / 60
18 / 60
19 / 60
Haskins Laboratories
gosh.nhs.uk
20 / 60
Jörgen Ahlberg – Source-Filter Model of Speech Production
21 / 60
– Many speech sounds
– Eg: to a lip reader
– A very hard problem – Key to speech recognition
22 / 60
– Voiced or louder
– Nasal or unvoiced
Wolf Paulus – Viseme Model with 12 Mouth Shapes
23 / 60
– Estimate mouth state
– No actual phoneme
– Don’t need perfection
– Chin motion (slow) – Measured from jaw – Includes static poses
– Lip motion (fast) – Estimated from
– No action when silent
Jaw sensor Lip “sensor” Speech Analysis Jaw Servos Mouth Est.
24 / 60
– Voiced, unvoiced,
– How much energy?
– How nasal is
Donald Derrick – nasalence of na “A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition,” Bishnu S. Atal, Lawrence R. Rabiner, 1976.
25 / 60
– Lips can be separate or added to jaw motion
26 / 60
– Jaw → 1 servo
– Lips → 1/2 servos (opt.)
– T
– Jaw strap
Eva Taylor – Animatronic Alien
https://makezine.com/2014/10/27/the-making-of-an-animatronic-alien/
27 / 60
http://www.tioh.de/
https://radulacastion.wixsite.com/radulacastion
Rick Lazzarini, Stan Winston School of Character Arts
skud duncan – Animatronic Jaw Test https://www.youtube.com/watch?v=15IVl1VYdSk Winter Snowmew - “Couple of my followers have been curious about the weird snout. Here is the snarl and mouth mechanics.”
28 / 60
– This is not the point of this project – Afgorability and “bang for the buck” is key
29 / 60
TheCharacterShop – TCSpolarbearWaldo.mov https://www.youtube.com/watch?v=bFW2azvVEdI Shanetheactor – MetroPCS Commercial https://www.youtube.com/watch?v=udlQ7SH_RtM Radula Castion – Zuzu’s White Rabbit https://www.youtube.com/watch?v=b2pDuWh3ik8
30 / 60
– Conventions are LOUD! – Voice acting gives bizarre speech patterns – Sensors don’t stay put
– Computer vision systems not practical (yet)
31 / 60
– Mouth held open for a long time – Mouth unmoving while speaking – Mouth held shut while mumbling
– Smile = mouth a little open for now...
32 / 60
33 / 60
– Loud, even during calm
– Noise is non-stationary
L3 MMSE L2 GCCPF
L1 Cardioid Mic Ambient Mic
34 / 60
35 / 60
* This test recording was actually done using an omni-directional microphone, thus worst-case
36 / 60
“Low Distortion Noise Cancellers – Revival of a Classical Technique,” Akihiko Sugiyama
37 / 60
“Development of speech technologies to support hearing through mobile terminal users,” T. Togawa, T. Otani, K. Suzuki, T. Taniguchi, 2015.
38 / 60
39 / 60
–
They will work in most environments
–
They will work with most speakers and languages
–
They will work with squeakers
–
They can get it wrong at times
–
Many, many parameters to confjgure
–
These are some of the most robust algo’s out there
–
Most of the parameters are fjxed for the application
–
The remainder tunes easily to a specifjc costume
40 / 60
41 / 60
– Shifts around too much – Interferes with speech
42 / 60
– Very comfy – Quite robust – Cheap – Easy to manufacture – Looks boss!
– Needs an adaptive
Sensor output while saying “mama, papa”
43 / 60
– Aside from the latency? (need >50 fps) – Contrast with beards, balaclava’s; lighting (IR) – Powerful computer needed
– Readily-available algorithms for facial
44 / 60
– Works very well – Good accuracy
– Need clear view of
– Complex algorithms
Cara Motion Capture (www.vicon.com) DisneyResearchHub – Synthetic prior design for real time facial capture https://www.youtube.com/watch?v=w71vxi60SzM
45 / 60
– Filters-out all the
– Some overshoot
RoboCow Industries
46 / 60
47 / 60
Close-Talking Cardioid Microphone 3L Noise Reduction Feed-Back Canceller Parametric Equalizer Cross-Over Sound Effects Amplifier Tweeter Mid-Range Woofer
48 / 60
Close-Talking Cardioid Microphone 3L Noise Reduction Feed-Back Canceller Parametric Equalizer Cross-Over Sound Effects Amplifier Tweeter Mid-Range Woofer
49 / 60
– Larson efgect – Why there are few
– Microphone design – Speaker design – Feed-back control
50 / 60
microphone and speaker
extra gain
–
Cardioid mic + decent speaker design ~20 dB
–
T
replicate your voice, at about the same volume. (Or “big creature” volume)
–
Not “punk band in a suit”!
–
If you can speak loud, the suit can also be LOUD
“Robust and Efficient Implementation of the PEM–AFROW Algorithm for Acoustic Feedback Cancellation,” G. Rombouts, T. Van Waterschoot,
51 / 60
Close-Talking Cardioid Microphone 3L Noise Reduction Feed-Back Canceller Parametric Equalizer Cross-Over Sound Effects Amplifier Tweeter Mid-Range Woofer
52 / 60
– But it often sounds bad (kinda incomprehensible)
– This ruins the formant relationships in speech – A time-domain pitch shifter has to lock to F0 for that
– Help the algorithm and actually voice act!
53 / 60
Close-Talking Cardioid Microphone 3L Noise Reduction Feed-Back Canceller Parametric Equalizer Cross-Over Sound Effects Amplifier Tweeter Mid-Range Woofer
54 / 60
– REW to the rescue – With help from own
https://www.roomeqwizard.com/
55 / 60
Close-Talking Cardioid Microphone 3L Noise Reduction Feed-Back Canceller Parametric Equalizer Cross-Over Sound Effects Amplifier Tweeter Mid-Range Woofer
56 / 60
mouth for realism
speaker in the mouth
–
High frequencies do most for sound localization
–
T weeter/mid in the nose
–
T weeters are small!
some place else (eg: cheeks, forehead, chin, chest, shoulders)
(no directionality)
3-D Audio & Applied Acoustics Lab Princeton
57 / 60
– Avoid comb fjltering
Elliot Sound Products
58 / 60
59 / 60