SLIDE 1
Designing Interac-ve Systems that Embrace Uncertainty Keith Vertanen - - PowerPoint PPT Presentation
Designing Interac-ve Systems that Embrace Uncertainty Keith Vertanen - - PowerPoint PPT Presentation
Designing Interac-ve Systems that Embrace Uncertainty Keith Vertanen , http://keithv.com, vertanen@mtu.edu Keiths Thesis-o-meter 80000 Words 70000 Submission BGS limit 60000 Number words 50000 40000 30000 20000 10000 07 07 07 08
SLIDE 2
SLIDE 3
SLIDE 4
SLIDE 5
SLIDE 6
SLIDE 7
SLIDE 8
SLIDE 9
10000 20000 30000 40000 50000 60000 70000 80000 07 04 07 07 07 10 08 01 08 04 08 07 08 10 09 01 09 04 BGS limit Number words Year/month Keith’s Thesis-o-meter Words Submission
SLIDE 10
SLIDE 11
My research
- Research areas
– Speech interfaces – Mobile interfaces – Assis4ve technologies – Text entry – Crowdsourcing
Design, build, and evaluate intelligent interac-ve systems that leverage uncertain input technologies with a focus
- n enhancing the capabili4es of users with permanent or
situa4onally-induced disabili4es.
SLIDE 12
Speech interfaces
I must go down to the see again, to the lonely sea and the sky, and all I ask is ah a tall ship and star to steer her by. Confidence visualiza4on for error detec-on
Highlight likely errors using word confidence scores.
One-step error correc-on
Infer loca4on and content of correc4on using uHerance + op4onal approximate loca4on info. "I must go down to the seas S-E-A-S again..."
Spelling-based error avoidance
Spell difficult words during correc4ons or preemp4vely during ini4al dicta4on.
SLIDE 13
Parakeet: touchscreen error correc-on
Uses a correc4on interface built from the word confusion network built from the speech recogni4on result.
Speech Dasher: eye-tracking error correc-on
Zoom through the speech recogni4on hypothesis space to confirm and correct result.
Gesture keyboard error correc-on / avoidance
Speak sentence, provides gestures for: all words, only recogni4on errors, words deemed difficult
SLIDE 14
Assis4ve technologies
Voice output AAC device
"Hello my name is Keith"
Augmenta4ve and Alterna4ve Communica4on (AAC)
Predic0ve AAC iPad app
I want ju
Speaking users: ~150 wpm (words-per-minute), AAC users: oWen < 10 wpm
- No corpora of actual AAC user communica4ons
- Models trained on telephone transcripts or newswire data
Ø Shared research resources for conversa4onal AAC
SLIDE 15
Is the dog friendly? I need to start making a shopping list soon. What I would really like right now is a plate of fruit. Who will drive me to the doctor's office tomorrow?
Some invented communica0ons
- Trained in-domain language model on invented communica4ons
- Selected data from larger data sets: twiHer, blog, Usenet
- Results: perplexity ↓82%, keystroke savings ↑11%
iSCAN: Predic4ve phoneme-based AAC
16 par4cipants: entry rate ↑108%, error rate ↓79% AAC user: beHer than device used for 4 years
SLIDE 16
Dwell-clicking on a keyboard 6 wpm (words-per-minute) Wri4ng via Dasher naviga4on 14 wpm Dwell-free via eye gestures Performance poten4al: 46 wpm AAC user input: oWen low bandwidth and noisy Ø Maximize output for each input bit
SLIDE 17
VelociTap: Project goals
- Encourage users to go fast
– Avoid delays due to monitoring intermediate results – Avoid overly precise tapping Ø Sentence-at-a--me entry
- Test on many users
– No training of user-specific model – No learning of new keyboard or entry technique Ø Tapping on familiar QWERTY layout
- One-handed or small keyboard use
Ø Tapping with a single finger
SLIDE 18
From taps to text
Given a noisy tap sequence: Guess the user's intended text:
have a good day 0.06 have a food day 0.01 have a fod day 0.004 have a god day 0.0006 ...
SLIDE 19
VelociTap: Touch modeling
2D Gaussians centered at each key.
Separate variances in the x- and y-dimensions.
SLIDE 20
VelociTap: Language modeling
- Language models:
– 12-gram leVer model – 4-gram word model with unknown word – Trained on billions of words of data
§ TwiHer, blog, social media, Usenet, and web data
– Op4mized for short email-like messages – LeHer and word model: ~4 GB memory
SLIDE 21
VelociTap: Decoder
Observa0on 1 Observa0on 2 Observa0on 3
f g c z ϵ a b z Tokens track: probability, LM context
- ϵ
a z
X X
d ϵ
- z
a
X
d good god go Prune unlikely paths
X X X X
SLIDE 22
The Invisible Keyboard
SLIDE 23
SLIDE 24
First find a transform that produces the best probability using greedy character-at-a-
- me decoding scheme.
Tap sequence scaled horizontally and slightly translated and rotated. Full search considering many possible character sequences.
SLIDE 25
Future work: text entry
- Improve models: more data
– Text Blaster: mul4-player tex4ng game – Eyes-free text adventure game
- Correc-on interfaces
- Improve user signal
– Audio / tac4le feedback – Real-4me uncertainty feedback – Error avoidance for difficult words
- Other use scenarios
– Searching links on a web page – Input via mid-air gestures – Input on an actual smartwatch
SLIDE 26
Future work: speech interfaces
- One-step voice correc-on
– Detec4ng hyperar4culate speech – Evaluate complete system, for:
- Improved desktop dicta4on
- Eyes-free mobile dicta4on
- Other domains
– Command-and-control
- Hands-busy or no-device use (instrumented
environments)
– Ambient speech recogni4on
- Inform future searches, push relevant content
- Inform predic4ve AAC
SLIDE 27
Future work: assis4ve technologies
- Improve AAC language models: more data
– Develop chat-like game, played by AAC and non-AAC users – Validate on AAC users / transcripts
- Dwell-free eye-wri-ng
– Evaluate with a recogni4on-based approach
- Context in predic-ve AAC
– Gleaned via sensors – Explicit partner sugges4ons
SLIDE 28
Conclusions
- Want to know more?
– keithv.com -> Papers, videos – Or stop by my office
- Opportuni-es for undergrads
– Good programming skills
- Java + Android development
- Socket programming
- Web development
– Good people skills:
- Recrui4ng par4cipants
- Running studies
Haythem Memmi MicrosoW Jus4n Emge Google