Xavier Snelgrove, CTO & Co-Founder, Whirlscape @wxswxs March 2017
Low memory RNNs... for emoji! Xavier Snelgrove , CTO & - - PowerPoint PPT Presentation
Low memory RNNs... for emoji! Xavier Snelgrove , CTO & - - PowerPoint PPT Presentation
Low memory RNNs... for emoji! Xavier Snelgrove , CTO & Co-Founder, Whirlscape @wxswxs March 2017 Me, me, me! Me, me, me! Me, me, me! Minuum Dango http:/ /minuum.com http:/ /getdango.com Me, me, me! Minuum Dango http:/
Me, me, me!
Me, me, me!
Me, me, me!
Dango
http:/ /getdango.com
Minuum
http:/ /minuum.com
Me, me, me!
Dango
http:/ /getdango.com
Minuum
http:/ /minuum.com
Title Text
Title Text
Title Text
Title Text
Title Text
Title Text
Title Text
With Dango
With Dango
With Dango
With Dango
With Dango
100s of Millions of Examples
Hi prince 👒 Never mind. I forgot I’m single 😓😪 that's what I like to hear 😈❤ Highway driving in the morning 🌆👍 happy bro bro it was cool chilling with you in line for Travis gotta catch another show turn up one time🙐😝
100s of Millions
- f Examples
100s of Millions
- f Examples
GPUs crunch away for days
💮
💮
💮
100s of Millions
- f Examples
Trained Model
💭
💮
📛 GPUs crunch away for days
💮
💮
💮
Let’s eat lunch later
Let’s eat lunch later
Let’
Let’s eat lunch later
Let’
Let’s eat lunch later
Let’
Let’s eat lunch later
🍵😌🍞
Emoji in semantic-space
How can we run this on device?
Let’s eat lunch later
Word Embedding Recurrent Layers Dense Output Layers
Embedding memory
the and cat yesterday eggplant . . . alchemist missspellling
Embedding memory
the and cat yesterday eggplant . . . alchemist missspellling
}
}
512 100,000
Embedding memory
512 100,000 4 bytes × ×
Embedding memory
512 100,000 4 bytes × ×
=200 MB
Embedding memory
512 100,000 4 bytes × ×
=200 MB
SQLite
Embedding memory
512 100,000 4 bytes × ×
=200 MB
SQLite
3 bits
20 MB
Quantize
Embedding memory
SQLite
Distribution of embedding values
Hufgman coding? Depends on quantization
Let’s eat lunch later
Word Embedding Recurrent Layers Dense Output Layers
Recurrent Layer Memory
Input Vector Previous State Next State
+
Output Vector
Recurrent Layer Memory
Recurrent Layer Memory
}
768
}
768
Recurrent Layer Memory
768 768 4 bytes × × 3 ×
= 14MB
2 layers ×
Recurrent Layer Memory
768 768 4 bytes × × 3 ×
= 14MB
2 layers × 2 bytes
Quantize (float16)
7MB
Recurrent Layer Memory
Distribution of weight values
Recurrent Layer Memory
Distribution of weight values
Many near-zero values
Recurrent Layer Memory
Recurrent Layer Memory
Prune 50% of weights closest to 0
Recurrent Layer Memory
Prune 50% of weights closest to 0 Train the rest of the network
Recurrent Layer Memory
Prune 50% of weights closest to 0 Train the rest of the network Repeat, pruning more each iteration
Recurrent Layer Memory
Prune 50% of weights closest to 0 Train the rest of the network Repeat, pruning more each iteration
90% prune 7MB × 0.1 = 700 kB
Recurrent Layer Memory
Prune 50% of weights closest to 0 Train the rest of the network Repeat, pruning more each iteration
90% prune 7MB × 0.1 = 700 kB
Questions?
http:/ /getdango.com
Xavier Snelgrove, CTO & Co-Founder @wxswxs