Revolutionary Voice Enhancement in Real-Time Communications with GPU
Davit Baghdasaryan, CEO, 2Hz Arto Minasyan, CTO, 2Hz
with GP U Davit Baghdasaryan, CEO, 2Hz Arto Minasyan, CTO, 2Hz 2 - - PowerPoint PPT Presentation
Revolutionary Voice Enhancement in Real-Time Communications with GP U Davit Baghdasaryan, CEO, 2Hz Arto Minasyan, CTO, 2Hz 2 Mute Background Noises Voice Quality with Deep Learning Mute Background Noise Mute Everyone Except Me
Revolutionary Voice Enhancement in Real-Time Communications with GPU
Davit Baghdasaryan, CEO, 2Hz Arto Minasyan, CTO, 2Hz
2
Mute Background Noises
Voice Quality with Deep Learning
5
6
Real-Time Noise Suppression with Deep Learning
7
Traditional Noise Cancellation
8
Train krispNet Deep Neural Network
Background Noises Clean Human Speeches
Deep Learning powered Noise Cancellation
and in the cloud
9
How to Measure Voice Quality?
10
Industry Standards
11
Audio Lab
12
13
Seamlessly Integrates in Conferencing Apps Supports any Microphone or Headset
17
krisp.ai Best Product in Audio/Voice 2018
18
Training and Inference
19
Training Process
20
Training Data
21
Training on GPUs
22
Inference
23
Moving to the Cloud
24
Server-side Noise Cancellation
25
Latency Constraints
200ms end to end latency
Codecs and other DSP (10-80ms) Network (varies) DNN Compute ( < 5ms) DNN Algorithmic (15ms)
< 20ms
26
How do you scale to 100K+ concurrent streams with such latency constraints?
concurrent audio streams
27
10x-20x less costly
…
CPU Servers GPU Servers
28
Scalability with Batching
29
Ultimate Quality
Remove Noise Remove Room Echo Expand Voice HD Audio Frame Ultimate Quality Audio Frame
30
Maximum Quality and Scale with NVIDIA Tensor Cores
31
TensorRT is pretty awesome
750 1500 2250 3000 P100 V100 K80 T4 TensorFlow Batching TensorRT Batching
32
T4 and V100 are both awesome
1250 2500 3750 5000 P100 V100 T4 FP32 FP16
33
Key Takeaways
34
Thank You!