CS525z Perceptual Quality Multimedia Networking Network Issues - PDF document

Introduction Purpose • Brief introduction to: – Digital Audio – Digital Video CS525z – Perceptual Quality Multimedia Networking – Network Issues – The “Science” (or lack of) in “Computer Science” • Get you ready for research papers! Introduction • Introduction to: – Silence detection (for project 1) Groupwork Introduction Outline • Let’s get started! • Background • Consider audio or video on a computer – Internetworking Multimedia (Ch 4) – Examples you have seen, or – Graphics and Video (Linux MM, Ch 4) – Multimedia Networking (Kurose, Ch 6) – Guess how it might look • Audio Voice Detection (Rabiner) • What are two conditions that degrade quality? • MPEG (Le Gall) – Giving technical name is ok • Misc – Describing appearance is ok Digital Audio • Sound produced by variations in air pressure – Can take any continuous value – Analog component • Computers work with digital – Must convert analog to digital – Use sampling to get discrete values 1

Digital Sampling Digital Sampling • Sample rate determines number of discrete • Half the sample rate values Digital Sampling Sample Rate • Nyquist’s Theorem: to accurately reproduce • Quarter the sample rate signal, must sample at twice the highest frequency • Why not always use high sampling rate? – Requires more storage – Complexity and cost of analog to digital hardware – Human’s can’t always perceive • Dog whistle – Typically want an adequate sampling rate Sample Size Sample Size • Samples have discrete values • Quantization error from rounding – Ex: 28.3 rounded to 28 • Why not always have large sample size? – Storage increases per sample – Analog to digital hardware becomes more expensive • How many possible values? + Sample Size + Common is 256 values from 8 bits 2

Audio Groupwork • Encode/decode device are called codecs – Compression is the complicated part • For voice compression, can take advantage • Think of as many uses of computer audio as of speech: you can • Which require a high sample rate and large “Smith” sample size? Which do not? Why? • Many similarities between adjacent samples • Send differences (µ-law) • Adapt to signal (ADPCM) • Use understanding of speech • Can ‘predict’ (CELP) Typical Encoding of Voice Audio by People • Today, telephones carry digitized voice • Sound by breathing air past vocal cords • 4 KHz (8000 samples per second) – Use mouth and tongue to shape vocal tract – Adequate for most voice communication • Speech made up of phonemes • 8-bit sample size – Smallest unit of distinguishable sound • For 10 seconds of speech: – Language specific • Majority of speech sound from 60-8000 Hz – 10 sec x 8000 samp/sec x 8 bits/samp – Music up to 20,000 Hz = 640,000 bits or 80 Kbytes • Hearing sensitive to about 20,000 Hz – Fit 3 minutes of speech on a floppy disc • Fine for voice, but what about music? – Stereo important, especially at high frequency – Lose frequency sensitivity as age Typical Encoding of Audio Sound File Formats • Raw data has samples (interleaved w/stereo) • Can only represent 4 KHz frequencies (why?) • Need way to ‘parse’ raw audio file • Human ear can perceive 10-20 KHz • Typically a header – Used in music – Sample rate • CD quality audio: – Sample size – sample rate of 44,100 samples/sec – Number of channels – Coding format – sample size of 16-bits – … – 60 min x 60 secs/min x 44,100 samp/sec x 2 • Examples: bytes/samples x 2 channels = 635,040,000 or about 600 Mbytes – .au for Sun µ-law, .wav for IBM/Microsoft • Can use compression to reduce 3

Outline • Introduction – Internetworking Multimedia (Ch 4) – Graphics and Video (Linux MM, Ch 4) – Multimedia Networking (Kurose, Ch 6) • Audio Voice Detection (Rabiner) • MPEG (Le Gall) • Misc Graphics and Video Graphics Basics “A Picture is Worth a Thousand Words” • People are visual by nature • Computer graphics (pictures) made up of • Many concepts hard to explain or draw pixels • Pictures to the rescue! – Each pixel corresponds to region of memory • Sequences of pictures can depict motion – Called video memory or frame buffer • Write to video memory – Video! – monitor displays with raster cannon Monochrome Display Grayscale Display • Bit-planes • Pixels are on (black) or off (white) – 4 bits per pixel, 2 4 = 16 gray levels – Dithering can appear gray 4

Video Palettes Color Displays • Humans can perceive far more colors than grayscales – Cones and Rods in eyes • Still have 16 million colors, only 256 at a time • All colors seen as combination of red, green and blue • 24 bits/pixel, 2 24 = 16 million colors • Complexity to lookup, color flashing • Can dither for more colors, too • But now requires 3 bytes required per pixel Video Summary Video Images • Television about 6000 lines, 4:3 aspect ratio – 833x625 (PAL), 700x525 (NTSC) • Digital video smaller – 352x288 (H.261), 176x144 (QCIF) • Monitors higher resolution than T.V. • 1200x1000 pixels not uncommon • xdpyinfo, display ! settings • Computer video often called “Postage Stamp” Moving Video Images Video Compression • Series of frames with changes appear as motion 640x480 – 25-30 frames/second “full-motion” video 320x240 • Lossless or Lossy • Take advantage of motion – Dependencies between frames Uncompressed Video is enormous! 5

Introduction Outline • Background – Internetworking Multimedia (Ch 4) – Graphics and Video (Linux MM, Ch 4) – Multimedia Networking (Kurose, Ch 6) • (6.1 to 6.3) • Audio Voice Detection (Rabiner) • MPEG (Le Gall) • Misc Internet Traffic Today Multimedia on the Internet • Internet dominated by text-based applications – Email, FTP, Web Browsing • Multimedia not as sensitive to loss • Very sensitive to loss – Words from sentence lost still ok – Example: lose a byte in your blah.exe – Frames in video missing still ok program and it crashes! • Multimedia can be very sensitive to delay • Not very sensitive to delay – Interactive session needs one-way delays less – 10’s of seconds ok for web page download than 1 second! – Minutes for file transfer • New phenomenon is jitter! – Hours for email to delivery Jitter Classes of Internet Multimedia Apps • Streaming stored media • Streaming live media • Real-time interactive media Jitter-Free 6

Streaming Stored Media Streaming Live Media • Stored on server • “Captured” from live camera, radio, T.V. • Examples: pre-recorded songs, famous • 1-way communication, maybe multicast • Examples: concerts, radio broadcasts, lectures, video-on-demand • RealPlayer and Netshow lectures • Interactivity, includes pause, ff, rewind… • RealPlayer and Netshow • Delays of 1 to 10 seconds or so • Limited interactivity… • Not so sensitive to jitter • Delays of 1 to 10 seconds or so • Not so sensitive to jitter Hurdles for Multimedia on the Internet Real-Time Interactive Media • IP is best-effort • 2-way communication – No delivery guarantees • Examples: Internet phone, video conference – No bandwidth guarantees • Very sensitive to delay – No timing guarantees < 150ms very good • So … how do we do it? < 400ms ok – Not too well for now > 400ms lousy – This class is largely about techniques to make it better! Multimedia on the Internet The Media Player • The Media Player • End-host application • Streaming through the Web – Real Player, Windows Media Player • The Internet Phone Example • Needs to be pretty smart • Decompression (MPEG) • Jitter-removal (Buffering) • Error correction (Repair, as a topic) • GUI with controls (HCI issues) – Volume, pause/play, sliders for jumps 7

Streaming through a Plug-In Streaming through a Web Browser Must still use TCP! Must download whole file first! Streaming through the Media Player An Example: Internet Phone • Specification • Removing Jitter • Recovering from Loss Internet Phone: Removing Jitter Internet Phone: Specification • Use header information to reduce jitter • 8 Kbytes per second, send every 20 ms – Sequence number and Timestamp – 20 ms * 8 kbytes/sec = 160 bytes per packet • Header per packet – Sequence number, time-stamp, playout delay • End-to-End delay of 150 – 400 ms – Why isn’t TCP effective? • UDP – Can be delayed different amounts (Removing Jitter) • Strategy: – Can be lost (Recovering from Loss) –Playout delay (Delay Buffer) 8

Playout Delay Internet Phone: Loss 1 2 3 4 Encode 1 4 Transmit 1 ??? ??? 4 Decode What do you do with the missing packets? Can be fixed or adaptive Internet Phone: Recovering from Loss Projects • Project 1: – Read and Playback from audio device 1 1 2 2 3 3 4 Encode – Detect Speech and Silence – Evaluate (1a) • Project 2: 1 3 4 Transmit – Build an Internet Phone application – Evaluate (2b) • Project 3: 1 1 3 4 Decode – Multi-person Internet Phone via multicast – Evaluate (3b) 9

CS525z Perceptual Quality Multimedia Networking Network Issues - PDF document

Introduction Purpose Brief introduction to: Digital Audio Digital Video CS525z Perceptual Quality Multimedia Networking Network Issues The Science (or lack of) in Computer Science Get you ready for

Natural Language Processing Acoustic Models Dan Klein UC Berkeley 1 The Noisy Channel Model

Machine Learning Modeling and Learning 15-110 Monday 4/13 Learning Goals Given a

Machine Learning 15-110 Wednesday 11/18 Learning Goals Identify three major categories of

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford University Friday 5:30 at

First-order Logic [RN2] Sec 7.1-7.6 Chap 8-9 [RN3] Sec 7.1-7.6 Chap 8-9 CS 486/686 University

CONSULTANT STRATEGY KENNEL STAR MASTINO Information Architecture 3 rd assignment Group 4:

Trust Region Method Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS

Aggregation Announcements Aggregation Aggregate Functions So far, all SQL expressions have

03-1 Specialization "is a" Drawing Names and Objects A name might not refer to

Case study 1 PRIMARY CARE SETTING You are a Nurse in the community and you have been asked to

20 04 4111 111 Com puter and Com puter and 2 Program m ing Program m ing Lecture #6: M

Pattern-based Solutions to Limitations of Leading Word Embeddings Roy Schwartz University of

First look at structures CS 6355: Structured Prediction 1 So far Binary classifiers

Can Humans be Replaced by Autonomous Robots? Ethical Reflections in the Framework of an

Imagination Library: Creating a Community of Readers 1 Session Overview Review the Overview

Vronsky ex Kitty paramour s g p n o i u l b s e i s Anna Dolly Levin spouse

Dan Wenman DSS Review November 7, 2016 proto Outline Installation Tools Procedure -

Dolly Parton Imagination Library Funding DPIL Webinars July 12, 13 and 24 7/13/2017 Todays

Talk for Primary 6 Parents Friday 25 January 2019 Programme Preparation for PSLE PSLE

Local Control Heather Lagrone Disaster Recovery Texas General Land Office Texas xas Ge

Heather Lagrone Texas General Land Office Disaster Recovery The e Texas as Pro rogr gram am

Chapter 6 Planning-Graph Techniques Dana S. Nau University of Maryland 3:04 PM February 8,

Graphplan Jos Luis Ambite * [* based in part on slides by Jim Blythe and Dan Weld] 1 Basic

Welcome We will begin at 7:30 pm Central Time. OFA Community Engagement Fellowship Spring 2018

CS525z Perceptual Quality Multimedia Networking Network Issues - PDF document

Introduction Purpose Brief introduction to: Digital Audio Digital Video CS525z Perceptual Quality Multimedia Networking Network Issues The Science (or lack of) in Computer Science Get you ready for

Natural Language Processing Acoustic Models Dan Klein UC Berkeley 1 The Noisy Channel Model

Machine Learning Modeling and Learning 15-110 Monday 4/13 Learning Goals Given a

Machine Learning 15-110 Wednesday 11/18 Learning Goals Identify three major categories of

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford University Friday 5:30 at

First-order Logic [RN2] Sec 7.1-7.6 Chap 8-9 [RN3] Sec 7.1-7.6 Chap 8-9 CS 486/686 University

CONSULTANT STRATEGY KENNEL STAR MASTINO Information Architecture 3 rd assignment Group 4:

Trust Region Method Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS

Aggregation Announcements Aggregation Aggregate Functions So far, all SQL expressions have

03-1 Specialization &quot;is a&quot; Drawing Names and Objects A name might not refer to

Case study 1 PRIMARY CARE SETTING You are a Nurse in the community and you have been asked to

20 04 4111 111 Com puter and Com puter and 2 Program m ing Program m ing Lecture #6: M

Pattern-based Solutions to Limitations of Leading Word Embeddings Roy Schwartz University of

First look at structures CS 6355: Structured Prediction 1 So far Binary classifiers

Can Humans be Replaced by Autonomous Robots? Ethical Reflections in the Framework of an

Imagination Library: Creating a Community of Readers 1 Session Overview Review the Overview

Vronsky ex Kitty paramour s g p n o i u l b s e i s Anna Dolly Levin spouse

Dan Wenman DSS Review November 7, 2016 proto Outline Installation Tools Procedure -

Dolly Parton Imagination Library Funding DPIL Webinars July 12, 13 and 24 7/13/2017 Todays

Talk for Primary 6 Parents Friday 25 January 2019 Programme Preparation for PSLE PSLE

Local Control Heather Lagrone Disaster Recovery Texas General Land Office Texas xas Ge

Heather Lagrone Texas General Land Office Disaster Recovery The e Texas as Pro rogr gram am

Chapter 6 Planning-Graph Techniques Dana S. Nau University of Maryland 3:04 PM February 8,

Graphplan Jos Luis Ambite * [* based in part on slides by Jim Blythe and Dan Weld] 1 Basic

Welcome We will begin at 7:30 pm Central Time. OFA Community Engagement Fellowship Spring 2018

03-1 Specialization "is a" Drawing Names and Objects A name might not refer to