course projects
play

Course Projects Class 9. 22 Sep 2009 Administrivia THURSDAYS - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Course Projects Class 9. 22 Sep 2009 Administrivia THURSDAYS CLASS: WEAN HALL 5403 n Thanks to Ramkumar Krishnan for arranging the room! q Almost all submissions of Homework 1 are in n


  1. 11-755 Machine Learning for Signal Processing Course Projects Class 9. 22 Sep 2009

  2. Administrivia THURSDAY’S CLASS: WEAN HALL 5403 n Thanks to Ramkumar Krishnan for arranging the room! q Almost all submissions of Homework 1 are in n Thanks to all students who have submitted q Three submissions are still due q Fernando’s lecture n Clarifications required? J q Homework 2 is up on the website n Face detection using a single Eigen face q Will expand to using multiple Eigen faces in stage 2 q Complex homework n Homework 3 will be very simple: L1 estimation of L2 algebraic operations n If (insufficient(time)==true) givenhomework(3) = false q 11-755 MLSP: Bhiksha Raj

  3. Course Projects n Covers 50% of your grade n 9-10 weeks n Required: A seriously attempted project q Demo if possible q Project report q 20 minute project presentation q n Project complexity Depends on what you choose to do q Complexity of project will be considered in grading q 11-755 MLSP: Bhiksha Raj

  4. Course Projects Projects will be done by teams of students n Ideal team size: 4 q Find yourself a team q If you wish to work alone, that is OK q But we will not require less of you for this n If you cannot find a team by yourselves, you will be assigned to a team q Teams will be listed on the website q All currently registered students will be put in a team eventually q Will require background reading and literature survey n Learn about the the problem q Grading will be done by team n All members of a team will receive the same grade q But I retain discretionary powers over this n 11-755 MLSP: Bhiksha Raj

  5. Projects n A list of possible projects will be presented to you in the rest of this lecture n This is just a sampling n You may work on one of the proposed projects, or one that you come up with yourselves n Teams must inform us of their choice of project by 29 th September 2009 The later you start, the less time you will have to work on q the project 11-755 MLSP: Bhiksha Raj

  6. Projects Projects range from simple to very difficult n Important to work in teams q Guest lecturers with project ideas n Anatole Gershman (LTI) q Alan Black (LTI) q Eakta Jain (RI) q Fernando De La Torre q Not presenting n Important: Be realistic n Partially completed projects will still get grades IF: q The work performed is a serious attempt at completing it n But only completed projects are likely to result in q papers/publications if any 11-755 MLSP: Bhiksha Raj

  7. Now .. To our guests.. n Alan Black n Anatole Gershman n Eakta Jain 11-755 MLSP: Bhiksha Raj

  8. More Project Ideas n Sound Separation q Music q Classification q Synthesis q n Images Processing q Editing q Classification q n Video … q … q 11-755 MLSP: Bhiksha Raj

  9. A Strange Observation The pitch of female Indian playback singers n A trend is on an ever-increasing trajectory 800 Alka Yangnik, Dil Ka Rishta Lata Mangeshkar, Anupama Peak: 740 Hz Peak: 570 Hz Pitch (Hz) 600 400 Shamshad Begum, Patanga Peak 310 Hz 1949 1966 2003 Year (AD) n Mean pitch values: 278Hz, 410Hz, 580Hz 11-755 MLSP: Bhiksha Raj

  10. I’m not the only one to find the high-pitched stuff annoying n Sarah McDonald (Holy Cow): “.. shrieking…” n Khazana.com: “.. female Indian movie playback singers who can produce ultra high frequncies which only dogs can hear clearly..” n www.roadjunky.com: “.. High pitched female singers doing their best to sound like they were seven years old ..” 11-755 MLSP: Bhiksha Raj

  11. A Disturbing Observation The pitch of female Indian playback singers n A trend is on an ever-increasing trajectory Glass Shatters 800 Alka Yangnik, Dil Ka Rishta Lata Mangeshkar, Anupama Peak: 740 Hz Peak: 570 Hz Pitch (Hz) 600 400 Shamshad Begum, Patanga Average Female Peak 310 Hz Talking Pitch 1949 1966 2003 Year (AD) n Mean pitch values: 278Hz, 410Hz, 580Hz 11-755 MLSP: Bhiksha Raj

  12. Subjectivity of Taste n High pitched female voices can often sound unpleasant n Yet these songs are very popular in India q Subjectivity of taste n The melodies are often very good, in spite of the high singing pitch 11-755 MLSP: Bhiksha Raj

  13. “Personalizing” the Song Retain the melody, but modify the pitch n To something that one finds pleasant q The choice of “pleasant” pitch is personal, hence “personalization” q Must be able to separate the vocals from the background music n Music and vocals are mixed in most recordings q Must modify the pitch without messing the music q Separation need not be perfect n Must only be sufficient to enable pitch modification of vocals q Pitch modification is tolerant of low-level artifacts q For octave level pitch modification artifacts can be undetectable. n 11-755 MLSP: Bhiksha Raj

  14. Separation example Dayya Dayya original (only vocalized regions) Dayya Dayya separated music Dayya Dayya separated vocals 11-755 MLSP: Bhiksha Raj

  15. Some examples n Example 1: Vocals shifted down by 4 semitonesExample 2: Gender of singer partially modified 11-755 MLSP: Bhiksha Raj

  16. Some examples n Example 1: Vocals shifted down by 4 semitones n Example 2: Gender of singer partially modified 11-755 MLSP: Bhiksha Raj

  17. Projects.. n Several component techniques n Illustrate various ML and signal processing concepts n Signal separation q Latent variable models q Non-negative factorization n Signal modification q Pitch and spectral modification q Phase and phase estimation 11-755 MLSP: Bhiksha Raj

  18. Song “Personalizer” n Modify vocals as desired Mono or Stereo q “Knob” control to modify pitch of vocals q n Given a song Separate music and song q Modify pitch as required q Adjust parameters for minimal artifacts q Add.. q n Issues: Separation q Modification q Use of appropriate statisical model and signal processing q 11-755 MLSP: Bhiksha Raj

  19. Talk-Along Karaoke n Pick a song that features a prominent vocal lead Preferably with only one lead vocal q n Build a system such that: User talks the song out with reasonable rhythm q The system produces a version of the song with the user q singing the song instead of the lead vocalist i.e. The user’s singing voice now replaces the vocalist in the n song n No. of issues: Separation q Pitch estimation q Alignment q Pitch shifting q 11-755 MLSP: Bhiksha Raj

  20. Dereverberation �������������������� �������������� ���������� ���������������� n Develop a supervised technique that can dereverberate a noisy signal Will work with artificially reveberated data q n Issues: Modeling the data q Learning parameters q Overcomplete representations q 11-755 MLSP: Bhiksha Raj

  21. Real-time music transcription n Proposed by Siddharth Hazra n Discover sheet music for a guitar on-line, as it is played 11-755 MLSP: Bhiksha Raj

  22. Voice transformation w ith Canonical Correlation Analysis A pinv(B) S x AS x BS Y S Y Canonical correlation Analysis: n Given spectra S x from speaker X q And spectra S y from speaker Y q Find transform matrices A and B such that AS x predicts BS y q Will transform the voice of speaker X to that of speaker Y n Issues: n CCA q Voice transformation q 11-755 MLSP: Bhiksha Raj

  23. The Doppler Ultrasound Sensor n Using the Doppler Effect 11-755 MLSP: Bhiksha Raj

  24. The Doppler Effect The observed frequency of a moving sound source differs from n the emitted frequency when the source and observer are moving relative to each other Discovery attributed to Christian Doppler (1803-1853) q Person being approached by a police car hears a higher frequency than a person from whom the car is moving away 11-755 MLSP: Bhiksha Raj

  25. Observed frequency The relationship of actual to percieved frequencies is known n Case 1: The source is moving with velocity n v , but the listener is static Observed frequency is: q c f sound f ' = c v - sound Case 2: The observer is emitting the signal n which is reflected off the moving object Observed frequency is: q ( c v ) f + sound f ' = c v - sound 11-755 MLSP: Bhiksha Raj

  26. Doppler Spectra 40 Khz tone reflected by an object approaching at approximately n 5m/s 40 KHz (transmitted freq) 41.22 KHz (reflected) power frequency 40 Khz tone reflected by two objects, one approaching at n approximately 5m/s and another at 3m/s 40.72 KHz (reflected) 40 KHz (transmitted) 41.22 KHz (reflected) power frequency Multiple velocities result in multiple reflected frequencies 11-755 MLSP: Bhiksha Raj

  27. Doppler from Walking Person Human beings are articulated objects n When a person walks, different parts of his body move with different n velocities. The combination of velocities is characteristic of the person These can be measured as the spectrum of a reflected Doppler signal q Log power Peak stride: Peaks at the incident Frequencies are frequency (40KHz) from less spread out reflections off static frequency objects in environment Log power Mid stride: Frequencies are more spread out frequency frequency time spectrogram of the reflections of a 40Khz tone by a person walking toward the sensor 11-755 MLSP: Bhiksha Raj The spikes in the spectrogram are measurement artefacts

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend