Voice Based Information Retrieval System How far is it from text - PowerPoint PPT Presentation

Voice Based Information Retrieval System How far is it from text based retrieval system? PRAJNA BHANDARY CMSC 676

MOTIVATION ● The ever increasing Internet bandwidth, the ever-decreasing storage costs and the fast development of multimedia technologies have paved road for more and more multimedia network content. ● The main motivation for many researchers in this area is to help visually challenged individuals to get information using a device used for speech recognition system

INTRODUCTION There are 3 different tasks of the Voice based Retrieval System ● Using Text Queries to retrieve spoken documents ○ Referred as Spoken Document Retrieval ○ Found that the queries need to be long in order for it to be more efficient ● Using spoken queries to retrieve text documents ○ Voice Search ○ The information to be retrieved is usually an existing text database such as those in directory assistance applications, although with lexical variations and so on but primarily without recognition uncertainty. ● Using spoken queries to retrieve spoken documents ○ In this case the speech recognition uncertainty exists on both sides of the queries and the documents, and therefore naturally this is a more difficult task this.

COMPARISON Text-Based Voice-Based Resources Rich resources-huge quantities of text Spoken/multimedia content are the new documents available over the internet trend Quantity continues to increase Can be realized even sooner given exponentially due to convenient access mature technologies Accuracy Retrieval accuracy is acceptable to Problems with speech recognition users and are properly ranked and errors, especially for spontaneous filtered speech under adverse environments User-System Retrieved documents easily Spoken/multimedia documents easily Interaction summarised on-screen thus easily summarised on-screen thus difficult to scanned and selected by the user scan and select User may easily select query terms Lacks efficient user system interaction suggested for next iteration retrieval in an interactive process

RETRIEVAL ACCURACY ● Lattice-based Approaches ● Position Specific Posterior Lattices(PSPL) ● Confusion Networks(CN) ● Time-based Merging for Indexing(TMI) ● Time-anchored Lattice Expansion(TALE) ● Position Specific Posterior Lattices(PSPL) ● Locating a word in a segment according to the position(or sequence ordering) of the word in a path as a tuple (W, d, pos, prob). ● Confusion Networks(CN) ● Clustering several words in a segment according to similar time spans and word pronunciation.

RETRIEVAL ACCURACY (Cont’d) Relevance ranking relevance scores between the segments and a query Q, which is a sequence of words, {W j , j = 1, 2.., Q} First calculate the expected tapered-count for each N-gram {Wi...Wi+N−1} within the query in a spoken segment d, S(d,Wi...Wi+N−1) as given below and aggregate the results to produce a score S N-gram (d, Q) for each order N as in where L is the lattice obtained from d and k is the cluster number in PSPL or CN structures. The different proximity types, one for each N-gram order allowed by the query length Q, are finally combined by a weighted sum to give the final relevance score S(d, Q),

USER-SYSTEM INTERACTION ● Multi-model dialogue for a query given by the user, the retrieval system produces a topic hierarchy constructed from the retrieved spoken documents to be shown on the screen. ● Semantic analysis of spoken documents

USER-SYSTEM INTERACTION ● Key term extraction from spoken documents Based on latent topic significance ● Automatic Generation of Summaries and Titles for spoken documents ● Query-based Local Semantic Structuring of Spoken Documents ● Semantic Structuring of spoken documents ● Interactive retrieval in Dialogue loop

PROPOSED MODEL Voice Voice to text Keyword BoW(Bag of Pattern Matching words) If no matc h with This is a three step DB process: 1. Speech to text yes Voice based 2. Pattern matching reply 3. Text to speech Voice Reply

VOICE TO TEXT ● A fuzzy logics can be used to match the speech of different accents. eg. the word “Vector” has different pronunciations ● Thus a single word can be represented by a fuzzy set. ● Now since this is a very specific to fit in a generic model of speech recognition, we can have a more general model of fuzzification of phonemes. ● This model is applied to spoken sentences. One fuzzy set is based on accents, the second one the speeds of pronunciation and the third on emphasis

BAG-of-WORDS ● A bag-of-words is a representation of text that describes the occurrence of words within a document. It involves two things: ○ A vocabulary of known words. ○ A measure of the presence of unknown words. ○ The steps followed: ■ Collect data ■ Create Vocabulary ■ Create Document Vector ■ Managing Vocabulary ■ Scoring words ■ Word Hashing ■ TF-IDF

PATTERN MATCHING ● Boyer-Moore(BM) algorithm can be used which positions the pattern over the leftmost characters in the text and attempts to match it from right to left. If no mismatch occurs then the pattern is found else. ● The algorithm computes a shift by an amount by which the pattern is moved to the right before a new matching is undertaken ● Shift is computed using two heuristics : ○ match heuristic ○ Occurence heuristics i. Match all characters previously matched and ii. To bring different character to the position in the text that caused the mismatch 𝑒 [ 𝑦 ] = 𝑛𝑗𝑜 { 𝑡 | 𝑡 = 𝑛 𝑝𝑠 (0 𝑡 < 𝑛 𝑏𝑜𝑒 𝑞𝑏𝑢𝑢𝑓𝑠𝑜 [ 𝑛 − 𝑡 ] = 𝑦 )}

TEXT TO VOICE ● After getting the text it must it must analyse and then transform into a phonetic description ● NLP module: ○ Digital Signal Processing(DSP) module: It transforms the symbolic information received to audible one as follows: text analysis: first the text is segmented into tokens. The token-to-word conversion creates the orthographic form of the token example Mr is mister and humber like 2 are transformed to two ○ Application of Pronunciation rules: After the text analysis is completed pronunciation rules can be applied. Silent letters in a word(h in caught) or several phoneme like(m in maximum) ■ Dictionary based solution: A dictionary can be used where all forms of possible words are stored. ■ Rule based solution: rules are generated from the phonological knowledge of dictionaries. Only words with come exception on pronunciation are included

CONCLUSION & FUTURE SCOPE It can be concluded that this approach is efficient in term of reduced computation complexity, reduced time ● There is research being done to make the whole process telephonic ● Limitations of Bag-of-Words ● Vocabulary ● Sparsity ● Meaning

REFERENCES [1] R. Uma, B. Latha. “An efficient voice based information retrieval using bag of words based indexing”, International Journal of Engineering & Technology [2] Lin-shan Lee and Yi-cheng Pan. “Voice-based Information Retrieval- how far are we from the text-based information retrieval?”, 2009 IEEE [3] Kiruthika M, Priyadarsini S, Rishwana Roshan K, Shifana Parvin V.M, Dr. G. Umamaheshwari. “Voice Based iNformation Retrieval System”, International Journal of Innovative Research in Science, Engineering and Technology [4]Personal Voice Based Information Retrieval System, patent [5] Lakra, Sachin, et al. "Application of fuzzy mathematics to speechto-text conversion by elimination of paralinguistic content." arXiv preprint arXiv: 1209.4535 (2012). [6] KNUTH, D., J. MORRIS, and V. PRATT. 1977. "Fast Pattern Matching in Strings." SIAM J on Computing, 6, 323-50. [7] BOYER, R., and S. MOORE. 1977. "A Fast String Searching Algorithm." CACM, 20, 762-72. [8] Ondrej Chum, James Philbin, Josef Sivic, Michael Isard, and Andrew Zisserman. Total recall:Automatic query expansion with a generative feature model for object retrieval. In ICCV, pages1–8, 2007. [9] HHerv´eJ´egou, MatthijsDouze, and CordeliaSchmid. Improving bag-of-features for largescale image search. International Journal of Computer Vision, 87(3):316–336, 2010.

Voice Based Information Retrieval System How far is it from text - PowerPoint PPT Presentation

Voice Based Information Retrieval System How far is it from text based retrieval system? PRAJNA BHANDARY CMSC 676 MOTIVATION The ever increasing Internet bandwidth, the ever-decreasing storage costs and the fast development of multimedia

Slide 1 Page: 1 The Leader's Voice Slide 3 Page: 5 The Leader's Voice Slide 4 Page: 6 The

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and

Digital Voice VHF, UHF, and HF Analog Voice - AM/SSB Analog Voice - FM Digital Voice GMSK UHF

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval Introducing Information Retrieval and Web Search

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Aisle Safety Light Brightness SFMTA Fleet Engineering Voice Annunciator Volume Voice

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Speech Processing 15-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis How

Structured Document Retrieval Benjamin Piwowarski DCC October 28, 2004 B. Piwowarski (DCC)

Impact of ASF on availability of critical nutrients in breast milk Lindsay H. Allen Center

Cannabis: Regulation, Testing, and Standardization Heather Krug, MS State Marijuana Laboratory

Slide 1 Slide 2 Human risk assessment perspectives for high risk conditions Jean Lou Dorne

Analysis of the Paragraph Vector Model for Information Retrieval Qingyao Ai 1 , Liu Yang 1 ,

Linear Algebraic Models in Information Retrieval Nathan Pruitt and Rami Awwad December 12th, 2016

Challenges for search engine retrieval effectiveness evaluations: Universal Search and user

Joint Visual-Text Modeling for Multimedia Retrieval JHU CLSP Workshop 2004 Final

Voice Based Information Retrieval System How far is it from text - PowerPoint PPT Presentation

Voice Based Information Retrieval System How far is it from text based retrieval system? PRAJNA BHANDARY CMSC 676 MOTIVATION The ever increasing Internet bandwidth, the ever-decreasing storage costs and the fast development of multimedia

Slide 1 Page: 1 The Leader's Voice Slide 3 Page: 5 The Leader's Voice Slide 4 Page: 6 The

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and

Digital Voice VHF, UHF, and HF Analog Voice - AM/SSB Analog Voice - FM Digital Voice GMSK UHF

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval Introducing Information Retrieval and Web Search

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Aisle Safety Light Brightness SFMTA Fleet Engineering Voice Annunciator Volume Voice

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Speech Processing 15-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis How

Structured Document Retrieval Benjamin Piwowarski DCC October 28, 2004 B. Piwowarski (DCC)

Impact of ASF on availability of critical nutrients in breast milk Lindsay H. Allen Center

Cannabis: Regulation, Testing, and Standardization Heather Krug, MS State Marijuana Laboratory

Slide 1 Slide 2 Human risk assessment perspectives for high risk conditions Jean Lou Dorne

Analysis of the Paragraph Vector Model for Information Retrieval Qingyao Ai 1 , Liu Yang 1 ,

Linear Algebraic Models in Information Retrieval Nathan Pruitt and Rami Awwad December 12th, 2016

Challenges for search engine retrieval effectiveness evaluations: Universal Search and user

Joint Visual-Text Modeling for Multimedia Retrieval JHU CLSP Workshop 2004 Final

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models