What we find changes who we become.
- Peter Morville.
Venkatesh Vinayakarao (Vv)
Information Retrieval
Venkatesh Vinayakarao venkateshv@cmi.ac.in Chennai Mathematical Institute
https://vvtesh.sarahah.com/
Information Retrieval Venkatesh Vinayakarao venkateshv@cmi.ac.in - - PowerPoint PPT Presentation
https://vvtesh.sarahah.com/ Information Retrieval Venkatesh Vinayakarao venkateshv@cmi.ac.in Chennai Mathematical Institute What we find changes who we become. -Peter Morville. Venkatesh Vinayakarao (Vv) Agenda A gentle introduce to
What we find changes who we become.
Venkatesh Vinayakarao (Vv)
Venkatesh Vinayakarao venkateshv@cmi.ac.in Chennai Mathematical Institute
https://vvtesh.sarahah.com/
BE (Computer Science) MS (Information Tech.) Software Engineer Software Engineer Software Engineer Yahoo, Microsoft PhD (IR + PA + SE) Year 2000 2013 2002 2009 2004 2018 Teaching & Research (IR + PA + SE) Till date
IR = Information Retrieval, PA = Program Analysis, SE = Software Engineering
Acknowledgment Some slides are borrowed from the companion website of Manning et al.’s IR book
(https://nlp.stanford.edu/IR-book/)
A good teacher can inspire hope, ignite the imagination, and instill a love of learning.
Life without search engines is difficult to imagine!
Lots of products to sell Reach a part of documentation faster
Search events, programs, and schedules
Search courses, articles, symptoms, books, etc.
Results of job search conducted on 18th June 2020 on https://www.linkedin.com/jobs for solr/information retrieval/search
Introduction
d1:“IIT Madras” d2:“CMI” … Collection
Retrieval System
Results = ?? Query = “CMI” Information Need Let us learn more about CMI
Information Retrieval (IR) is finding material (usually documents)
information need from within large collections.
– From the Manning et al. IR Book.
Shannon’s Definition, Fisher Information, Neumann Entropy, …
Introduction
Introduction
Introduction
Several retrieval systems: Lycos, Altavista, MSN, Baidu, Yahoo!, Ask.com, etc.,
Royal Library of Alexandria
300 BC.
Bibliothèque nationale de France 1463
British Library 1970’s
170+ Million Collection
Digital Libraries
1970’s
Universal Digital Library, Project Gutenberg, etc.
1998
30 Trillion documents > 130 Trillion in 2016
Processed Content Index
Retrieval System Results Query Documents Query Results Human Judges Crawling Relevance and Ranking Index Compression Evaluation Techniques Content Processing
1 2 3 4 5 6
Galago, Indri UMass & CMU
Apache Apache
Thanks to these… We can now focus on more complex problems.
Apache There are many more….
In spite of all these developments, “search”ing effectively has been an art.
(This is why) The academia, research labs, and the software industry needs you. Let us strive to build better search experiences.
Research contributions from leading corporates in SIGIR 2020
Introduction
Course Text Reference