Information Retrieval Venkatesh Vinayakarao venkateshv@cmi.ac.in - - PowerPoint PPT Presentation

information retrieval
SMART_READER_LITE
LIVE PREVIEW

Information Retrieval Venkatesh Vinayakarao venkateshv@cmi.ac.in - - PowerPoint PPT Presentation

https://vvtesh.sarahah.com/ Information Retrieval Venkatesh Vinayakarao venkateshv@cmi.ac.in Chennai Mathematical Institute What we find changes who we become. -Peter Morville. Venkatesh Vinayakarao (Vv) Agenda A gentle introduce to


slide-1
SLIDE 1

What we find changes who we become.

  • Peter Morville.

Venkatesh Vinayakarao (Vv)

Information Retrieval

Venkatesh Vinayakarao venkateshv@cmi.ac.in Chennai Mathematical Institute

https://vvtesh.sarahah.com/

slide-2
SLIDE 2

Agenda

A gentle introduce to “Information Retrieval”

slide-3
SLIDE 3

About Me

BE (Computer Science) MS (Information Tech.) Software Engineer Software Engineer Software Engineer Yahoo, Microsoft PhD (IR + PA + SE) Year 2000 2013 2002 2009 2004 2018 Teaching & Research (IR + PA + SE) Till date

IR = Information Retrieval, PA = Program Analysis, SE = Software Engineering

slide-4
SLIDE 4

Acknowledgment Some slides are borrowed from the companion website of Manning et al.’s IR book

(https://nlp.stanford.edu/IR-book/)

A good teacher can inspire hope, ignite the imagination, and instill a love of learning.

  • Brad Henry.
slide-5
SLIDE 5

Life without search engines is difficult to imagine!

slide-6
SLIDE 6

Search in Banking and Finance

Lots of products to sell Reach a part of documentation faster

slide-7
SLIDE 7

Search in Sports, Travel & Entertainment

Search events, programs, and schedules

slide-8
SLIDE 8

Search in Education, Ecommerce and Healthcare

Search courses, articles, symptoms, books, etc.

slide-9
SLIDE 9

Results of job search conducted on 18th June 2020 on https://www.linkedin.com/jobs for solr/information retrieval/search

slide-10
SLIDE 10

Introduction

What is Information Retrieval?

d1:“IIT Madras” d2:“CMI” … Collection

Retrieval System

Results = ?? Query = “CMI” Information Need Let us learn more about CMI

Information Retrieval (IR) is finding material (usually documents)

  • f an unstructured nature (usually text) that satisfies an

information need from within large collections.

– From the Manning et al. IR Book.

slide-11
SLIDE 11

Information

Information is any entity or form that provides the answer to a question of some kind or resolves uncertainty. – Wikipedia.

Shannon’s Definition, Fisher Information, Neumann Entropy, …

slide-12
SLIDE 12

Introduction

Role of Information

  • If only you knew
  • Which stock to invest in?
  • Which faculty to work with?
  • How to get into a top college?
  • Which course to register for?
  • What to study?
  • How to prepare for job interviews?
  • If only you had the information, you could rule this

world!

  • What happens when all the information is deprived

from you?

slide-13
SLIDE 13

Introduction

Solitary ry Confinement is Cruel

slide-14
SLIDE 14

Introduction

Information

Several retrieval systems: Lycos, Altavista, MSN, Baidu, Yahoo!, Ask.com, etc.,

Royal Library of Alexandria

300 BC.

Bibliothèque nationale de France 1463

British Library 1970’s

170+ Million Collection

Digital Libraries

1970’s

Universal Digital Library, Project Gutenberg, etc.

Google

1998

30 Trillion documents > 130 Trillion in 2016

slide-15
SLIDE 15

Information Retrieval – Road Ahead

Processed Content Index

Retrieval System Results Query Documents Query Results Human Judges Crawling Relevance and Ranking Index Compression Evaluation Techniques Content Processing

1 2 3 4 5 6

slide-16
SLIDE 16

Technologies & Frameworks

Galago, Indri UMass & CMU

  • Univ. of Glasgow

Apache Apache

Thanks to these… We can now focus on more complex problems.

Apache There are many more….

slide-17
SLIDE 17

Entity Search

slide-18
SLIDE 18

In spite of all these developments, “search”ing effectively has been an art.

(This is why) The academia, research labs, and the software industry needs you. Let us strive to build better search experiences.

Research contributions from leading corporates in SIGIR 2020

slide-19
SLIDE 19

Introduction

Resources

Course Text Reference

slide-20
SLIDE 20

Thank You