Information Retrieval
Information Retrieval
Index Construction Hamid Beigy
Sharif university of technology
October 6, 2018
Hamid Beigy | Sharif university of technology | October 6, 2018 1 / 30
Information Retrieval Index Construction Hamid Beigy Sharif - - PowerPoint PPT Presentation
Information Retrieval Information Retrieval Index Construction Hamid Beigy Sharif university of technology October 6, 2018 Hamid Beigy | Sharif university of technology | October 6, 2018 1 / 30 Information Retrieval Table of contents 1.
Information Retrieval
Hamid Beigy | Sharif university of technology | October 6, 2018 1 / 30
Information Retrieval
Hamid Beigy | Sharif university of technology | October 6, 2018 2 / 30
Information Retrieval | Introduction
Hamid Beigy | Sharif university of technology | October 6, 2018 3 / 30
Information Retrieval | Introduction
Hamid Beigy | Sharif university of technology | October 6, 2018 3 / 30
Information Retrieval | Introduction
Hamid Beigy | Sharif university of technology | October 6, 2018 4 / 30
Information Retrieval | Introduction
Hamid Beigy | Sharif university of technology | October 6, 2018 5 / 30
Information Retrieval | Introduction
Hamid Beigy | Sharif university of technology | October 6, 2018 6 / 30
Information Retrieval | Sort-based index construction
Hamid Beigy | Sharif university of technology | October 6, 2018 7 / 30
Information Retrieval | Sort-based index construction
Hamid Beigy | Sharif university of technology | October 6, 2018 7 / 30
Information Retrieval | Sort-based index construction
Hamid Beigy | Sharif university of technology | October 6, 2018 8 / 30
Information Retrieval | Sort-based index construction
Hamid Beigy | Sharif university of technology | October 6, 2018 9 / 30
Information Retrieval | Sort-based index construction
Hamid Beigy | Sharif university of technology | October 6, 2018 10 / 30
Information Retrieval | Sort-based index construction
Hamid Beigy | Sharif university of technology | October 6, 2018 11 / 30
Information Retrieval | Single–pass in-memory indexing (SPIMI)
Hamid Beigy | Sharif university of technology | October 6, 2018 12 / 30
Information Retrieval | Single–pass in-memory indexing (SPIMI)
Hamid Beigy | Sharif university of technology | October 6, 2018 12 / 30
Information Retrieval | Single–pass in-memory indexing (SPIMI)
Hamid Beigy | Sharif university of technology | October 6, 2018 13 / 30
Information Retrieval | Single–pass in-memory indexing (SPIMI)
Hamid Beigy | Sharif university of technology | October 6, 2018 14 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 15 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 15 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 16 / 30
Information Retrieval | Distributed indexing
Mem Disk CPU Mem Disk CPU
Switch ch rack contains 16-64 nodes Mem Disk CPU Mem Disk CPU
Switch Switch s between air of nodes ack 2-10 Gbps backbone between racks
Hamid Beigy | Sharif university of technology | October 6, 2018 17 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 18 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 19 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 20 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 21 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 22 / 30
Information Retrieval | Distributed indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 23 / 30
Information Retrieval | Distributed indexing
map(key, value): // key: document name; value: text of document for each word w in value: emit(w, 1) reduce(key, values): // key: a word; value: an iterator over counts result = 0 for each count v in values: result += v emit(result)
Hamid Beigy | Sharif university of technology | October 6, 2018 24 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 25 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 25 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 26 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 27 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 28 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 29 / 30
Information Retrieval | Dynamic indexing
Hamid Beigy | Sharif university of technology | October 6, 2018 30 / 30