Elastic Search - Aditi Choksi (EW18455) Elastic Search Search - - PowerPoint PPT Presentation

elastic search
SMART_READER_LITE
LIVE PREVIEW

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search - - PowerPoint PPT Presentation

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed search Full text Search Near real time search Evolution of Data Size of data being generated and stored has grown exponentially over


slide-1
SLIDE 1

Elastic Search

  • Aditi Choksi (EW18455)
slide-2
SLIDE 2

Elastic Search

  • Search engine
  • Distributed search
  • Full text Search
  • Near real time search
slide-3
SLIDE 3
  • Size of data being

generated and stored has grown exponentially

  • ver the past few

decades.

Evolution of Data

slide-4
SLIDE 4

Vertical Scaling – increase machine size Horizontal Scaling – add more machines

Need for Distributed Data Systems

  • Elastic search sends a query to every

node / machine and then collects and combines the results from them to return to the user.

slide-5
SLIDE 5

Elastic Search Cluster

Shard Shard Shard Shard 2 Shard 3 Shard 3 Shard 1 Shard 2 Shard 1

slide-6
SLIDE 6

Lucene Index

Segment

slide-7
SLIDE 7

Documents 3 1 2, 3 2 1, 2, 3 2 2,3 1 3

dictionary postings

Inverted Indexes

Term count Frequency choice 1 coming 1 contours 2 fury 1 is 3

  • urs

1 the 2 winter 1 yours 1

slide-8
SLIDE 8

Documents 3 1 2, 3 2 1, 2, 3 2 2,3 1 3

dictionary postings

Inverted Indexes

Term count Frequency choice 1 coming 1 contours 2 fury 1 is 3

  • urs

1 the 2 winter 1 yours 1

2 2, 3

slide-9
SLIDE 9
  • Wild card searches are difficult
  • These are unindexed queries
  • So searching somethings like *our* requires going

through all the terms of the index.

Wild Card Queries

Term count Frequency choice 1 coming 1 contours 2 fury 1 is 3

  • urs

1 the 2 winter 1 yours 1

slide-10
SLIDE 10
  • Can you think of a way to make queries like *ours

efficient? What kind of index can we create?

Term count Frequency choice 1 coming 1 contours 2 fury 1 is 3

  • urs

1 the 2 winter 1 yours 1

Question

slide-11
SLIDE 11
  • Can you think of a way to make queries like *ours

efficient? What kind of index can we create?

  • Reverse Indexing:

*ours → sruo*

  • search(our*) union search(sruo*)

Term count Reversed word choice eciohc coming gnimoc contours sroutnoc fury yruf is si

  • urs

srou the eht winter retniw yours sruoy

Question

slide-12
SLIDE 12

Shard Shard Shard Shard 2 Shard 3 Shard 3

Shard 1 Shard 2 Shard 1

Bottom up

  • Indexes are immutable,

segments are merged and that’s when obsolete entries are cleaned

slide-13
SLIDE 13

References

  • [1]Reaz Ahmed, R. Boutaba, 2011 “A Survey of Distributed Search Techniques in Large Scale

Distributed Systems”, IEEE Communications Surveys and Tutorials

  • [2]Enrico Nardelli, Fabio Barillari, 2015, “Distributed Searching of Multi-dimensional Data”
  • [3] ShaoHua Liu ; Xing Xue, 2016, Distributed Database Query Based on Improved Genetic Algorithm,

3rd International Conference on Information Science and Control Engineering

  • [4] Clinton Gourmley, Zachary Tong, 2015, ElasticSearch: The Definitive Guide
  • https://www.youtube.com/watch?v=lWKEphKIG8U
slide-14
SLIDE 14

Thanks ☺