Who is that guy? Sanne Grinovero From this planet T eam - - PowerPoint PPT Presentation

who is that guy
SMART_READER_LITE
LIVE PREVIEW

Who is that guy? Sanne Grinovero From this planet T eam - - PowerPoint PPT Presentation

4 June 2012 Sanne Grinovero, Red Hat What you get by replicating Lucene indexes on the Infinispan Data Grid Who is that guy? Sanne Grinovero From this planet T eam Hibernate Hibernate Search Hibernate OGM T eam


slide-1
SLIDE 1

What you get by replicating

Lucene indexes on the Infinispan Data Grid

4 June 2012 Sanne Grinovero, Red Hat

slide-2
SLIDE 2

Who is that guy?

  • Sanne Grinovero
  • From this planet
  • T

eam Hibernate

  • Hibernate Search
  • Hibernate OGM
  • T

eam Infinispan

  • Infinispan Core
  • Infinispan Query
  • Apache Lucene, Netty, HotSpot, ANTLR, JGroups,

Byteman, The Jokre

slide-3
SLIDE 3

What are we talking about?

  • Apache Lucene
  • Infinispan
  • Integrations with Lucene
  • Infinispan Lucene Directory
slide-4
SLIDE 4

Apache Lucene ?

slide-5
SLIDE 5
  • An in-memory datagrid
  • Memory of multiple nodes
  • Cluster modes
  • CacheLoaders
  • Integrations with Lucene
  • Lucene Directory
slide-6
SLIDE 6

Infinispan API?

  • Map-like key/value store
  • JSR 107 javax.cache.Cache interface
  • JSR 347 ??
  • Asynchronous API
slide-7
SLIDE 7

In practice:

cache.put( “user-34”, userInstance ); cache.get( “user-34” ); cache.remove( “user-34” ); cache.putIfAbsent( “user-38”, other );

slide-8
SLIDE 8

Distributed Data

slide-9
SLIDE 9

Connected via JGroups

A Toolkit for Reliable Multicast Communication http://jgroups.org

slide-10
SLIDE 10

Or remote clients via:

  • Memcached
  • REST
  • Hot Rod (Ruby, Python, C, C#, ...)
  • Netty
slide-11
SLIDE 11

Consistent Hashing: DIST

slide-12
SLIDE 12

Transactions!

slide-13
SLIDE 13

JBoss AS7 core component

  • Cluster nodes autodiscovery
  • Session replication / failover
  • Hibernate second level cache
  • mod_cluster integration
slide-14
SLIDE 14

In-memory volatile?

Cache Stores: durability, warm caches, more capacity...

  • Cassandra
  • HBase
  • JDBC
  • Clouds (S3, ...)
  • Plain Old Files
  • Many more + custom
slide-15
SLIDE 15

Back on Lucene: Single Writer lock

slide-16
SLIDE 16

Queue-based clustering

(filesystem index)

slide-17
SLIDE 17

Lucene index storage

slide-18
SLIDE 18
slide-19
SLIDE 19

Index stored in Infinispan

slide-20
SLIDE 20

Example architecture : JIRA / Scarlet

slide-21
SLIDE 21

Hints

  • Some tuning options might have

different effects than what you're used

  • Network is orders of magnitude faster

than disk (YMMV)

  • But data locality helps
  • Balance resources
  • Get mergers to avoid segment

chunking, or readlocks will engage

slide-22
SLIDE 22

“benchmarks”, stats and more lies

Infinispan Local FSDirectory Infinispan D40 Infinispan D4 Infinispan 0 RAMDirectory 5000 10000 15000 20000 25000

Queries/sec

queries per second

Infinispan Local FSDirectory Infinispan D40 Infinispan D4 Infinispan 0 RAMDirectory 50 100 150 200 250 300 350 400

Write ops/sec

slide-23
SLIDE 23

Infinispan Local FSDirectory Infinispan D40 Infinispan D4 Infinispan 0 RAMDirectory 5000 10000 15000 20000 25000

Queries/sec

queries per second

Infinispan Local FSDirectory Infinispan D40 Infinispan D4 Infinispan 0 RAMDirectory 50 100 150 200 250 300 350 400

Write ops/sec

It's not about the figures

slide-24
SLIDE 24

What's next?

  • Infinispan (core) 5.2 and 6
  • Lucene 4.x
  • Dynamic chunk sizes
  • Ad-hoc “Lucene native” CacheStore
  • NIO byte buffers?
slide-25
SLIDE 25

Conclusions

  • Quick index replication
  • Transactions
  • Not a replacements for shards
  • Cloud-friendly
  • Delegates to any storage
slide-26
SLIDE 26

Q&A

@Infinispan @Hibernate @SanneGrinovero http://infinispan.org http://in.relation.to http://jboss.org