Scaling Hibernate Emmanuel Bernard - Max Ross Emmanuel Bernard - - PowerPoint PPT Presentation

scaling hibernate
SMART_READER_LITE
LIVE PREVIEW

Scaling Hibernate Emmanuel Bernard - Max Ross Emmanuel Bernard - - PowerPoint PPT Presentation

Scaling Hibernate Emmanuel Bernard - Max Ross Emmanuel Bernard Hibernate Search in Action blog.emmanuelbernard.com twitter.com/emmanuelbernard Max Ross Google App Engine Hibernate Shards What is scalability? What is


slide-1
SLIDE 1

Scaling Hibernate

Emmanuel Bernard - Max Ross

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4

Emmanuel Bernard

  • Hibernate Search in Action
  • blog.emmanuelbernard.com
  • twitter.com/emmanuelbernard
slide-5
SLIDE 5

Max Ross

  • Google App Engine
  • Hibernate Shards
slide-6
SLIDE 6

What is scalability?

slide-7
SLIDE 7

What is scalability?

  • Users
  • Resource
  • Data
  • Uptime
slide-8
SLIDE 8

How does Hibernate stand?

  • Limitations?
  • SQL optimizations
  • 2nd level cache
  • Conversation

Node 2nd level cache DB Session Session Session Node 2nd level cache Session Session Session Node 2nd level cache Session Session Session

slide-9
SLIDE 9

Changes in mass

  • Bulk insert / update / delete
  • Stateless session
slide-10
SLIDE 10

To Googolzillions and beyond

slide-11
SLIDE 11

Googolzillion things? Who are you?

  • Social network
  • SaaS
slide-12
SLIDE 12

Problem

  • Same data model
  • Too much load
  • Too much data
  • Too many lawyers
slide-13
SLIDE 13

Separating customer data

slide-14
SLIDE 14

Logical separation

  • All customers share tables
  • Manual or Hibernate Filter

Application SessionFactory DB Schema

slide-15
SLIDE 15

One user per schema

  • One SessionFactory per schema
  • Rewrite SQL

Application Session Factory DB Schema Session Factory Schema Application Session Factory DB Schema Schema

slide-16
SLIDE 16

Use database security

  • Map JAAS credentials to DB credentials
  • One connection (pool) per user
slide-17
SLIDE 17

Oracle security

  • Oracle VPD
  • Application defines active user
slide-18
SLIDE 18

Storing in multiple databases

slide-19
SLIDE 19

SessionFactory == DB

  • Same schema across DBs
  • Expensive in RAM
  • Data isolated

Sharing state across SessionFactorys is probably doable

slide-20
SLIDE 20

How many customer per DB?

  • One
  • One per schema
  • Several per schema
  • Dispatch customer to the right

SessionFactory

slide-21
SLIDE 21

Adjusting the application layer

slide-22
SLIDE 22

Homogeneous nodes

  • Memory
  • Too many connections
  • Slow to start

Application Session Factory DB Conn pool Session Factory DB Conn pool Application Session Factory Conn pool Session Factory Conn pool Application Session Factory Conn pool Session Factory Conn pool

slide-23
SLIDE 23

Specialized nodes

  • Load balancing rules
  • Easy scalability
  • Efficient resource-wise

Application Session Factory Conn pool Session Factory DB Conn pool Application Session Factory DB Conn pool Session Factory DB Conn pool Session Factory DB Conn pool Application Session Factory DB Conn pool Dispatch per user Application Session Factory Conn pool

slide-24
SLIDE 24

What if you need to query all your data?

slide-25
SLIDE 25

Hibernate Shards

slide-26
SLIDE 26

Simplified Horizontal Partitioning

  • Separates app logic from federation logic
  • Standard Hibernate API
  • Unified view of your data
slide-27
SLIDE 27

Shard Strategy

  • Federation logic is application specific
  • Selection
  • Resolution
  • Access

?

Model Object Shard 1 Shard 2 Shard 3

? ?

slide-28
SLIDE 28

Shard Selection

  • On which shard do we create the record?
  • Round robin
  • Capacity based
  • Attribute based
  • Performance based
slide-29
SLIDE 29

Shard Resolution

  • On which shard do we find the record?
  • Exhaustive search
  • Map ID ranges to shards
  • Distributed cache
slide-30
SLIDE 30

Shard Access

  • How do we apply operations across

shards?

  • Serially
  • In parallel (bring your own thread pool)
  • Hybrid
slide-31
SLIDE 31

Writing the app is the easy part

  • Operational challenges/risks are amplified
  • Virtual shards can help
slide-32
SLIDE 32

Virtual Shards

Application

Sharded Session Factory Virtual Shard 1 Virtual Shard 2 Virtual Shard 3

Physical Shard 1

slide-33
SLIDE 33

Virtual Shards

Application

Sharded Session Factory Virtual Shard 1 Virtual Shard 2 Virtual Shard 3

Physical Shard 2 Physical Shard 1

slide-34
SLIDE 34

Coming Soon

  • Static Data
  • Full-fledged ShardedQuery
  • JPA
slide-35
SLIDE 35

Hibernate Search

slide-36
SLIDE 36

Full-text search your domain objects

  • Hibernate + Lucene
  • Same programmatic model
  • Index synchronized
slide-37
SLIDE 37

Human queries

  • Data set
  • Word centric
  • Typos / Synonyms
  • Relevance
slide-38
SLIDE 38

SQL underperforms

  • Wildcard
  • Table/Index full scan
  • Multiple joins
  • Relevance?
slide-39
SLIDE 39

Customer DBA

slide-40
SLIDE 40

Full-text search

  • Move load away from the DB
  • Replace or complement searches
slide-41
SLIDE 41

Scalability Symmetric cluster

  • Distributed lock
  • Immediate visibility
  • Affects front end

Database Lucene Directory (Index)

Hibernate + Hibernate Search Search request Index update Hibernate + Hibernate Search Search request Index update

slide-42
SLIDE 42

Scalability Asymmetric cluster

  • Search local / change sent to master
  • Asynchronous indexing (delay)
  • No front end extra cost / good scalability

Database

Hibernate + Hibernate Search

JMS queue Lucene Directory (Index) Master

Hibernate + Hibernate Search Process Index update Index update order

Lucene Directory (Index) Copy

Search request Copy

Slave Master

slide-43
SLIDE 43

Scalabilities (sic)

  • Hibernate a good citizen
  • Isolating customer data
  • Deal with multiple databases
  • Hibernate Shards
  • Hibernate Search
slide-44
SLIDE 44

Q&A

  • For more infos
  • Hibernate Search in Action
  • Java Persistence with Hibernate
  • Max’s podcasts
  • http://google-code-updates.blogspot.com/2007/08/google-developer-

podcast-episode-six.html

  • http://www.javaworld.com/podcasts/jtech/2008/072408jtech.html
  • hibernate.org