1 Mapping Relational Data Model Patterns To The App Engine - - PowerPoint PPT Presentation

1 mapping relational data model patterns to the app
SMART_READER_LITE
LIVE PREVIEW

1 Mapping Relational Data Model Patterns To The App Engine - - PowerPoint PPT Presentation

1 Mapping Relational Data Model Patterns To The App Engine Datastore Max Ross November 19, 2009 1 Agenda App Engine Datastore Basics Soft Schemas Moving To App Engine Leaving App Engine Questions 2 2 3 The App Engine


slide-1
SLIDE 1

1

slide-2
SLIDE 2

Mapping Relational Data Model Patterns To The App Engine Datastore

Max Ross November 19, 2009

1

slide-3
SLIDE 3

2

Agenda

  • App Engine Datastore Basics
  • Soft Schemas
  • Moving To App Engine
  • Leaving App Engine
  • Questions

2

slide-4
SLIDE 4

3

slide-5
SLIDE 5

The App Engine Datastore

3

slide-6
SLIDE 6

4

The Datastore Is...

  • Transactional
  • Natively Partitioned
  • Hierarchical
  • Schema-less
  • Based on Bigtable
  • Not a relational database

4

slide-7
SLIDE 7

5

Simplifying Storage

  • Simplify development of apps
  • Simplify management of apps
  • Scale always matters

– Request volume – Data volume

1,000 10,000 100,000 1,000,000 10,000,000 1 100 10,000 1,000,000 Records Concurrent Users

Small dataset, light usage Medium dataset, medium usage Large dataset, heavy usage Large dataset, light usage Small dataset, heavy usage 5

slide-8
SLIDE 8

6

What’s The Value Prop?

  • Free to get started
  • Pay only for what you need
  • Let someone else manage

– upgrades – redundancy – connectivity

  • Let someone else scramble when things go south
  • Scale automatically to any point on the scale curve
  • Remember this when I’m telling you what you have to give up!

6

slide-9
SLIDE 9

7

Datastore Storage Model

  • Basic unit of storage is an Entity consisting of

– Kind (table) – Key (primary key) – Entity Group (partition) – 0..N typed Properties (columns)

Kind Person Entity Group /Person:Ethel Key /Person:Ethel Age Int64: 30 Best Friend Key:/Person:Sally Key:/Person:Dave

7

slide-10
SLIDE 10

8

slide-11
SLIDE 11

Soft Schemas

8

slide-12
SLIDE 12

“A soft schema is a schema whose constraints are enforced purely in the application layer.”

9

slide-13
SLIDE 13

10

Soft Schemas

  • App’s expectations define the schema
  • Simpler development process

– Rapid typesafe prototyping

  • Think about data in a familiar way

Business Logic Business Logic Schema Type Checking FK Constraints App App

Schema Type Checking FK Constraints CRUD Query Engine ID Generation CRUD Query Engine ID Generation

RDBMS GAE Datastore

10

slide-14
SLIDE 14

11

JPA

  • Use JPA to define the soft schema
  • Reuse existing tools, apis, and knowledge
  • You’re not giving up as much as you think!

@Entity class Book { @Id Long id; String author; Date publishDate; // ... } List<Book> getBooksByAuthor(EntityManager em, String author) { Query q = em.createQuery( “select from Book where author = :a order by publishDate”); q.setParameter(“a”, author); return q.getResultList(); }

11

slide-15
SLIDE 15

12

slide-16
SLIDE 16

Moving To App Engine

12

slide-17
SLIDE 17

13

Sub-Agenda

  • Primary Keys
  • Transactions
  • Relationships
  • Queries

13

slide-18
SLIDE 18

14

Primary Keys

  • What’s different?

– kind (table) is part of the pk – hierarchical – Person 13 is the parent of the pet named Ernie

/Person:13/Pet:Ernie

14

slide-19
SLIDE 19

15

Primary Keys - Composite Example

PET_ID (pk) PERSON_ID (pk)(fk) Ernie 13 PET Key /Person:13/Pet:Ernie

15

slide-20
SLIDE 20

16

Primary Keys - Surrogate Example

PET_ID (pk) PET_NAME (u) PERSON_ID (fk) (u) 88 Ernie 13 PET Key /Person:13/Pet:Ernie Key /Person:13/Pet:Ernie PetId 88 Key /Pet:88 PetName Ernie PersonId /Person:13

16

slide-21
SLIDE 21

17

Transactions

  • What’s different?

– Transactions apply to a single Entity Group

/Person:Ethel/Person:Jane /Person:Ethel /Person:Max Transaction

17

slide-22
SLIDE 22

18

Transactions - Entity Group Selection

  • Critical design choice
  • Too coarse hurts throughput
  • Too fine limits usefulness of transactions

Store Aisle Shelf Item Store Aisle Shelf Item Store Aisle Shelf Item

Coarse Fine Just Right?

18

slide-23
SLIDE 23

19

Transactions - Eventual Consistency

  • Use transactional tasks to update multiple entity groups

19

slide-24
SLIDE 24

19

Transactions - Eventual Consistency

  • Use transactional tasks to update multiple entity groups

1 void updateBalance(EntityManager em, Account act, int balance, 2 TaskOptions taskOpts) { 3 em.getTransaction().begin(); 4 act.setBalance(balance); 5 em.merge(act); 6 if (taskOpts != null) { 7 QueueFactory.getDefaultQueue().add(taskOpts); 8 } 9 em.getTransaction().commit(); 10 }

19

slide-25
SLIDE 25

19

Transactions - Eventual Consistency

  • Use transactional tasks to update multiple entity groups

1 void updateBalance(EntityManager em, Account act, int balance, 2 TaskOptions taskOpts) { 3 em.getTransaction().begin(); 4 act.setBalance(balance); 5 em.merge(act); 6 if (taskOpts != null) { 7 QueueFactory.getDefaultQueue().add(taskOpts); 8 } 9 em.getTransaction().commit(); 10 } 11 void transferCash(EntityManager em, Account from, Account to, 12 int amount) { 13 TaskOptions taskOpts = newTask(to, to.getBalance() + amount); 14 updateBalance(em, from, from.getBalance() - amount, taskOpts); 15 updateBalance(em, to, to.getBalance() + amount, null); 16 } 17 TaskOptions newTask(Account act, int newBalance) {...}

19

slide-26
SLIDE 26

20

Transactions - What About 2PC?

  • Similar limitations in a typical sharded db deployment
  • Why not consider a typical sharded db deployment solution?
  • Two phase commit

– Dan Wilkerson (Berkeley) developed the algo – Erick Armbrust (Google) implemented it

/Person:Ethel/Person:Jane /Person:Ethel /Person:Max Txn 1 Txn 2 Distributed Txn

20

slide-27
SLIDE 27

21

Relationships

  • Letting a framework manage relationships can simplify code

– True for RDBMS – Especially true for App Engine Datastore

  • Relationships can be described as “owned” or “unowned”
  • Ownership implies co-location within an Entity Group

21

slide-28
SLIDE 28

22

Owned One To Many

@Entity @Entity class Person { class Pet { // ... // ... @OneToMany(mappedBy = ”owner”) @ManyToOne List<Pet> petList; Person owner; } } void createPersonWithPet(EntityManager em) { em.getTransaction().begin(); Person p = new Person(“max”, “ross”); p.addPet(new Pet(“dog”, “ernie”)); em.persist(p); em.getTransaction().commit(); }

Kind Person Entity Group /Person:13 Key /Person:13 Kind Pet Entity Group /Person:13 Key /Person:13/Pet:18

22

slide-29
SLIDE 29

23

Queries

  • Testing set membership (RDBMS)

– Give me all users who do yoga

  • Requires a join table

@Entity @Entity class User { class UserHobby { // ... // ... List<UserHobby> hobbies; User user; } String hobby; } select from User u JOIN u.hobbies h where h.hobby = ‘yoga’

23

slide-30
SLIDE 30

24

Queries Continued

  • Testing set membership (GAE Datastore)

– Give me all users who do yoga

  • Use a multi-value property!
  • Simpler and more efficient!

@Entity class User { // ... List<String> hobbies; } select from User where hobbies = ‘yoga’

24

slide-31
SLIDE 31

25

Why We Don’t Support Joins (yet)

  • Our commitment:

– Query performance scales linearly with the size of the result set

  • Feasible for joins?

– How can we return the first result without constructing a complete

cross product?

  • Making good progress

– Working algo for a subset of join queries! – Based on merge-join – Not production ready

select * from Student s JOIN s.courses c where c.department = ‘Biology’ and s.grade = 10 order by s.lastName

25

slide-32
SLIDE 32

26

In The Meantime...

– RDBMS encourages cheap writes and expensive reads – Datastore encourages expensive writes and cheap reads

  • Denormalization is not a dirty word!

– What happens when a course switches departments?

@Entity class Student { // ... int grade; List<Course> courses; List<String> courseDepartments; } EntityManager em = getEntityManager(); em.createQuery(“select from Student where grade = 10 and courseDepartments = ‘biology’).getResultList();

26

slide-33
SLIDE 33

27

slide-34
SLIDE 34

Leaving App Engine

27

slide-35
SLIDE 35

28

Taking Your Code To Someone Else’s Party

  • App Engine persistence generally more restrictive

– Primary Keys – Queries – Transactions

  • Decide what portability means and how important it is

– To Key or not to Key? – Multi-value properties

  • Congratulations, you’ve already sharded your data model!

28

slide-36
SLIDE 36

29

Portable Root Object

@Entity class Book { @Id String id; String title; // ... }

Kind Book Entity Group /Book:2 Key /Book:2 Title Vineland ID (pk) TITLE 2 Vineland BOOK

29

slide-37
SLIDE 37

30

Portable Child Object

@Entity class Chapter { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Extension(vendorName = "datanucleus", key = “gae.encoded-pk”) String id; @Extension(vendorName = “datanucleus”, key = “gae.parent-pk”) Long bookId; String pages; // ... }

Kind Chapter Entity Group /Book:2 Key /Book:2/Chapter:8 Pages 23 ID (pk) BOOK_ID (pk)(fk) PAGES 8 2 23 CHAPTER

30

slide-38
SLIDE 38

31

Key Takeaways

  • App Engine Datastore simplifies persistence
  • JPA adds typical RDBMS features to the datastore
  • Important to understand how the datastore is different

– Even if you’re starting from scratch!

  • Easier to move apps off than on
  • If portability is important, plan for it!

31

slide-39
SLIDE 39

32

slide-40
SLIDE 40

Questions

32

slide-41
SLIDE 41

33

More Information

  • http://code.google.com/appengine
  • http://groups.google.com/group/google-appengine-java
  • http://gae-java-persistence.blogspot.com
  • http://code.google.com/p/tapioca-orm (dt library)
  • App Engine Chat Time

– irc.freenode.net#appengine – First and third Wednesday of each month

  • maxr@google.com

33