AllegroCache alpha 0.7.4 yaoodb By Jans Aasman, Franz Inc. . 2 - - PowerPoint PPT Presentation

allegrocache alpha 0 7 4 yaoodb
SMART_READER_LITE
LIVE PREVIEW

AllegroCache alpha 0.7.4 yaoodb By Jans Aasman, Franz Inc. . 2 - - PowerPoint PPT Presentation

AllegroCache alpha 0.7.4 yaoodb By Jans Aasman, Franz Inc. . 2 Complexity is coming your way . Our customers have complex database problems everywhere Sample Complex Applications Financial Services Derivatives Life Government


slide-1
SLIDE 1

.

AllegroCache alpha 0.7.4 yaoodb

By

Jans Aasman,

Franz Inc.

slide-2
SLIDE 2

.

2

Complexity is coming your way

slide-3
SLIDE 3

.

3

Our customers have complex database problems everywhere

Sample Complex Applications

Telco Manufacturing Energy Financial Services Life Sciences Government Media Seismic Analysis Cancer Research Drug Discovery Protein Folding Pathway Modeling Derivatives Analysis Portfolio Risk Analysis Fraud Detection Market Modeling Product Design Finite Element Analysis Failure Analysis Bandwidth Modeling Multiplayer Gaming Digital Rendering Homeland Security DOE, DOD Research: (physics, weather) eGovennment

Computing Infrastructure

slide-4
SLIDE 4

.

4

Big push for ORM as part of solution

Hibernate for Java Versant for .NET Oracle’s OO extentions CLSQL for Lisp … However: these are user friendly thin layers on top of

RDBMS that don’t solve the real complexity problem

slide-5
SLIDE 5

.

5

Why a full OO, Why Allegrocache

If your data is best described as a complex graph Graph many times larger than Memory

º > 10^8 objects in pointer space

Graph search & Complex queries & Inferencing & Reasoning

º Instead of set operations

Very heterogeneous data, possibly in multiple databases Object definitions often change Intelligent Caching

º More reads than writes, ultra fast access to individual records.

slide-6
SLIDE 6

.

6

AllegroCache from a modern database perspective

Stand-alone & Client Server model

º Single user on local disk º Multiple clients talking to server over sockets

Commit/Rollback ACID

º Atomicity (all or nothing) º Consistency (or rollback) º Isolation (multiple transactions will not interfere) º Durability

Optimistic concurrency

slide-7
SLIDE 7

.

7

AllegroCache for Lispers

Persistent CLOS on all 64 and 32 bit platforms Lisp Btrees (previously Berkeley DB)

º Floating cursors, multiple concurrent readers º Keys and Values are unsigned byte 8 arrays of unlimited size º Comparison functions in Lisp º Comprehensive marshalling package for most datatypes º Fine grained dynamic control over btree cache size – resourced blocks, almost no consing.. º Comparable to BDB in speed and functionality – 130,000 key/value pairs per second for increasing keys, 66,000 for unordered (mostly disk bound now)

slide-8
SLIDE 8

.

8

Features from (lisp) programmer perspective

MetaClass persistent-class. Change class-definition supported

º lazy update of objects

Class definitions are first class objects in AC Object ID's unique for the life time of the database

º and user accessible.

Indexed slots Referential integrity

º Deleted objects are lazily and silently changed to nil in slots.

slide-9
SLIDE 9

.

9

Features for programmers (cont)

Maps (persistent hashtables) & Sets (persistent large collections

  • f objects)

º Transactionally safe º And convenient macro’s to loop over them.

Support main Lisp datatypes

º Strings, lists, vectors, symbols, numbers º Persistent objects, Maps, Sets º Unsigned byte 8 arrays (for your structs and non-persistent clos objects and all other data) º Tell us what you need and we build it for you

slide-10
SLIDE 10

.

10

Features for programmers (cont)

Several ways to retrieve objects and object ids (oid)

º (retrieve-from-index ‘person ‘name “jans”) º (doclass (e ‘person) (when (string= (name e) “jans”) (print))) º (setf person (oid-to-object ‘person 100)) º (name (country (city person))

slide-11
SLIDE 11

.

11

Features for programmers (cont)

Prolog as efficient higher level retrieval language Real soon full SQL support (courtesy of Intelligent

Handbook)

Simple webbased database browser

slide-12
SLIDE 12

.

12

Current todo list

Caching strategies and user defined caching rules Index range queries Rebuilding indexes when redefining classes Dumping the database into a readable format Restore database from the dump Internationalization (99 % done) Journaling

slide-13
SLIDE 13

.

13

To Do for 1.1

Support for automatic blobs User defined indexes for slots and maps Query language running in the cache Integration with other dbms

º (automatically reading in tables from relational databases) º Using rdbms for secondary storage.

A hook for marshalling your own datatypes Thick Client GUI for creating objects, managing users

and the database.

slide-14
SLIDE 14

.

14

Premature Benchmarking

1,000,000,000 objects in 12 hours in stand alone mode.

º Small objects, two slots, no overflow blocks in btree.., no indexing apart from oid

Adding objects in constant time, nearly 18.000 obj/s Retrieving objects constant time, independent of size.. Lisp size doesn’t grow beyond 260 MB Database on disk is 97 Gb

slide-15
SLIDE 15

.

15

Premature benchmarking. AC alpha 0.7.4 vs MySQL

  • n 64 bit, 1.5 Ghz, 4 Gig linux

machine

(defclass* call-data () ((call_number :index :any-unique) action from_user to_user time_start time_spoken amount balance description))

create table call (call_number int primary key auto_increment, action int, from_user int, to_user int, time_start int, time_spoken int, amount int, balance int, description varchar(200) );

slide-16
SLIDE 16

.

16

Premature benchmarking. AC alpha 0.7.4 vs MySQL (cont)

AC vs MySQL

10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 Writing 5,000,000 individual objects Reading 5,000,000

  • bjects (no cache)

1,000,000 random accesses (no cache)

  • bjects per second

AC MySQL

slide-17
SLIDE 17

.

17

Our expectations for raw speed

Given our past performance on raw speed for Perl Regexp,

Validating XML parser, AllegroServe, Prolog, etc

Writing: within range of MySQL and Oracle Reading: Looping through all objects in AC always slower,

RDBMS can often bypass btrees, read tables with fixed size.. º RDB: good at set operations º OO: good at pointer operations

Random Access, 5 to10 times faster than RDBMS

slide-18
SLIDE 18

.

18

Applications & Prototypes

Biolingua: A frame system on top of AC, the basis of KnowOs Pepito: data mining package on AC* TellMe: personal directory for mobile phones Kido: Fraud detection over Call Detail Records KDDI: Rule Based Policy Server for Security using OWL and Racer. CRL: P2P document server, a secure webserver. 2Is Inc. WinStoic - Real-time, Data mining and EDI Contract Analysis

System supporting Department of Defense Weapons Systems

Boomtree: Web-based RSS reader based on Flash that can play

'podcasts' in the browser.

Franz: Geneology Royal British Family, Tivo Box, 90.000 RSS feeds,

Pandorabots, Internal CRM package, Support Database