real time databases
play

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat - PowerPoint PPT Presentation

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi Klopman Timeline 1:40 - 1:50: Introduction 1:50 - 3:00: Real-Time Databases/Scheduling 3:00 - 3:10: Break 3:10 - 4:00: Operator Scheduling in


  1. Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi Klopman

  2. Timeline 1:40 - 1:50: Introduction 1:50 - 3:00: Real-Time Databases/Scheduling 3:00 - 3:10: Break 3:10 - 4:00: Operator Scheduling in Aurora 4:00 - 4:25: Discussion 4:25 - 4:30: Comments

  3. Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data: consistency and validity • Conclusions

  4. References • http://www.fpa.org/newsletter_info2584/n ewsletter_info.htm (info on scud missiles) • http://www.fas.org/spp/starwars/gao/im9 2026.htm (info on Patriot Missile System)

  5. Imagine this… • We are at war with Iraq • Our soldiers find a potential target • Military intelligence consults a database to determine course of action

  6. Imagine this… • We are at war with Iraq • Air control system constantly monitors hundreds of aircraft and records them in a database • Intelligence systems constantly query the database for potential threats

  7. Suddenly… • Hundreds of missiles are launched • We suspect some are nuclear • Need info which will allow us to determine a course of action • Need this info to make rapid decision • The costs of indecision are catastrophic

  8. What could go wrong? • Limited number of missiles we can intercept • Once they’re launched, we have limited time to react • Our traditional database is slowed by less critical queries • Finally, our queries may not be answered in time due to system load

  9. We need a system that: • Handles time-sensitive queries • Returns only temporally valid data • Supports priority scheduling • Solution: Real-Time Databases!

  10. Real-Time Databases and Streams • Scheduling – Streams: priority based on QoS optimization – Real-Time: priority based on deadlines • Load Shedding – Streams: dropping tuples from queues – Real-Time: missing deadlines • Freshness of data: – Streams: not guaranteed – Real-Time: resample

  11. Real-Time Databases and Streams • Scheduling – Streams: priority based on QoS optimization – Real-Time: priority based on deadlines and user- supplied values • Load Shedding – Streams: dropping tuples from queues – Real-Time: missing deadlines, dropping transactions • Freshness of data: – Streams: not guaranteed – Real-Time: resample

  12. Real-Time Databases • An extension to traditional databases • Motivated by class of applications that require reliable responses • Predictable (not necessarily fast)

  13. Real-Time Database Features • Priority – Classification of transactions – Assigns value to transactions • Deadlines – Transactions specify explicit time requirements – Transaction scheduling takes time requirements into account – Predictability that transactions will complete by deadline or not at all

  14. Transactions and Streams • Operation on the database that perform combinations of reads/writes in an atomic step – Queries are a subset of transactions • Streams are read-only data (may create new tuples) • Data Consistency

  15. Characteristics of Transactions • Manner in which transactions use data • Nature of time constraints • Significance of executing a transaction by its deadline – consequence of missing specified time constraints

  16. Transaction Classification • Effect of missing transaction deadlines • Value to user is dependent on timeliness: – Soft: have some value after deadline – Firm: have no value after deadline – Hard: have negative value after deadline • Special case: no deadline • Idea for Streams: Queries have periodic deadlines

  17. Scheduling and Streams • Streams: schedules queries in terms of QoS • Real-Time Databases: schedule transactions in terms of scheduling policy

  18. Real-Time Databases/Scheduling • General Introduction • Scheduling Policies � • Resource Allocation • Properties of Data: consistency and validity • Conclusions

  19. Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data: consistency and validity • Conclusions

  20. Scheduling Policies • Earliest deadline first (PMM, PAQRS) • Highest value first • Highest value per unit computation time first • Longest executed transaction first

  21. PMM • Priority Memory Management • Admission Control – Decide if we run a query. • Memory Allocation – How much memory does each running query get.

  22. Memory Allocation: Two Strategies • Max – Queries get their maximum required memory or no memory at all. • MinMax – High priority queries get their maximum required memory and low priority queries get their minimum.

  23. Admission Control • Goal: minimize the miss ratio (number of queries that miss their deadline/total queries). • MultiProgramming Level (MPL) = number of queries to run. • Optimize system resource use: optimal MPL.

  24. Relating MPL to Streams • Real-Time: One time queries • Stream: Continuous Queries • Possibilities for future DSMS: – Using MPL for QoS

  25. Oh no! Missiles are launched again. • We are running two types of queries: – Query1 – Where should CNN’s cameras face to see the missile? – Query2 – Should we shoot the missile down? • Queries of type 2 are obviously more important, but how does the db know? • Consider: Applications for relative query values in stream systems.

  26. PAQRS – extension of PMM • Priority Adaptation Query Resource Scheduling. • PMM only minimizes miss ratio for the entire system. • We would like to be able to specify a ratio between query classes for missed deadlines. • RelMissRatio (Relative Miss Ratio) = {99:1} Query1:Query2.

  27. Why do we care? • Think of the missile example. – Same problems still exist in stream systems. • Potential Stream Additions: – Relative Priority Scheduling. • Not all queries are equal • Another form of QoS – Periodic Query Deadlines. • Deadlines for continuous queries

  28. Bias Control • Puts queries into two groups: – Regular – Queries run with normal priority – Reserve – Queries run with priority lower than regular. • Manages groups on a per query basis – Each class gets RegQuota regular queries. – The rest have to run as reserve queries.

  29. Relative Weights Weight should reflect a class’ RelMissRatio. Weight i = (1/RelMissRatio i )/ Σ j (1/RelMissRatio j ) Weight cnn = (1/99)/(1/99 + 1) = .01 Weight mis = (1)/(1/99 + 1) = .99

  30. Bias Control using Relative Weights WeightedMissRatio = Σ (Weight i * MissRatio i ) All terms are equal when the ratio is correct. WeightMissRatio ex =(.01*99x%) + (.99*x%) WeightMissRatio ex =.99x% + .99x%

  31. Back to Missiles and CNN • The actual miss ratio is not correct, the miss ratio is 50:50! new = RegQuota i old * • RegQuota i {(Weight i * MissRatio i )/ (WeightedMissRatio/NumClasses)}

  32. Missiles and CNN Calculations WeightedMissRate=(.01*.50)+(.99*.50)=.5 .005 ≠ .495 RegQuota cnnnew =RegQuota cnnold * (.01*.50)/(.5/2) RegQuota cnnnew =RegQuota cnnold *.02 (98% less) RegQuota misnew =RegQuota misold * (.99*.50)/(.5/2) RegQuota misnew =RegQuota misold *1.98 (98% more)

  33. Does it really work?

  34. Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation � • Properties of Data:consistency & validity • Conclusions

  35. Real-Time Databases/Scheduling • General Introduction • Scheduling Policies • Resource Allocation • Properties of Data:consistency & validity • Conclusions

  36. Essence of Real Time • Although adaptive systems give better throughput, IT DOESN’T MATTER! • RT is about dependability , not throughput. 1% miss rate is (usually) unacceptable. • Throughput can be handled (usually) with extra hardware (i.e. money). Dependability needs a special design.

  37. Resources in Databases • Physical • Logical – CPU(s) – Locks – Memory • Cache • Work Area – I/O Bandwidth • Disks & Storage • Network for Distributed Processing – Time...

  38. Cost of a Transaction • Waiting for locks to release • Work memory needed (e.g. O(n) for in-memory hash join, O(sqrt(n)) for disk-assisted) • I/O amount (e.g. worst case join: multiplication) • CPU needed to process • Cost of aborting a transaction (negligible for queries) If success cannot be guaranteed, don't start!

  39. Physical Resources – Now and Then 1995 (Paper) 2003 (opt.RAID) CPU Speed (MIPS) 40 2 X 2500 Memory Buffers (MB) 20 800 I/O Bandwidth (MB/s) 10 100 Disk Size (GB) 1 120 (1TB) Disk Cache (kB) 256 8192 (1GB) # Disks 10 4 (16) Disk Latency (ms) 16.7 6 Latency is Forever…

  40. Memory Allocation Strategies (I) • Max – all memory needed or nothing (don't admit) • MinMax – all memory needed for high-priority – min memory needed for low-priority • M&M – feedback-based allocation – adaptive – small amount of memory set aside for small transactions

  41. Memory Allocation Strategies (II) • Multiclass Dependent – Small get all the memory they need – Large get a minimum amount – Medium get according to level load • Classes are: – Small – less than 10% of memory – Large – more than memory – Medium – between them.

  42. Allocating Memory S M L M S L S

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend