1
play

1 Milestones Status Update Milestones Status Update #1 Completion - PDF document

Update Powerset Viewer: A Datamining Application Jordan Lee 1 2 Update Update Completed Tools and Features Completed Tools and Features And relevant GUI widgets And relevant GUI widgets Implemented animation between zoom


  1. Update Powerset Viewer: A Datamining Application Jordan Lee 1 2 Update Update � Completed Tools and Features � Completed Tools and Features – And relevant GUI widgets – And relevant GUI widgets � Implemented animation between zoom states and automatic zooming 3 4 Update Update � Completed Tools and Features � Completed Tools and Features – And relevant GUI widgets – And relevant GUI widgets � Implemented animation between zoom states and � Implemented animation between zoom states and automatic zooming automatic zooming � Increased alphabet size from 14 to 30 � Increased alphabet size from 14 to 30 – Optimized calculations – Optimized calculations � Increased alphabet size from 30 to 45 – Realized set cardinality is, in practice, low – Using max set size of 10 5 6 1

  2. Milestones Status Update Milestones Status Update � #1 Completion of the basic visualization of a � #1 Completion of the basic visualization of a randomized database of small set size (~10) randomized database of small set size (~10) � #2 Addition of a single level of “marking”. � #3 Addition of multiple levels of “marking” (6) � #4 Addition of background marking to demarcate areas of sets containing different amounts of items. 7 8 Milestones Status Update Milestones Status Update � #1 Completion of the basic visualization of a � #1 Completion of the basic visualization of a randomized database of small set size (~10) randomized database of small set size (~10) � #2 Addition of a single level of “marking”. � #2 Addition of a single level of “marking”. � #3 Addition of multiple levels of “marking” (6) � #3 Addition of multiple levels of “marking” (6) � #4 Addition of background marking to demarcate � #4 Addition of background marking to demarcate areas of sets containing different amounts of items. areas of sets containing different amounts of items. � #5 Implement multiple constraints � #5 Implement multiple constraints � #6 Increase maximum possible dataset size to at least 100. 9 10 Difficulties BEFORE BRIDGE � BigInteger solution to increase maximum � Incoming Set (Position = 982) Success! alphabet caused massive slow-down � Incoming Set (Position = 2^32 + 1) CRASH! – Recall: required BigIntegers to support > 30 – Integer too large alphabet size – Solution: redesign keys to use integers and create a bridge to map integers to BigInteger positions 11 12 2

  3. AFTER BRIDGE Difficulties � BigInteger solution to increase maximum � Incoming Set (Position = 982) alphabet caused massive slow-down – Encode to Key #1 Success! – Recall: required BigIntegers to support > 30 � Incoming Set (Position = 2^32 + 1) alphabet size – Encode to Key #2 Success! – Solution: redesign keys to use integers and create a bridge to map integers to BigInteger positions � Incoming Set (Position = arbitrarily large) � Expensive initial costs – Encode to Key #3 Success! � Grid size limited by integer restrictions – Solution: create grid on the fly 13 14 Benchmarks � Low Cardinality First MEMORY (MB) SET COUNT 76 10M 75 1M 74 100,000 73 10,000 58 1,000 Figure: Low Cardinality (10000 sets) 73 MB 16 15 Benchmarks (cont’d) � Random Generated MEMORY (MB) SET COUNT 72 263 71 168 70 127 72 30 71 10 Figure: Random (176 sets) 71 MB 18 17 3

  4. Questions and Comments 19 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend