distributed streaming
play

Distributed Streaming Albert Bifet May 2012 COMP423A/COMP523A Data - PowerPoint PPT Presentation

Distributed Streaming Albert Bifet May 2012 COMP423A/COMP523A Data Stream Mining Outline 1. Introduction 2. Stream Algorithmics 3. Concept drift 4. Evaluation 5. Classification 6. Ensemble Methods 7. Regression 8. Clustering 9. Frequent


  1. Distributed Streaming Albert Bifet May 2012

  2. COMP423A/COMP523A Data Stream Mining Outline 1. Introduction 2. Stream Algorithmics 3. Concept drift 4. Evaluation 5. Classification 6. Ensemble Methods 7. Regression 8. Clustering 9. Frequent Pattern Mining 10. Distributed Streaming

  3. Data Streams Big Data & Real Time

  4. Distributed Systems Hadoop, S4 and Storm

  5. Hadoop Hadoop

  6. Hadoop Hadoop architecture

  7. Apache Mahout Mahout: open source framework

  8. Pig Pig: Similar to SQL

  9. Pig ◮ A = LOAD ’data’ USING PigStorage() AS (f1:int, f2:int, f3:int); ◮ B = GROUP A BY f1; ◮ C = FOREACH B GENERATE COUNT ($0); ◮ DUMP C; Pig: Similar to SQL

  10. Apache S4 Apache S4

  11. Apache S4

  12. Storm Storm from Twitter

  13. Storm Stream, Spout, Bolt, Topology

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend