social media analysis using kafka
play

Social Media Analysis using Kafka Surmeet Kaur Jhajj - 54245972 - PowerPoint PPT Presentation

Social Media Analysis using Kafka Surmeet Kaur Jhajj - 54245972 Aiswarya Manikandan- 94646410 Rohan Rajeev - 38249388 Data Streaming in todays world Large amounts of data being generated from social media today Instantaneous input and


  1. Social Media Analysis using Kafka Surmeet Kaur Jhajj - 54245972 Aiswarya Manikandan- 94646410 Rohan Rajeev - 38249388

  2. Data Streaming in today’s world Large amounts of data being generated from social media today ● Instantaneous input and fast analysis is required ● Frameworks consists of: i) distributed data ingestion; ii) distributed ● data processing; iii) data visualization Apache Kafka is a distributed streaming platform for messaging ● It follows the publish/subscribe system ● Data from multiple sources is sent to a Producer which streams it to ● the consumer for further processing

  3. Our Objective Establish a data-pipeline with ingestions from multiple sources ● Define a scheduling middleware which determines the selection of ● processing logic based on the data ingestion rate

  4. What we propose…. Data is live Streamed from two social Medias : Twitter and Reddit ● This Data is fed into the respective Kafka Producers ● Producers, with the help of the scheduler, stream this data to the ● consumer. The scheduler is responsible for normalizing the input rates, acting ● accordingly in cases of high input. Consumer further processes this data according to processing ● requirements.

  5. Architecture Kafka Consumer Kafka Producer Kafka Topic Pa Pb Kafka Producer Scheduling Middleware

  6. References [1] L Magnoni, “Modern Messaging For Distributed Systems”, Journal of Physics, Conference Series [2] Rajiv Ranjan, “Streaming Big Data Processing in Datacenter Clouds”, IEEE Cloud Computing (Volume: 1 , Issue: 1 , May 2014) [3] Babak Yadranjiaghdam, Seyedfaraz Yasrobi, Nasseh Tabrizi, “Developing a Real-time Data Analytics Framework For Twitter Streaming Data”, 2017 IEEE 6th International Congress on Big Data [4] Hassan Nazeer, Waheed Iqbal, Fawaz Bokhari, Faisal Bukhari, “Real-time Text Analytics Pipeline Using Open-source Big Data Tools”

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend