Big Data & BI – The Most Powerful Combo
Asha Subramanian HCL Technologies Ltd.
Using and Acknowledging Sources Presenter: David Stodder, TDWI – Big Data and BI : How they come together
Big Data & BI The Most Powerful Combo Asha Subramanian HCL - - PowerPoint PPT Presentation
Big Data & BI The Most Powerful Combo Asha Subramanian HCL Technologies Ltd. Using and Acknowledging Sources Presenter: David Stodder, TDWI Big Data and BI : How they come together AGENDA Where BI is going Big Data Trends and
Asha Subramanian HCL Technologies Ltd.
Using and Acknowledging Sources Presenter: David Stodder, TDWI – Big Data and BI : How they come together
Where BI is going Big Data Trends and What’s different about Big Data Keys to Complimentary BI and Analytics with Big Data Concluding Thoughts
2
Reduce, if not eliminate delay in servicing customers, responding to market events and increasing efficiency
Control and coordinate access across internal and external business networks to improve versatility and reaction
Use information to be predictive and proactive, learn from multiple data sources to continuously improve decisions
Manage costs and increase productivity
3
dependent on IT and “Power Users”
specific use ( eg. reports )
for reports and Ad-hoc query analysis
users requiring self service functionality
access to raw sources, detailed data, more sources
PAST & PRESENT PRESENT & FUTURE
4
for reports and Ad-hoc query analysis
data
Richer metadata for semi structured
and analysis
5
IMPERATIVE # 1 IMPERATIVE # 2
6
IMPERATIVE # 3 IMPERATIVE # 4
204.243.130.5 - - [26/Feb/2001:15:34:52
"http://metacrawler.com/crawler?general =dimensional+modeling" "Mozilla/4.5 [en] (Win98; I)"
7
Highly Structured Arbitrarily Structured Human Generated Data Business Application Data
[en] (Win98; I)"
Machine Generated Data
BIG NEWS !!!!!! BIG ( Terabytes of Data, Petabytes soon, what next ….) -> VOLUME FAST (continuous, regular intervals, bursts…) -> VELOCITY ANY FORM ( structured, semi structured, unstructured ) -> VARIETY IS NOT UNIFORM (does not conform to predictable structures..) -> VARIABILITY IS NOT UNIFORM (does not conform to predictable structures..) -> VARIABILITY IS EVERYWHERE ………………………………………IS FUELLING THE NEW AGE OF ANALYTICS
8
It lies beyond the capabilities of the current BI and Data Warehousing
Flow of semi-structured or unstructured content
Data complexity defies current BI metadata and structure
Demand for Hadoop (HDFS), MapReduce, NoSQL
9
10
11
Storage – Largely relational or Columnar Limitations on scalability, requires costly high end storage to scale to larger volumes of data Standard BI/DW : Schema created, then data loaded and transformed as Storage – NoSQL, Hierarchical, Key Value Stores, Document based stores, Column oriented databases
commodity hardware Big Data – Don’t transform it, just put it into a file – raw (egs, Indexed, Key- then data loaded and transformed as per internal data structure Data extracted as per the analytics requirement
12
into a file – raw (egs, Indexed, Key- Value pairs..) Late Binding – Let Big Data Analytics determine extract required at read time, find the structure in the data
Convergence of many innovations – Technology CoE
Stream Scope Big data programming
from Karmasphere
Management)
users, e.g. IBM BigSheets Big data application data stores
stores
through distributed caching. Oracle Coherence, VMWare GemFire .. Big data OLAP Platforms
Splunk Analytics (Data science)
Pan-Enterprise Search
13
14
15
Data warehouse Applicance
Data Integration Platform
Server Analytics Platform
BigInsights
16
Machine Data
Business Value
17
Structured Data
Machine Data Comprises of data from RFID, GPS, Sensors, Web Servers, Messaging, Clickstreams, Mobile devices, Databases, Telematics, Servers etc..
Real Time Collection & Indexing Storage Analyse Visualise
18
In memory & In Database Analysis Explore, Analyse And Visualize Big Data And data from Analytic Databases and Cubes Use R Statistical Analysis Software
Unstructured Data from Customer Interaction Master Data from Traditional Data Warehouse Location Information Customer Interacts with service Online or from any device Correlated with relevant information from the DB Location information based on where the customer interacted with the service
19
Enriching the data: Going beyond the limits of just structured relational data to show a complete view, including data relationships across sources Using Big Data technologies to gain a near or true real time view of data flowing into and through the organisation Not waiting for the data warehouse: Implementing models and sensors to spot exceptions Not waiting for the data warehouse: Implementing models and sensors to spot exceptions
Integrating search for accessing semi structured or unstructured sources : finding the structure in the source rather than accessing only data that conforms to predetermined structure
20
What machine data, geo-location data and more can tell about customer transactions and business operational performance Increasing importance of machine data in an age of devices, cloud services and automated decision management systems Inescapable Data : “Data will not only be everywhere, but will be fully interconnected, Inescapable Data : “Data will not only be everywhere, but will be fully interconnected, delivering us value that we never before anticipated”
21
(usage clickstream + feature descriptions + Customer Profile)
(mobile access + content downloads + search)
(device activation + billing plans + geo location)
22
(call detail records + tariff Database + VOIP peering)
(customer clickstream + virtual Goods pricing + billing)
(web/mobile logs + ad pricing + click through)
“Feeding transactional data into a traditional data warehouse no longer represents the extent of capabilities necessary for BI.” “The simple idea of building a traditional data warehouse to support a BI platform is no longer sufficient.”
23
longer sufficient.” “….require new information management capabilities to integrate information from disparate, external and unstructured information sources.”
Source: Business Analytics Require New Information Management Capabilities, Nov 2011
24