Big Data & BI The Most Powerful Combo Asha Subramanian HCL - - PowerPoint PPT Presentation

big data bi the most powerful combo
SMART_READER_LITE
LIVE PREVIEW

Big Data & BI The Most Powerful Combo Asha Subramanian HCL - - PowerPoint PPT Presentation

Big Data & BI The Most Powerful Combo Asha Subramanian HCL Technologies Ltd. Using and Acknowledging Sources Presenter: David Stodder, TDWI Big Data and BI : How they come together AGENDA Where BI is going Big Data Trends and


slide-1
SLIDE 1

Big Data & BI – The Most Powerful Combo

Asha Subramanian HCL Technologies Ltd.

Using and Acknowledging Sources Presenter: David Stodder, TDWI – Big Data and BI : How they come together

slide-2
SLIDE 2

AGENDA

Where BI is going Big Data Trends and What’s different about Big Data Keys to Complimentary BI and Analytics with Big Data Concluding Thoughts

2

slide-3
SLIDE 3

BI’s Mission – Responding to Business Needs

Speed

Reduce, if not eliminate delay in servicing customers, responding to market events and increasing efficiency

Agility

Control and coordinate access across internal and external business networks to improve versatility and reaction

Intelligence

Use information to be predictive and proactive, learn from multiple data sources to continuously improve decisions

Effectiveness

Manage costs and increase productivity

3

slide-4
SLIDE 4

Where’s BI been and where it is going

  • Small, departmental user communities

dependent on IT and “Power Users”

  • ETL Processes prepare data for

specific use ( eg. reports )

  • Focused on structured data sources

for reports and Ad-hoc query analysis

  • Enterprise Deployments; Diverse

users requiring self service functionality

  • ELT, in memory Analytics, “Big Data”

access to raw sources, detailed data, more sources

  • Needed – Portfolio of search, query,

PAST & PRESENT PRESENT & FUTURE

4

for reports and Ad-hoc query analysis

  • Historical “Rear View Window” on the

data

  • Needed – Portfolio of search, query,

Richer metadata for semi structured

  • Past, Present and Future data views

and analysis

slide-5
SLIDE 5

Changing BI Imperatives

Enable Self Service BI and Analytics

  • Business Users want

to be in control and not IT Conquer Data Latency

  • Faster Data Delivery

from canned reports to real time dashboards

5

  • Users need structured

data and unstructured content

  • Users want to discover

data in guidance dashboards

  • From batch, historical

reports to alerts and exception reports

  • Operational users

need automated decisions

IMPERATIVE # 1 IMPERATIVE # 2

slide-6
SLIDE 6

Changing BI Imperatives

Operational Intelligence

  • Ability to understand,

analyse and act on continuous stream of data (spatial, sensors, social media, mobile Enhancing Visibility into Data

  • Dashboards : Visual,

role based views of actionable data

6

social media, mobile device data etc..)

  • Visibility: Integrated

views of data across multiple event streams and data sources to spot correlations and patterns

IMPERATIVE # 3 IMPERATIVE # 4

  • Specialised visualisations

for data types, models and analytics

  • Mobile BI: Synchronised

visualised for people on the go. Location intelligence

slide-7
SLIDE 7

Expanding Universe of Data Sources -

204.243.130.5 - - [26/Feb/2001:15:34:52

  • 0600] "GET / HTTP/1.0" 200 8437

"http://metacrawler.com/crawler?general =dimensional+modeling" "Mozilla/4.5 [en] (Win98; I)"

7

Highly Structured Arbitrarily Structured Human Generated Data Business Application Data

[en] (Win98; I)"

Machine Generated Data

slide-8
SLIDE 8

BIG DATA is …

BIG NEWS !!!!!! BIG ( Terabytes of Data, Petabytes soon, what next ….) -> VOLUME FAST (continuous, regular intervals, bursts…) -> VELOCITY ANY FORM ( structured, semi structured, unstructured ) -> VARIETY IS NOT UNIFORM (does not conform to predictable structures..) -> VARIABILITY IS NOT UNIFORM (does not conform to predictable structures..) -> VARIABILITY IS EVERYWHERE ………………………………………IS FUELLING THE NEW AGE OF ANALYTICS

8

slide-9
SLIDE 9

BIG DATA – Why is it important ?

ONE DEFINITION –

It lies beyond the capabilities of the current BI and Data Warehousing

BEYOND RELATIONAL - BEYOND RELATIONAL -

Flow of semi-structured or unstructured content

BEYOND STRUCTURE –

Data complexity defies current BI metadata and structure

BEYOND THE WAREHOUSE –

Demand for Hadoop (HDFS), MapReduce, NoSQL

9

slide-10
SLIDE 10

Big Data Sources – Variety and Velocity

10

slide-11
SLIDE 11

BIG DATA – Volume : TB today, PB soon

Users conduct analytics with ever larger data sets A third of surveyed organisations have crossed the 10 TB barrier Soon, we will measure Big Data in Petabytes and not terabytes

11

slide-12
SLIDE 12

Big Data and Transformation to BI

Storage – Largely relational or Columnar Limitations on scalability, requires costly high end storage to scale to larger volumes of data Standard BI/DW : Schema created, then data loaded and transformed as Storage – NoSQL, Hierarchical, Key Value Stores, Document based stores, Column oriented databases

  • Has the capability to scale on

commodity hardware Big Data – Don’t transform it, just put it into a file – raw (egs, Indexed, Key- then data loaded and transformed as per internal data structure Data extracted as per the analytics requirement

12

into a file – raw (egs, Indexed, Key- Value pairs..) Late Binding – Let Big Data Analytics determine extract required at read time, find the structure in the data

slide-13
SLIDE 13

BIG Data Technologies

Convergence of many innovations – Technology CoE

Stream Scope Big data programming

  • Map-reduce implemented by Hadoop, SQL-MR alternative from AsterData, eclipse based MR IDE

from Karmasphere

  • Event Stream Processing (ESP, implemented by IBM Streams, Oracle CEP, Esper and many others)
  • Complex Event Processing (CEP, implemented by TIBCO BE, IBM Websphere Operational Decision

Management)

  • Tools that make the use of these new programming techniques more easily accessible to business

users, e.g. IBM BigSheets Big data application data stores

  • NOSQL databases for scale-out approach to managing structured and/or unstructured data
  • Eg. Cassandra, CouchDB ..
  • In-memory databases that allow eXtreme Transaction Processing (XTP) and high-performance

stores

  • In-memory databases that allow eXtreme Transaction Processing (XTP) and high-performance

through distributed caching. Oracle Coherence, VMWare GemFire .. Big data OLAP Platforms

  • Appliances such as Exadata and Teradata
  • Shared-nothing analytical databases like EMC Greenplum
  • Columnar databases like HP Vertica
  • In-memory appliance like SAP Hana
  • In-memory self-service analytics and data visualization platforms like Spotfire, Qlikview, Tableau,

Splunk Analytics (Data science)

  • SAS e-miner, SPSS, Revolution analytics, Spotfire analytics and the like
  • Customer experience analytics like ClickFox
  • Text mining tools like SAS text miner and OpenCalais
  • Social intelligence platforms like Radian6 and Visible

Pan-Enterprise Search

  • Autonomy, FAST ESP, Clarabridge etc.

13

slide-14
SLIDE 14

What is a Big Data appliance?

  • Integrates key components of

a big data platform into a single product

  • No risks of a custom built

solution

14

  • Comes with inbuilt connectors
  • Storage, Processing, Analytics

and Visualisation platforms all bundled into one product

slide-15
SLIDE 15

Oracle Big Data Implementation

15

slide-16
SLIDE 16

IBM Big Data Implementation

Data warehouse Applicance

  • IBM Netezza
  • Infosphere Warehouse

Data Integration Platform

  • Infosphere Information

Server Analytics Platform

  • Infosphere Streams
  • IBM Infosphere

BigInsights

16

Storage Organise Analyse and Decide

slide-17
SLIDE 17

Operational Business Intelligence

Machine Data

  • Unstructured Data
  • Tremendous Source of

Business Value

  • Cannot be handled by BI

17

Structured Data

  • Business Txn data
  • Well understood
  • Handled by traditional BI
  • Slow growth
  • Cannot be handled by BI
  • Under Leveraged
  • Needs New Approach

Machine Data Comprises of data from RFID, GPS, Sensors, Web Servers, Messaging, Clickstreams, Mobile devices, Databases, Telematics, Servers etc..

slide-18
SLIDE 18

Operational Intelligence Tools - Examples

Real Time Collection & Indexing Storage Analyse Visualise

18

In memory & In Database Analysis Explore, Analyse And Visualize Big Data And data from Analytic Databases and Cubes Use R Statistical Analysis Software

slide-19
SLIDE 19

Business Visibility from Big Data and traditional BI - example

Unstructured Data from Customer Interaction Master Data from Traditional Data Warehouse Location Information Customer Interacts with service Online or from any device Correlated with relevant information from the DB Location information based on where the customer interacted with the service

19

REAL TIME BUSINESS INSIGHTS

  • What products are popular in what

regions

  • Which products are customers

leaving in cart

  • What are interaction paths by

devices

  • How can we improve customer

experience

slide-20
SLIDE 20

Big Data & BI – Most Powerful Combo

Big Data for BI : Improving Operations

Enriching the data: Going beyond the limits of just structured relational data to show a complete view, including data relationships across sources Using Big Data technologies to gain a near or true real time view of data flowing into and through the organisation Not waiting for the data warehouse: Implementing models and sensors to spot exceptions Not waiting for the data warehouse: Implementing models and sensors to spot exceptions

  • r patterns in large data as they are happening - not as historical data

Integrating search for accessing semi structured or unstructured sources : finding the structure in the source rather than accessing only data that conforms to predetermined structure

20

slide-21
SLIDE 21

Big Data & BI – Most Powerful Combo

BI & Analytics : Richer Insights from Big Data

What machine data, geo-location data and more can tell about customer transactions and business operational performance Increasing importance of machine data in an age of devices, cloud services and automated decision management systems Inescapable Data : “Data will not only be everywhere, but will be fully interconnected, Inescapable Data : “Data will not only be everywhere, but will be fully interconnected, delivering us value that we never before anticipated”

21

slide-22
SLIDE 22

Big Data & BI – Most Powerful Combo

BUSINESS INSIGHTS – representative use cases across customers Application Analytics

(usage clickstream + feature descriptions + Customer Profile)

Content & Search Analytics

(mobile access + content downloads + search)

Real Time Sales Analytics

(device activation + billing plans + geo location)

22

Service Analytics

(call detail records + tariff Database + VOIP peering)

Online Monetization Analytics

(customer clickstream + virtual Goods pricing + billing)

Marketing Analytics

(web/mobile logs + ad pricing + click through)

slide-23
SLIDE 23

Big Data & BI – Most Powerful Combo The World of Business Analytics is changing

“Feeding transactional data into a traditional data warehouse no longer represents the extent of capabilities necessary for BI.” “The simple idea of building a traditional data warehouse to support a BI platform is no longer sufficient.”

23

longer sufficient.” “….require new information management capabilities to integrate information from disparate, external and unstructured information sources.”

Source: Business Analytics Require New Information Management Capabilities, Nov 2011

slide-24
SLIDE 24

Big Data & BI – Most Powerful Combo Closing Thoughts…

Evaluate specialized analytic database technologies

  • Machine Data analytics, Hadoop/MapReduce and more could increase

speed and depth of BI and analytics

  • Evaluate these technologies for large scale analytic workloads and big data,

especially for semi structured and structured data Don’t look at analytics through BI/DW glasses

24

Don’t look at analytics through BI/DW glasses

  • Analytics processes are typically less structured than traditional BI and data

warehousing

  • Analytics need to be iterative, fluid, self-creating and driven by business

users Give non technical users self service data discovery and visual analysis

  • Non technical users need actionable information and guided analysis
  • Visual Analysis is often the best way for users to work with data and share

insights with colleagues

slide-25
SLIDE 25

Q & A Q & A