Enterprise En erprise In Information formation Sys ystems ems - - PowerPoint PPT Presentation

enterprise en erprise in information formation sys ystems
SMART_READER_LITE
LIVE PREVIEW

Enterprise En erprise In Information formation Sys ystems ems - - PowerPoint PPT Presentation

D IPARTIMENTO DI I NGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE A NTONIO R UBERTI Master Degree Programme in Manage gement ment Engin inee eeri ring Enterprise En erprise In Information formation Sys ystems ems Umbe mberto to Nan


slide-1
SLIDE 1

1 Enterprise Information Systems Umberto Nanni

Master Degree Programme in

Manage gement ment Engin inee eeri ring

En Enterprise erprise In Information formation Sys ystems ems

Umbe mberto to Nan anni ni

DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI

Data Warehousing – Introduction to the Multidimensional Model

slide-2
SLIDE 2

2 Enterprise Information Systems Umberto Nanni

2

Multidimensional model

datum in the operational DB

(rappresenting the event)

vendita di uno scopettone

fact in the enterpise DW (measurement in an n-dimensional space)

sold N units of article X at point od sale Y

  • n day Z
  • ne box of “DS”

dish soap is sold event in the real world

  • U. Nanni, di anni 28,

acquista uno scopettone Pippo, di anni 32, acquista brillantante Pluto, di anni 77, acquista dentifricio Paperino, di anni 16, acquista lucido da scarpe

  • U. Nanni, aged 28,

buys a box of dish soap

DB design: selection of all and only features having operational relevance DW design: ..., definition of aggregations (with a given granularity)

  • f analytic interest to some users
slide-3
SLIDE 3

3 Enterprise Information Systems Umberto Nanni

Events and Facts

  • EVENT (in the real world) and FACT (of

interest) are terms whose meaning is determined by the granularity (level of detail)

  • f the multidimensional representation of the

Data Warehouse

Example of FACT:

  • n 02/05/2012 in “Rome 23” PoS (Point of Sale) 278

boxes of soap have been sold, cashing 745 Euro

slide-4
SLIDE 4

4 Enterprise Information Systems Umberto Nanni

Multidimensional Cube

05/05/2004 brusher Roma units: 278 sales: € 745 toothpaste brusher dish soap broom 13/11/2012 06/11/2012 07/11/2012 08/11/2012 11/11/2012 12/11/2012 05/11/2012 Foggia Firenze L’Aquila Roma Torino Venezia Palermo

slide-5
SLIDE 5

5 Enterprise Information Systems Umberto Nanni

Dimensions

Foggia Firenze L’Aquila Roma Torino Venezia Palermo 13/11/2012 06/11/2012 07/11/2012 08/11/2012 11/11/2012 12/11/2012 05/11/2012

← time → ← place→

toothpaste brusher dish soap broom

slide-6
SLIDE 6

6 Enterprise Information Systems Umberto Nanni

dimensions

(classification attributes)

measurements

(variables, metrics, indicators, …)

A relational schema for the multidimensional cube

TABLE(dimens-1, …, dimens-k, measur-1, …, measur-h) Example: SALES(product, city, date, numUnitsSold, sales)

IDENTIFICATIVE KEY

Functional dependence: dimensions → measurements In the example: product, city, date → numUnitsSold, sales

slide-7
SLIDE 7

7 Enterprise Information Systems Umberto Nanni

Dimensional hierarchy

Each dimension is associated with a hierarchy that groups the values at different levels of aggregation (also orthogonally). Each node of the hierarchy is called DIMENSIONAL ATTRIBUTE

Example (time dimension):

day week month four-month period quarter year Functional Dependencies: day→ week day → month month → quarter month → four-month-period quarter → year four-month-period → year ALL

slide-8
SLIDE 8

8 Enterprise Information Systems Umberto Nanni

Access to Data Warehouse

Reporting

  • predefined needs (poss., parametrized)
  • automated extraction of information

OLAP

  • analysis needs that are not identifiable in advance

(browsing on dimensions and indicators)

  • interactive data exploration, searching for

information of interest Data Mining

  • analysis needs that are not identifiable in advance
  • (semi)automatic data exploration
slide-9
SLIDE 9

9 Enterprise Information Systems Umberto Nanni

Reporting

Report with predefined structure and format

  • query

– typically based on restrictions and / or aggregations

  • presentation

– interleaved with text data – in tabular and / or graphic form – customized with environment variables (date, user, ...)

  • generation

– on user demand – periodic – under specified conditions

  • distribution

– preview/print, e-mail, web, …

slide-10
SLIDE 10

10 Enterprise Information Systems Umberto Nanni

OLAP

  • implementation:

– analysis session, divided into a series of steps, each dependent on the results obtained previously

  • tipical users:

– domain experts, not necessarily computer experts

slide-11
SLIDE 11

11 Enterprise Information Systems Umberto Nanni

OLAP: example

Firenze L’Aquila Roma Torino Palermo 06/11/2012 07/11/2012 08/11/2012 11/11/2012 12/11/2012 05/11/2012 toothpaste brusher dish soap

slide-12
SLIDE 12

12 Enterprise Information Systems Umberto Nanni

OLAP – Basic operations in the multidimensional model

RESTRICTION (filtering based on values)

  • selection on specific values of dimensional attributes
  • in case of equality with a single value, the operator is called

SLICING (one dimension disappears) AGGREGATION (level of abstrazion increases)

  • considered one dimension, group one dimensional attribute

to a higher level of granularity

  • joining up to the highest level (the most generic, consisting of

a single value), a dimension disappears

slide-13
SLIDE 13

13 Enterprise Information Systems Umberto Nanni

OLAP: Roll-Up Operator

Aggregation along one dimension

→ use of an aggregate operator on a measurement attribute

Examples:

  • a. sales of each product by city and month
  • b. daily sales per city (total on all products)

c. daily sales by product (North / Central / South)

← day → ← city → ← month → ← city→

(a)

slide-14
SLIDE 14

14 Enterprise Information Systems Umberto Nanni

OLAP: Aggregative Operators

DISTRIBUTIVE: computable from:

  • 1. partial aggregates

– Examples: SUM, MAX, MIN, …

ALGEBRAIC: computable from:

  • 1. partial aggregates
  • 2. finite set of support information

– Examples : weighted AVG, standard deviation, …

OLISTIC: computable from:

  • 1. original data

– Example: rank of an array

slide-15
SLIDE 15

15 Enterprise Information Systems Umberto Nanni

OLAP: Drill-Down Operator

Disaggregation over one dimension → disaggregated data are assumed to be available Examples:

  • a. from “sales of each product by city and month”

to “daily sales”

  • b. from “monthly sales by city (total on all products)”

to “daily sales by city and product”

← date → ← city → ← month → ← city →

(a)

slide-16
SLIDE 16

16 Enterprise Information Systems Umberto Nanni

OLAP: Cuboids (dimensions)

  • given a data cube, you can perform different roll-up
  • perations up to the disappearance of a dimension
  • given a k-dimensional data cube, there exist 2k

possible cuboids

product, date, city all product date city product, data product, city date, city drill-down roll-up

slide-17
SLIDE 17

17 Enterprise Information Systems Umberto Nanni

OLAP: Cuboids (aggregation levels)

  • given a data cube, you can perform different roll-up operations,

increasing the aggregation level (up to the disappearance of a dimension)

  • given a k-dimensional data cube, resp. with n1, n2, …, nk

agregation levels, there exist n1×n2 ×…×nk possible cuboids

article, date { } article brand, month date article, month brand, date brand month drill-down roll-up

slide-18
SLIDE 18

18 Enterprise Information Systems Umberto Nanni

brooms rinsing detergents

OLAP: Slice-and-Dice Operators

Selection (up to “slicing”) Examples: from “monthly sales of products by city” to:

  • a. monthly sales of products by city in central Italy
  • b. monthly sales by city of dishwasher detergents

dishwasher detergents Firenze L’Aquila Roma Torino Palermo Sept-12 Oct-12 Aug-12

(b)

dishwasher detergents Firenze L’Aquila Roma Torino Palermo Sptt-12 Oct-12 Aug-12

slide-19
SLIDE 19

19 Enterprise Information Systems Umberto Nanni

OLAP: Pivoting Operators

Rotating dimensions on a bidimensional table Example from: sales of (products by month) by (city) to: sales of (products by city) by (month)

RM FI AQ TO PA rinsing det aug-12 34 23 12 56 65 sep-12 56 45 23 44 67

  • ct-12

76 34 34 55 45

  • dishw. det.

aug-12 57 46 35 79 88 sep-12 79 68 46 67 90

  • ct-12

99 57 57 78 68 brooms aug-12 46 35 24 68 77 sep-12 68 57 35 56 79

  • ct-12

88 46 46 67 57 aug-12 sep-12 oct-12 rinsing det. RM 34 56 76 FI 23 45 34 AQ 12 23 34 TO 56 44 55 PA 65 67 45

  • dishw. det.

RM 57 79 99 FI 46 68 57 AQ 35 46 57 TO 79 67 78 PA 88 90 68 brooms RM 46 68 88 FI 35 57 46 AQ 24 35 46 TO 68 56 67 PA 77 79 57

two

  • perations
slide-20
SLIDE 20

20 Enterprise Information Systems Umberto Nanni

OLAP: Drill-Through Operator

  • access data at a level of aggregation lower

than the Data Warehouse

  • access to reconciled data (three-level

architecture) or to the operational DB

  • a sort of extreme Drill-Down
slide-21
SLIDE 21

21 Enterprise Information Systems Umberto Nanni

OLAP: Drill-Across Operators

  • Correlate data from two or more cubes, based on

the dimensional values, returning the value in a new cube Example: Starting from

sales by product, date, city employers by city

calculate:

average sales by product, employee

slide-22
SLIDE 22

22 Enterprise Information Systems Umberto Nanni

A Book on Datawarehouse

Matteo Golfarelli, Stefano Rizzi Data Warehouse Design Modern Principles and Methodologies McGraw-Hill, 2009. Matteo Golfarelli, Stefano Rizzi Data Warehouse Teoria e Pratica della Progettazione (2a ed.) McGraw-Hill, 2006.