Grouping and Aggregation Grouping and Aggregation in the Concept- - - PowerPoint PPT Presentation

grouping and aggregation grouping and aggregation in the
SMART_READER_LITE
LIVE PREVIEW

Grouping and Aggregation Grouping and Aggregation in the Concept- - - PowerPoint PPT Presentation

Grouping and Aggregation Grouping and Aggregation in the Concept- -Oriented Data Model Oriented Data Model in the Concept Alexandr Savinov Fraunhofer Institute for Autonomous Intelligent Systems Knowledge Discovery Team Germany


slide-1
SLIDE 1

SAC’06, Dijon, France, April 23-27

1

Grouping and Aggregation Grouping and Aggregation in the Concept in the Concept-

  • Oriented Data Model

Oriented Data Model

Alexandr Savinov Fraunhofer Institute for Autonomous Intelligent Systems Knowledge Discovery Team Germany savinov@conceptoriented.com

slide-2
SLIDE 2

SAC’06, Dijon, France, April 23-27

2

  • Introduction
  • Physical and Logical Structures
  • Model Dimensionality
  • Projection and De-projection
  • Multidimensional Analysis
  • Conclusions

Outline Outline

slide-3
SLIDE 3

SAC’06, Dijon, France, April 23-27

3

Concept Concept-

  • oriented paradigm
  • riented paradigm

Introduction Introduction

  • Duality: any element is a collection of other elements and a combination of
  • ther elements, for example:

– references vs. properties – entity modeling vs. identity modeling

  • Order: order of elements determines most of syntactic and semantic

properties

  • Representation and access (RA) is the main concern.

Concept-oriented paradigm Concept-oriented model (COM) Concept-oriented programming (COP)

slide-4
SLIDE 4

SAC’06, Dijon, France, April 23-27

4

Physical structure Physical structure

Physical Physical and and Logical Logical

  • At physical level an element of the model is a collection of other elements
  • Physical structure is used for representation and access
  • Physical structure is used to implement reference
  • Physical structure is hierarchical where each element has only one parent

, ,

root concepts items physical structure Customers Countries Germany France #23 Orders CompanyX

slide-5
SLIDE 5

SAC’06, Dijon, France, April 23-27

5

Logical structure Logical structure

  • Each element is a combination of other elements (by reference)
  • Logical structure is used to represent data semantics (properties)
  • Logical collection is a dual combination
  • Each element has many parents and many children

root concepts items physical structure logical structure Customers Countries Germany France #23 Orders CompanyX

, ,

Physical Physical and and Logical Logical

  • rder

part1 part2 customer date AND OR

slide-6
SLIDE 6

SAC’06, Dijon, France, April 23-27

6

Two level model Two level model

Physical Physical and and Logical Logical

  • [Root] One root element is a physical

collection of concepts,

  • [Syntax] Each concept is

– (i) a combination of other concepts called superconcepts (while this concept is a subconcept), – (ii) a physical collection of data items (or concept instances),

  • [Semantics] Each data item is

– (i) a combination of other data items called superitems (while this item is a subitem), – (ii) empty physical collection,

, ,

{} = i

root concepts items physical structure logical structure Customers Countries Germany France #23 Orders CompanyX

slide-7
SLIDE 7

SAC’06, Dijon, France, April 23-27

7

Two level model Two level model

Physical Physical and and Logical Logical

  • [Special elements]

– Top and bottom concepts – Primitive concepts – Null item

  • [Cycles] Cycles in subconcept-

superconcept relation and subitem- superitem relation are not allowed,

  • [Syntactic constraints] Each data item

from a concept may combine only items from its superconcepts.

, ,

root concepts items physical structure logical structure Customers Countries Germany France #23 Orders CompanyX

slide-8
SLIDE 8

SAC’06, Dijon, France, April 23-27

8

Multidimensional space Multidimensional space

Model Model Dimensionality Dimensionality

  • Superconcept is a domain of a dimension
  • A common subconcept is a multidimensional space
  • More levels can be added to the multidimensional space

, ,

Countries Orders Products Customers item concept arrow from subitem to superitem superconcepts subconcept

slide-9
SLIDE 9

SAC’06, Dijon, France, April 23-27

9

Hierarchical space Hierarchical space

Model Model Dimensionality Dimensionality

  • It is one-dimensional space with many levels of details
  • Subconcepts are alternative views on their common superconcept

, ,

Employees Company Products Customers item concept arrow from subconcept to superconcept company as one whole alternative views

  • n the company

Orders Surveys alternative views

  • n the customers
slide-10
SLIDE 10

SAC’06, Dijon, France, April 23-27

10

Hierarchical multidimensional space Hierarchical multidimensional space

  • Both structures are combined in one concept graph
  • The concept graph possesses both multidimensional and hierarchical

properties

Model Model Dimensionality Dimensionality

, ,

SubC1 SubC2 Top SubC3 SupC1 SupC2 SupC3 Bottom C1 C2 Top C3 Bottom

hierarchy multidimensional space most general concept most specific concept

slide-11
SLIDE 11

SAC’06, Dijon, France, April 23-27

11

Dimensions Dimensions

  • Dimension is a named position of superconcept
  • Superconcept is referred to as the domain
  • Dimensions of higher rank consists of many (local) dimensions
  • Dimension with the domain in a primitive concept is a primitive dimension
  • The number of primitive dimensions is the model primitive dimensionality

, ,

Model Model Dimensionality Dimensionality

Prices Users Auctions Top AuctionBids Dates Products Categories auction price user date product category date user

slide-12
SLIDE 12

SAC’06, Dijon, France, April 23-27

12

Inverse dimensions Inverse dimensions

  • Inverse dimension has an opposite direction
  • Inverse dimension identifies a subconcept
  • Inverse dimensions are multi-valued (while dimensions are one-valued)
  • The number of primitive dimensions is equal to the number of primitive

inverse dimensions

  • {AuctionBids.auction.product.category}

, ,

Model Model Dimensionality Dimensionality

Prices Users Auctions Top AuctionBids Dates Products Categories auction price user date product category date user

slide-13
SLIDE 13

SAC’06, Dijon, France, April 23-27

13

Two retrieval operations Two retrieval operations

  • Two ways to retrieve related items: projection and de-projection
  • These two ways are supported by the model structure and correspond to

moving up and down in the concept graph

  • These two retrieval operations need only dimension names – no complex

joins anymore

  • These operations are analogous to the corresponding geometrical
  • perations

, ,

Projection Projection and and De De-

  • projection

projection

slide-14
SLIDE 14

SAC’06, Dijon, France, April 23-27

14

Projection Projection

  • Projection operator returns a set of superitems along some dimension
  • Projection operator -> is followed by a dimension:

OrderParts->product->category

Projection Projection and and De De-

  • projection

projection

, , For each subitem we get its superitem along the dimension used in projection Projection direction

U C I Countries Dates Top Orders Months Products Categories customer country

  • rder

date product category month Customers OrderParts

slide-15
SLIDE 15

SAC’06, Dijon, France, April 23-27

15

De De-

  • projection

projection

  • De-projection operator returns a set of subitems
  • De-projection operator -> is followed by an inverse dimension:

Category->{product->category}

For each superitems we find all subitems along inverse dimension that reference it De-projection direction

I S

Projection Projection and and De De-

  • projection

projection

, ,

Countries Dates Top Orders Months Products Categories customer country

  • rder

date product category month Customers OrderParts

slide-16
SLIDE 16

SAC’06, Dijon, France, April 23-27

16

Access path Access path

Projection Projection and and De De-

  • projection

projection

  • Access path is a sequence of projections and de-projection where each next
  • perator is applied to the result of the previous operator
  • Category.getOrders = this->
  • Category.getOrders = this->

{OrderParts->product->category}->

  • rder->customer->country;
  • Zigzag paths

are possible

  • Aggregation can be applied

to sets of items

  • Category.meanPrice = avg(

this->getOrders->price );

, ,

{OrderParts->product->category}->

  • rder;

Countries Dates Top Orders Months Products Categories customer country

  • rder

date product category month Customers OrderParts

slide-17
SLIDE 17

SAC’06, Dijon, France, April 23-27

17

Multidimensional de Multidimensional de-

  • projection

projection

  • More than one bounding dimension
  • Multidimensional de-projection returns a set of subitems referencing source

items along all bounding dimensions:

I S

Multidimensional Analysis Multidimensional Analysis

, ,

I S

One-dimensional de-projection Multi-dimensional de-projection

slide-18
SLIDE 18

SAC’06, Dijon, France, April 23-27

18

D2 D1 M

S

Steps of analysis Steps of analysis

Multidimensional Analysis Multidimensional Analysis

1.

Choose dimension paths along which we want to view our data S

2.

Choose the levels along these dimensions

3.

Universe of discourse is the Cartesian product of the chosen levels

4.

Each point from UoD is de-projected onto the target subconcept S

5.

De-projection is aggregated using some property (measure)

, ,

UoD Measure

slide-19
SLIDE 19

SAC’06, Dijon, France, April 23-27

19

Example Example

Multidimensional Analysis Multidimensional Analysis

  • Choose the target concept OrderParts and two dimensions leading to

concepts Countries and Categories

  • De-project each pair of customer and product to OrderParts:

<c,p>->{OrderParts->order->customer, OrderParts->product}

  • Aggregate and return average price:

FORALL(c Customers, p Products) { tmp= <c,p>->{ OrderParts->order->customer, OrderParts->product } RETURN(c,p,avg(tmp.price)); }

, ,

Countries Dates Top Orders Months Products Categories customer country

  • rder

date product category month Customer OrderParts

slide-20
SLIDE 20

SAC’06, Dijon, France, April 23-27

20

Change the level of details Change the level of details

Multidimensional Analysis Multidimensional Analysis

  • Choose other domains along dimension paths and apply the same query:
  • FORALL(c Countries, p Categories) {

tmp= <c,p>->{ OrderParts->order->customer->country, OrderParts->product->category } RETURN(c,p,avg(tmp)); }

, ,

Countries Dates Top Orders Months Products Categories customer country

  • rder

date product category month Customers OrderParts Roll up Drill down

slide-21
SLIDE 21

SAC’06, Dijon, France, April 23-27

21

Example 2 Example 2

Multidimensional Analysis Multidimensional Analysis

  • FORALL{ d Dates, c Categories } {

tmp = this->{ Auctions.date, Auctions.product.category } RETURN(d,c, avg(tmp->meanPrice) ) }

, ,

Prices Users Auctions Top AuctionBids Dates Products Categories auction price user date product category date user Roll up Drill down

slide-22
SLIDE 22

SAC’06, Dijon, France, April 23-27

22

  • Features:

– Canonical semantics – Logical navigation via access paths, dimensions and inverse dimensions – Multidimensional aggregation and analysis – Constraint propagation and inference (not described in this presentation)

  • Advantages:

– Grouping and aggregation is integral part of the model – Combination in one model hierarchical and multidimensional properties – Formal syntax and semantics – Simple query language -- no joins anymore

, ,

Conclusions Conclusions