analytical data bases
play

Analytical data bases Database lectures for mathematics students - PowerPoint PPT Presentation

Analytical data bases Database lectures for mathematics students Zbigniew Jurkiewicz, Institute of Informatics UW May 14, 2017 Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW Decision


  1. Analytical data bases Database lectures for mathematics students Zbigniew Jurkiewicz, Institute of Informatics UW May 14, 2017 Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  2. Decision support systems From the perspective of the time span all decisions in the organization could be divided into three categories: operational decisions within the scope of days or weeks; tactical decisions, whose effects range from a few months to one year, strategic decisions, which impact the organization development for the next few years. It has been observed that when moving from operational decisions towards the strategic ones, the procedures used become less and less algorithmic and formalized. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  3. Decision support systems Initially in business activities computer systems had been used mostly for operational data processing with such applications as sales order management, invoicing, or magazine inventory. Gradually the computers has been used for less routine activities, called Decision Support Systems (DSS). They are also known under other popular names BIS/BIT ( Business Intelligence System/Technology ) EIS ( Executive Information System ). In addition to “mechanical” data processing they also provide various mechanisms for deducing new information from the facts contained in the database. This has lead to a division of “database” applications into operational (transactional) and analytical. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  4. Requirements for decision support systems Information should usually be presented in a summarized form. No standard access path, very varied methods of selection and formatting the information to be presented, dynamics. Associating the selected information which other computational resources (spreadsheets, specialized statistical packages). Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  5. Analytical data processing Commonly known as On-Line Analytical Processing , abbreviated usually to OLAP . Typical applications: trend analysis, looking for patterns of behavior, looking for anomalies. Used interactively, so efficiency is very important, especially time efficiancy. If a user observes that some queries (e.g. based on 5 and more joins) are executed very slowly, she will try to avoid them. It is assumed that the answers for 90% of queries should be available within 10 seconds. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  6. Analytical data bases Also called On-Line Analytical Processing (OLAP) databases. Growing in importance. From personal computers to large client-server configurations. Many buzzwords roll-up and drill-down, drill-through, MOLAP , pivoting. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  7. Main issues What is an analytical database? Models and operations Implementing analytical database Development trends Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  8. Matrix reports Data analysis was first supported by matrix reports . Matrix reports look like spreadsheets. They are often concerned with finances or management. Sales system should for example contain a report about customers and their buying patterns, divided by country regions. However, instead of analysing buying patterns for each product, we divide product into categories. So the report would have product categories as columns, country regions as rows, and each report cell will show the number of items sold in this category in this region. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  9. Data Mart Small data warehouse, sometimes called thematic database Covers only some areas (themes) of the enterprise, e.g. marketing: customers, products, sales Model adapted to the needes of a department. Usually the information is initially preaggregated Elimination of unnecessary details Some critical level of details selected. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  10. Tools for querying and analysis Query builders Report generators comparisons: growth, decrease trends, graphs Spreadsheets WWW interface Data mining Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  11. Other operations Functions over time e.g. averages on different periods Computed attributes e.g.. profit = sales * rate Textual queries, e.g. find all documents containing words A and B order documents by frequency of occurence for words X , Y and Z Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  12. Data models and operators Data models relation star and snowflake cube: extension of spreadsheet idea (multidimensional tables, dimensions indexed by database values ) Operators slice & dice roll-up, drill down pivoting other Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  13. Multidimensional data model Multidimensional databases are most popular due to analytical data model of the form of multidimensional cube containing: facts (also called measures ), e.g. the number of cars sold; dimensions , e.g. months, regions of sale. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  14. Dimensions Dimensions usually form hierarchies, e.g. for time dimension the hierarchy will be year-quarter-month-day. Hierarchies enable the interactive change of detail level (granularity) of the information presented. In more complex models the hierachies can branch, e.g. division into weeks is incompatible with division into months. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  15. Time Time dimension needs a special treatment It is hidden — there is no separate table for time. As a dimension time is specific because it is sequential in nature. We might ask to see the sales for May or the sales for the first three months of 2007. But we would rarely ask to see the sales for the first five goods (ever assuming they are ordered by name). Method of aggregation for time depends on the meaning of the measure. If a company sold 10 computers in January, 15 computers in February, and 10 computers in March, then typical query would ask for total number (i.e. sum ) sold for the first quarter. On the other side, if a company had employeed 10 people in January, 7 in February, and 10 again in March, then we would usually ask about the average count for the quarter. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  16. Database The data is usually taken from data warehouse (real or virtual). Direct storage of information for all facts and all levels of detail in database could be very costly in terms of space, so Store only data for the most often used levels of hierarchies. Other data is computed from stored data on the fly if needed. When aggregating measures it is important to take into account various rules of aggregation, e.g. Sales amount is usually summed. Temperature or price will rather be averaged. The analytical database stores as a rule only aggregated data. To see the detail data ( drill-through ) it is necessary to fetch it from data warehouse or operational database. Because this takes a lot of time, such need should not occur too often. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  17. Operations on data Cutting and projecting on the cross-section surface ( slice and dice ) Change of detail level: drill-down and roll-up ) Turning ( pivot ): changes the visible dimensions on the “image”. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  18. Approaches to building the OLAP database ROLAP = “Relational OLAP”: we adapt the relational 1 DBMS to star or snowflake schema. MOLAP = “Multidimensional OLAP”: we use specialized 2 DBMS based on “datacube” model. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  19. Star schema Star schema is a typical method of data organization in relational database for OLAP . It is composed of: Fact table: large set of facts such as informations about the amount of sale. Dimension tables: smaller, statical information about the objects that the facts deal with. Generalization: snowflake model. Hierarchies of tables for particular dimensions: dimension table normalization. Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

  20. Example star schema We want to have in OLAP database informations about the selling of beers: pub, beer name, drinker who bought it, day, hour, and price. We take the following relation as our fact table: Sales(pub,beer,drinker,day,hour,price) Analytical data bases Database lectures for mathematics Zbigniew Jurkiewicz, Institute of Informatics UW

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend