Topic: SQAK: Doing More with Keywords Speaker: YINGJING YAN Why - PowerPoint PPT Presentation

Topic: SQAK: Doing More with Keywords Speaker: YINGJING YAN

Why ¡SQAK? ¡ • Today’s ¡enterprise ¡databases ¡are ¡large ¡ and ¡complex, ¡o=en ¡rela>ng ¡hundreds ¡of ¡ en>>es. ¡ ¡ • Enabling ¡ordinary ¡users ¡to ¡query ¡such ¡ databases ¡and ¡derive ¡value ¡from ¡them ¡ has ¡been ¡of ¡great ¡interest ¡in ¡database ¡ research. ¡ ¡

Why ¡SQAK? � • However, ¡in ¡order ¡to ¡compute ¡even ¡simple ¡ aggregates, ¡a ¡user ¡is ¡required ¡to ¡write ¡a ¡SQL ¡ statement ¡and ¡can ¡no ¡longer ¡use ¡simple ¡ keywords. ¡ • As ¡a ¡solu>on ¡to ¡this ¡problem, ¡we ¡propose ¡a ¡ framework ¡called ¡SQAK ¡(SQL ¡Aggregates ¡using ¡ Keywords) ¡that ¡enables ¡users ¡to ¡pose ¡aggregate ¡ queries ¡using ¡simple ¡keywords ¡with ¡liKle ¡or ¡no ¡ knowledge ¡of ¡the ¡schema. ¡ ¡

INTRODUCTION ¡ • Consider ¡the ¡simple ¡schema ¡in ¡Figure ¡1 ¡of ¡a ¡university ¡ database ¡that ¡tracks ¡student ¡registra>ons ¡in ¡various ¡ courses ¡offered ¡in ¡different ¡departments: ¡ ¡ ¡ ¡ ¡ ¡ ¡Figure ¡1: ¡Sample ¡University ¡Schema ¡

INTRODUCTION ¡ • Suppose ¡that ¡a ¡user ¡wished ¡to ¡determine ¡the ¡ number ¡of ¡students ¡registered ¡for ¡the ¡course ¡ “Introduc>on ¡to ¡Databases” ¡in ¡the ¡Fall ¡ semester ¡in ¡2007. ¡ ¡ • The ¡SQL ¡statement ¡would ¡be ¡wriKen ¡as ¡ follows: ¡

INTRODUCTION ¡ SELECT ¡courses.name, ¡sec>on.term, ¡count (students.id) ¡as ¡count ¡ FROM ¡students, ¡enrollment, ¡sec>on, ¡courses ¡ WHERE ¡students.id ¡= ¡enrollment.id ¡ ¡ ¡ ¡AND ¡sec>on.classid ¡= ¡enrollment.classid ¡ ¡ ¡ ¡AND ¡courses.courseid ¡= ¡sec>on.courseid ¡ ¡ ¡ ¡ ¡AND ¡lower(courses.name) ¡LIKE ¡’\%intro. ¡to ¡ ¡ ¡ ¡ ¡ databases\%’ ¡ ¡ ¡ ¡ ¡AND ¡lower(sec>on.term) ¡= ¡’\%fall ¡2007\%’ ¡ GROUP ¡BY ¡courses.name, ¡sec>on.term ¡

INTRODUCTION ¡ • While ¡this ¡may ¡seem ¡easy ¡and ¡obvious ¡to ¡a ¡ database ¡expert ¡who ¡has ¡examined ¡the ¡schema, ¡ it ¡is ¡indeed ¡a ¡difficult ¡task ¡for ¡an ¡ordinary ¡user. ¡ • Ideally, ¡the ¡user ¡should ¡be ¡able ¡to ¡pose ¡ ¡ ¡ ¡this ¡query ¡using ¡simple ¡keywords ¡such ¡as ¡ “Introduc*on ¡to ¡Databases” ¡“Fall ¡2007” ¡number ¡ students . ¡ • SQAK ¡system ¡achieves ¡exactly ¡this ¡by ¡ empowering ¡end ¡users ¡to ¡pose ¡more ¡complex ¡ queries. �

SQAK ¡Overview ¡ • SQAK ¡provides ¡a ¡novel ¡and ¡exci>ng ¡way ¡to ¡ trade-‑off ¡some ¡of ¡the ¡expressive ¡power ¡of ¡SQL ¡ in ¡exchange ¡for ¡the ¡ability ¡to ¡express ¡a ¡large ¡ class ¡of ¡aggregate ¡queries ¡using ¡simple ¡ keywords, ¡by ¡taking ¡advantage ¡of ¡the ¡data ¡in ¡ the ¡database ¡and ¡the ¡schema ¡(tables, ¡ aKributes, ¡keys, ¡and ¡referen>al ¡constraints). ¡ ¡ • SQAK ¡does ¡not ¡require ¡any ¡changes ¡to ¡the ¡ database ¡engine ¡and ¡can ¡be ¡used ¡with ¡any ¡ exis>ng ¡database. ¡ ¡

SQAK ¡Overview ¡ • SQAK ¡takes ¡advantage ¡of ¡the ¡data ¡in ¡the ¡database, ¡ metadata ¡such ¡as ¡the ¡names ¡of ¡tables ¡and ¡aKributes, ¡ and ¡referen>al ¡constraints. ¡ ¡ • SQAK ¡also ¡discovers ¡and ¡uses ¡func>onal ¡dependencies ¡ in ¡each ¡table ¡along ¡with ¡the ¡fact ¡that ¡the ¡input ¡query ¡is ¡ reques>ng ¡an ¡aggregate ¡to ¡aggressively ¡prune ¡out ¡ ambiguous ¡interpreta>ons. ¡ • As ¡a ¡result, ¡SQAK ¡is ¡able ¡to ¡provide ¡a ¡powerful ¡and ¡ easy ¡to ¡use ¡querying ¡interface ¡that ¡fulfills ¡a ¡need ¡not ¡ addressed ¡by ¡any ¡exis>ng ¡systems. ¡

Architecture ¡of ¡SQAK ¡ • A ¡keyword ¡query ¡in ¡SQAK ¡is ¡simply ¡a ¡set ¡of ¡ words ¡(terms) ¡with ¡at ¡least ¡one ¡of ¡them ¡being ¡ an ¡aggregate ¡func>on ¡(such ¡as ¡count, ¡number, ¡ sum, ¡min, ¡or ¡max). ¡Terms ¡in ¡the ¡query ¡may ¡ correspond ¡to ¡words ¡in ¡the ¡schema ¡(names ¡of ¡ tables ¡or ¡columns) ¡or ¡to ¡data ¡elements ¡in ¡the ¡ database. ¡

Architecture ¡of ¡SQAK ¡ • The ¡SQAK ¡system ¡consists ¡of ¡three ¡major ¡ components ¡– ¡the ¡Parser/Analyzer, ¡the ¡SQN-‑Builder, ¡ and ¡the ¡Scorer. ¡A ¡query ¡that ¡enters ¡the ¡system ¡is ¡first ¡ parsed ¡into ¡tokens. ¡The ¡analyzer ¡then ¡produces ¡a ¡set ¡ of ¡Candidate ¡Interpreta>ons ¡(CI’s) ¡based ¡on ¡the ¡ tokens ¡in ¡the ¡query. ¡For ¡each ¡CI, ¡the ¡SQN ¡Builder ¡ builds ¡a ¡tree ¡(called ¡an ¡SQN) ¡which ¡uniquely ¡ corresponds ¡to ¡a ¡structured ¡query. ¡The ¡SQN’s ¡are ¡ scored ¡and ¡ranked. ¡Finally, ¡the ¡highest ¡ranking ¡tree ¡is ¡ converted ¡to ¡SQL ¡and ¡executed ¡using ¡a ¡standard ¡ rela>onal ¡engine ¡and ¡the ¡results ¡are ¡displayed ¡to ¡the ¡ user. ¡ ¡

Architecture ¡of ¡SQAK ¡ Figure 3: Architecture of SQAK

Candidate ¡Interpreta>on ¡(CI) ¡ ¡ • A ¡CI ¡can ¡be ¡thought ¡of ¡as ¡an ¡interpreta>on ¡of ¡ the ¡keyword ¡query ¡posed ¡by ¡the ¡user ¡in ¡the ¡ context ¡of ¡the ¡schema ¡and ¡the ¡data ¡in ¡the ¡ database. ¡ ¡ • A ¡CI ¡is ¡simply ¡a ¡set ¡of ¡aKributes ¡from ¡a ¡database ¡ with ¡(op>onally) ¡a ¡predicate ¡associated ¡with ¡ each ¡aKribute. ¡

Candidate ¡Interpreta>on ¡(CI) ¡ • In ¡addi>on, ¡one ¡of ¡the ¡elements ¡of ¡the ¡CI ¡is ¡ labeled ¡with ¡an ¡aggregate ¡func>on ¡F. ¡This ¡ aggregate ¡func>on ¡is ¡inferred ¡from ¡one ¡of ¡the ¡ keywords. ¡ • For ¡instance, ¡the ¡“average” ¡func>on ¡from ¡ keyword ¡query ¡“John ¡average ¡grade” ¡would ¡ be ¡the ¡aggregate ¡func>on ¡F ¡in ¡a ¡CI ¡generated ¡ from ¡it. ¡ ¡

Candidate ¡Interpreta>on ¡(CI) ¡ • One ¡of ¡the ¡elements ¡of ¡the ¡CI ¡may ¡be ¡ op>onally ¡labeled ¡as ¡a ¡“with” ¡node ¡(called ¡a ¡ w-‑node). ¡A ¡w-‑node ¡is ¡used ¡in ¡certain ¡keyword ¡ queries ¡where ¡an ¡element ¡with ¡a ¡maximum ¡ (or ¡minimum) ¡value ¡for ¡an ¡aggregate ¡is ¡the ¡ desired ¡answer. ¡ ¡ • For ¡instance, ¡in ¡the ¡query ¡“ ¡student ¡with ¡max ¡ average ¡grade”, ¡the ¡node ¡for ¡student ¡is ¡ designated ¡as ¡the ¡w-‑node. ¡ ¡ • This ¡is ¡discussed ¡in ¡more ¡detail ¡later. ¡

CI ¡Defini>on ¡

CI ¡Defini>on ¡ • An ¡intui>ve ¡way ¡of ¡understanding ¡a ¡CI ¡is ¡to ¡ think ¡of ¡it ¡as ¡supplying ¡just ¡the ¡SELECT ¡clause ¡ of ¡the ¡SQL ¡statement. ¡ • In ¡transla>ng ¡from ¡the ¡CI ¡to ¡the ¡final ¡query, ¡ SQAK ¡“figures ¡out” ¡the ¡rest ¡of ¡the ¡SQL. ¡ Consider ¡the ¡sample ¡schema ¡showed ¡in ¡Figure ¡ 1. ¡An ¡edge ¡from ¡one ¡table ¡to ¡another ¡simply ¡ means ¡that ¡a ¡column ¡in ¡the ¡source ¡table ¡refers ¡ to ¡a ¡column ¡in ¡the ¡des>na>on ¡table. ¡

CI ¡Defini>on ¡ • Now ¡consider ¡the ¡aggregate ¡keyword ¡query ¡ ”John ¡num ¡courses” ¡posed ¡by ¡a ¡user ¡trying ¡to ¡ compute ¡the ¡number ¡of ¡courses ¡John ¡has ¡ taken. ¡ • One ¡of ¡the ¡possible ¡CI’s ¡that ¡might ¡be ¡ generated ¡for ¡this ¡is: ¡({Student.name [=John],Courses.courseid}, ¡Courses.courseid, ¡ count, ¡w= ∅ ). ¡ • Depending ¡on ¡the ¡query, ¡there ¡may ¡be ¡several ¡ other ¡CI’s ¡generated ¡for ¡a ¡given ¡query. ¡

Topic: SQAK: Doing More with Keywords Speaker: YINGJING YAN Why - PowerPoint PPT Presentation

Topic: SQAK: Doing More with Keywords Speaker: YINGJING YAN Why SQAK? Todays enterprise databases are large and complex, o=en rela>ng hundreds of en>>es.

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

ConnectHome ConnectHome Topic 2 Topic 2 Nation Webinar Nation Webinar Topic 3 Topic 3 Topic

Verilog HDL:Digital Design and Modeling Chapter 3 Keywords Chapter 3 Keywords 2

Doing Business with Doing Business with FEMA Introductions Doing Business with FEMA

UNIT TOPICS TOPIC 1: MINERALS TOPIC 2: IGNEOUS ROCKS TOPIC 3: SEDIMENTARY ROCKS

TOPIC #X: TOPIC NAME DATE, 2020 PRESENTATION OUTLINE Main topic #1 Main topic #2 Main

COMP31212: Concurrency Topic 5.3: Liveness and Topic 5.4 Fairness Topic 5.3: Liveness Properties

Learn more Do more Be more Learn more Do more Be more UNITY Learn more Do

Defect Detection Thomas Zimmermann The First Bug September 9, 1947 More Bugs More Bugs More

Why Transformers Work. More info blablabla More info blablabla More info blablabla More

Second Year Student Meeting PhD Candidacy Exam On-topic or Off-topic Candidacy Exam? On-Topic:

The Dynamic Earth Unit Topics Topic 1: Earths Interior Topic 2: Continental Drift

Strategic Considerations for Managing a Nanotechnology Patent Portfolio Sarah Korman, Ph.D., J.D.

9/15/17 Outline Topic 1.Introduc8on Topic 2. RCS for six key fuels Topic 3.

Researching Researching Your Paper Topic Your Paper Topic A HOW TO GUIDE A HOW TO GUIDE

K12 PRODUCT PROMOTION WHAT WE ARE DOING NOW Email and mail campaigns WHAT WE ARE DOING NOW

Revisiting Division Property Based Cube Attacks: Key-Recovery or Distinguishing Attacks?

Recent works on orbital angular momentum Masashi Wakamatsu , Osaka University Transversity

t tts

cse 311: foundations of computing Fall 2015 Lecture 5: Canonical forms and predicate logic

CS 4518 Mobile and Ubiquitous Computing Lecture 19: ActivPass & Sandra Emmanuel Agu

CS 528 Mobile and Ubiquitous Computing Lecture 10b: Gamification & Energy Efficiency

State Machine for GPRS MM READY timer expiry or GPRS Attach Force to STANDBY IDLE IDLE READY

High Availability and Automatic Failover in PostgreSQL Using Open Source Solutions Avinash

Topic: SQAK: Doing More with Keywords Speaker: YINGJING YAN Why - PowerPoint PPT Presentation

Topic: SQAK: Doing More with Keywords Speaker: YINGJING YAN Why SQAK? Todays enterprise databases are large and complex, o=en rela>ng hundreds of en>>es.

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

ConnectHome ConnectHome Topic 2 Topic 2 Nation Webinar Nation Webinar Topic 3 Topic 3 Topic

Verilog HDL:Digital Design and Modeling Chapter 3 Keywords Chapter 3 Keywords 2

Doing Business with Doing Business with FEMA Introductions Doing Business with FEMA

UNIT TOPICS TOPIC 1: MINERALS TOPIC 2: IGNEOUS ROCKS TOPIC 3: SEDIMENTARY ROCKS

TOPIC #X: TOPIC NAME DATE, 2020 PRESENTATION OUTLINE Main topic #1 Main topic #2 Main

COMP31212: Concurrency Topic 5.3: Liveness and Topic 5.4 Fairness Topic 5.3: Liveness Properties

Learn more Do more Be more Learn more Do more Be more UNITY Learn more Do

Defect Detection Thomas Zimmermann The First Bug September 9, 1947 More Bugs More Bugs More

Why Transformers Work. *More info blablabla *More info blablabla *More info blablabla *More

Second Year Student Meeting PhD Candidacy Exam On-topic or Off-topic Candidacy Exam? On-Topic:

The Dynamic Earth Unit Topics Topic 1: Earths Interior Topic 2: Continental Drift

Strategic Considerations for Managing a Nanotechnology Patent Portfolio Sarah Korman, Ph.D., J.D.

9/15/17 Outline Topic 1.Introduc8on Topic 2. RCS for six key fuels Topic 3.

Researching Researching Your Paper Topic Your Paper Topic A HOW TO GUIDE A HOW TO GUIDE

K12 PRODUCT PROMOTION WHAT WE ARE DOING NOW Email and mail campaigns WHAT WE ARE DOING NOW

Revisiting Division Property Based Cube Attacks: Key-Recovery or Distinguishing Attacks?

Recent works on orbital angular momentum Masashi Wakamatsu , Osaka University Transversity

t tts

cse 311: foundations of computing Fall 2015 Lecture 5: Canonical forms and predicate logic

CS 4518 Mobile and Ubiquitous Computing Lecture 19: ActivPass &amp; Sandra Emmanuel Agu

CS 528 Mobile and Ubiquitous Computing Lecture 10b: Gamification &amp; Energy Efficiency

State Machine for GPRS MM READY timer expiry or GPRS Attach Force to STANDBY IDLE IDLE READY

High Availability and Automatic Failover in PostgreSQL Using Open Source Solutions Avinash

Why Transformers Work. More info blablabla More info blablabla More info blablabla More

CS 4518 Mobile and Ubiquitous Computing Lecture 19: ActivPass & Sandra Emmanuel Agu

CS 528 Mobile and Ubiquitous Computing Lecture 10b: Gamification & Energy Efficiency