query based data pricing
play

Query-Based Data Pricing Dan Suciu U. of Washington Joint with M. - PowerPoint PPT Presentation

Query-Based Data Pricing Dan Suciu U. of Washington Joint with M. Balazinska, B. Howe, P. Koutris, Daniel Li, Chao Li, G. Miklau, P. Upadhyaya EPFL, 2013 1 Data Has Value And it is increasingly being sold/bought on the Web Big data


  1. Query-Based Data Pricing Dan Suciu – U. of Washington Joint with M. Balazinska, B. Howe, P. Koutris, Daniel Li, Chao Li, G. Miklau, P. Upadhyaya EPFL, 2013 1

  2. Data Has Value And it is increasingly being sold/bought on the Web • Big data vendors • Data Markets • Private data Pricing digital goods is challenging [Shapiro&Varian] EPFL, 2013 2

  3. Pricing Data Pricing data lies at the intersection of several areas: This talk • Data management • Mechanism design • Economics EPFL, 2013 3

  4. 1. Big Data Vendors High value data • Gartner report: $5k, even if you need only one chart • Navteq Maps • Factual • A few others [Muschalle]: – Thomson Reuters, Mendeley Ltd., DataMarket Inc, Vico Research & Consulting GmbH, TEMIS S.A., Neofonie GmbH, Inovex GmbH Expensive datasets, available only to major customers EPFL, 2013 4

  5. 2. Data Markets • Azure DataMarkets – 100+ data sources • Infochimps – 15,000 data sets • Xignite – financial data • Aggdata • Gnip – social media data • PatientsLikeMe These datasets are available to the little guy. The markets themselves are struggling, because they are just facilitators; no innovation EPFL, 2013 5

  6. 3. Private Data • Private data has value – A unique user: $4 at FB, $24 at Google [JPMorgan] • Today’s common practice: – Companies profit from private data without compensating users • New trend: allow users to profit financially – Industry: personal data locker https://www.personal.com/ , http://lockerproject.org/ – Academia: mechanisms for selling private data [Ghosh11,Gkatzelis12,Aperjis11,Roth12,Riederer12] DIMACS - 10/2012 6

  7. Sample Data Markets EPFL, 2013 7

  8. Different price by business type 8

  9. $699 for 885976 teacher names & emails! EPFL, 2013 9

  10. Cheaper just for Washington EPFL, 2013 10

  11. A Criticism of Today’s Pricing Schemes • Small buyers want to purchase only a tiny amount of data: if they can’t, they give up • Large buyers have specific needs: price is often negotiated in a room-full-of-lawyers • Sellers can’t easily anticipate all possible queries that buyers might ask Needed: more flexible pricing scheme, parameterized by queries 11

  12. Outline • Framework and examples • Results so far • Conclusions EPFL, 2013 12

  13. Query-based Pricing • Seller defines price-points : (V 1 ,p 1 ), (V 2 , p 2 ), … Meaning: price(V i )=p i . • Buyer may buy any query Q • System will determine price D (Q) based on: – The price points – The current database instance D – The query Q EPFL, 2013 How should a “ good “ price function be? 13

  14. Arbitrage Freeness Arbitrage-free Axiom: For all queries Q 1 , …, Q k , Q, if Q 1 , …, Q k determine Q, then: price D (Q) ≤ price D (Q 1 ) + … + price D (Q k ) “Q 1 ,…, Q k determine Q” means that Q(D) can be answered from Q 1 (D), …, Q k (D), without accessing the database instance D 14

  15. Example 1: Pricing Relational Data S(Shape,Color,Picture) Price list Price Shape Color Picture V 1 = σ Shape=‘Swan’ (S) $2 Swan White V 2 = σ Shape=‘Dragon’ (S) $2 V 3 = σ Shape= ‘Car’ (S) $2 Swan Yellow . . . . . V 4 = σ Shape= ‘Fish’ (S) $2 Dragon Yellow Car Yellow . . . . . W 1 = σ Color=‘White’ (S) $3 Fish White . . . . . W 2 = σ Color=‘Yellow’ (S) $3 W 3 = σ Color=‘Red’ (S) $3 Price( σ Shape )=$2 Price( σ Color )=$3 15 Picture credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  16. Example 1: Pricing Relational Data S(Shape,Color,Picture) Price list Price Shape Color Picture V 1 = σ Shape=‘Swan’ (S) $2 Get all Dragons Swan White for $2 V 2 = σ Shape=‘Dragon’ (S) $2 V 3 = σ Shape= ‘Car’ (S) $2 Swan Yellow . . . . . V 4 = σ Shape= ‘Fish’ (S) $2 Dragon Yellow Car Yellow . . . . . W 1 = σ Color=‘White’ (S) $3 Fish White . . . . . W 2 = σ Color=‘Yellow’ (S) $3 Get all W 3 = σ Color=‘Red’ (S) $3 Red Origami Price( σ Shape )=$2 Price( σ Color )=$3 for $3 16 Picture credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  17. Example 1: Pricing Relational Data S(Shape,Color,Picture) Price list Price Shape Color Picture V 1 = σ Shape=‘Swan’ (S) $2 Get all Dragons Swan White for $2 V 2 = σ Shape=‘Dragon’ (S) $2 V 3 = σ Shape= ‘Car’ (S) $2 Swan Yellow . . . . . V 4 = σ Shape= ‘Fish’ (S) $2 Dragon Yellow Car Yellow . . . . . W 1 = σ Color=‘White’ (S) $3 Fish White . . . . . W 2 = σ Color=‘Yellow’ (S) $3 Get all W 3 = σ Color=‘Red’ (S) $3 $1? Red Origami Price( σ Shape )=$2 Price( σ Color )=$3 for $3 $4? $8? Find the price of the entire db $20? 17 Picture credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  18. Example 1: Pricing Relational Data S(Shape,Color,Picture) Price list Price Shape Color Picture V 1 = σ Shape=‘Swan’ (S) $2 Get all Dragons Swan White for $2 V 2 = σ Shape=‘Dragon’ (S) $2 V 3 = σ Shape= ‘Car’ (S) $2 Swan Yellow . . . . . V 4 = σ Shape= ‘Fish’ (S) $2 Dragon Yellow Car Yellow . . . . . W 1 = σ Color=‘White’ (S) $3 Fish White . . . . . W 2 = σ Color=‘Yellow’ (S) $3 Get all $1? W 3 = σ Color=‘Red’ (S) $3 Red Origami Price( σ Shape )=$2 Price( σ Color )=$3 $4? for $3 $8 Find the price of the entire db $20? To ensure aribitrage-freeness, V 1 , V 2 , V 3 , V 4 determine Q, price(Q) ≤ $8 we can charge only $8 for the W 1 , W 2 , W 3 determine Q, price(Q) ≤ $9 entire database. Picture credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  19. Example 1: Pricing Relational Data Price( σ Color )=$55 Price( σ Shape )=$2 Price( σ Color )=$3 Price( σ Shape )=$99 R S T Shape Instructions Shape Color Picture Color PaperSpecs Swan Fold,fold,fold… White 15g/100 Swan White Dragon Cut,cut,cut,… Black 20g/100 Swan Yellow . . . . . Dragon Yellow Car Yellow . . . . . Fish White . . . . . Find the price of the full join: Q = R ⋈ S ⋈ T 19 Pictures credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  20. Example 1: Pricing Relational Data Price( σ Color )=$55 Price( σ Shape )=$2 Price( σ Color )=$3 Price( σ Shape )=$99 R S T Shape Instructions Shape Color Picture Color PaperSpecs Swan Fold,fold,fold… White 15g/100 Swan White Dragon Cut,cut,cut,… Black 20g/100 Swan Yellow . . . . . Dragon Yellow Car Yellow . . . . . Fish White . . . . . Find the price of the full join: Q = R ⋈ S ⋈ T Shape Instructions Color Picture PaperSpecs Swan Fold,fold,fold… White 15g/100 20 Pictures credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  21. Example 1: Pricing Relational Data Price( σ Color )=$55 Price( σ Shape )=$2 Price( σ Color )=$3 Price( σ Shape )=$99 R S T Shape Instructions Shape Color Picture Color PaperSpecs Swan Fold,fold,fold… White 15g/100 Swan White Dragon Cut,cut,cut,… Black 20g/100 Swan Yellow . . . . . Dragon Yellow Car Yellow . . . . . Not obvious! Fish White . . . . . E.g. no Yellow Cars in the join. Find the price of the full join: Q = R ⋈ S ⋈ T Shape Instructions Color Picture PaperSpecs What to pay for? σ Shape=‘car’ (R) or Swan Fold,fold,fold… White 15g/100 σ Color=‘yellow’ (T) 21 Pictures credits: http://www.toysperiod.com/blog/uncategorized/the-modern-art-and-science-of-origami/

  22. Discussion Why not charge per row in the answer? • Q 1 (x,y) = Fortune500(x,y) Q(x,y) = Fortune500(x,y),StrongBuyRec(x) • Q ⊆ Q 1 , yet Price(Q) >> Price(Q 1 ) • “Containment” is unrelated to pricing • “Determinacy” is the right concept for studying pricing EPFL, 2013 22

  23. Example 2: Pricing Private Data UID User Rating (0..5) 1 Alice 3 $10 2 Bob 0 $10 3 Carol 1 $10 4 Dan 0 $10 … … … 1000 Zoran 2 $10 • Buyer: query c = x 1 +x 2 +…+x 1000 • User compensation: $10 • Price for the buyer: $10,000 1. Raw data is too expensive! DIMACS - 10/2012 23

  24. Example 2: Pricing Private Data Differential privacy • Perturbation is necessary for privacy [Dwork’2011] Selling private data • Perturbation is a cost saving feature • Two extremes: – Raw data = no perturbation = high price – Differentially private = high perturbation = low price

  25. Example 2: Pricing Private Data UID User Rating (0..5) 1 Alice 3 $10 2 Bob 0 $10 3 Carol 1 $10 4 Dan 0 $10 … … … 1000 Zoran 2 $10 • Buyer: c = x 1 +x 2 +…+x 1000 – Tolerates error ±300 2. Perturbation lowers the price – Equivalently: variance v = 5000* • Answer: ĉ = c + Lap( √ (v/2)) • User compensation: $10 $0.001 (query is 0.1-DP**) • Price for the buyer: $10,000 $1 *Probability(| ĉ – c| ≥ 3 √ 2 σ ) < 1/18=0.056 (Chebyshev), where σ = √ v =50 √ 2 ** ε = √ 2 sensitivity( q )/ σ = 5 √ 2 / 50 √ 2 = 0.1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend