Billion Prices Proj ect:
Avoiding the Big Data – Small Info S yndrome
Roberto Rigobon MIT S loan, NBER, CS AC
Billion Prices Proj ect: Avoiding the Big Data Small Info S yndrome - - PowerPoint PPT Presentation
Billion Prices Proj ect: Avoiding the Big Data Small Info S yndrome Roberto Rigobon MIT S loan, NBER, CS AC Big distance between Data and Information! The world is not lacking of data, its lacking of information Need a question
Avoiding the Big Data – Small Info S yndrome
Roberto Rigobon MIT S loan, NBER, CS AC
discontinuations
S ervices?
Our Approach to Daily Inflation S tatistics
Use scraping t echnology Connect t o t housands of
every day Find individual it ems Develop daily inflat ion st at ist ics for ~20 count ries
1 2 3 4 5
S t ore and process key it em informat ion in a dat abase
Our prices are collected from public online sources, using a technique called “web scraping”
A software downloads a webpage, analyses the html code, “scrapes” price data, and stores it in a database
S
tores keep markups between online and offline prices relatively constant in the medium run
Online Price Indexes trends anticipate official inflation shifts. Online prices are easier to change, retailers are more
competitive, and consumers have less memory.
Changes in inflation trends by retailers reflect changes in the
demand they are facing
S ep 16 2008
Argentina Australia Colombia Germany Ireland
Annual Inflation Rates
UK China – S upermarket Index Russia Venezuela
Not es: Product Availabilit y normalized t o 100 on 10/ 1/ 2011
20 40 60 80 100 120
10/ 1 10/ 2 10/ 3 10/ 4 10/ 5 10/ 6 10/ 7 10/ 8 10/ 9 10/ 10 10/ 11 10/ 12 10/ 13 10/ 14 10/ 15 10/ 16 10/ 17 10/ 18 10/ 19 10/ 20 10/ 21 10/ 22 10/ 23 10/ 24 10/ 25 10/ 26 10/ 27 10/ 28 10/ 29 10/ 30 10/ 31 Product Availability In Online Retailers