PSI and Big Data Jiannong Cao Department of Computing Hong Kong - - PowerPoint PPT Presentation
PSI and Big Data Jiannong Cao Department of Computing Hong Kong - - PowerPoint PPT Presentation
OGCIO Seminar on PSI/Data.One PSI and Big Data Jiannong Cao Department of Computing Hong Kong Polytechnic University Big Data TB ( 10 12 bytes) PB ( 10 15 bytes) EB ( 10 18 bytes) ZB ( 10 21 bytes) ... The World creates about 2.5EB
Slide 2
TB (1012 bytes) → PB (1015 bytes) → EB (1018 bytes)→ ZB (1021 bytes) ...
The World creates about 2.5EB data every day The total amount of global data reached 2.8ZB — report by International Data Corporation(IDC) in Dec. 2012 Where does the data come from
Internet & Web Sensing & IoT (M2M) Enterprises Government
Big Data
Slide 3
Answer formerly unanswerable questions
Traditional applications but with real-time processing of big volume of data, faster and more accurate
Formulate new questions
New applications where solutions exist only when volume of data becomes big enough
Discover invisible knowledge
New knowledge is discovered from different separate sources of data
Make evidence based predictions / decisions
In particular, shift from causality (why) to correlations (what happens)
Opportunities of Big Data
Slide 4
Difficulties in big data research
Cannot get the data Lack of effective means for analyzing the data Limited use of the data
PSI provides open data sets for big data research
Big Data
Analyze Acquire Organize Decide
Slide 5
Hong Kong Beijing Shanghai
PSI Initiatives
Slide 6
15 Categories
Air pollution indices
Hong Kong PSI Dataset
Buildings Food and environmental hygiene Law and order Charitable fund-raising activities Geo-referenced public facility data Image resources Marine News and information Population Census statistics Public transport Property market statistics Real-time traffic data Water quality Weather data
Slide 7
14 categories
Beijing PSI Dataset
Tourism Dining Transportation Healthcare Entertainment Shopping Religion Education Social Welfare Living Services Housing Environment Enterprise Rural Residents
Slide 8
Shanghai PSI Dataset
7 categories
Enterprise Healthcare Transportation Law and Order Statistics Property Environment
Slide 9
PSI Datasets
PSI Data Format Data Access Development Support Beijing files (e.g. csv, XML) Direct download API for map, transport, search Shanghai files (e.g. csv, XML) Direct download N/A Hong Kong files (e.g. csv, XML, mdb) Direct download RSS subscription N/A
Slide 10
Opportunities for Big Data Research
Tourism Education Healthcare Green City Easy Life Business Strategy
Slide 11
Tourism
Geo-referenced Public Facility Data
- Public Toilet
- Free Wi-Fi
Food and Environmental Hygiene
- Quality Dining
Shopping Malls
- Shopping
Public Transport
- Cheapest Route
Slide 12
Green City
Air Pollution Indices
- Anti-Pollution Action
Real-time Traffic Data
- Traffic Scheduling
- Navigation
Slide 13
Business Strategy
Buildings
- Investment
Property Market Statistics Food and Environmental Hygiene
- Retail Store Planning
- Market Research
Population Census Statistics
Slide 14
Easy Life
Air Pollution Indices
- Route Plan
Weather Data
- Parking
- Leisure
Geo-Referenced Public Facility Data Real-time Traffic Data
Slide 15
PSI-enabled Big Data Research
Air Pollution Indices Data Real-time Traffic Data Big data analytics for urban transportation Joint analysis on urban air quality & traffic
Slide 16
Big data analytics for urban transportation
Urban transportation planing & design
With PSI on real-time traffic data, we can
mine the correlation between two roads detect the congestion points evaluate the effectiveness of urban traffic planning
Intelligent transportation system
With PSI on real-time traffic data, we can
analyze congestion state of traffic flows provide a user with the practically fastest route to a destination at a given time
PSI-enabled Big Data Research
Slide 17
Traffic State in Hong Kong
Slide 18
Urban air quality estimation
With the PSI on real-time traffic and air pollution data, we can learn the historical and real-time air quality data estimate the fine-grained air quality information based on limited number of monitoring stations predict air quality information throughout the city in future
Joint analysis on urban air quality & traffic
With the PSI on real-time traffic and air pollution data, we can find out the relationship between urban air quality & traffic estimate the fine-grained air quality with traffic information calculate maximal traffic volume with air quality constraints
PSI-enabled Big Data Research
Slide 19
Air quality monitoring stations in HK
— from Environmental Protection Department of the Government of the HKSAR
Slide 20
Air quality statistics of areas in HK
20 40 60 80 100 120 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Roadside API of Mong Kok in HK
2012 2011
Months
10 20 30 40 50 60 70 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
General API of Yuen Long in HK
2012 2011
Months
20 40 60 80 100 120 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Roadside API of Central in HK
2012 2011
Months
10 20 30 40 50 60 70 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
General API of Sha Tin in HK
2012 2011
Months
Slide 21
Example of joint analysis on air quality & traffic
20 40 60 80 100 120 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Roadside API of Mong Kok in HK
2012 2011
Months
20 40 60 80 Jan Mar May Jul Sep Nov
General API of Yuen Long in HK
2012 2011
Months
20 40 60 80 100 120 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Roadside API of Central in HK
2012 2011
Months
- 10
10 30 50 70 Jan Mar May Jul Sep Nov
General API of Sha Tin in HK
2012 2011
Months
Slide 22
PSI sheds new light on big data research
Various data sources Valuable data
PSI offers a lot of opportunities
Operational efficiency and productivity Public utilities to reduce consumption Produce safety from farm to fork Value for money for citizens Customize actions based on population segments Fraud detection and prevention Prevent crime waves
Conclusion
Slide 23