public data
play

Public Data Enhancing Data Discovery and Exploration Benjamin Yolken - PowerPoint PPT Presentation

Public Data Enhancing Data Discovery and Exploration Benjamin Yolken (yolken@google.com) September 2011 Overview Disseminating public statistics Google tools Public Data Explorer Fusion Tables Refine Conclusion Disseminating


  1. Public Data Enhancing Data Discovery and Exploration Benjamin Yolken (yolken@google.com) September 2011

  2. Overview Disseminating public statistics Google tools ● Public Data Explorer ● Fusion Tables ● Refine Conclusion

  3. Disseminating Statistics

  4. Objective Make public statistics accessible, useful, and well-organized.

  5. Public statistics (2)

  6. Public statistics (4)

  7. Accessible... (1) Access: Data need to be online and findable ● Provider web sites ● Third-party aggregators ● Search engines (2) Understanding: Statisticians aren't the only users ● Lay users: Teachers, students, journalists, policy makers ● Computers: Search engines ● If not accessible to non-experts, data can become unused or, worse, misused

  8. Useful... There are a lot of distractions today: tables and simple plots are not enough Need to engage not just with users' eyes, but also their brains

  9. Well-organized... Go beyond flat lists of data... ● Topics ● Time periods ● Geographic regions ● Formats ● Languages, etc... Ultimately, depends on having good metadata

  10. Google Tools

  11. Public Data Explorer (PDE) [Link] What it is: ● Stand-alone product for interactively exploring and visualizing rich datasets ● Visualizations can be shared or embedded on 3rd party sites What it's good for: ● Reaching out to non-expert users ● Getting traffic to your site ● Categorical, aggregated, time-series data Caveats: ● Datasets must be in Dataset Publishing Language (DSPL) format ○ Have some tools to help ○ Working on converters from other formats like SDMX

  12. PDE: Demo Demo link

  13. PDE: Embed Demo link

  14. Fusion Tables [Link] What it is: ● Product for creating, editing, and sharing tabular data What it's good for: ● Table edits and transformations: joining, filtering, aggregating, etc. ● Static visualizations, particularly maps ● Exposing data to users via APIs Caveats: ● Not connected to PDE (yet) ● Not as useful for time series exploration

  15. Fusion Tables: Demo Demo link

  16. Google Refine What it is: ฀ ● Desktop-based tool for cleaning up and transforming tabular data What it's good for: ฀ ● Bulk data transformations ● Faceted data browsing ● Outlier-detection and cleanup Caveats: ฀ ● No collaboration features (yet)

  17. Google Refine

  18. Conclusion Need to make statistics accessible , useful , organized Google has tools that can help Key advice: Think about the users, their needs Really exciting area, only scratched the surface in terms of what's possible

  19. Thank you! Questions?

  20. Appendix

  21. PDE Intro Video

  22. PDE: Metadata Dataset Publishing Language (DSPL) ● ฀Designed for interactive exploration and visualization ● Released under BSD, open source license ● Combines data tables (CSV) with metadata (XML)

  23. PDE: Dataset Creation and Upload

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend