open data science initiative
play

Open Data Science Initiative Neil D. Lawrence data@she ffi eld 16th - PowerPoint PPT Presentation

Open Data Science Initiative Neil D. Lawrence data@she ffi eld 16th December 2015 Challenges for Companies Trying to dominate the modern interconnected data market (e.g. Amazon, Google, Facebook) buying up talent and competitors. or


  1. Open Data Science Initiative Neil D. Lawrence data@she ffi eld 16th December 2015

  2. Challenges for Companies ◮ Trying to dominate the modern interconnected data market (e.g. Amazon, Google, Facebook) — buying up talent and competitors. ◮ or trying to exploit current ‘data silos’ (e.g. Tescos clubcard, Experian) — monetising our data today (limited shelf life?) ◮ or trying to understand their own systems (the internal google search) ◮ or new companies with new ideas that will generate data.

  3. Challenges for Companies ◮ How do they break the natural data monopoly? ◮ How do they access the necessary expertise?

  4. Challenges in Science Data sharing is more widely accepted but: ◮ Most analysis is simple statistical tests or explorative modelling with PCA or clustering. ◮ Few scientists understand these methodologies, apply them as black box. ◮ There is an understanding gap between the data & scientist and the data scientist.

  5. Challenges in Health ◮ Ensure the privacy of patients is respected. ◮ Leverage the wide range of data available for wider societal benefit.

  6. International Development ◮ Exploit new telecommunications infrastructure to develop a leap-frog developed countries. ◮ Needs mechanisms for data sharing that retain the individual’s control. ◮ Widespread education of local talent in code and model development.

  7. Common Strands ◮ Improving access to data whilst balancing against individual’s right to privacy against societal needs to advance. ◮ Advancing methodologies: development of methodologies needed to characterize large interconnected complex data sets. ◮ Analysis empowerment: giving scientists, clinicians, students, commercial and academic partners ability to analyze their own data with latest methodologies.

  8. Open Data Science: A Magic Bullet? ◮ Make new methodologies available as widely and rapidly as possible with as few conditions on their use as possible. ◮ Educate commercial, scientific and medical partners in use of these methodologies. ◮ Act to achieve a balance between data sharing for societal benefit and right of an individual to own their own data.

  9. Achieving This ◮ Use BSD-like licenses on software. ◮ Educate our partners (summer schools, courses etc). ◮ Act to achieve a balance between data sharing for societal benefit and rights of the individual.

  10. Make Analysis Available

  11. Educating But we need to do much more!

  12. Digital Identity and Data Ownership

  13. Data Warehousing

  14. Blog Post

  15. Blog Post

  16. Modern Tools: Github

  17. Modern Tools: Reddit

  18. Modern Tools: IPython Notebook

  19. Literate Computing

  20. Example: Prediction of Malaria Incidence in Uganda ◮ Work with John Quinn and Martin Mubaganzi (Makerere University, Uganda) ◮ See http: // air.ug / research.html.

  21. Malaria Prediction in Uganda Data SRTM/NASA from http://dds.cr.usgs.gov/srtm/version2_1 4°N 2°N 0°N 2°S 29°E 31°E 33°E 35°E ( ?? )

  22. Malaria Prediction in Uganda Nagongera / Tororo (Multiple output model) Sentinel - all patients 6 5 4 3 2 1 0 1 2 3 Sentinel - patients with malaria 6 5 4 3 2 1 0 1 2 3 HMIS - all_patients 6 5 4 3 2 1 0 1 2 3 Satellite - rain 6 5 4 3 2 1 0 1 2 3 W. station - temperature 6 5 4 3 2 1 0 1 2 3 1500 2000 2500 3000 3500

  23. Malaria Prediction in Uganda Mubende 5000 sparse regression incidence 4000 3000 2000 1000 0 0 300 600 900 1200 1500 1800 5000 4000 incidence multiple output 3000 2000 1000 0 0 300 600 900 1200 1500 1800 time (days)

  24. GP School at Makerere

  25. Early Warning Systems

  26. Early Warning Systems

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend