iterative design for data science projects
play

Iterative design for data science projects Bo Peng @bo_p for QCon - PowerPoint PPT Presentation

Iterative design for data science projects Bo Peng @bo_p for QCon San Francisco Nov 7, 2016 approach case study: heritage health prize Goal: Create an algorithm that predicts how many days a patient will spend in a hospital in the


  1. Iterative design for data science projects Bo Peng • @bo_p for QCon San Francisco • Nov 7, 2016

  2. approach case study: heritage health prize Goal: Create an algorithm that predicts how many days a patient will spend in a hospital in the next year. http://heritagehealthprize.com

  3. approach case study: heritage health prize 2 years 1,363 teams 25,316 entries http://heritagehealthprize.com

  4. approach case study: heritage health prize all zeros constant value score goal time (in months) http://heritagehealthprize.com

  5. approach case study: heritage health prize all zeros constant value score goal time (in months) http://heritagehealthprize.com

  6. approach case study: heritage health prize all zeros constant value score goal time (in months) http://heritagehealthprize.com

  7. approach case study: heritage health prize all zeros constant value score goal time (in months) http://heritagehealthprize.com

  8. approach case study: heritage health prize all zeros constant value score goal time (in months) http://heritagehealthprize.com

  9. approach case study: heritage health prize all zeros constant value score goal time (in months) http://heritagehealthprize.com

  10. What can we learn from this? Solving business problems can rarely be reduced to minimizing a model’s RMSE. all zeros constant value score goal time (in months)

  11. Contests are fun. Solving business problems can rarely be reduced to minimizing a model’s RMSE. all zeros constant value score goal time (in months)

  12. Contests are fun. Solving business problems can rarely be reduced to minimizing a model’s RMSE. all zeros constant value score goal time (in months)

  13. agenda - A common approach to data science - The design approach: - a simple model goes along way (eDiscovery) - finding & recommending experts within P&G

  14. Data driven e-discovery for Daegis How simple models + design go a long way

  15. data-driven e-discovery daegis

  16. data-driven e-discovery daegis about patent about patent not

  17. data-driven e-discovery daegis don’t turn over to plaintiff turn over to plaintiff about patent adverse inference about patent not

  18. data-driven e-discovery daegis don’t turn over to plaintiff turn over to plaintiff about patent adverse inference about patent not give away trade secrets

  19. data-driven e-discovery daegis don’t turn over to plaintiff turn over to plaintiff about patent adverse inference about patent not give away trade secrets

  20. data-driven e-discovery daegis don’t turn over to plaintiff turn over to plaintiff

  21. data-driven e-discovery daegis

  22. data-driven e-discovery daegis lunch fantasy football algorithm design marketing coffee patents create a “document map” finances

  23. data-driven e-discovery daegis lunch fantasy football algorithm design marketing coffee patents create a “document map” finances review away shades of grey reduce reviews by 90-99%

  24. care about design. simple, powerful interfaces relay analytics better.

  25. iterative problem solving plan, build, test, and iterate as quickly as possible generate ideas rapid iterations evaluate build prototype

  26. Data driven expertise exploration Procter & Gamble

  27. data-driven expertise exploration procter & gamble

  28. data-driven expertise exploration procter & gamble

  29. High level goals: - reveal areas of expertise - evaluate connectivity within experts

  30. data-driven expertise exploration procter & gamble

  31. data-driven expertise exploration procter & gamble Lorem Ipsum: a narrative about blankets. Author: Charlie Brown Date: 31 Jan 2012 Lorem Ipsum is a dummy text used when typesetting or marking up documents. It has a long history starting from the 1500s and is still used in digital millennium for typesetting electronic documents, page designs, etc. In itself, the original text of Lorem Ipsum might have been taken from an ancient Latin book that was written about 50 BC. Nevertheless, Lorem Ipsum’s words have been changed so they don’t read as a proper text. Naturally, page designs that are made for text documents must contain some text rather than placeholder dots or something else. However, should they contain proper English words and sentences almost every reader will deliberately try to interpret it eventually, missing the design itself. However, a placeholder text must have a natural distribution of letters and punctuation or otherwise the markup will look strange and unnatural. That’s what Lorem Ipsum helps to achieve. I would like to thank Peppermint Patty for her support on studying Lorem Ipsum as well as the infinite wisdom of Linus van Pelt and his willingness to use his blanket in my experiments.

  32. vs.

  33. vs.

  34. iterative problem solving plan, build, test, and iterate as quickly as possible generate ideas rapid iterations evaluate build prototype

  35. High level goals: - reveal areas of expertise - evaluate connectivity within experts

  36. High level goals: - reveal areas of expertise - evaluate connectivity within experts

  37. let’s compare countries.

  38. + 1

  39. 10 5 5 20 8 25 2 5 12 3 30 10 1 20 25 50

  40. 10 5 5 20 8 25 2 5 12 3 30 10 1 20 25 50

  41. 10 5 5 20 8 25 2 5 12 3 30 10 1 20 25 50

  42. 10 5 5 20 8 25 2 5 12 3 30 10 1 20 25 50

  43. design influences data science.

  44. care about design.

  45. Iterative design for data science projects Bo Peng • @bo_p for QCon San Francisco • Thanks!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend