building an analytic department
play

Building an analytic department From Zero to TensorFlow 1 The Peter - PowerPoint PPT Presentation

Building an analytic department From Zero to TensorFlow 1 The Peter principle: People in a hierarchy tend to rise to their level of incompetence : An employee is promoted based on their success in previous jobs, until they reach a level


  1. Building an analytic department From Zero to TensorFlow 1

  2. The Peter principle: People in a hierarchy tend to rise to their “ level of incompetence ”: An employee is promoted based on their success in previous jobs, until they reach a level at which they are no longer competent, as skills in one job do not necessarily translate to another.

  3. Introductions Antoine Desmet Analytics manager – Smart Solutions, Komatsu 3

  4. Hunter Valley

  5. 2000 The US Defense Department ended the purposeful degradation of GPS 2008 Komatsu releases Level 4 autonomy, driverless truck fleet. Operates even if wireless link is lost

  6. Real-time terrain mapping • LIDAR on diggers • Scans stitched together into terrain map • Compare to plan • Operator sees: • Red: over-dug • Blue: matches plan • Green: needs digging In near real-time

  7. Topics This is the story of a growing analytics team. It’s a business-oriented presentation A collection of thoughts and discoveries: sorry if I don’t have all the definitive answers • Background • The beginning: small vs. big? • Growth • R&D • Picking your projects • Stakeholder management • What’s next

  8. Background What data do we have, what we do with it… and WHY? 9

  9. The cost of downtime Ore extraction chain Cost=20,000$/hr Revenue = 40,000$/hr Profit when operating = +20,000 Profit on breakdown = -15,000 A leaking air hose: Time to fix = 1-2 hr Parts + labour = $300 Loss of production = 15-30 k$

  10. Payload Machine’s motions Operator’s joysticks Motor currents Auto-lube system Air pressure 800 sensors Sampling rate: 100ms max Temperatures Brakes status

  11. What we provide • The machine’s control system will “fault” if it detects a severe malfunction • Unplanned downtime is extremely costly in the mining industry • We analyse telemetry data to detect issues before they trigger a system fault • It’s not so much about saving the part. By the time we can detect a malfunction, often it’s already beyond repair • It’s about giving customer time to plan maintenance for what would otherwise be a disruptive unplanned breakdown

  12. In the beginning At the peak of the “big data” hype cycle 14

  13. Day 1 • 2014: one engineer (me) and one manager (sales) • At the peak of the “Big Data” craze, but… • In the midst of a mining downturn: 
 no budget, pressure to deliver • 6 years prior, a visionary setup dataloggers + backend 
 to harvest hundreds of sensor data at high rez 
 = lots of data • Data locked-up in antiquated time-series databases You are here • Fragile infrastructure • Zero process

  14. The Skunk works • Hired a couple of summer interns to boost output • Version control = copy/paste in separate folders 
 That’s OK because there were only a couple of developers • Built an rudimentary “model factory” data-dredging algorithm – without any hypothesis or prior assessment. Generally viewed as poor practice… 
 That’s OK because it’s machine data: correlations usually indicate something mechanically or electrically coupled. Feature engineering made it work. 
 3 Months=wide “coverage” of the machine. • Do everything on your laptop, then straight to Production 
 That’s OK because there were no contracts or nothing mission critical. 
 Mission-critical was demonstrating value

  15. Reflections: Small Vs. Big Small / startup model: • Loose plan, objectives and strategy • Less capital investment from business, so lower expectations • Pick problems yourself: those that seem relevant, and “safe bets” = quick wins in months • High risk of picking the wrong projects. Fast but disorganised, bound to run into scaling issues Big / corporate model: • Large investment, financial targets set from the start • Regimented methods, pressure to deliver may hinder creativity • 1 year, 10 DS: explore, investigate use cases for analytics • Well organised, safe-but-slow approach, prepared for the long-term

  16. Growing Product: tick – customers: tick – what’s next? 19

  17. Another start-up that became bloated Mech/Elec engs were very productive and creative… but things started to tear at the seams: • Why document when everyone knows… bus factor! • IT upgrading databases crippled us with rework. • Lack of software engineering practices = poor: reliability, readability, re-useability, • Things started to slow down. • Routine means you become blind to your own deficiencies. • Hard to see the paradigm shift: “remember how we used to be faster, what happened?” • Accept that things are the way they are. Getting a clean run or working faster isn’t possible.

  18. Today • 2-3 years later, we welcomed 3 team members, including a senior software dev. • The software dev went on a crusade (still going) for: unit tests, doc, libraries • The “old guard” had to lift their games and mature to integrate the “fresh blood”. Helped kick the old counter-productive habits, and work towards increasing quality and pace Our team now has: • 2 Data scientists: the theory • 2 Engineers: make it work • 2 Software developers: make it scale • 1 Analyst / report developer: make it visible • 3 Subject matter experts: make it relevant

  19. Workflow challenges The release cliff-hanger: • Analysts are fluent at developing models on their laptop… ouch • Releasing an analytic into production is a rare event. Lack of practice = frequent fails Trialling a solution: ouch • Start with Test release of “skeleton” PROD • Instead of leaving release as final step ouch success • DevOps 101: release early and frequently!

  20. Workflow challenges From bench to streaming: • R&D happens on a static block of time-series data (e.g. one month). • Challenge = from static to live streaming: batch size, handover between batches, catching-up (maintain full history) vs forcing forward (satisfy real-time) Standardise • Build high-level functions & templates to abstract real-time execution aspects. • Don’t lock-down the process and make it hard to build “non-standard” • Standardising helps maintainability, collaboration, etc.

  21. 3 aspects of Continuous improvement Streamline actioning the insights Streamline tools for faster analytics development Streamline analytics : generic and re-useable

  22. R&D 25

  23. Finance, industrial plants and insurance analytics Industrial analytics are a niche application, no-one can help me! 
 What could there be to gain by outside of my industry? Finance f (A,B) = Ĉ C is a share price, A and B the competitor’s share prices 
 • If Ĉ >> C: sell, Ĉ << C:buy, Ĉ = C: do noting Insurance s f (A,B) = Ĉ , C is the amount claimed, A and B some parameters of the claim 
 • Ĉ ≈ C: do nothing, Ĉ << C: investigate a potentially fraudulent claim Plant analytics f (A,B) = Ĉ , C is the temperature of a motor, A and B are brearing temps. 
 • Ĉ ≈ C: do nothing, Ĉ << C motor potentially overheating At the right level of abstraction, it all becomes the same. 
 Talk to people. But I’m preaching the choir!

  24. Interns for R&D • Autonomy: R&D can be insulated from the production systems. Low risk to business. 
 Here’s a dataset, install [ your favourite toolset ] and go get it, tiger! • This usually produces a proof-of-concept • An intern can clear the fog on that high risk/high value project. You can make a sound decision to proceed forwards, without having used any precious permanent employee time • With the right intern: the newer the tech, the greater the challenge… the more they engage! • Co-supervision with an academic will inject a lot of their knowledge in your project. This is often a better solution vs. directly engaging into a research project with academics • You can hire the outstanding ones, risk free!

  25. Picking projects business value vs. geeky indulgence 28

  26. A tale of two companies merging P&H P&H Mainly sells primary digging equipment A mine owns 1-5 of them, no redundancy Very expensive, “top of the pyramid” Analytics strategy focus on fault prediction & uptime maximisation: keep them running 24/7 Komatsu Mainly sells dump trucks A mine owns 50-200 + spare units Less expensive, small loss is not-mission critical Analytics strategy focus on compliance to scheduled maintenance, part sales, operator abuse

  27. The “no free lunch” of analytics Leaking air hose Gearbox failure Recurrent, low impact, easy: supervised Rare, extremely high impact, hard: unsupervised

  28. TensorFlow to the rescue! Need a generic Time Series pattern recognition Weary of the deep-learning hype: “hot topic” of 2016… 
 At the peak of Gartner’s “hype curve” Is it just for images? An overkill? A summer intern ran the project with great success (accurate and generalises) CNN + LSTM is our standard approach to detect failure patterns in automated systems. Interested in the details? 
 Data Science Sydney Meetup - Tue 28 May

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend