Is this normal? Finding anomalies in real-time data. Who am I? Im - PowerPoint PPT Presentation

Is this normal? Finding anomalies in real-time data.

Who am I? I’m Theo (@postwait on Twitter) I write a lot of code 50+ open source projects several commercial code bases I wrote “Scalable Internet Architectures” I sit on the ACM Queue and Professions boards. I spend all day looking at telemetry data at Circonus

What is real-time? Hard real-time systems are those where the outputs of a system based on specific inputs are considered incorrect if the latency of their delivery is above a specified amount. Soft real-time systems are similar, but “less useful” instead of “incorrect.” I don’t design life support systems, avionics or other systems where lives are at stake, so it’s a soft real-time life for me.

A survey of big data sytems. Traditional: Oracle, Postgres, MySQL, Teradata, Vertica, Netezza, Greenplum, Tableau, K The shiny: Hadoop, Hive, HBase, Pig, Cassandra The real-time: SQLstream, S4, Flumebase, Truviso, Esper, Storm

Big data the old way Relational databases, both column store and not. Just work. Likely store more data than your “big data.”

Big data the distributed way distributed systems allow much larger data sets, but markedly change the data analytics methods hard for existing quants to roll up their sleeves highly scalable and accommodate growth

Big data the real-time way what we do needs a different approach the old (and even the distributed) do not design for soft real-time complex observation of data. Notable exceptions are S4 and Storm.

So, what’s your problem? We have telemetry... over 10 trillion data points on near-line storage growing super-linearly

Data, what kind? Most data is numeric: counts, averages, derivatives, stddevs, etc. Some data is: text changes (ssh fingerprints, production launches) histograms highly dimensional event streams.

Data rates. Quantity of data isn’t such a big deal okay, yes it is, but we’ll get to that later. The rate of new data arrival makes the problem hard. low end: 15k datum / second high end: 300k datum / second growing rapidly

What we use. We use Esper Esper is very powerful, elegantly coded and performance focused http://www.flickr.com/photos/mcertou/ Like any good tool that allows users to write queries...

What we do with Esper Detect absence in streams: select b from pattern [every a=Event -> (timer:interval(30 sec) and not b=Event(id=a.id, metric=a.metric)] Detect ad-hoc threshold violation: select * from Event(id=”host1”, metric=”disk1”) where value > 95 etc. etc. etc. [1]

Making the problem harder. So, it just wasn’t enough. We want to do long term trending and apply that information to anomaly detection Think: Holt-Winters (or multivariate regressions) Look at historic data Use that to predict the immediate future with some quantifiable confidence.

How we do it. We implemented the Snowth for storage of data. [2] We implemented a C/lua distributed system to analyze 4 weeks of data (~8k statistical aggregates) yielding a prediction with confidences (triple exponential smoothing) [3] To keep the system real-time, we need to ensure that queries return in less than 2ms (our goal is 100µs).

Cheating is winning. Our predictions work on 5 minute windows. 4 weeks of data is 8064 windows. Given Pred(T -8063 .. T 0 ) -> (P 1 , C 1 ) Given Pred(T -8062 .. T 0 , P 1 ) -> ~(P 2 , C 2 )

Tolerably inaccurate. When V arrives, we determine the prediction window W N we need. If W N isn’t in cache, we assume V is within tolerances. If W N+1 isn’t in cache, we query the Snowth for W N , W N+1 placing in cache Cache accesses are local and always < 100µs.

I see challenges How do I take offline data analytics techniques and apply them online to high-volume, low-latency event streams quickly? without deep expertise?

Thank you. Circonus is hiring: software engineers, quants, and visualization engineers. [1] http://esper.codehaus.org/tutorials/solution_patterns/solution_patterns.html [2] http://omniti.com/surge/2011/speakers/theo-schlossnagle [3] http://labs.omniti.com/people/jesus/papers/holtwinters.pdf

Is this normal? Finding anomalies in real-time data. Who am I? Im - PowerPoint PPT Presentation

Is this normal? Finding anomalies in real-time data. Who am I? Im Theo (@postwait on Twitter) I write a lot of code 50+ open source projects several commercial code bases I wrote Scalable Internet Architectures I sit on the ACM

Linear regression How to measure the accuracy of linear regression models Linear Regression

Normal A Spectrum of Engineering Design Normal Radical A Spectrum of Engineering Design Normal

Chomsky Normal Form Chomsky Normal Form Chomsky Normal Form A context free grammar is in

Analysis of MAP in CRP Normal-Normal model ukasz Rajkowski Faculty of Mathematics, Informatics

Chapter 5 Slide 1 Normal Probability Distributions 5-1 Overview 5-2 The Standard Normal

4.3 Normal distribution Prof. Tesler Math 186 Winter 2020 Prof. Tesler 4.3 Normal distribution

Normal Accidents: Normal Accidents: A Book Report A Book Report Bill Tet zlaf f Bill Tet zlaf

Chomsky Normal Form We introduce Chomsky Normal Form, which is used to answer questions about

Christchurch Earthquake Christchurch Earthquake New Normal or Old Normal, and Implications for

1.10.2 Normal distribution 1.10.3 Approximating binomial distribution by normal 2.10 Central

The Normal-Normal Model Alicia Johnson Associate Professor, Macalester College DataCamp

Normal Shock Waves Lecture 24 ME EN 412 Andrew Ning aning@byu.edu Outline Normal Shock Waves

Schematic Presentation Of Normal Massage Points Schematic representation of normal ECG. Diagram

ACMS 20340 Statistics for Life Sciences Chapter 11: The Normal Distributions Introducing the

Smith Normal Form and Combinatorics Richard P . Stanley Smith Normal Form and Combinatorics

Chomsky and Greibach Normal Forms Chomsky and Greibach Normal Forms p.1/24 Simplifying

TIME MANAGEMENT PRESENTATION OVERVIEW Know what's 1 on your plate Self-management 2 Make a

26 TH STREET PROJECT BENJAMIN J. MAHONEY | CONSTRUCTION MANAGEMENT 1 FINAL PRESENTATION | APRIL

Design of CIC of CIC Compensators Compensators With With SPT SPT Design Coefficients Based

Constant Propagation and Interval Analysis Daniela Moldovan 29. May 2010 Daniela Moldovan

2 MAINTAIN YOUR NABERS ENERGY RATING Ethan Burns Sustainability Now THE ENERGY JOURNEY EN

We all perceive time differently, shouldnt our clocks do the same? STORY is the first

Mathematical Simulation and Michael D ohler Zheng He Optimization of Semiconductor Devices

Loop Invariants [Andersen, Gries, Lee, Marschner, Van Loan, White] Announcements Prelim 2