Learn more from your logfiles Learn more from your logfiles using machine learning using machine learning
[DEV1156] Adam.Spiers@suse.com Dirk.Mueller@suse.com
CC BY-NC 2.0 Thomas Hawk
Learn more from your logfiles Learn more from your logfiles using - - PowerPoint PPT Presentation
Learn more from your logfiles Learn more from your logfiles using machine learning using machine learning [DEV1156] Adam.Spiers @suse.com Dirk.Mueller @suse.com CC BY-NC 2.0 Thomas Hawk We are SUSE OpenStack Cloud software engineers We are
CC BY-NC 2.0 Thomas Hawk
warning /(?i)warning/ error /Traceback \(most recent call last\)/ error /(?i)error/ error /(?i)\bfail(ure|ed)?\b/ error /(?i)fatal/ error /$h1!!/
# Successful tempest run
# rpms containing "Error"
# https://bugzilla.suse.com/show_bug.cgi?id=1030822 warning /Cleaning up (vip-admin-\S+) on \S+, removing fail-count-\1/ # https://bugzilla.suse.com/show_bug.cgi?id=971832
algorithm
time
detection
tree
use
using
thus
artificial
used
also supervised Classification
systems methods mining inputs examples Main
field article neural models input
rules may
new anomaly
See vector based AI
like study types take rule
either t r u e v a l u e find t e r m l
s Speech c a r e 2 1 8 show b u i l d False user w
k
unsupervised
Networks
Theory analysis feature decision example network
dictionary
reinforcement Computer known tasks performance features knowledge
statistical mathematical problems techniques Sparse learn represented learned perform association machines Bayesian function programming
many method called neurons
language related statistics Optimization genetic bias include inductive logic information different signal biases regression task intelligence Software
labels human given approach two test Typically represent system
without predictions Applications within Relation approaches support class various linear desired process Positive Negative medical previously trained cluster representation rulebased people specific vision performing fields trees program contains whether image
values Natural problem Research goal accuracy instances predict complexity addition brain neuron health
I n s t e a d
d e r m a k e f
u s e s g a m e density f
n d H
e v e r i n c l u d i n g a b i l i t y unknown i m p r
e way s
v e i d e n t i f y deep
t l i e r i t e m s n
m a l l a y e r R a t e Journal e m a i l c l
e l y m a k i n g h i s t
y r e s p e c t r a t h e r b l a c k b u i l d s s p a m a l r e a d y w e l l logical p a t t e r n s c i e n c e
probabilistic computational Semisupervised recognition discovery generalization algorithmic similarity
applied
a p p l i c a t i
relationships p r
e s s e s d e f i n i t i
c a t e g
i e s c
t i n u
s Similar theoretical r e c
m e n d a t i
r e p r e s e n t a t i
s c l a s s i f i e r m a t h r m connection layers p a t t e r n s i n f e r e n c e d e c i s i
s p r e d i c t i v e p r
r i e t a r y e x p e r i e n c e c
c e r n e d computing s e v e r a l c e r t a i n restricted l i m i t e d m e a n i n g s t r u c t u r e c l u s t e r i n g d i s c
e r d y n a m i c e n v i r
m e n t p r
a b i l i t y e s t i m a t i
s e q u e n c e s r e s e a r c h e r s i n c r e a s i n g leading s e p a r a t e
Classification Naive Bayes NearestNeighbor Support Vector Machines (SVM) Neural Networks ... Regression Decision Trees Linear Regression Neural Networks ... Clustering K-Means Hidden Markov Model Neural Networks ...
Mar 11 02:43:28 localhost sudo[5195]: pam_unix(sudo:session): session opened for user root by (uid=5) DATE localhost sudo pam_unix sudo session session opened for user root uid hash(DATE) hash(localhost) hash(sudo) hash(pam_unix) hash(sudo) hash(session) hash(opened) ... [0, ...., 0, 1, 0, ..., 0, 1, 0, ...]
DATE RNGU RNGI RNGN RNGD
scikit
$ zypper install python3-logreduce $ pip3 install --user logreduce
# logreduce ... diff Compare directories/files dir Train and run against local files/dirs dir-train Build a model for local files/dirs dir-run Run a model against local files/dirs ... job Train and run against CI logs ... journal Train and run against local journald
# logreduce diff logs/good.txt logs/bad.txt ... 0.527 | bad.txt:34245: 2018-10-09 05:56:51.021261 | controller |\ Details: {u'created': u'2018-10-09T05:11:20Z', u'code': 500,\ u'message': u'Exceeded maximum number of retries. Exhausted \ all hosts available for retrying build failures for instance d7046aa3-e885-4ed6-80e7-d7a7eff9f883.'} ... 97.98% reduction (from 35244 lines to 712)
Truncated singular value decomposition (SVD)
$ logreduce dir-train model.clf baseline/* Training on 8 logs took 12.090s at 1.426MB/s (20.831kl/s) $ logreduce dir-run model.clf error.txt Testing took 6.375s at 0.454MB/s (6.569kl/s) 99.72% reduction (from 41879 lines to 118)
# logreduce journal --range day # logreduce journal-train --range month journal.clf
# logreduce journal-run --range day journal.clf ... 99.76% reduction (from 19677 lines to 48) 0.730 | cron - postdrop: warning: uid=16311: File too large ... 0.000 | smartd Device: /dev/sdb, 1 Offline uncorrectable sectors # killall -SEGV automount # logreduce journal-run --range day journal.clf 99.75% reduction (from 19679 lines to 50) ... 0.317 | systemd - DAEMON - autofs.service: Main process exited, code=dumped, s 0.314 | systemd - DAEMON - autofs.service: Failed with result 'core-dump'.
# logreduce dir-train nova.clf /var/log/nova/nova-compute.log-*xz # logreduce dir-run nova.clf /var/log/nova/nova-compute.log ... 0.684 | INFO .. No calling threads waiting for msg_id : d3afd41a53bb4d14a5e42d 0.619 | INFO .. Recovered from being unable to report status ... 93.15% reduction (from 6741 lines to 462)
# logreduce diff report-good/ report-bad/ --html report.html Training took 51.364s at 1.543MB/s Testing took 37.432s at 0.446MB/s ... 88.41% reduction (from 261091 lines to 30251)
name: base post-run:
command: log-classify job-train ...
command: log-classify job-run ...
zuul_return: report.html