Machine Learning applied to Process definitions Our target: CFS - - PowerPoint PPT Presentation

machine learning applied to process
SMART_READER_LITE
LIVE PREVIEW

Machine Learning applied to Process definitions Our target: CFS - - PowerPoint PPT Presentation

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and Machine Learning applied to Process definitions Our target: CFS Scheduling What can we do ? Results and analysis Benoit Zanotti Conclusion


slide-1
SLIDE 1

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis Conclusion

Machine Learning applied to Process Scheduling

Benoit Zanotti

jicks@lse.epita.fr http://www.lse.epita.fr

July 17, 2013

slide-2
SLIDE 2

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

Plan

1

Introduction and definitions Machine Learning Process Scheduling

slide-3
SLIDE 3

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

Plan

1

Introduction and definitions Machine Learning Process Scheduling

slide-4
SLIDE 4

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

Definition of Machine Learning

Definition Machine Learning is a field of Computer Science about the construction and study of systems that can learn from data. Usual organizations of ML algorithms : Supervised learning (classification, ...) Unsupervised learning (clustering, ...) Semi-supervised learning ...

slide-5
SLIDE 5

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

Notes about Machine Learning

We won’t talk really about the theory. But: Pretreatment is very important. Usually, big tradeoff between speed and efficiency In Process Scheduling, those factors will be limiting.

slide-6
SLIDE 6

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

Plan

1

Introduction and definitions Machine Learning Process Scheduling

slide-7
SLIDE 7

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

What is Process Scheduling ?

Definition Process Scheduling is the method by which processes are given access to processor time. It is used to achieved multi- tasking. There is many well-known scheduling algorithms. For example: First In, First Out Round-Robin (fixed time unit, processes in a circle)

slide-8
SLIDE 8

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions

Machine Learning Process Scheduling

Our target: CFS What can we do ? Results and analysis Conclusion

Main concerns

A scheduler has mainly 3 metrics: throughput, latency and fairness. We can simplify them (in practice) by: Speed (how much time the scheduler itself uses, number of context-switching, ...) Fairness (giving equal CPU time to each process) Reactivity (are interactive processes given any advantages ?) A scheduler is complicated. Let’s optimize one using ML !

slide-9
SLIDE 9

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS

Inner workings Advantages/Inconvenients

What can we do ? Results and analysis Conclusion

Plan

2

Our target: CFS Inner workings Advantages/Inconvenients

slide-10
SLIDE 10

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS

Inner workings Advantages/Inconvenients

What can we do ? Results and analysis Conclusion

Plan

2

Our target: CFS Inner workings Advantages/Inconvenients

slide-11
SLIDE 11

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS

Inner workings Advantages/Inconvenients

What can we do ? Results and analysis Conclusion

Inner workings of CFS

Stands for Completely Fair Scheduler Scheduler of Linux since 2.6.23 Just an RB-tree with elements indexed by the runtime of the process. Straightforward algorithm: just take the minimum of the tree. CFS in Linux kernel is actually more complicated (handling Real-Time tasks, nice values, ...)

slide-12
SLIDE 12

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS

Inner workings Advantages/Inconvenients

What can we do ? Results and analysis Conclusion

Why CFS ?

Quite simple and works really well Most familiar (I implemented one in mikro) Already efficient. I wanted to see what ML could do.

slide-13
SLIDE 13

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS

Inner workings Advantages/Inconvenients

What can we do ? Results and analysis Conclusion

Plan

2

Our target: CFS Inner workings Advantages/Inconvenients

slide-14
SLIDE 14

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS

Inner workings Advantages/Inconvenients

What can we do ? Results and analysis Conclusion

Advantages/Inconvenients

✓ Very simple to understand ✓ Works really well in general cases ✓ No real corner cases ✗ A little light on the handling of interactive processes.

slide-15
SLIDE 15

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Plan

3

What can we do ? ML considerations Applying ML to the CFS

slide-16
SLIDE 16

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Plan

3

What can we do ? ML considerations Applying ML to the CFS

slide-17
SLIDE 17

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

ML considerations

Restricted to supervised learning (classification and regression mainly) Scheduler must be as fast as possible. Its ML components too. Avoiding complex code in the kernel is often a good idea. → precomputed model/profile for each processes → no complex methods, results will be mitigated

slide-18
SLIDE 18

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Plan

3

What can we do ? ML considerations Applying ML to the CFS

slide-19
SLIDE 19

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Applying ML to the CFS

Ojective: reducing the number of context switchs: A process time quantum should ideally not finish (process going to sleep) An estimation of the next quantum would help Based on the N lasts quantums Be careful not to be too unfair Note: Many other objectives were possible...

slide-20
SLIDE 20

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Actual implementation

Proof of Concept One using Taylor’s Theorem and one using a classifier Need to extract real runtime quantums and to create profiles

slide-21
SLIDE 21

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Taylor’s theorem

The sequence of quantums can be seen as a function

  • f the time.

Taylor’s theorem gives an approximation of a function on a point given its derivatives Discrete derivation is only substraction → an approximation of the next quantum is: f(x + 1) = f(x) + f ′(x − 1) + f ′′(x − 1) 2 This method is simple and fast, but not very precise.

slide-22
SLIDE 22

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ?

ML considerations Applying ML to the CFS

Results and analysis Conclusion

Classifier

Naive Bayes Classifier using the last 4 quantums: It is the best (found) compromise between speed and results Parameters and output are range of time, not the actual values Based on Bayes’ theorem. Outputs the labels with most probability Only 4 multiplications are needed for each label (there is 10 of them). Using bit manipulation, we can avoid any conditionals → it is fast, but clearly not the most accurate

slide-23
SLIDE 23

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Plan

4

Results and analysis perf and Linsched Methodology and results Analysis

slide-24
SLIDE 24

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Plan

4

Results and analysis perf and Linsched Methodology and results Analysis

slide-25
SLIDE 25

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

perf

perf Performance analysis tools for Linux Based on kernel-based performance counters Can be used to extract many scheduling stats

slide-26
SLIDE 26

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Linsched

Linsched Linux Scheduler Simulator (in userland...) ✓ Easy to use (cycle of development, debugging, ...) and fast ✓ Can replay records from perf ✗ Hard to quantify how much time is used by the scheduler

slide-27
SLIDE 27

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Plan

4

Results and analysis perf and Linsched Methodology and results Analysis

slide-28
SLIDE 28

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Methodology of the tests

Use perf to extract records and datasets Use WEKA to compute profiles for each process Test using vanilla/modified linsched to see the gain Time the tests of vanilla/modified linsched to estimate how costly each method is

slide-29
SLIDE 29

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Results

Time used (base=100) Results of the simulation (without scheduler time) 100 98 95 Vanilla Extrapolation Classifier 20 40 60 80 100 120

slide-30
SLIDE 30

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Results

Time used (base=100) Results of the simulation (with scheduler time) 100 102 98 Vanilla Extrapolation Classifier 20 40 60 80 100 120

slide-31
SLIDE 31

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Plan

4

Results and analysis perf and Linsched Methodology and results Analysis

slide-32
SLIDE 32

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis

perf and Linsched Methodology and results Analysis

Conclusion

Analysis

CFS is already quite good ML results are positive but very limited More complex pretreatment/ML techniques would yield better results... at which cost ?

slide-33
SLIDE 33

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis Conclusion

Plan

5

Conclusion

slide-34
SLIDE 34

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis Conclusion

Conclusion

It was only one idea on one objective. Using ML in scheduling is hard, because of the speed/results tradeoff Difficulties for a real kernel integration (passing the models, limiting abuses, ...) Basic rule in scheduling: "Simpler is Better" Another idea: run a (kernel ?) process every X hours to compute new profiles...

  • K. Kumar Pusukuri, A. Negi, Applying machine

learning techniques to improve Linux process scheduling,

  • Dec. 2005.
slide-35
SLIDE 35

Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis Conclusion

Questions ?

Questions ?