a practical approach of different programming techniques
play

A practical approach of different programming techniques to - PowerPoint PPT Presentation

A practical approach of different programming techniques to implement a real-time application using Django Dipl.-Math. Sebastian Stigler sebastian.stigler@hs-aalen.de Marina Burdack, MSc marina.burdack@hs-aalen.de Aalen University of Applied


  1. A practical approach of different programming techniques to implement a real-time application using Django Dipl.-Math. Sebastian Stigler sebastian.stigler@hs-aalen.de Marina Burdack, MSc marina.burdack@hs-aalen.de Aalen University of Applied Sciences, Germany

  2. Motivation Dipl.-Math. Stigler, Burdack, MSc 2/18

  3. Aims for the Django Application How far do we get with an Python only approach? Tool to configure and run DA / ML pipeline Datasource Preprocessing Tasks Machine Learning Tasks Presentation of the Result Dipl.-Math. Stigler, Burdack, MSc 3/18

  4. The Focus of this Paper The preprocessing part of the pipeline. How does our application scale? What are the knobs we can use to scale the application? Dipl.-Math. Stigler, Burdack, MSc 4/18

  5. Preprocessing App (in german) Source : own graphic Dipl.-Math. Stigler, Burdack, MSc 5/18

  6. Methodology Dipl.-Math. Stigler, Burdack, MSc 6/18

  7. Types of Scaling Singlethreaded ✗ Multithreaded [4, 8] ✗ Multiprocessing ✓ Distributed Task Queue ✓ Dipl.-Math. Stigler, Burdack, MSc 7/18

  8. Multiprocessing Multiprocessing Workflow Source : own graphic The Multiprocessing Pool is realized with the ProcessPoolExecuter Class of the concurrent.futures module [5] from Python 3.7’s Standard Library. Dipl.-Math. Stigler, Burdack, MSc 8/18

  9. Task Queue Celery Workflow Source : own graphic The Task Queue is realized with Celery 4.3 [7] and Redis [6]. Dipl.-Math. Stigler, Burdack, MSc 9/18

  10. Structure of a chained Task Processing of a chained Task Source : own graphic Dipl.-Math. Stigler, Burdack, MSc 10/18

  11. The Math Queueing Theory [1] A queue with c servers is stable (won’t grow without bound) if the following equation holds: ρ = λ c µ < 1 (1) Where ρ is the server utilization, λ is the arrival rate and µ is the service rate (the inverse of the service time) for one task. Dipl.-Math. Stigler, Burdack, MSc 11/18

  12. Evaluation Dipl.-Math. Stigler, Burdack, MSc 12/18

  13. Test Data 750 ′ 000 Measurements ( rows ) from a Davis Weatherstation 33 value/row in total 26 of them with numerical values 75 − 750 ′ 000 Messages ( msg ) are the output of the buffer with a rows/msg rate from 10000 down to 1 16 Subtasks Prepare and Result Task 6 Tasks which directly uses methods from Pandas [3] 8 Tasks which uses preprocessing methods form scikit-learn [2] Dipl.-Math. Stigler, Burdack, MSc 13/18

  14. Test Runs Mean Servicetime per Message s task_prepare (GEN) s task_fillna_zero (PAN) s task_normalizer (SKN) 10 6 10 6 10 6 task queue task queue task queue multiprocessing multiprocessing multiprocessing groundtruth groundtruth groundtruth 10 5 10 5 10 5 saturation saturation saturation 10 4 10 4 10 4 10 3 10 3 10 3 10 2 10 2 10 2 10 0 10 1 10 2 10 3 10 4 10 0 10 1 10 2 10 3 10 4 10 0 10 1 10 2 10 3 10 4 rows/msg rows/msg rows/msg Mean Servicetime per Row task_prepare (GEN) task_fillna_zero (PAN) task_normalizer (SKN) s s s 10 5 10 5 10 5 task queue multiprocessing saturation saturation saturation groundtruth 10 4 10 4 10 4 10 3 10 3 10 3 10 2 10 2 10 2 10 1 task queue 10 1 10 1 task queue multiprocessing multiprocessing groundtruth groundtruth 10 0 10 0 10 0 10 0 10 1 10 2 10 3 10 4 10 0 10 1 10 2 10 3 10 4 10 0 10 1 10 2 10 3 10 4 rows/msg rows/msg rows/msg Dipl.-Math. Stigler, Burdack, MSc 14/18 Source : own graphic

  15. Conclusion Dipl.-Math. Stigler, Burdack, MSc 15/18

  16. Python Libraries a sophisticated enough for scaling real-time applications. Buffering incomming datarows can compensate overhead for Task Queues. λ µ < c determine’s the scaling for the application. All results are applicable to the machine learning process too. Dipl.-Math. Stigler, Burdack, MSc 16/18

  17. Thank you for your attention! This was A practical approach of different programming techniques to implement a real-time application using Django Dipl.-Math. Sebastian Stigler sebastian.stigler@hs-aalen.de Marina Burdack, MSc marina.burdack@hs-aalen.de Dipl.-Math. Stigler, Burdack, MSc 17/18

  18. References I U. Narayan Bhat. An Introduction to Queueing Theory. Modelling and Analysis in [1] Applications . Birkhäuser Basel, 2015. doi : 10.1007/978-0-8176-8421-1 . David Cournapeau and contriburors. scikit-learn . url : https://scikit-learn.org . [2] Wes McKinney et al. Pandas. Python Data Analysis Library . url : [3] https://pandas.pydata.org/ . Python Software Foundation. Thread State and the Global Interpreter Lock . url : [4] https://docs.python.org/3/c-api/init.html#thread-state-and-the-global-interpreter-lock . Brian Quinlan. concurrent.futures — Launching parallel tasks . url : [5] https://docs.python.org/3/library/concurrent.futures.html . Salvatore Sanfilippo and contriburors. Redis . url : hppts://redis.io . [6] Ask Solem and contributors. Celery: Distributed Task Queue . url : [7] www.celeryproject.org . Thomas Wouters. GlobalInterpreterLock . url : [8] https://wiki.python.org/moin/GlobalInterpreterLock . Dipl.-Math. Stigler, Burdack, MSc 18/18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend