 
              Failure Prediction for decision makers in Data Centers using Data Mining. Group ID- 39WDIT
Team Members  IT 11 6002 44 D.G.S.M. Wijayarathne  IT 11 6049 90 W.K.S.D Fernando  IT 11 6005 58 A.S.M.S Sharfaan  IT 11 6073 42 J.S.D Fernando  IT 11 6104 58 M.P.L Mendis Internal Supervisor: Mr. Dilhan Manawadu IT 11 6049 90
Topics to be covered…. 1. Introduction 2. Overall Descriptions 3. Specific Requirements 4. References 5. Appendices IT 11 6049 90
Introduction
Overview  Develop team will implement a system called “WinSeer” to predict Data Center failures.  What is the need of predicting Data Center failures?  System targets for decision makers. IT 11 6049 90
Objectives  Select best data mining algorithm.  Develop a data mining model.  Predict data center failures.  Acknowledge decision makers about failures. IT 11 6049 90
Software Architecture Diagram IT 11 6049 90
Product perspective.  Existing Researches.  Research 1: Prediction of Hard Drive Failures via Rule Discovery from Auto Support Data by V. Agrawal, C. Bhattacharyya, T. Niranjan, S. Susarla in Sep 2009 .  Research 2: Effective Failure Prediction in Hadoop Clusters by R. Dudko, A. Sharma, and J. Tedesco .  Research 3: A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers by Q. Guan, Z. Zhang, and S. Fu in October 2012 .  Research 4: Host Load Prediction in a Google Compute Cloud with a Bayesian Model by S. Di1, D. Kondo, W. Cirne in 2012 .
Product perspective “ WinSeer ” Project Research 1 Research 2 Research 3 Research 4 Prediction Data center server Hard Drive Failure prediction Enhancing Host Load factor failures. Failures. in Hadoop Dependability of Prediction. Clusters. Data Centers. Target Decision makers in Data center Operators and Data center Google users. Audients organizations. administrators. managers of the administrators. cluster. Business To increase the To avoid loss of Data management Provide high To increase the data centers’ Goal data and and monitoring accuracy to the availability of availability. performance large clusters. Data Centers. the search degradation. engine. Model Type Open source data Rule learning Novel approach. Bayesian and Based on Bayes mining models. algorithms. decision trees model. models. User Web interface. Net Application. Monitoring Monitoring Web interface. Interface systems. systems. IT 11 6002 44 Features.
Product functions IT 11 6002 44
Product functions  Login  View Predictions IT 11 6002 44
Product functions  Login  View Predictions  Update Profile  Add a new user IT 11 6002 44
User characteristics  Two classes of users.  Organizational decision maker  System administrator.  Ability to read and understand English.  Familiarity with the operation of the basic Graphical User Interface (GUI) of a web browser.  Should have an e-mail account to get email alerts. IT 11 6002 44
Assumptions and dependencies  The server monitoring function will be done by the currently available system and the pattern recognition and the failure predictions generating only will be done by the proposed system. IT 11 6002 44
Distribution of requirements  Prototype – Processing data set and Integrating tools with ASP.net.  Mid Review - WinSeer data mining model.  Finals - Complete project, focus on predicting the failures at least before 2 weeks’ time.  Processing XML data set – Shamini, Premeshini  Integrate the open source data mining tools with ASP.net. – Samith, Sameera  Research the mining model algorithms – Saumy, Sameera, Samith  Generate reports - Saumy IT 11 6002 44
Specific Requirements
External interface requirements  Detailed user interfaces  Login Interface IT 11 6005 58
External interface requirements  Home Page IT 11 6005 58
Detailed user interfaces  Register New User. IT 11 6005 58
Detailed user interfaces  Edit Profile. IT 11 6005 58
Detailed user interfaces  Update and view user details. IT 11 6005 58
Detailed user interfaces  Software interface integrations  Weka 3.6  Knime 2.6  RapidMiner 5.3  Communication interface integrations  Internet connection is required to feed the web pages and to access the web interface by the user. IT 11 6005 58
Classes/Objects IT 11 6005 58
Performance Requirements  Response Time  How fast the system handle individual requests.  Should not render resident computer useless for other purposes. IT 11 6073 42
Performance Requirement  Throughput  How many requests the system can handle.  “Winseer” prediction handles datasets of up to 20 GB in size IT 11 6073 42
Design Constraints  Easy to access the system.  Develop the mining model by using open source tools.  Software Interfaces used in “WinSeer”. IT 11 6073 42
Software System Attributes.  Reliability  Availability. IT 11 6073 42
Software System Attributes.  Security.  Maintainability. IT 11 6073 42
Supporting Information
References  [1] HowStuffWorks.com Contributors, "Are data mining and data warehousing related?", 20 April 2011. HowStuffWorks.com , [Online]. Available: http://www.howstuffworks.com/are-data-mining-and-data- warehousing-related.htm. [Accessed: March. 23, 2013].  [2] “Database Fundamentals,” 2008. [Online]. Available: http://www.personal.psu.edu/glh10/ist110/topic/topic07/topic07_09.htm l. [Accessed: Mar. 23, 2013].  [3] B. Sudeshna, Georgia, "DATA MINING," 1997. [Online]. Available: http://www.siggraph.org/education/materials/HyperVis/applicat/data_mi ning/data_mining.html [Accessed: Mar.23, 2013].  [4] M. Bruno, "4 open source data mining tools (with GUI)," April 21 2009. [Online]. Available: http://www.analyticbridge.com/profiles/blogs/4-open-source-data- mining.  [5] “Data Mining: What is Data Mining?,” [Online]. Available: http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/ palace/datamining.htm. [Accessed: Mar.24, 2013]. IT 11 6104 58
References  [6] C.G Carrier, and O Povel, "Characterizing Data Mining software," Intelligent Data Analysis 7, pp. 181-185, August 2003.  [7] V. Agrawal, C. Bhattacharyya, T. Niranjan and S. Susarla, “Prediction of Hard Drive Failures via Rule Discovery from Auto Support Data” pp.782 -786, 2009 International Conference on Machine Learning and Applications, Dec. 2009.  [8] R. Dudko, A. Sharma and J. Tedesco, “Effective Failure Prediction in Hadoop Clusters,” Available:https://wiki.engr.illinois.edu/download/attachments/19576688 7/JAR-2nd.pdf? version=3&modificationDate=1333424381000 [Accessed Mar 28, 2013].  [9] Q. Guan, Z. Zhang, and S. Fu, A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers, Vol. 4, No. 5, International Journal of Computer Theory and Engineering, 2012.  [10] S. Di, D. Kondo and W. Cirne, “Host Load Prediction in a Google Compute Cloud with a Bayesian Model,” In Proceedings of IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2012. IT 11 6104 58
References  [11] “Performance requirements documentation,” [Online]. Available: http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=%2Fcom.ib m.aix.prftungd%2Fdoc%2Fprftungd%2Fdoc_perf_reqs.htm. [Accessed: March. 23, 2013].  [12] “How to write Performance Requirements with Example,” [Online]. Available: http://www.1202performance.com/atricles/how-to-write- performance-requirements-with-example/. [Accessed: March. 23, 2013].  [13]M. Bruno, "4 open source data mining tools (with GUI)," April 21 2009. [Online]. Available: http://www.analyticbridge.com/profiles/blogs/4-open-source-data- mining. [Accessed: Mar.23, 2013].  [14]Z. Li, "using data mining techniques to improve software reliable," 2006.  [15]A. Alzghoul, M. Löfstrand,” Increasing availability of industrial systems through data stream mining”, Computers &Industrial Engineering, 2010. [Accessed: Mar.23, 2013]. IT 11 6104 58
References  [16]"Non Functional Requirements," 2, Aug 26 2010. [Online]. Available: http://c2.com/cgi/wiki?NonFunctionalRequirements.[Accessed: March. 22, 2013]. IT 11 6104 58
Interview Questions  How you define the scale of your company? (Scale of the servers.)  What kind of data do your servers handle? ( How Critical )  Have you faced any server failures in your company?  How often failures are happening?  How do you get to know when a failure occurred in your company?  How failures affect to your company?  After a failure happens, what are your next action steps? IT 11 6104 58
Interview Questions Cont..  How long does it take to recover from a failure?  Do you have any server failure prediction mechanism?  If yes; • What kind of mechanism do you have to predict failures in your data centers? • Is it cost effective? • How early can you get about the failure? • Are you satisfied with your system?  If no; • If you have a failure prediction mechanism, will it be helpful to your decisions and your company? • What is your idea about a failure prediction mechanism? IT 11 6104 58
Recommend
More recommend