Mining Software Engineering Data g g g
Ahmed E. Hassan
Queen’s University
Tao Xie
North Carolina State University Q y www.cs.queensu.ca/~ahmed ahmed@cs.queensu.ca y www.csc.ncsu.edu/faculty/xie xie@csc.ncsu.edu
Some slides are adapted from tutorial slides co-prepared by Jian Pei from Simon Fraser University, Canada
An up-to-date version of this tutorial is available at http://ase.csc.ncsu.edu/dmse/
y
Ahmed E Hassan Ahmed E. Hassan
- NSERC/RIM Software Engineering
Research Chair Queen’s University, Canada y,
- Leads the SAIL research group at Queen’s
C h i f W k h Mi i S ft
- Co-chair for Workshop on Mining Software
Repositories (MSR) from 2004-2006
- Chair of the steering committee for MSR
- A. E. Hassan and T. Xie: Mining Software Engineering Data
2
Tao Xie Tao Xie
A i t t P f t N th C li St t
- Assistant Professor at North Carolina State
University, USA
- Leads the ASE research group at NCSU
- PC Co-Chair of ICSM 2009 MSR 2011
PC Co Chair of ICSM 2009 MSR 2011
- Co-organizer of 2007 Dagstuhl Seminar on
Mining Programs and Processes Mining Programs and Processes
- A. E. Hassan and T. Xie: Mining Software Engineering Data
3
Acknowledgments Acknowledgments
- Jian Pei, SFU
- Thomas Zimmermann Microsoft Research
Thomas Zimmermann, Microsoft Research
- Peter Rigby, U. of Victoria
- Sunghun Kim, HKUST
- John Anvik U of Victoria
- John Anvik, U. of Victoria
- A. E. Hassan and T. Xie: Mining Software Engineering Data
4
Tutorial Goals Tutorial Goals
- Learn about:
– Recent and notable research and researchers in mining SE data – Data mining and data processing techniques and how to l th t SE d t apply them to SE data – Risks in using SE data due to e.g., noise, project culture
- By end of tutorial, you should be able:
– Retrieve SE data – Prepare SE data for mining – Mine interesting information from SE data
- A. E. Hassan and T. Xie: Mining Software Engineering Data
5
Mining SE Data Mining SE Data
- MAIN GOAL
– Transform static record- keeping SE data to active data – Make SE data actionable by uncovering hidden by uncovering hidden patterns and trends
Mailings Bugzilla Mailings Bugzilla Code Execution CVS
- A. E. Hassan and T. Xie: Mining Software Engineering Data
6
repository traces CVS