in Open Source Projects? Gerardo Massimiliano Rocco - - PowerPoint PPT Presentation

in open source projects
SMART_READER_LITE
LIVE PREVIEW

in Open Source Projects? Gerardo Massimiliano Rocco - - PowerPoint PPT Presentation

Who is going to Mentor Newcomers in Open Source Projects? Gerardo Massimiliano Rocco Sebastiano Canfora Di Penta Oliveto Panichella C ontext and M otivations Software Development


slide-1
SLIDE 1

Who is going to Mentor Newcomers in Open Source Projects?

Gerardo Massimiliano Rocco Sebastiano Canfora Di Penta Oliveto Panichella

slide-2
SLIDE 2

Context and Motivations

  • Software Development

How?

  • Training via Mentoring

Case Study

  • Explorative analysis
  • Recommendation system evaluation
slide-3
SLIDE 3

Training Project Newcomers

With a GOOD TRAINING

Can immediately start to work ACTIVELY

Newcomer

slide-4
SLIDE 4

Zhou and Mockus Better training from Senior Developers

Newcomer

Previous Work...

Low Sociability

slide-5
SLIDE 5

Previous Work...

Dagenais et al. MENTOR

Newcomer Mentoring of project newcomers is highly desirable

slide-6
SLIDE 6

Characteristics of a good Mentor

slide-7
SLIDE 7

Sources of Information

SVN

GIT CVS

slide-8
SLIDE 8

Small Projects: find Mentors is a trivial problem Large Projects: : find Mentors is not a trivial problem

Mentoring Small/large Projects

.........

slide-9
SLIDE 9

YODA

(Young and newcOmer Developer Assistant)

Approach for Mentors Identification in Open Source Projects

slide-10
SLIDE 10

SVN GIT CVS

YODA: two pashes

?

What factor can be used to identify mentors?

slide-11
SLIDE 11

What factor can be used to identify mentors?

RQ1: Identify past mentors

slide-12
SLIDE 12

How does Arnetminer works?

f1: they published Many papers Together f2: advisor published More than the Student f3: advisor older than the student f4: student published her first paper(s) with the advisor

Ranks pairs of researchers according to four factors:

slide-13
SLIDE 13

Time

F1: Exchanged emails

Heuristics to identify Mentors

slide-14
SLIDE 14

When Alice joins the project

F1: Exchanged emails

Heuristics to identify Mentors

Time

slide-15
SLIDE 15

F2: overall amount of emails

Heuristics to identify Mentors

slide-16
SLIDE 16

F2: overall amount of emails

Heuristics to identify Mentors

slide-17
SLIDE 17

F2: overall amount of emails

Heuristics to identify Mentors

slide-18
SLIDE 18

F3: project age

Heuristics to identify Mentors

slide-19
SLIDE 19

Time

F3: project age

Heuristics to identify Mentors

slide-20
SLIDE 20

F4: newcomer early emails

Heuristics to identify Mentors

slide-21
SLIDE 21

Time

First emails by Alice in the project

F4: newcomer early emails

Heuristics to identify Mentors

slide-22
SLIDE 22

F5: Commits

Heuristics to identify Mentors

slide-23
SLIDE 23

When Alice joins the project

Time

F5: Commits

Heuristics to identify Mentors

slide-24
SLIDE 24

What factors can be used to identify mentors?

Aggregating the factors

 5 1 i i i f

w

slide-25
SLIDE 25

Recommend Mentors

Time

slide-26
SLIDE 26

Recommend Mentors

Time

slide-27
SLIDE 27

Recommend Mentors

Time

t

slide-28
SLIDE 28

Recommend Mentors

Time

t

Mentor with adequate skills

slide-29
SLIDE 29

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al. 2011

slide-30
SLIDE 30

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al. 2011

t

slide-31
SLIDE 31

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al. 2011

t

slide-32
SLIDE 32

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al. 2011

t

slide-33
SLIDE 33

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al. 2011

t

slide-34
SLIDE 34

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al. 2011

t

DICE SIMILARITY

slide-35
SLIDE 35

Empirical Study

  • Goal: analyze data from mailing lists and versioning

systems

  • Purpose: investigating which factors can be used to

identify mentors

  • Quality focus: recommend mentors in software

projects

  • Context: mailing lists and versioning systems of five

software

  • Apache, FreeBSD, PostgreSQL, Python and Samba
slide-36
SLIDE 36

Apache FreeBSD PostgreSQL Python Samba

Period (Training set)

08/2001-03/2002 11/1998-02/2000 10/1998-05/2001 05/2000-05/2001 04/1998-09/2000

Period (Test set)

04/2002-12/2008 03/2000-10/2008 06/2001-03/2008 06/2001-12/2008 10/2000-12/2008

# of Mentors (Training set)

19 65 10 28 17

# of Newcomers (Training set)

13 33 8 32 33

# of Newcomers (Test set)

13 33 7 31 33

Context

Training and Test sets for evaluating Yoda.

slide-37
SLIDE 37

Research Questions

?

slide-38
SLIDE 38

RQ1: How can we identify mentors from the past

history of a software project?

SCORE 2.5 1.5 1.5 1.5 1.5 1.5 ………. COUPLES ……….

 5 1 i i i f

w

slide-39
SLIDE 39

RQ1: How can we identify mentors from the past

history of a software project?

SCORE 2.5 1.5 1.5 1.5 1.5 1.5 ………. COUPLES ……….

 5 1 i i i f

w

Manual Validation

slide-40
SLIDE 40

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 18 19 20 21 22 23 24

Precision Number of newcomer‐mentor pairs Possible Configurations

f1

slide-41
SLIDE 41

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 18 19 20 21 22 23 24

Precision Number of newcomer‐mentor pairs Possible Configurations

f1 +f2+ f3

slide-42
SLIDE 42

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 18 19 20 21 22 23 24

Precision Number of newcomer‐mentor pairs Possible Configurations

f1 +f2+ f4

slide-43
SLIDE 43

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 18 19 20 21 22 23 24

Precision Number of newcomer‐mentor pairs Possible Configurations

f5

(Baseline)

slide-44
SLIDE 44

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 18 19 20 21 22 23 24

Precision Number of newcomer‐mentor pairs

Apache

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 12 14 16 18 20 22

Precision Number of newcomer‐mentor pairs

PostgreSQL

f1 f1 +f2+ f3 f1 +f2+ f4 f5

(Baseline)

slide-45
SLIDE 45

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 18 19 20 21 22 23 24

Precision Number of newcomer‐mentor pairs

Apache

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 12 14 16 18 20 22

Precision Number of newcomer‐mentor pairs

PostgreSQL

f1 f1 +f2+ f3 f1 +f2+ f4 f5

(Baseline)

slide-46
SLIDE 46

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 23 25 27 29 31 33 35 37 39 41

Precision Number of newcomer‐mentor pairs

FreeBSD

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 24 26 28 30 32 34 36 38 40 42 44 46 48

Precision Number of newcomer‐mentor pairs

Python

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

30 32 34 36 38 40 42

Precision Number of newcomer‐mentor pairs

Samba

slide-47
SLIDE 47

RQ1: How can we identify mentors from the past

history of a software project?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 23 25 27 29 31 33 35 37 39 41

Precision Number of newcomer‐mentor pairs

FreeBSD

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 24 26 28 30 32 34 36 38 40 42 44 46 48

Precision Number of newcomer‐mentor pairs

Python

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

30 32 34 36 38 40 42

Precision Number of newcomer‐mentor pairs

Samba

USEFUL FACTORS FOR MENTORS IDENTIFICATION

0.5*f1 + 0.25*f2 + 0.25*f4 0.5*f1 + 0.25*f2 + 0.25*f3 f1

slide-48
SLIDE 48

RQ2: To what extent would it be possible to

recommend mentors to newcomers joining a software project?

slide-49
SLIDE 49

RQ2: To what extent would it be possible to

recommend mentors to newcomers joining a software project?

slide-50
SLIDE 50

RQ2: To what extent would it be possible to

recommend mentors to newcomers joining a software project?

YODA make it is possible possible to recommend Mentors

slide-51
SLIDE 51

Why don’t just using Top Committers?

slide-52
SLIDE 52

Why don’t just using Top Committers?

slide-53
SLIDE 53

Why don’t just using Top Committers?

Not all Committers Are Good Mentors

slide-54
SLIDE 54

Questions Asked:

  • Done/received mentoring
  • Perceived importance of mentoring
  • What makes a good Mentor

Surveying Projects Developers

slide-55
SLIDE 55

Sent to 114 Subjects…

FreeBSD Postgre- SQL Python Apache Samba

.....

37

.....

37

.....

15

.....

23

.....

23

slide-56
SLIDE 56

Obtained Answare…

FreeBSD Postgre- SQL Python Apache Samba

slide-57
SLIDE 57

92% 58% 8% 42%

0% 20% 40% 60% 80% 100% Did mentoring? Had a mentor? YES NO

Done/received mentoring?

slide-58
SLIDE 58

92% 58% 8% 42%

0% 20% 40% 60% 80% 100% Did mentoring? Had a mentor? YES NO

Done/received mentoring?

Yes, I received

  • Mentoring. My

mentor was… Yes, I did mentoring…

>

slide-59
SLIDE 59

18% 36% 45% 0% 0% 33% 56% 11% 0% 0%

0% 20% 40% 60% Very important Important Neutral Not important Useless at all Effect of mentor Effect on newcomer

Perceived importance of mentoring

slide-60
SLIDE 60

18% 36% 45% 0% 0% 33% 56% 11% 0% 0%

0% 20% 40% 60% Very important Important Neutral Not important Useless at all Effect of mentor Effect on newcomer

Perceived importance of mentoring

slide-61
SLIDE 61

18% 36% 45% 0% 0% 33% 56% 11% 0% 0%

0% 20% 40% 60% Very important Important Neutral Not important Useless at all Effect of mentor Effect on newcomer

Perceived importance of mentoring

slide-62
SLIDE 62

18% 36% 45% 0% 0% 33% 56% 11% 0% 0%

0% 20% 40% 60% Very important Important Neutral Not important Useless at all Effect of mentor Effect on newcomer

Perceived importance of mentoring

Is very important that mentor share knowledge with a mentee…

slide-63
SLIDE 63

19% 42% 38% 0% 0% 10% 20% 30% 40% 50% Experience Communication skills Project knowledge Others

What makes a good Mentor

slide-64
SLIDE 64

19% 42% 38% 0% 0% 10% 20% 30% 40% 50% Experience Communication skills Project knowledge Others

What makes a good Mentor

slide-65
SLIDE 65

19% 42% 38% 0% 0% 10% 20% 30% 40% 50% Experience Communication skills Project knowledge Others

What makes a good Mentor

My first Mentor had a very strong and technical background

slide-66
SLIDE 66

Conclusion

slide-67
SLIDE 67

Conclusion

slide-68
SLIDE 68

Conclusion

slide-69
SLIDE 69

Conclusion

slide-70
SLIDE 70

Conclusion

slide-71
SLIDE 71

Future Work