KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association SOFTWARE DESIGN AND QUALITY GROUP INSTITUTE FOR PROGRAM STRUCTURES AND DATA ORGANIZATION, FACULTY OF INFORMATICS
Reuse and Beyond: Innovative Software Retrieval Approaches Oliver - - PowerPoint PPT Presentation
Reuse and Beyond: Innovative Software Retrieval Approaches Oliver - - PowerPoint PPT Presentation
Palladio Days 2012 Reuse and Beyond: Innovative Software Retrieval Approaches Oliver Hummel SOFTWARE DESIGN AND QUALITY GROUP sdq.ipd.kit.edu INSTITUTE FOR PROGRAM STRUCTURES AND DATA ORGANIZATION, FACULTY OF INFORMATICS KIT University of
Software Design and Quality Group Institute for Program Structures and Data Organization 2 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Three Problems I
ssdfsd „I was having three problems: No job, no money, and no clue how to continue.
Otto – The Movie (1986)
Photo: [Andreas Reiner, Wikipedia]
Software Design and Quality Group Institute for Program Structures and Data Organization 3 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Three Problems II
“Software reuse is the process of creating software systems from existing software rather than building software systems from scratch.” [Krueger 92] “A software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be deployed independently and is subject to composition by third parties.” [Szyperski 02] „[…] And there they were again, my three problems…“
No repository structure,
no reusable software, and no clue how to retrieve something.
The Reuse Community (1999)
Software Design and Quality Group Institute for Program Structures and Data Organization 4 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Overview
I’m going to share some insights on software search & retrieval with you –
1.
Using the web as a reuse repository
2.
How to build a software search engine with Lucene
- and how to use it
3.
How to add functional semantics to software searches
4.
A Word on proactive reuse recommendations in Eclipse
5.
Can we utilize the wisdom of the crowd in software development?
6.
Outlook, summary and conclusion
Software Design and Quality Group Institute for Program Structures and Data Organization 5 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Solution Idea I
Using the Web (and Google) as a Reuse Repository (2006)
with some additional keywords
filetype:java „class stack“ „void push“ „Object pop“
+ simple, easy to use + no overhead + precise for simple requests
- still too imprecise
- „dark web“ not covered anymore
- no reliable API
for IDE integration
Stack +push(o:Object):void +pop():Object
Software Design and Quality Group Institute for Program Structures and Data Organization 6 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Solution Idea II
Build a specialized software search engine (2008)
with google-like interface and IDE integration and suport for programming languages, signatures, interfaces etc.
Use Database or Lucene or both?
relational model vs. field-based approach
„De-Normalizing“ allows effective software searches in Lucene
10 million files (~40 GB) in < 3 seconds with
Stack +push(o:Object):void +pop():Object
name stack method push method pop msign 1_pt:object_rt:void msign 1_pt:void_rt:object minter 1_mn:push_pt:object_rt:void minter 1_mn:pop_pt:void_rt:object
.com
Software Design and Quality Group Institute for Program Structures and Data Organization 7 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Simple Applications
What can we do with such a software search engine?
Keyword Searches Name Matching Signature Matching Interface Matching Open Source Lookup Library Identification Calculator lang:java method:push $(float,float):float; isLeapYear(int):boolean; java.util.Stack findjar:org.apache.lucene.search.Query
Software Design and Quality Group Institute for Program Structures and Data Organization 8 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Simple Applications II
Searches for full object interfaces are also possible
… even in Java/C# syntax
Only test cases „refuse“ to deliver test cases..?
public class Matrix { public Matrix add(Matrix) {} public Matrix multiply(Matrix) {} } import junit.framework.TestCase; public class CalculatorTest extends TestCase { public void testAdd() { Calculator calc = new Calculator(); assertEquals(calc.add(2, 1), 3); } }
Software Design and Quality Group Institute for Program Structures and Data Organization 9 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Precision from a Reuse Perspective
Preliminary comparison of retrieval approaches has shown limited top 25 precision
results for 13 method queries
even a full syntax matching is not precise enough
how can we support semantics in searches?
Software Design and Quality Group Institute for Program Structures and Data Organization 10 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Semantic Software Retrieval
And what about „contractually specified interfaces“?
and a seamless integration in modern development processes and tools?
test-driven development
Test-Driven Reuse candidate selection based on interface of C.U.T.
assessment of semantics based on test cases very precise more candidates by ignoring names in interface expensive, but still precise
[Hummel, 2008]
Software Design and Quality Group Institute for Program Structures and Data Organization 11 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Success Examples & Discussion
[Hummel, 2008]
Software Design and Quality Group Institute for Program Structures and Data Organization 12 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Eclipse Integration
Precise reuse recommendations on the fly…
[CodeConjurer.org]
Software Design and Quality Group Institute for Program Structures and Data Organization 13 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
The Wisdom of the Crowd
API Recommendations with ParseWeb
(Thummalapenta & Xie, 2007)
Book Recommendations @ Amazon
Software Design and Quality Group Institute for Program Structures and Data Organization 14 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
The Wisdom of the Crowd
Towards crowd-based design
recommendations (2010)
via intersecting interfaces of search result
- r attributes
- r dependencies etc.
„Developers who have coded a Stack have implemented a push, pop and isEmpty method…“
Software Design and Quality Group Institute for Program Structures and Data Organization 15 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Ongoing & Future Work
Component recognition
tracing dependencies -> e.g. delivered a full-grown FTP Client
goal is to get this into the Merobase index
[Kakarontzas & Stamelos] have been working on Facade detection
in Open SME EU project
Test case recommendation
[Janjic & Atkinson subm.]
Software Design and Quality Group Institute for Program Structures and Data Organization 16 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
Conclusion
Numerous applications for software retrieval beyond reuse
Code Recommendations Design Recommendations Snippet Reuse Component Reuse Library Reuse Back-to-Back Testing Test Recommendation Program Understanding Impact Analysis Defect information Open Source Lookup Libray Lookup
concrete searches explorative searches [Janjic et al., 2010]
Software Design and Quality Group Institute for Program Structures and Data Organization 17 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
The End.
Thank you for your attention! Time for some questions?
… or maybe you want to contact me later
Software Design and Quality Group Institute for Program Structures and Data Organization 18 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel
References
- C. Atkinson, O. Hummel, W. Janjic, "Search-enhanced testing: NIER track", in Software Engineering (ICSE),
2011 33rd International Conference on, 2011. p. 880--883.
- O. Hummel, C. Atkinson, "Extreme harvesting: Test driven discovery and reuse of software components", in
Information Reuse and Integration, 2004. IRI 2004. Proceedings of the 2004 IEEE International Conference
- n, 2004. p. 66--72.
- O. Hummel, C. Atkinson, "Using the web as a reuse repository", Reuse of Off-the-Shelf Components, p. 298-
- 311. 2006.
- O. Hummel, W. Janjic, C. Atkinson, “Evaluating the Efficiency of Retrieval Methods for Component
Repositories” , Proceedings of the Int. Conf on Software Engineering & Knowledge Engineering, 2007.
- O. Hummel, W. Janjic, C. Atkinson, "Code conjurer: Pulling reusable software out of thin air", IEEE Software,
- vol. 25, no. 5 p. 45--52. 2008.
O.Hummel, „Semantic Component Retrieval in Software Engineering“, PhD thesis, University of Mannheim, 2008.
- O. Hummel, W. Janjic, C. Atkinson, "Proposing software design recommendations based on component
interface intersecting", in Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, 2010. p. 64--68.
- W. Janjic, O. Hummel, C. Atkinson, "More archetypal usage scenarios for software search engines", in
Proceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation, 2010. p. 21--24.
- G. Kakarontzas, I. Stamelos, S. Skalistis, A. Naskos, "Extracting Components from Open Source: The
Component Adaptation Environment (COPE) Approach", in Software Engineering and Advanced Applications (SEAA), 2012 38th EUROMICRO Conference on, 2012. p. 192--199.
- S. Thummalapenta, T. Xie, "Parseweb: a programmer assistant for reusing open source code on the web", in
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering,
- 2007. p. 204--213.
Software Design and Quality Group Institute for Program Structures and Data Organization 19 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel