Reuse and Beyond: Innovative Software Retrieval Approaches Oliver - - PowerPoint PPT Presentation

reuse and beyond innovative software retrieval approaches
SMART_READER_LITE
LIVE PREVIEW

Reuse and Beyond: Innovative Software Retrieval Approaches Oliver - - PowerPoint PPT Presentation

Palladio Days 2012 Reuse and Beyond: Innovative Software Retrieval Approaches Oliver Hummel SOFTWARE DESIGN AND QUALITY GROUP sdq.ipd.kit.edu INSTITUTE FOR PROGRAM STRUCTURES AND DATA ORGANIZATION, FACULTY OF INFORMATICS KIT University of


slide-1
SLIDE 1

KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association SOFTWARE DESIGN AND QUALITY GROUP INSTITUTE FOR PROGRAM STRUCTURES AND DATA ORGANIZATION, FACULTY OF INFORMATICS

www.kit.edu sdq.ipd.kit.edu

Reuse and Beyond: Innovative Software Retrieval Approaches

Oliver Hummel

Palladio Days 2012

slide-2
SLIDE 2

Software Design and Quality Group Institute for Program Structures and Data Organization 2 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Three Problems I

ssdfsd „I was having three problems: No job, no money, and no clue how to continue.

Otto – The Movie (1986)

Photo: [Andreas Reiner, Wikipedia]

slide-3
SLIDE 3

Software Design and Quality Group Institute for Program Structures and Data Organization 3 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Three Problems II

“Software reuse is the process of creating software systems from existing software rather than building software systems from scratch.” [Krueger 92] “A software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be deployed independently and is subject to composition by third parties.” [Szyperski 02] „[…] And there they were again, my three problems…“

No repository structure,

no reusable software, and no clue how to retrieve something.

The Reuse Community (1999)

slide-4
SLIDE 4

Software Design and Quality Group Institute for Program Structures and Data Organization 4 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Overview

I’m going to share some insights on software search & retrieval with you –

1.

Using the web as a reuse repository

2.

How to build a software search engine with Lucene

  • and how to use it

3.

How to add functional semantics to software searches

4.

A Word on proactive reuse recommendations in Eclipse

5.

Can we utilize the wisdom of the crowd in software development?

6.

Outlook, summary and conclusion

slide-5
SLIDE 5

Software Design and Quality Group Institute for Program Structures and Data Organization 5 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Solution Idea I

Using the Web (and Google) as a Reuse Repository (2006)

with some additional keywords

filetype:java „class stack“ „void push“ „Object pop“

+ simple, easy to use + no overhead + precise for simple requests

  • still too imprecise
  • „dark web“ not covered anymore
  • no reliable API

for IDE integration

Stack +push(o:Object):void +pop():Object

slide-6
SLIDE 6

Software Design and Quality Group Institute for Program Structures and Data Organization 6 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Solution Idea II

Build a specialized software search engine (2008)

with google-like interface and IDE integration and suport for programming languages, signatures, interfaces etc.

Use Database or Lucene or both?

relational model vs. field-based approach

„De-Normalizing“ allows effective software searches in Lucene

10 million files (~40 GB) in < 3 seconds with

Stack +push(o:Object):void +pop():Object

name stack method push method pop msign 1_pt:object_rt:void msign 1_pt:void_rt:object minter 1_mn:push_pt:object_rt:void minter 1_mn:pop_pt:void_rt:object

.com

slide-7
SLIDE 7

Software Design and Quality Group Institute for Program Structures and Data Organization 7 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Simple Applications

What can we do with such a software search engine?

Keyword Searches Name Matching Signature Matching Interface Matching Open Source Lookup Library Identification Calculator lang:java method:push $(float,float):float; isLeapYear(int):boolean; java.util.Stack findjar:org.apache.lucene.search.Query

slide-8
SLIDE 8

Software Design and Quality Group Institute for Program Structures and Data Organization 8 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Simple Applications II

Searches for full object interfaces are also possible

… even in Java/C# syntax

Only test cases „refuse“ to deliver test cases..?

public class Matrix { public Matrix add(Matrix) {} public Matrix multiply(Matrix) {} } import junit.framework.TestCase; public class CalculatorTest extends TestCase { public void testAdd() { Calculator calc = new Calculator(); assertEquals(calc.add(2, 1), 3); } }

slide-9
SLIDE 9

Software Design and Quality Group Institute for Program Structures and Data Organization 9 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Precision from a Reuse Perspective

Preliminary comparison of retrieval approaches has shown limited top 25 precision

results for 13 method queries

even a full syntax matching is not precise enough

how can we support semantics in searches?

slide-10
SLIDE 10

Software Design and Quality Group Institute for Program Structures and Data Organization 10 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Semantic Software Retrieval

And what about „contractually specified interfaces“?

and a seamless integration in modern development processes and tools?

test-driven development

Test-Driven Reuse candidate selection based on interface of C.U.T.

assessment of semantics based on test cases very precise more candidates by ignoring names in interface expensive, but still precise

[Hummel, 2008]

slide-11
SLIDE 11

Software Design and Quality Group Institute for Program Structures and Data Organization 11 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Success Examples & Discussion

[Hummel, 2008]

slide-12
SLIDE 12

Software Design and Quality Group Institute for Program Structures and Data Organization 12 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Eclipse Integration

Precise reuse recommendations on the fly…

[CodeConjurer.org]

slide-13
SLIDE 13

Software Design and Quality Group Institute for Program Structures and Data Organization 13 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

The Wisdom of the Crowd

API Recommendations with ParseWeb

(Thummalapenta & Xie, 2007)

Book Recommendations @ Amazon

slide-14
SLIDE 14

Software Design and Quality Group Institute for Program Structures and Data Organization 14 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

The Wisdom of the Crowd

Towards crowd-based design

recommendations (2010)

via intersecting interfaces of search result

  • r attributes
  • r dependencies etc.

„Developers who have coded a Stack have implemented a push, pop and isEmpty method…“

slide-15
SLIDE 15

Software Design and Quality Group Institute for Program Structures and Data Organization 15 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Ongoing & Future Work

Component recognition

tracing dependencies -> e.g. delivered a full-grown FTP Client

goal is to get this into the Merobase index

[Kakarontzas & Stamelos] have been working on Facade detection

in Open SME EU project

Test case recommendation

[Janjic & Atkinson subm.]

slide-16
SLIDE 16

Software Design and Quality Group Institute for Program Structures and Data Organization 16 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Conclusion

Numerous applications for software retrieval beyond reuse

Code Recommendations Design Recommendations Snippet Reuse Component Reuse Library Reuse Back-to-Back Testing Test Recommendation Program Understanding Impact Analysis Defect information Open Source Lookup Libray Lookup

concrete searches explorative searches [Janjic et al., 2010]

slide-17
SLIDE 17

Software Design and Quality Group Institute for Program Structures and Data Organization 17 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

The End.

Thank you for your attention! Time for some questions?

… or maybe you want to contact me later

slide-18
SLIDE 18

Software Design and Quality Group Institute for Program Structures and Data Organization 18 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

References

  • C. Atkinson, O. Hummel, W. Janjic, "Search-enhanced testing: NIER track", in Software Engineering (ICSE),

2011 33rd International Conference on, 2011. p. 880--883.

  • O. Hummel, C. Atkinson, "Extreme harvesting: Test driven discovery and reuse of software components", in

Information Reuse and Integration, 2004. IRI 2004. Proceedings of the 2004 IEEE International Conference

  • n, 2004. p. 66--72.
  • O. Hummel, C. Atkinson, "Using the web as a reuse repository", Reuse of Off-the-Shelf Components, p. 298-
  • 311. 2006.
  • O. Hummel, W. Janjic, C. Atkinson, “Evaluating the Efficiency of Retrieval Methods for Component

Repositories” , Proceedings of the Int. Conf on Software Engineering & Knowledge Engineering, 2007.

  • O. Hummel, W. Janjic, C. Atkinson, "Code conjurer: Pulling reusable software out of thin air", IEEE Software,
  • vol. 25, no. 5 p. 45--52. 2008.

O.Hummel, „Semantic Component Retrieval in Software Engineering“, PhD thesis, University of Mannheim, 2008.

  • O. Hummel, W. Janjic, C. Atkinson, "Proposing software design recommendations based on component

interface intersecting", in Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, 2010. p. 64--68.

  • W. Janjic, O. Hummel, C. Atkinson, "More archetypal usage scenarios for software search engines", in

Proceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation, 2010. p. 21--24.

  • G. Kakarontzas, I. Stamelos, S. Skalistis, A. Naskos, "Extracting Components from Open Source: The

Component Adaptation Environment (COPE) Approach", in Software Engineering and Advanced Applications (SEAA), 2012 38th EUROMICRO Conference on, 2012. p. 192--199.

  • S. Thummalapenta, T. Xie, "Parseweb: a programmer assistant for reusing open source code on the web", in

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering,

  • 2007. p. 204--213.
slide-19
SLIDE 19

Software Design and Quality Group Institute for Program Structures and Data Organization 19 2012-10-16 Software Retrieval Beyond Reuse Jun.-Prof. Dr. Oliver Hummel

Three Problems Again

Otto: „There they were again, my three problems.“

Sylvia: „Are you going to make me happy tonight?" Otto: „Yeah." Sylvia: „Are you going to make me happy twice?“ Otto: „Yeah.“ Sylvia: „Even three times?“ Otto: „There they were again, my three problems.“