Copy-and-Paste Redeemed Towards Adapting Abstractions Christoph - - PowerPoint PPT Presentation

copy and paste redeemed
SMART_READER_LITE
LIVE PREVIEW

Copy-and-Paste Redeemed Towards Adapting Abstractions Christoph - - PowerPoint PPT Presentation

Copy-and-Paste Redeemed Towards Adapting Abstractions Christoph Reichenbach Goethe University Frankfurt 09 November 2015 based on our ASE 2015 paper with Krishna Narasimhan Extending DSLs PQL/Java : Parallel Query Language for Java


slide-1
SLIDE 1

Copy-and-Paste Redeemed

Towards Adapting Abstractions

Christoph Reichenbach

Goethe University Frankfurt

09 November 2015

based on our ASE 2015 paper with Krishna Narasimhan

slide-2
SLIDE 2

Extending DSLs

◮ PQL/Java: Parallel Query Language for Java ◮ PQL-ESL: Extension Specification Language for

PQL/Java

2 / 14

slide-3
SLIDE 3

Extending DSLs

◮ PQL/Java: Parallel Query Language for Java ◮ PQL-ESL: Extension Specification Language for

PQL/Java

◮ Student Project: Add SQL support via PQL-ESL

2 / 14

slide-4
SLIDE 4

Extending DSLs

◮ PQL/Java: Parallel Query Language for Java ◮ PQL-ESL: Extension Specification Language for

PQL/Java

◮ Student Project: Add SQL support via PQL-ESL

◮ BSc student: “Using PQL-ESL abstractions was

challenging”

◮ SQL support still not optimal: Differences in SQL

implementations would require multiple backends

2 / 14

slide-5
SLIDE 5

Extending DSLs

◮ PQL/Java: Parallel Query Language for Java ◮ PQL-ESL: Extension Specification Language for

PQL/Java

◮ Student Project: Add SQL support via PQL-ESL

◮ BSc student: “Using PQL-ESL abstractions was

challenging”

◮ SQL support still not optimal: Differences in SQL

implementations would require multiple backends

◮ Can we help introduce new abstractions? ◮ Can we help extend existing abstractions?

2 / 14

slide-6
SLIDE 6

Duplication vs. Abstraction

int dist(coord a, coord b) { int dx = abs(a.x - b.x); int dy = abs(a.y - b.y); return std::max(dx, dy); }

3 / 14

slide-7
SLIDE 7

Duplication vs. Abstraction

int dist(coord a, coord b) { int dx = abs(a.x - b.x); int dy = abs(a.y - b.y); return std::max(dx, dy); } int dist2(coord a, coord b) { int dx = abs(a.x - b.x); int dy = abs(a.y - b.y); return std::hypot(dx, dy); }

3 / 14

slide-8
SLIDE 8

Duplication vs. Abstraction

int dist(coord a, coord b) { int dx = abs(a.x - b.x); int dy = abs(a.y - b.y); return std::max(dx, dy); } int dist2(coord a, coord b) { int dx = abs(a.x - b.x); int dy = abs(a.y - b.y); return std::hypot(dx, dy); } Alternative: Abstraction int distG(coord a, coord b, int (*f)(int, int)) { int dx = abs(a.x - b.x); int dy = abs(a.y - b.y); return f(dx, dy); }

3 / 14

slide-9
SLIDE 9

A Small User Study

◮ We took a number of near-clones from existing programs ◮ We removed one of the clones from each program ◮ We asked five C++ programmers (graduate students) to

reimplement missing functionality

◮ Experience: 2, 3, 12, 48, 120 months

◮ Abstraction: ◮ Copy-Paste:

4 / 14

slide-10
SLIDE 10

A Small User Study

◮ We took a number of near-clones from existing programs ◮ We removed one of the clones from each program ◮ We asked five C++ programmers (graduate students) to

reimplement missing functionality

◮ Experience: 2, 3, 12, 48, 120 months

◮ Abstraction:

◮ Users tend to prefer abstraction

◮ Copy-Paste:

4 / 14

slide-11
SLIDE 11

A Small User Study

◮ We took a number of near-clones from existing programs ◮ We removed one of the clones from each program ◮ We asked five C++ programmers (graduate students) to

reimplement missing functionality

◮ Experience: 2, 3, 12, 48, 120 months

◮ Abstraction:

◮ Users tend to prefer abstraction ◮ 7/10 tasks completed in time

◮ Copy-Paste:

◮ 15/15 tasks completed in time 4 / 14

slide-12
SLIDE 12

A Small User Study

◮ We took a number of near-clones from existing programs ◮ We removed one of the clones from each program ◮ We asked five C++ programmers (graduate students) to

reimplement missing functionality

◮ Experience: 2, 3, 12, 48, 120 months

◮ Abstraction:

◮ Users tend to prefer abstraction ◮ 12/15 tasks completed in time

◮ Copy-Paste:

◮ 15/15 tasks completed in time 4 / 14

slide-13
SLIDE 13

A Small User Study

◮ We took a number of near-clones from existing programs ◮ We removed one of the clones from each program ◮ We asked five C++ programmers (graduate students) to

reimplement missing functionality

◮ Experience: 2, 3, 12, 48, 120 months

◮ Abstraction:

◮ Users tend to prefer abstraction ◮ 12/15 tasks completed in time ◮ Factor 2-3 slower than abstraction

◮ Copy-Paste:

◮ 15/15 tasks completed in time ◮ Universally faster (except for a single tie)

0 2 4 6 2 4 1 2 3 4 5

20 40 60 2 4 1 2 3 4 5

0 10 20 30 2 4 1 2 3 4 5 0 5 10 15 2 4 1 2 3 4 5 10 20 2 4 1 2 3 4 5

4 / 14

slide-14
SLIDE 14

The Best of Both Worlds

◮ Hypothesis: More abstractions ⇒ slower programmers ◮ However:

◮ Literature on clone research mostly agrees that clones are

maintenance problem

◮ Practitioner literature strongly advocates abstraction

◮ Choice between:

◮ Fast development with copy-paste programming ◮ Maintainability with abstraction 5 / 14

slide-15
SLIDE 15

The Best of Both Worlds

◮ Hypothesis: More abstractions ⇒ slower programmers ◮ However:

◮ Literature on clone research mostly agrees that clones are

maintenance problem

◮ Practitioner literature strongly advocates abstraction

◮ Choice between:

◮ Fast development with copy-paste programming ◮ Maintainability with abstraction

◮ Generative Programming perspective:

◮ Fast development with generation + manual edits ◮ Maintainability by evolving code generators 5 / 14

slide-16
SLIDE 16

The Best of Both Worlds

◮ Hypothesis: More abstractions ⇒ slower programmers ◮ However:

◮ Literature on clone research mostly agrees that clones are

maintenance problem

◮ Practitioner literature strongly advocates abstraction

◮ Choice between:

◮ Fast development with copy-paste programming ◮ Maintainability with abstraction

◮ Generative Programming perspective:

◮ Fast development with generation + manual edits ◮ Maintainability by evolving code generators

Why not both?

5 / 14

slide-17
SLIDE 17

An Example

void function1() { fake_db_init(&fakedb, true); run_mode = MODE_DEBUG; start(&service); } void function2() { fake_db_init(&fakedb, false); run_mode = MODE_INFO; start(&service); } void function3() { db_init(); db_sanity_check(); run_mode = MODE_REGULAR; start(&service); }

6 / 14

slide-18
SLIDE 18

An Example

void function1() { b(c,d); y = f1; x(z); } void function2() { b(c,e); y = f2; x(z); } void function3() { b2(); n(); y = f3; x(z); }

6 / 14

slide-19
SLIDE 19

An Example

void function1() { b(c,d); y = f1; x(z); } void function2() { b(c,e); y = f2; x(z); } void function3() { b2(); n(); y = f3; x(z); }

a b c d y f1 x z a b c e y f2 x z a b2 n y f3 x z

6 / 14

slide-20
SLIDE 20

Automating Abstraction

a b c d y f1 x z a b c e y f2 x z a b2 n y f3 x z

7 / 14

slide-21
SLIDE 21

Automating Abstraction

a b c d y f1 x z a b c e y f2 x z a b2 n y f3 x z merge a M(12,3) b c M(1,2) d e b2 M(3) n y M(1,2,3) f1 f2 f3 x z

7 / 14

slide-22
SLIDE 22

Automating Abstraction

a b c d y f1 x z a b c e y f2 x z a b2 n y f3 x z merge a M(12,3) b c M(1,2) d e b2 M(3) n y M(1,2,3) f1 f2 f3 x z

7 / 14

slide-23
SLIDE 23

An Example

void function1() { b(c,d); y = f1; x(z); } void function2() { b(c,e); y = f2; x(z); } void function3() { b2(); n(); y = f3; x(z); }

8 / 14

slide-24
SLIDE 24

An Example

void function1() { b(c,d); y = f1; x(z); } void function2() { b(c,e); y = f2; x(z); } void function3() { b2(); n(); y = f3; x(z); }

void fnMerged(int functionId, int fValue, bool bParam) { if (functionId == 1 || functionId == 2) { b(c, bParam); } if (functionId == 3) { b2(); n(); } y = fValue; x(z); }

8 / 14

slide-25
SLIDE 25

An Example

void function1() { b(c,d); y = f1; x(z); } void function2() { b(c,e); y = f2; x(z); } void function3() { b2(); n(); y = f3; x(z); }

void fnMerged(int functionId, int fValue, bool bParam) { if (functionId == 1 || functionId == 2) { b(c, bParam); } if (functionId == 3) { b2(); n(); } y = fValue; x(z); } void function1() { fnMerged(1,f1,d); }

8 / 14

slide-26
SLIDE 26

Realising Merge Points

◮ Merge Points (M) realised via abstraction ◮ Different forms of abstraction:

◮ C++: ◮ Parameters ◮ Template parameters ◮ Fields ◮ Delegate patterns ◮ Subclasses ◮ Function pointers

. . .

9 / 14

slide-27
SLIDE 27

Realising Merge Points

◮ Merge Points (M) realised via abstraction ◮ Different forms of abstraction:

◮ C++: ◮ Parameters ◮ Template parameters ◮ Fields ◮ Delegate patterns ◮ Subclasses ◮ Function pointers

. . .

◮ We leave choice of abstraction to user 9 / 14

slide-28
SLIDE 28

Realising Merge Points

◮ Merge Points (M) realised via abstraction ◮ Different forms of abstraction:

◮ C++: ◮ Parameters ◮ Template parameters ◮ Fields ◮ Delegate patterns ◮ Subclasses ◮ Function pointers

. . .

◮ We leave choice of abstraction to user ◮ Implemented as Eclipse plugin for C++ (based on CDT) 9 / 14

slide-29
SLIDE 29

Implementing Merges

int f1() { return 1; } int f2() { return 2; } void main() { cout << f1() << endl; } int f(int k) { switch (k) { case 1: return 1; case 2: return 2; } } void main() { cout << f(1) << endl; }

10 / 14

slide-30
SLIDE 30

Implementing Merges

int f1() { return 1; } int f2() { return 2; } void main() { cout << f1() << endl; } int f(int k) { switch (k) { case 1: return 1; case 2: return 2; } } void main() { cout << f(1) << endl; } formal selector actual selector selection mechanism selector channel

◮ Selector: encodes the variant to choose ◮ Selection Mechanism: translates selector to alternative

10 / 14

slide-31
SLIDE 31

Implementing Merges

int f1() { return 1; } int f2() { return 2; } void main() { cout << f1() << endl; } int f(int k) { return k; } void main() { cout << f(1) << endl; } formal selector actual selector selection mechanism selector channel

◮ Selector: encodes the variant to choose ◮ Selection Mechanism: translates selector to alternative

10 / 14

slide-32
SLIDE 32

Selectors and Selection Mechanisms in C++

Selector Channel Types

◮ Value Parameter ◮ Pointer Parameter ◮ Reference Parameter ◮ Template Parameter ◮ Global Variable ◮ Field ◮ Dynamic Type ◮ UNIX Environment Variable

. . .

11 / 14

slide-33
SLIDE 33

Selectors and Selection Mechanisms in C++

Selector Channel Types

◮ Value Parameter ◮ Pointer Parameter ◮ Reference Parameter ◮ Template Parameter ◮ Global Variable ◮ Field ◮ Dynamic Type ◮ UNIX Environment Variable

. . .

Our C++ prototype supports the boldfaced options

11 / 14

slide-34
SLIDE 34

Selectors and Selection Mechanisms in C++

Selector Channel Types

◮ Value Parameter ◮ Pointer Parameter ◮ Reference Parameter ◮ Template Parameter ◮ Global Variable ◮ Field ◮ Dynamic Type ◮ UNIX Environment Variable

. . .

Selection Mechanism Types

◮ Direct Parameter ◮ Switch Statement ◮ If Statement ◮ Function Pointer ◮ Closure ◮ Delegates ◮ Dynamic Dispatch ◮ Dynamic Linker

. . .

Our C++ prototype supports the boldfaced options

11 / 14

slide-35
SLIDE 35

Selectors and Selection Mechanisms in C++

Selector Channel Types

◮ Value Parameter ◮ Pointer Parameter ◮ Reference Parameter ◮ Template Parameter ◮ Global Variable ◮ Field ◮ Dynamic Type ◮ UNIX Environment Variable

. . .

Selection Mechanism Types

◮ Direct Parameter ◮ Switch Statement ◮ If Statement ◮ Function Pointer (partial) ◮ Closure ◮ Delegates ◮ Dynamic Dispatch ◮ Dynamic Linker

. . .

Our C++ prototype supports the boldfaced options

11 / 14

slide-36
SLIDE 36

Evaluation

◮ Selected near-clones from Open Source C++ projects

(Facebook rocksdb, Google protobuf, Oracle node-oracledb, mongodb, . . . )

◮ Used our tool to merge ◮ Manual cleanup: Formatting, variable renaming

12 / 14

slide-37
SLIDE 37

Evaluation

◮ Selected near-clones from Open Source C++ projects

(Facebook rocksdb, Google protobuf, Oracle node-oracledb, mongodb, . . . )

◮ Used our tool to merge ◮ Manual cleanup: Formatting, variable renaming ◮ Phase 1 (Early prototype, limited to two-way merge):

Submitted Accepted Rejected Pending 8 1 4 3 ⇒ Feedback: Need multi-way merge

12 / 14

slide-38
SLIDE 38

Evaluation

◮ Selected near-clones from Open Source C++ projects

(Facebook rocksdb, Google protobuf, Oracle node-oracledb, mongodb, . . . )

◮ Used our tool to merge ◮ Manual cleanup: Formatting, variable renaming ◮ Phase 1 (Early prototype, limited to two-way merge):

Submitted Accepted Rejected Pending 8 1 4 3 ⇒ Feedback: Need multi-way merge

◮ Phase 2:

Submitted Accepted Rejected Pending 10 9 1

Effective as a Refactoring Tool

12 / 14

slide-39
SLIDE 39

Lessons for Generative Programming?

Language-Level Abstraction

◮ Approach shows that we can automatically introduce

abstraction

◮ Still need heuristics to choose abstraction mechanism ◮ Extending existing abstraction also feasible

13 / 14

slide-40
SLIDE 40

Lessons for Generative Programming?

Language-Level Abstraction Template-based Code Generation

◮ Approach shows that we can automatically introduce

abstraction

◮ Still need heuristics to choose abstraction mechanism ◮ Extending existing abstraction also feasible

13 / 14

slide-41
SLIDE 41

Lessons for Generative Programming?

Language-Level Abstraction Template-based Code Generation General Metaprogramming & DSLs

◮ Approach shows that we can automatically introduce

abstraction

◮ Still need heuristics to choose abstraction mechanism ◮ Extending existing abstraction also feasible

13 / 14

slide-42
SLIDE 42

Lessons for Generative Programming?

Language-Level Abstraction Template-based Code Generation General Metaprogramming & DSLs

◮ Approach shows that we can automatically introduce

abstraction

◮ Still need heuristics to choose abstraction mechanism ◮ Extending existing abstraction also feasible

Open Question: How do we pick the Actual Selector automatically?

13 / 14

slide-43
SLIDE 43

Conclusions

◮ Hypothesis: More abstractions ⇒ harder for humans to

work with

◮ Idea: Use automation to

◮ Abstract code after cloning ◮ Specialise abstract code prior to specialised modification

(Inline refactoring)

◮ Merge modification back into abstract code

◮ For C++/Eclipse: https://marketplace.eclipse.

  • rg/content/clone-abstractor-c-methods-0

14 / 14