Intelligent Software Development, Courtesy of Intelligent Software - - PDF document

intelligent software development courtesy of intelligent
SMART_READER_LITE
LIVE PREVIEW

Intelligent Software Development, Courtesy of Intelligent Software - - PDF document

K1 Keynote 11/8/17 8:45 AM Intelligent Software Development, Courtesy of Intelligent Software Presented by: Stephen Frein Comcast Brought to you by: 350 Corporate Way, Suite 400, Orange Park, FL 32073 888 --- 268 --- 8770 904 --- 278 ---


slide-1
SLIDE 1

K1

Keynote 11/8/17 8:45 AM

Intelligent Software Development, Courtesy of Intelligent Software

Presented by:

Stephen Frein

Comcast

Brought to you by:

350 Corporate Way, Suite 400, Orange Park, FL 32073 888---268---8770 ·· 904---278---0524 - info@techwell.com - https://www.techwell.com/

slide-2
SLIDE 2

Stephen Frein

Comcast

Stephen Frein is a senior director of software engineering at Comcast. He previously managed high profile software projects for the U.S. Department of Defense and the U.S. Treasury. For two decades, he has been leading development and testing teams to questionable success by dint of accidents he cannot reliably replicate. As an adjunct professor at Drexel and Villanova, Stephen delivers soporific lectures on machine learning, database development, and technology management to frequently inattentive students. He has presented at previous TechWell events by sneaking into unused rooms and deceiving the unsuspecting. Stephen enjoys polluting the hive mind via TechBeacon and other industry publications with questionable editorial

  • standards. Visit his poorly maintained vanity website, where he practices writing

vapid, self-congratulatory bios.

slide-3
SLIDE 3

11/8/17 1

Intelligent Software Development

Courtesy of Intelligent Software

Stephen Frein

About Me

  • Sr. Director of Software Engineering @ Comcast

Adjunct @ Drexel & Villanova Universities Contributor to TechBeacon.com stephen_frein@cable.comcast.com www.frein.com

slide-4
SLIDE 4

11/8/17 2

Takeaways

Machine learning can help you build better software Not hard to get started

slide-5
SLIDE 5

11/8/17 3

What is machine learning?

When machines get better at tasks through experience.

Funny hats decrease the chances of finding gold by 50%

DANG! What is data mining?

Subset of machine learning Generation of novel el ins insights ights by discovering previously unknown patterns Not writing standard SQL (or similar) queries

slide-6
SLIDE 6

11/8/17 4

“Big data” not required

Supervised learning: labeled examples

Feature 1 Feature 2 Feature 3 Feature 4 Target AAA 38.54 1 0.37 Yes CCC 117.16 1

  • 1.21 No

BBB 18.68 .349 Yes AAA 89.41 1 0.06 No

Predictors (x) Target (y) Model y = f(x)

slide-7
SLIDE 7

11/8/17 5

Supervised learning: model predictions

Feature 1 Feature 2 Feature 3 Feature 4 CCC 24.14 1.04 AAA 192.23 1

  • 0.28

BBB 84.01 .551

New Data (x) Prediction (y) Model y = f(x)

Target Yes No Yes

Supervised learning: model interpretation

Feature 1 = AAA: Prob No Feature 3 = 1 : Prob No Feature 4 > 0 : Prob Yes

Interpretation not always feasible

slide-8
SLIDE 8

11/8/17 6

Unsupervised learning: no labels, find patterns We do it all the time for

  • thers.
slide-9
SLIDE 9

11/8/17 7

We rarely do it for ourselves. Problem: Missed Source Changes

File 1

  • -function foo (float x){

++function foo (int x){ … }

File 2 float param = 7.0; foo(param); Should be in sync

slide-10
SLIDE 10

11/8/17 8

Market basket analysis (a.k.a. association rules)

“People who buy waffles are three times more likely to buy syrup than the average shopper.” Sample Transactions {hammer, nails} {hammer, nails, rope} {nails, ladder} {ladder, rope} {screwdriver, hammer}

What is “Lift”?

Support: nails appear in 60% (3/5) of all transactions Confidence: nails appear 67% (2/3) of the time when hammers do Lift: nails are 11% ((67-60)/60) more likely to appear when hammers do

Hammers make it 1.11 times as likely that nails appear

slide-11
SLIDE 11

11/8/17 9

Treat code check-ins like shopping baskets?

Does the presence of some files make others more likely?

frein@ubuntu:~$ git commit -m “leaving for Orlando; good luck finding these bugs, suckers"

Sa Sample le Analysis is

changes

Tomcat

slide-12
SLIDE 12

11/8/17 10

Highest Lift

Rules Lift {/ajp/AjpAprProcessor.java} => {/ajp/AjpProcessor.java} 93.87 {/ajp/AjpProcessor.java} => {/ajp/AjpAprProcessor.java} 93.87 {/http11/Http11NioProcessor.java} => {/http11/Http11Processor.java} 46.42 {/http11/Http11Processor.java} => {/http11/Http11NioProcessor.java} 46.42 {/http11/Http11Processor.java} => {/http11/Http11AprProcessor.java} 43.10 {/http11/Http11AprProcessor.java} => {/http11/Http11Processor.java} 43.10

If you change AjpProcessor.java, you may need to change AjpAprProcessor.java.

Operatio ionaliz lize It

Rep

  • Change

Rule Builde r CI Rules Warnings

slide-13
SLIDE 13

11/8/17 11

Just Use a Database? Could, but Hard

Problem: Predicting Defects

slide-14
SLIDE 14

11/8/17 12

Why try to predict defects?

Test Effort Complexity

What would we do if we could predict defects?

Experience Exploration Targeting Pairing Peer Review

slide-15
SLIDE 15

11/8/17 13

How can we predict defects?

Code Measures Defect History Requirements

Using words in stories

Id Story Defect? 101 As a customer, I want to order doughnuts with sprinkles. Yes 102 As a customer, I want to pay with a credit card. No 103 As admin, I want to configure available doughnut types. Yes 104 As a customer, I want to order cupcakes with sprinkles. No 105 As a vendor, I want to submit a doughnut invoice. No

slide-16
SLIDE 16

11/8/17 14

Sa Sample le Analysis is Re Reason to thin ink k this is will ill work

slide-17
SLIDE 17

11/8/17 15

𝑸𝒔 𝑸𝒔𝒑𝒄𝒃𝒄𝒋𝒎𝒋𝒖𝒛​𝑰𝒛𝒒𝒑𝒖𝒊𝒇𝒕𝒋 𝒕𝒋𝒕⁠𝑭𝒘𝒋𝒆𝒇𝒐𝒅 𝒐𝒅𝒇 = ​𝑸𝒔 𝑸𝒔𝒑𝒄𝒃𝒄𝒎𝒋𝒎𝒖𝒛(𝑰𝒛𝒒𝒑𝒖𝒊𝒇𝒕𝒋 𝒕𝒋𝒕)∗ 𝑸𝒔 𝑸𝒔𝒑𝒄𝒃𝒄𝒋𝒎𝒋𝒖𝒛(𝑭𝒘𝒋𝒆𝒇𝒐𝒅 𝒐𝒅𝒇|𝑰𝒛𝒒𝒑𝒖𝒊𝒇𝒕𝒋 𝒕𝒋𝒕) 𝒕) /​𝑸𝒔 ​𝑸𝒔𝒑𝒄𝒃𝒄𝒋𝒎𝒋𝐮𝐳 𝐮𝐳(𝑭𝒘𝒋𝒆𝒇𝒐𝒅 𝒐𝒅𝒇) )

Bayes’ Theorem

Heavil ily used in in spam m fi filt lters rs

Training and Testing

Data Training X Y 80% 20% Model Y = f(X) Test Inputs Predicted Compare X

slide-18
SLIDE 18

11/8/17 16

Results

44 % (32/73) of predicted defects really were defects. 21% of the stories are associated with a defect. I found 27% of the defects.

Understanding Factors with Decision Trees

TaskActualTotal TaskActualTotal > 28: [S1] > 28: [S1] TaskActualTotal TaskActualTotal <= 28: <= 28: :... :...TaskActualTotal TaskActualTotal <= 5.5: <= 5.5: No_Defect No_Defect (42.3) (42.3) TaskActualTotal TaskActualTotal > 5.5: > 5.5: :... :...DaysInProgress DaysInProgress <= 1: <= 1: No_Defect No_Defect (20.3) (20.3) DaysInProgress DaysInProgress > 1: > 1: :... :...DaysTillAcceptance DaysTillAcceptance <= 3: <= 3: Has_Defect Has_Defect (19.9/6) (19.9/6) DaysTillAcceptance DaysTillAcceptance > 3: [S2] > 3: [S2]

goes to a sub-tree (# of hours)

slide-19
SLIDE 19

11/8/17 17

Other Things to Try

Which test cases will find defects? Will we finish this story in one sprint? Is the application about to go down? Will this feature get used?

Technology Choices

slide-20
SLIDE 20

11/8/17 18

Data Science for Developers – TechBeacon.com

slide-21
SLIDE 21

11/8/17 19

Takeaways

Machine learning can help you build better software Not hard to get started