Qian Zhang Jiyuan Muhammad Rohan Miryung Wang Ali Gulzar - - PowerPoint PPT Presentation

qian zhang jiyuan muhammad rohan miryung wang ali gulzar
SMART_READER_LITE
LIVE PREVIEW

Qian Zhang Jiyuan Muhammad Rohan Miryung Wang Ali Gulzar - - PowerPoint PPT Presentation

Qian Zhang Jiyuan Muhammad Rohan Miryung Wang Ali Gulzar Padhye Kim ... val locations = sc.textFile("zipcode.csv") . map { s => val cols= s.split(",") (cols(0), cols(1)) } . filter { s => s._2.equals("New


slide-1
SLIDE 1
slide-2
SLIDE 2

Qian Zhang Muhammad Ali Gulzar Rohan Padhye Miryung Kim Jiyuan Wang

slide-3
SLIDE 3
slide-4
SLIDE 4

... val locations = sc.textFile("zipcode.csv") .map { s => val cols= s.split(",") (cols(0), cols(1)) } .filter { s => s._2.equals("New York") } ...

slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9

... val locations = sc.textFile("zipcode.csv") .map{s => val cols = s.split(",") (cols(0), cols(1) } .filter{s => s._2 == "New York"} ...

slide-10
SLIDE 10

... val locations = sc.textFile("zipcode.csv") .map{s => val cols = s.split(",") (cols(0), cols(1) } .filter{s => s._2 == "New York"} ...

slide-11
SLIDE 11

... val locations = sc.textFile("zipcode.csv") .map{s => val cols = s.split(",") (cols(0), cols(1) } .filter{s => s._2 == "New York"} ... public class Map1 { static final Map1 apply(String line2) { String cols[]=line2.split(","); return new Map1(cols[0],cols[1]); }

slide-12
SLIDE 12

... ArrayList<Map1> results1 =LoanSpec.map1 (inputs); ArrayList<Map1> results2 =LoanSpec.filter2 (results1) ... public ArrayList<Map1> map1(ArrayList<String> input){ ArrayList<Map1> output = new ArrayList<>(); for (String item: input){

  • utput.add(Map1.apply(item) );}

return output;} ... val locations = sc.textFile("zipcode.csv") .map{s => val cols = s.split(",") (cols(0), cols(1) } .filter{s => s._2 == "New York"} ... public class Map1 { static final Map1 apply(String line2) { String cols[]=line2.split(","); return new Map1(cols[0],cols[1]); }

slide-13
SLIDE 13

data filter

True False

...

  • val pair = data.filter{

if (s._1 == 90024) A; else B; }

...

slide-14
SLIDE 14

data filter

True False

...

  • val pair = data.filter{

if (s._1 == 90024) A; else B; }

...

slide-15
SLIDE 15
  • integer[0-30]

integer[0-30]

slide-16
SLIDE 16
  • .collect().foreach(pri

ntln) Division by zero str.split(“\t”)[1] str.split(“,”)[1] str.substring(1,0) If(age>10 && age<9) LeftOuterJoin (Value, Key) Spark word2vec

  • ne row join in spark
slide-17
SLIDE 17
slide-18
SLIDE 18

AFL (9216M memory and 100s timeout) runs at an extremely low speed 9.68 execs_per_sec on average

slide-19
SLIDE 19
slide-20
SLIDE 20

BigFuzz speeds up to 1477x times with framework abstraction

slide-21
SLIDE 21
slide-22
SLIDE 22

BigFuzz provides up to a 3.71X improvement on code coverage

slide-23
SLIDE 23

BigFuzz achieves up to a 2.57X improvement on error detection

slide-24
SLIDE 24
slide-25
SLIDE 25

In comparison to a symbolic execution based approach BigTest , BigFuzz detects 80.6% more injected errors

Muhammad Ali Gulzar, Shaghayegh Mardani, Madanlal Musuvathi, and Miryung Kim. 2019. White-Box Testing of Big Data Analytics with Complex User-Defined Functions. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019)

.

slide-26
SLIDE 26
slide-27
SLIDE 27

○ ○

27