Post-Debugging in Large Scale Analytic Systems Eduard Bergen M.Sc. - - PowerPoint PPT Presentation

post debugging in large scale analytic systems
SMART_READER_LITE
LIVE PREVIEW

Post-Debugging in Large Scale Analytic Systems Eduard Bergen M.Sc. - - PowerPoint PPT Presentation

Post-Debugging in Large Scale Analytic Systems Eduard Bergen M.Sc. BTW 2017 BigBIA Workshop Context: Distributed systems log file analysis Distributed system A lot of log file information 2 BTW2017 BigBIA Workshop Eduard Bergen


slide-1
SLIDE 1

Post-Debugging in Large Scale Analytic Systems

Eduard Bergen M.Sc. BTW 2017 BigBIA Workshop

slide-2
SLIDE 2

2

Context: Distributed systems log file analysis

Distributed system A lot of log file information

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-3
SLIDE 3

1 class SubText { 2 public static void main (String[] args) { 3 String s = “SubText”; 4 System.out.println(s.substring(7,4)); 5 } 6 }

3

Exception domain

Unchecked exceptions Source code example with exception

Arithmetic, ArrayStore, ClassCast, IllegalArgument: [IllegalThreadState, NumberFormat] IllegalMonitorState, IndexOutOfBound, NegativeArraySize, NullPointer, Security

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-4
SLIDE 4

4

Cluster program execution

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-5
SLIDE 5

public class Tokenizer extends RichtFlatMapFunction<String, Tuple2<String, Integer>> { @Overrride public void flatMap(String value, Collector<Tuple2<String, Integer>> out) { String[] subText = value.split(“\\W+”); for (String s: subText) {

  • ut.collect(

new Tuple2<String, Integer>( s.substring(7,4) ));

} } }

5

Dataflow system on the JVM #1

User defined function

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-6
SLIDE 6

6

Dataflow system on the JVM #2

Job transformation

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-7
SLIDE 7

JVM

7

JVM Bytecode instrumentation

Operating system Native Java agent attachment via

  • perating system

process id

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-8
SLIDE 8

8

Current prototype development

M R K S M R K F Worker 2 F F S Worker 1 (2) F Linux process (1) (3)

It is possible to record and replay manually, but current work is to generalize and automate the manual process.

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-9
SLIDE 9
  • 1. Create bytecode patch of transformed classes with

invocations

  • 2. Deploy desired patch to remote machines running JVM
  • 3. Instrument bytecode via JVMTI using only

ClassFileLoadHook with native client out-of-process

  • 4. Copy serialized records to local machine
  • 5. Start replay from deserialized records locally

9

Programming model

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-10
SLIDE 10
  • 1. Consider of the order of loaded classes.
  • 2. Use only wait-free code to avoid deadlocks.
  • 3. Consider of state corruption of non-reentrant

instrumentation code.

  • 4. Beware of uninitialized objects usage in constructors.

10

Considerations of problems during instrumentation

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-11
SLIDE 11
  • 1. Instrumented code runs at full speed1, since inserted code

(patch) is standard bytecode.

  • 2. Transformed classes can be returned to their original state

during the instrumented execution.

  • 3. Analysis process takes place out of base process (JVM).
  • 4. Instrumentation contains all files for execution. No waiting

until transformation classes ready.

11

Conclusion

1 https://docs.oracle.com/javase/7/docs/platform/jvmti/jvmti.html#bci

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-12
SLIDE 12
  • 1. Declaration of Patches
  • 2. Validation and provision of patches
  • 3. Automate distribution of patches

12

Future work

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

slide-13
SLIDE 13

Zeller, A (1999). Yesterday, My Program Worked. Today, It Does Not. Why? Proceedings of the 7th European Software Engineering Conference Held Jointly with the 7th ACM SIGSOFT International Symposium on Foundations of Software Engineering series, 253-267. Dave, A., Zaharia, M., & Stoica, I. (2013). Arthur: Rich Post-Facto Debugging for Production Analytics Applications. Technical report, University of California, Berkeley. Sen, T., & Mall, R. (2016). Extracting finite state representation of java programs. Software & Systems Modeling, Springer Berlin Heidelberg, 497–511. Gulzar, M., Interlandi, M., Yoo, S., Tetali, S., Condie, T., Millstein, T., & Kim, M. (2016). BigDebug: debugging primitives for interactive big data processing in spark. Proceedings of the 38th International Conference on Software Engineering, 784-795.

13

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017

Related work

slide-14
SLIDE 14

14

Thank you for listening

BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017