Post-Debugging in Large Scale Analytic Systems Eduard Bergen M.Sc. - - PowerPoint PPT Presentation
Post-Debugging in Large Scale Analytic Systems Eduard Bergen M.Sc. - - PowerPoint PPT Presentation
Post-Debugging in Large Scale Analytic Systems Eduard Bergen M.Sc. BTW 2017 BigBIA Workshop Context: Distributed systems log file analysis Distributed system A lot of log file information 2 BTW2017 BigBIA Workshop Eduard Bergen
2
Context: Distributed systems log file analysis
Distributed system A lot of log file information
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
1 class SubText { 2 public static void main (String[] args) { 3 String s = “SubText”; 4 System.out.println(s.substring(7,4)); 5 } 6 }
3
Exception domain
Unchecked exceptions Source code example with exception
Arithmetic, ArrayStore, ClassCast, IllegalArgument: [IllegalThreadState, NumberFormat] IllegalMonitorState, IndexOutOfBound, NegativeArraySize, NullPointer, Security
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
4
Cluster program execution
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
public class Tokenizer extends RichtFlatMapFunction<String, Tuple2<String, Integer>> { @Overrride public void flatMap(String value, Collector<Tuple2<String, Integer>> out) { String[] subText = value.split(“\\W+”); for (String s: subText) {
- ut.collect(
new Tuple2<String, Integer>( s.substring(7,4) ));
} } }
5
Dataflow system on the JVM #1
User defined function
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
6
Dataflow system on the JVM #2
Job transformation
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
JVM
7
JVM Bytecode instrumentation
Operating system Native Java agent attachment via
- perating system
process id
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
8
Current prototype development
M R K S M R K F Worker 2 F F S Worker 1 (2) F Linux process (1) (3)
It is possible to record and replay manually, but current work is to generalize and automate the manual process.
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
- 1. Create bytecode patch of transformed classes with
invocations
- 2. Deploy desired patch to remote machines running JVM
- 3. Instrument bytecode via JVMTI using only
ClassFileLoadHook with native client out-of-process
- 4. Copy serialized records to local machine
- 5. Start replay from deserialized records locally
9
Programming model
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
- 1. Consider of the order of loaded classes.
- 2. Use only wait-free code to avoid deadlocks.
- 3. Consider of state corruption of non-reentrant
instrumentation code.
- 4. Beware of uninitialized objects usage in constructors.
10
Considerations of problems during instrumentation
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
- 1. Instrumented code runs at full speed1, since inserted code
(patch) is standard bytecode.
- 2. Transformed classes can be returned to their original state
during the instrumented execution.
- 3. Analysis process takes place out of base process (JVM).
- 4. Instrumentation contains all files for execution. No waiting
until transformation classes ready.
11
Conclusion
1 https://docs.oracle.com/javase/7/docs/platform/jvmti/jvmti.html#bci
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
- 1. Declaration of Patches
- 2. Validation and provision of patches
- 3. Automate distribution of patches
12
Future work
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
Zeller, A (1999). Yesterday, My Program Worked. Today, It Does Not. Why? Proceedings of the 7th European Software Engineering Conference Held Jointly with the 7th ACM SIGSOFT International Symposium on Foundations of Software Engineering series, 253-267. Dave, A., Zaharia, M., & Stoica, I. (2013). Arthur: Rich Post-Facto Debugging for Production Analytics Applications. Technical report, University of California, Berkeley. Sen, T., & Mall, R. (2016). Extracting finite state representation of java programs. Software & Systems Modeling, Springer Berlin Heidelberg, 497–511. Gulzar, M., Interlandi, M., Yoo, S., Tetali, S., Condie, T., Millstein, T., & Kim, M. (2016). BigDebug: debugging primitives for interactive big data processing in spark. Proceedings of the 38th International Conference on Software Engineering, 784-795.
13
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017
Related work
14
Thank you for listening
BTW2017 BigBIA Workshop – Eduard Bergen – 06.03.2017