Chava: Reverse Engineering and Tracking of Java Applets
Jeffrey Korn Princeton University
- Dept. of Computer Science
Princeton, NJ 08544 jlk@cs.princeton.edu Yih-Farn Chen AT&T Labs - Research 180 Park Avenue Florham Park, NJ 07932 chen@research.att.com Eleftherios Koutsofios AT&T Labs - Research 180 Park Avenue Florham Park, NJ 07932 ek@research.att.com Abstract
Java applets have been used increasingly on web sites to perform client-side processing and provide dynamic con-
- tent. While many web site analysis tools are available, their
focus has been on static HTML content and most ignore ap- plet code completely. This paper presents Chava, a system that analyzes and tracks changes in Java applets. The tool extracts information from applet code about classes, meth-
- ds, fields and their relationships into a relational database.
Supplementary checksum information in the database is used to detect changes in two versions of a Java applet. Given our Java data model, a suite of programs that query, visualize, and analyze the structural information were gen- erated automatically from CIAO, a retargetable reverse en- gineering system. Chava is able to process either Java source files or compiled class files, making it possible to an- alyze remote applets whose source code is unavailable. The information can be combined with HTML analysis tools to track both the static and dynamic content of many web sites. This paper presents our data model for Java and describes the implementation of Chava. Advanced reverse engineer- ing tasks such as reachability analysis, clustering, and pro- gram differencing can be built on top of Chava to support design recovery and selective regression testing. In partic- ular, we show how Chava is used to compare several Java Development Kit (JDK) versions to help spot changes that might impact Java developers. Performance numbers indi- cate that the tool scales well.
- 1. Introduction
The World Wide Web first started with web servers only presenting static HTML content. Later, Common Gate- way Interface (CGI) scripts were introduced to run on web servers to dynamically compose content before presenting them to the clients. Recently, Java applets have been used increasingly on web sites to provide rich user interfaces and perform client-side processing to generate dynamic content. While many web site analysis tools [14, 8] are available to analyze the structure of static HTML content, most of them completely ignore the applet code, which by its nature re- quires software analysis techniques. Traditional software repositories [29, 30, 7, 13, 3] apply reverse engineering [12] techniques on the source code to build a central information source for maintaining code in a software system. Repositories are useful to developers as they make it possible to efficiently examine the structure and interaction between components of a system without having to delve through potentially hundreds of thousands
- f lines of source code. Advanced tools have also been
built to perform reachability analysis [7], clustering analy- sis [20], selective regression testing [10] and even extraction
- f light-weight object models [28, 19].
This paper presents Chava, a reverse engineering and tracking system for Java [1]. The system presented has sev- eral noteworthy features:
- Data Model for both Byte Code and Source Code: