A Statistical Package Based on Pnuts Junji NAKANO The Institute of - - PowerPoint PPT Presentation

a statistical package based on pnuts
SMART_READER_LITE
LIVE PREVIEW

A Statistical Package Based on Pnuts Junji NAKANO The Institute of - - PowerPoint PPT Presentation

A Statistical Package Based on Pnuts Junji NAKANO The Institute of Statistical Mathematics Takeshi FUJIWARA The Graduate University for Advanced Studies Yoshikazu YAMAMOTO Tokushima Bunri University Ikunori KOBAYASHI Tokushima Bunri University


slide-1
SLIDE 1

A Statistical Package Based

  • n Pnuts

Junji NAKANO The Institute of Statistical Mathematics Takeshi FUJIWARA

The Graduate University for Advanced Studies

Yoshikazu YAMAMOTO Tokushima Bunri University Ikunori KOBAYASHI Tokushima Bunri University JAPAN

slide-2
SLIDE 2

2

Overview

We are developing Jasp (Java based statistical

processor) system

Outline of my talk

  • 1. Motivations of making Jasp
  • 2. Jasp language
  • 3. Data based GUI
  • 4. Client/server (C/S) approach and distributed

computing

  • 5. Foreign language interface
  • 6. Conclusion
slide-3
SLIDE 3

3

Existing statistical systems

We have a lot of statistical systems, which

have long history are reliable are sophisticated by adopting new technologies

GUI Internet

are not so expensive (some are free!)

However, we want to have new one. Why?

  • 1. Motivation of making Jasp
slide-4
SLIDE 4

4

Reasons for new system (1)

We need fully controllable system

Basic structure may be changed for realizing

new ideas in statistical computing

We have to understand details of the system

We want to be free from the original

author(s)

  • 1. Motivation of making Jasp
slide-5
SLIDE 5

5

Reasons for new system (2)

Need for newly designed system

In many available systems, new functions were

added on their original design

As computer has been changing rapidly,

new design will be better for new technologies

  • 1. Motivation of making Jasp
slide-6
SLIDE 6

6

Reasons for new system (3)

We (ISM) needs new statistical system

Many Fortran programs exist in ISM They are not easy to use

Development of Data mining

Requirement for general purpose system Statistical model using both mathematics and algorithm

We should have know-how for making large

system

  • 1. Motivation of making Jasp
slide-7
SLIDE 7

7

Tool we selected: Java language (1)

Java adopts many new technologies in a

well organized way

Platform independent Purely object oriented Good libraries

Network (TCP/IP) GUI support and Graphics

Interface to other languages (Java Native

Interface, JNI)

  • 2. Jasp language
slide-8
SLIDE 8

8

Tool we selected: Java language (2)

Advanced network functions

Web based execution (Applet) Using remote objects (Remote Method Invocation, RMI)

Security

Demerits

slow speed politically unstable

Java is easy to use, at least compared with before Merits are bigger than demerits

  • 2. Jasp language
slide-9
SLIDE 9

9

Statistical language

Function based languages (e.g. S, XploRe)

are intuitive and flexible are not good at bundling similar notions

Object oriented languages (e.g. Java)

are good at arranging notions are not easy to use tentatively

Both function based and object oriented abilities

are required

Function based abilities are for interactive use Object oriented abilities are for reusing programs

2 .Jasp language

slide-10
SLIDE 10

10

Jasp is based on Pnuts

Pnuts is a function based script language

interpreter written in Java

Simple syntax Easy access to Java class Built-in language extension facility, with examples for

  • bject oriented syntax

Pnuts -> Java translator Source is available freely http://javacenter.sun.co.jp/pnuts/

  • 2. Jasp language
slide-11
SLIDE 11

11

Statistical parts we used

“Jampack” for matrix manipulation

ftp://math.nist.gov/pub/Jampack/Jampack/Abo

utJampack.html

“Ptplot” for graphics

http:// ptolemy.eecs.berkeley.edu/java/ ptplot/

“Colt” for statistical distributions and

random numbers

http://nicewww.cern.ch/hoschek/colt/

  • 2. Jasp language
slide-12
SLIDE 12

12

Example: Jasp functions

y X X X b ' ) ' (

1 −

=

function ols(y, x){ coeff = (x.trans * x).inv * x.trans * y return coeff }

Xb y = ˆ

function forecast(x, b){ y_hat = x * b return h_hat }

  • 2. Jasp language
slide-13
SLIDE 13

13

jaspclass LinearRegression { method LinearRegression(y, x){ // constructor this.beta = ols(y,x) // slot this.forecast = forecast(x,this.beta) // slot } function ols(y, x){ coeff = (x.trans * x).inv * x.trans * y return coeff } function forecast(x, b){ y_hat = x * b return h_hat } }

Note that functions are not modified

Example: Jasp class

  • 2. Jasp language
slide-14
SLIDE 14

14

Graphical user interface

Recently, almost all software products are

  • perated by GUIs

Most statistical systems also have GUIs

Show lists of functions as menus Generate beautiful graphs by mouse operations

  • 3. Data based GUI
slide-15
SLIDE 15

15

Character use interface

We need to manipulate data and models in

complex manners

These operations are difficult to be defined by

GUI

They are properly described by programs

CUIs are easy to use for professional or

frequent users

  • 3. Data based GUI
slide-16
SLIDE 16

16

Mixed user interface

Both GUI and CUI are required for statistical

systems

We propose “mixed user interface”

GUI and CUI can be used almost independently They can be used together seamlessly and

alternatively

  • 3. Data based GUI
slide-17
SLIDE 17

17

The history of the analysis

We need to do trials and errors to get

appropriate models for data

Recording the history of the analysis is

important

We implement it in our GUI as data based

way

  • 3. Data based GUI
slide-18
SLIDE 18

18

User interface of Jasp

We develop the user interface of Jasp for

realizing an example of the mixed user interface

Jasp has two windows for GUI and CUI Operations on one window are automatically

recorded on the other window in appropriate forms

  • 3. Data based GUI
slide-19
SLIDE 19

19

Roles of CUI

CUI is an environment for writing and

executing a program in Jasp language

We can edit functions on the upper window We can execute commands on the lower

window and have results as characters

  • 3. Data based GUI
slide-20
SLIDE 20

20

The CUI window

Pull-down menus Buttons Editor area Input and output area

  • 3. Data based GUI
slide-21
SLIDE 21

21

Roles of GUI

GUI window is another environment for operating

Jasp mainly using mouse

All the functions and statistical objects are listed on the

GUI window and can be used by mouse operations

Data are displayed as icons which are arranged to

express the history of the analysis

Almost all operations can be performed through

the GUI window

Exceptions: Defining new functions and new statistical

  • bjects
  • 3. Data based GUI
slide-22
SLIDE 22

22

The GUI window

Pull-down menu Icons Popup menu Functions, methods Graphs Statistics

  • 3. Data based GUI
slide-23
SLIDE 23

23

Pop-up menu

Situation sensitive menu

Menu items are arranged by specified object

and its state

Executed operations depend on specified object

and its state

Menu items are applicable methods of

selected object

  • 3. Data based GUI
slide-24
SLIDE 24

24

Executing functions and methods

We can execute functions and methods in

the lower left GUI window by mouse click

If the function or method requires some

arguments, the window for specifying them are displayed

We can drag and drop icons on the upper left

GUI window to the arguments of the function

  • r method
  • 3. Data based GUI
slide-25
SLIDE 25

25

Client/server approach of Jasp

Jasp system consists of server and client

The user interface runs as the client program

Jasp client can be invoked both as an

application and as a Java applet from Web browsers

Jasp runs on many platforms supported by

Java virtual machines

  • 4. C/S approach and distributed computing
slide-26
SLIDE 26

26

Client/server of Jasp

GUI

and

CUI

Client Server

Calculation program

Message sending by RMI

  • 4. C/S approach and distributed computing
slide-27
SLIDE 27

27

Distributed computing in statistics

Need for distributed computing

We can use many powerful computers connected by

network

Computer intensive statistical techniques are popular

simulation, resampling maximization of complex likelihood function

Existing distributed computing technologies are

not easy to use for statistician

MPI (Message Passing Interface) PVM (Parallel Virtual Machine)

  • 4. C/S approach and distributed computing
slide-28
SLIDE 28

28

Functions for distributed computing in Jasp

remotes(commands)

commands: ArrayList of commands Execute commands simultaneously

remote(command,server)

Execute command on server

send(var1,var2,server)

var1: variable on local computer var2: variable on server Execute “var2 = var1” on server

  • 4. C/S approach and distributed computing
slide-29
SLIDE 29

29

cmds = ArrayList() cmds.add(“mean1 = ranMean(50000)”) cmds.add(“mean2 = remote(¥”ranMean(50000)¥”,¥”Serv2¥”) cmds.add(“mean3 = remote(¥”ranMean(50000)¥”,¥”Serv3¥”) remotes(cmds) mean = (mean1 + mean2 + mean3) / 3

Example of Distributed computing

ranMean(n): Function to generate n random numbers and return their mean value

  • 4. C/S approach and distributed computing
slide-30
SLIDE 30

30

Foreign language interface

We have many programs written in Fortran, C,

and C+ + , for example, TIMSAC (TIMe Series Analysis and Control) package is a time series analysis program written in Fortran at ISM

Jasp has interface to such programs and use them

in the same way as other Java libraries

We implement this using JNI(Java Native

Interface)

  • 5. Foreign language interface
slide-31
SLIDE 31

31

JNI(Java Native Interface)

JNI is Java API(Application Programming Interface)

to interface Java with other language (Native code)

Using JNI, we can call native code from Java (and

call Java code from native code)

This is useful to reuse exiting programs and to

increase processing speed

Java code Native code JNI

  • 5. Foreign language interface
slide-32
SLIDE 32

32

Implementation of interface to TIMSAC programs

Build a DLL(Dynamic Link Library) or shared

library of Fortran programs(and C programs for wrapping them)

Create a Java class to DLL and shared library

using JNI

Use this Java class from Jasp

Jasp TIMSAC DLL Java class TIMSAC

Shared Library

Windows Unix JNI

  • 5. Foreign language interface
slide-33
SLIDE 33

33

Conclusion

Jasp (Java based statistical processor)

Function based and object oriented language New GUI

Connected tightly to CUI Express the analysis history by icons

Distributed computing

Client/server Web browser Parallel processing Use Fortran or C/C+ + programs

Our web site: http:// jasp.ism.ac.jp/

(still under construction!!)

  • 6. Conclusion