Implementing Legacy Implementing Legacy Statistical Algorithms in a - - PowerPoint PPT Presentation
Implementing Legacy Implementing Legacy Statistical Algorithms in a - - PowerPoint PPT Presentation
Implementing Legacy Implementing Legacy Statistical Algorithms in a Statistical Algorithms in a Spreadsheet Environment Spreadsheet Environment Stephen W. Liddle Liddle John S. Lawson Stephen W. John S. Lawson Information Systems Faculty
Overview Overview
- Introduction
Introduction
- Fundamentals of VBA in Excel
Fundamentals of VBA in Excel
- Retargeting traditional algorithms to a
Retargeting traditional algorithms to a spreadsheet environment spreadsheet environment
- Converting FORTRAN to VBA
Converting FORTRAN to VBA
- Conclusions
Conclusions
Why Convert FORTRAN Programs to Run Why Convert FORTRAN Programs to Run in a Spreadsheet Environment? in a Spreadsheet Environment?
- Useful code available that is not
Useful code available that is not implemented in standard statistical implemented in standard statistical packages packages
- FORTRAN compilers not usually available
FORTRAN compilers not usually available
- n normal Windows workstation
- n normal Windows workstation
- Many textbooks refer to published
Many textbooks refer to published FORTRAN algorithms FORTRAN algorithms
Sources for Published FORTRAN Sources for Published FORTRAN Algorithms Algorithms
- STATLIB (
STATLIB (http:// http://lib.stat.cmu.edu lib.stat.cmu.edu/ /) )
- General Archive
General Archive
- Applied Statistics Archive
Applied Statistics Archive
- Journal of Quality Technology Archive
Journal of Quality Technology Archive
- JASA Software Archive
JASA Software Archive
- JCGS Archive
JCGS Archive
Advantages of Running Legacy Advantages of Running Legacy FORTRAN Code in Excel FORTRAN Code in Excel
- Comfortable environment for practitioners
Comfortable environment for practitioners
- More user friendly input from spreadsheet
More user friendly input from spreadsheet
- Output to spreadsheet allows further
Output to spreadsheet allows further graphical and computational analysis of graphical and computational analysis of results with Excel functions results with Excel functions
VDG Inputs: Design X1 X2 X3 X4 X5 X6 Nickname: 1-hybrid 2.3094 Runs: 30
- 1
- 1
- 1
- 1
- 1
0.57735 Factors: 6 1 1
- 1
- 1
- 1
0.57735 Model Order(1/2): 2 1
- 1
1
- 1
- 1
0.57735 Design Region(S/C): s
- 1
1 1
- 1
- 1
0.57735 Weight by N (Y/N): y 1
- 1
- 1
1
- 1
0.57735 Number of Radii (20-99): 20
- 1
1
- 1
1
- 1
0.57735 Scaling Unit (suggest 1): 1
- 1
- 1
1 1
- 1
0.57735 Design Radius/Region Radius: 1 1 1 1 1
- 1
0.57735 1
- 1
- 1
- 1
1 0.57735
- 1
1
- 1
- 1
1 0.57735
- 1
- 1
1
- 1
1 0.57735 1 1 1
- 1
1 0.57735
- 1
- 1
- 1
1 1 0.57735 1 1
- 1
1 1 0.57735 1
- 1
1 1 1 0.57735
- 1
1 1 1 1 0.57735 2
- 1.1547
- 2
- 1.1547
2
- 1.1547
Run VDG
Proposed Methodology Proposed Methodology
- Understand original FORTRAN program
Understand original FORTRAN program
- Choose suitable I/O methods
Choose suitable I/O methods
- Convert original FORTRAN code to VBA
Convert original FORTRAN code to VBA
- Test and use resulting Excel code
Test and use resulting Excel code
Visual Basic For Applications Visual Basic For Applications
- Built on ANSI BASIC
Built on ANSI BASIC
- Language engine of Microsoft Office
Language engine of Microsoft Office
- Modern structured programming language
Modern structured programming language
- Has vast array of types, functions,
Has vast array of types, functions, programming helps programming helps
- Powerful support environment (Office platform)
Powerful support environment (Office platform)
- Popular in business contexts
Popular in business contexts
Excel Object Model Excel Object Model
- Objects in Excel are
Objects in Excel are addressable in VBA addressable in VBA
- Each object has:
Each object has:
- Properties
Properties
- Methods
Methods
Application Workbooks (Workbook) Range Chart Worksheets (Worksheet)
Clicking these buttons runs the ORPS1 and ORPS2 algorithms. Input Region Output Region
Input/Output Methods Input/Output Methods
- Non
Non-
- interactive
interactive
Files, databases
Files, databases
Worksheet cells
Worksheet cells
- Interactive
Interactive
Message boxes
Message boxes
Input boxes
Input boxes
Custom GUI forms
Custom GUI forms
FORTRAN vs. VBA FORTRAN vs. VBA
- VBA:
VBA:
( (-
- b+Sqr
b+Sqr (b^ 2 (b^ 2-
- 4*a*c))/(2*a)
4*a*c))/(2*a)
- FORTRAN:
FORTRAN: (
(-
- b+SQRT(b
b+SQRT(b**2 **2-
- 4*a*c))/(2*a)
4*a*c))/(2*a)
a ac b b 2 4
2 −
± − a ac b b 2 4
2 −
± −
More Operators More Operators
- .EQ.
.EQ. = =
- .NE.
.NE. <> <>
- .LT.
.LT. < <
- .LE.
.LE. <= <=
- .GT.
.GT. > >
- .GE.
.GE. >= >=
- .AND.
.AND. And And
- .OR.
.OR. Or Or
- .NOT.
.NOT. Not Not
- //
// & &
Data Types Data Types
- INTEGER
INTEGER Byte, Integer, Long Byte, Integer, Long
- REAL
REAL Single Single
- DOUBLE PRECISION
DOUBLE PRECISION Double Double
- COMPLEX
COMPLEX Non Non-
- primitive in VBA
primitive in VBA
- LOGICAL
LOGICAL Boolean Boolean
- CHARACTER
CHARACTER String String
- CHARACTER*
CHARACTER*length length String* String*length length
- Other notable VBA types:
Other notable VBA types:
Currency, Decimal, Date, Variant
Currency, Decimal, Date, Variant
Worksheet Functions Worksheet Functions
- ChiDist(x,deg_freedom)
Returns one-tailed probability of the ?2 distribution.
- Correl(array1,array2)
Returns the correlation coefficient of two cell ranges.
- Fisher(x)
Returns the Fisher transformation at a given x.
- Pearson(array1,array2)
Returns the Pearson product moment correlation coefficient for two sets.
- Quartile(array,quart)
Returns the requested quartile of a data set.
- StDev(array)
Returns the standard deviation of a data set.
- ZTest(array,x,sigma)
Returns the two-tailed P-value of a z-test.
Flow Flow-
- Control Statements
Control Statements
If expr1 Then stmt1 ElseIf expr2 Then stmt2 … Else stmtn EndIf IF (expr1) THEN stmt1 ELSE IF (expr2) THEN stmt2 … ELSE stmtn END IF Block if If expr Then stmt IF (expr) stmt Logical if VBA FORTRAN
Subtle Differences (“ Subtle Differences (“Gotchas Gotchas”) ”)
- Implicit conversion of real to integer values
Implicit conversion of real to integer values
- FORTRAN: truncate
FORTRAN: truncate
- VBA: round
VBA: round
- Solution: use
Solution: use VBA’s VBA’s Fix(), which truncates Fix(), which truncates
- Both languages allow implicit typing
Both languages allow implicit typing
- This introduces ambiguity
This introduces ambiguity
- Solution: supply explicit types everywhere
Solution: supply explicit types everywhere
Eliminating Eliminating Goto Goto Statements Statements
- Computer science accepts the axiom that
Computer science accepts the axiom that goto goto is generally “considered harmful” is generally “considered harmful”
- We advocate rewriting
We advocate rewriting alogrithms alogrithms to use to use structured programming techniques where structured programming techniques where feasible feasible
- Sine qua non
Sine qua non is “make it work” is “make it work”
- It’s a good idea for maintainability,
It’s a good idea for maintainability, understandability to move to structured form understandability to move to structured form
Eliminating Eliminating Goto Goto Statements Statements
DO 8 J=1,3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 CONTINUE
Eliminating Eliminating Goto Goto Statements Statements
For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 Next j
Eliminating Eliminating Goto Goto Statements Statements
For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 If j <> 3 Then xk = bestk - step End If Next j
Eliminating Eliminating Goto Goto Statements Statements
For j=1 To 3 ... Do Until objfn > bestfn ... Loop If j <> 3 Then xk = bestk - step End If Next j
Our Reasoning Our Reasoning
- Digital assets are fragile
Digital assets are fragile
- FORTRAN is not universally available
FORTRAN is not universally available
- Excel is a ubiquitous, powerful platform
Excel is a ubiquitous, powerful platform
- VBA is a full
VBA is a full-
- featured language capable of
featured language capable of handling sophisticated statistical handling sophisticated statistical computations computations
Conclusions Conclusions
- We recommend creating a Web
We recommend creating a Web-
- based
based repository of Excel/VBA implementations repository of Excel/VBA implementations
- f classic statistical algorithms
- f classic statistical algorithms
- We can preserve our legacy algorithms in
We can preserve our legacy algorithms in this modern spreadsheet environment this modern spreadsheet environment
- E
E-
- mail us if you want a copy of our
mail us if you want a copy of our manuscript ( manuscript (liddle liddle or
- r lawson@byu.edu