1
CSE 3341: Principles of Programming Languages Core Concepts I - - PowerPoint PPT Presentation
CSE 3341: Principles of Programming Languages Core Concepts I - - PowerPoint PPT Presentation
CSE 3341: Principles of Programming Languages Core Concepts I Jeremy Morris 1 Imperative Language Basics The model of most imperative languages is straightforward: Programs are a sequence of expressions or statements Expressions
Imperative Language Basics
The model of most imperative languages is
straightforward:
Programs are a sequence of expressions or statements Expressions read values from memory and produce a new value
Some expressions may have side effects that change values in memory
Statements read values from memory and do not produce new
values
All statements operate by side effect
Assignment statements – most basic example
Core idea: Expressions are evaluated, Statements are executed 2
Imperative Language Basics
Early languages sought to break away from being
machine dependent
Assembly languages are machine dependent
These language designers sought a higher level of
abstraction
Separation of function from the details of machine implementation
Or separation of function from implementation details period
(Sound familiar – it should. Think client vs. implementer)
Key to this idea are names
Strings of symbols used to refer to "things" in a programming language
Variables, procedures, constants, etc.
3
Binding and Binding Time
Binding is an association between two things
Typically we are using it in reference to binding a name to a
target (or object in the textbook's terminology)
Binding time is when a binding occurs
A few important ones:
Compile time
Compiler chooses how to bind high-level constructs to machine code
Link time
When code in libraries are joined together by a linker
Load time
When the OS loads a program into memory
Run time
The entire span of execution of a program from beginning to end
Static vs. Dynamic
Static before run time, dynamic during runtime
4
Object lifetime
Lifetime of an object
The period of time between when an object is created and when
it is destroyed
Lifetime of a binding
The period of time between when a binding between a name and
an object is created and when it is destroyed
Object lifetimes generally fall into one of three types
based on how they are allocated
Static objects Stack objects Heap objects 5
Static Allocation
An absolute memory location retained through the program's execution
Typically global variables
Also (in some languages) variables that are local to a subroutine but retain value from one invocation to the next
Numeric and string-valued constant literals
The lifetime of a statically allocated variable is the entire program
Local variables are not typically statically allocated in modern languages
If a language doesn't have recursion, local variables can be statically allocated
Fortran as originally designed, for example
Recursion means we need to do something else
Stack allocation can help us solve this problem
6
Static local variable in C
void adder() { static int x = 0; x++; printf("%d\n", x); } int main(int argc, char *argv[]) { adder(); // prints 1 adder(); // prints 2 adder(); // prints 3 return 0; }
7
Example courtesy Wikipedia: https://en.wikipedia.org/wiki/Static_variable
Stack Allocation
8
Image courtesy Programming Language Pragmatics p. 118
Heap Allocation
The heap is free memory allocated to the program by the
OS
The programmer can allocate it however they want Uses keywords like new (Java/C++) or malloc (C) to allocate
memory from the heap for the program
Uses keywords like delete (C++) or free (C) to free it up
Data in the heap is accessed via references (or pointers)
Addresses in the heap to where data is stored new and malloc both return a reference (pointer) to the memory
they allocate
The lifetime of a heap allocated object is controlled by the
programmer
May be as long or as short as the programmer needs
9
C program running in memory
10
Example courtesy Wikipedia: https://en.wikipedia.org/wiki/Data_segment
Scope
The region in the code where a binding is active is its
scope
Scope can be determined statically or dynamically
Almost all modern languages have static scoping
Dynamic scoping comes with a run-time cost
Static scoping is also known as lexical scoping
Because we can determine it from the code at compile time
Just by reading the source, we can understand the scope of every variable
Dynamic scoping can only be determined at run-time
11
Static/Lexical Scope
Probably what you think of when you think of scope
Java, C, C++, Python, Pascal … Most modern languages use lexical scope Typically the current binding between a name and a target is
based on the "block" of code that the name appears in
Scope can often be nested – variables in outer blocks can be
accessed from within inner blocks (but not vice-versa)
12
Static/Lexical Scope – Java Example
public class Sample { private static int avg = 33; private static int average(int[] list) { int x = avg; int avg = 0; for (int i=0; i<list.length; i++) { int y = list[i]; avg += y; } if (list.length > 0) { avg /= list.length; } return avg; } public static void main(String[] args) { int[] list = { 2, 4, 6, 8, 10 }; int avg = average(list); System.out.println("The average is: " + avg); } }
13
Static/Lexical Scope – C++ Example
int avg = 33; double average(int list[], int size) { double x = avg; double avg = 0; for (int i=0; i<size; i++) { int x = list[i]; avg += x; } if (size > 0) { avg /= size; } return avg; } int main(String[] args) { int list[5] = { 2, 4, 6, 8, 10 }; double avg = average(list); cout << "The average is: " << avg << endl; return 0; }
14
Static/Lexical Scope – Python Example
avg = 33 def average(list): avg = 0 for x in list: avg += x if ( len(list) > 0 ): avg /= len(list) return avg def foo(): local = 10 def bar(): local = 5 print("Local2: "+str(local)) bar() print("Local1: "+str(local)) print("The average is: "+str(avg)) list = [2,4,6,8,10] avg = average(list) print("The average is: "+str(avg)) 15
Static/Lexical Scope – C/C++
In Java, you can refer to methods before you declare
them
Scope is the entire block they are declared in
In C/C++, you cannot
Scope is only after they've been declared This is a problem for recursive types and subroutines
How can you refer to something if it isn't in scope?
C/C++ solve this problem by separating declaration from
definition
Only declaration has to exist before you can use it Definition can come after you've used it 16
Static/Lexical Scope – C/C++
bool isEven(int x); bool isOdd(int y); bool isEven(int x) { if (x == 0) { return true; } else { return isOdd(x-1);} } bool isOdd(int x) { if (x == 0) { return false; } else { return isEven(x-1);} }
17
Dynamic scope
Bindings between names and targets can only be
determined at run-time
Depending on the flow of control of the program, bindings may be
different
You can't tell from reading the code exactly what the scope of a
variable is
It will depend on what was previously executed before the code reached that point
If you are used to static scoping, this idea may seem very odd 18
Dynamic scope example
int n void first(): n = 1 void second(): int n first() main(): n = 2 int x = readInteger() if (x > 0) then second() else first() println(n)
19
Example courtesy Michael Scott – Programming Language Pragmatics
Scope implementation
Scope rules in a compiler are implemented using the
symbol table
If everything is global, it's just a mapping of symbols to internal
keys
Think about the recursive descent algorithm discussed in class
Scope rules require some extra complexity
Augment the symbol table with "enter scope" and "leave scope"
- perations to keep track of visibility
Nothing is ever deleted from a symbol table – even when it falls out
- f scope
It's just marked "out of scope"
Retained through entire compilation
Scope rules in an interpreter are treated in a similar fashion
Interpreter also has to keep track of the targets tied to each binding
20
Expression Evaluation
An expression is a line of code that is evaluated and
produces a result
Expressions can be combinations of operands and operators
An operator in a language is a built-in function that uses special syntax
An operand is one of the arguments to the operator function
3 + 2, x + y, x++, --j, etc.
Expressions can also be function calls
Typically the name in parentheses with arguments in parentheses
myFunction(x,y)
In some languages, operands are just syntactic sugar for function
calls
Ada – "+"(a,b)
C++ - a.operator+(b)
Scala – a.+(b)
21
Expression Values & Assignment
Variables are a binding between a memory location and
a value
Semantics of how we read the variable differ Sometimes the variable designates a value
x * 3
multiply the value stored in x by 3
But sometimes it designates a memory location
x = 3
store the value three in the memory location denoted by x
Value model of variables
"l-value" denotes a memory location "r-value" denotes a value But … pointers/references can make this tricky 22
Pointers & Assignment
Pointer values make the r-value/l-value split trickier
A pointer value is a memory location (l-value)
In C – the address of the first byte in memory for the object
In C we can get the pointer value of any variable with &
(address of) operator
y = &x – y holds the l-value of x
In C we can also dereference the pointer value to get
back to the value stored in memory (r-value)
*y – access the r-value stored at the l-value denoted by y
This allows us to use l-values on the right side
"pointer arithmetic" 23
References in Java
A Java reference is like a C pointer
Except without the ability to use l-values as if they were r-values That's why people like to say that "Java has no pointers"
What they mean is "Java has no pointer arithmetic"
The semantics of a Java reference exactly follow the r-
value/l-value scheme
When a variable of a reference type appears on the left side of an
assignment, it refers to a memory location (l-value)
When a variable of a reference type appears on the right side of
an assignment, it refers to the object stored at that location (r-val)
Java distinguishes between reference types and primitive (or
value) types
Has turned out to cause some complications
24
Expressions
Expressions can be function calls, values or operations
Operations have operators and operands
An operator has arity (unary, binary or ternary)
An operator is used prefix, infix or postfix
Unary postfix: x++
Unary prefix: --x
Binary infix: x + y
Ternary: b ? 1 : 2
Operators will have precedence rules and associativity rules
And these will depend on the language
- (Will they respect mathematical precedence and associativity? These
days probably yes…)
Function calls are built-in or programmer defined
Prefix notation typical - foo(x) or (foo x)
x is an actual parameter if foo(x)is a function call
Return a value – should not have side effects
25
Expressions and Side Effects
A side effect is when an expression affects computation
in some way other than "returning a value"
1 + 2 has no side effects x = 1 + 2 has a side effect
The value stored in the location referred to by x changes
Other examples: x++, x += y, --y
Expressions with no side effects are referentially
transparent
An expression is referentially transparent if you can replace the
expression with its value and not change program behavior
j = 10; k = 1 + j++; if (k == (j+1)) b = true; j = 10; k = 1 + 10; if (k == (j+1)) b = true;
Not referentially transparent – different behaviors
26
Evaluation order
In the presence of side effects, precedence and
associativity aren't enough
a – foo(b) + c
When will foo(b) be evaluated? What if it has a side effect that changes a or c?
foo(a, bar(b), foobar(c))
If bar and foobar have no side effects, it doesn't matter when they are evaluated
If they have side effects it could change the behavior of this code quite a lot
Language semantics must state this order
Clarify behavior in the presence of side effects And to allow for compiler optimizations 27
Statements
Statements are lines of code that do not themselves
evaluate to a value
All of their work is done by side effect
Control flow is generally done by statement
Selection statements (if-else, switch, etc.) Iteration statements (while, for, do, etc.) Jump statements (goto, return, break, continue, throw)
Unstructured control flow
Using goto to jump to arbitrary points in the code
Based on the "jump" commands in assembly code
Structured control flow
Abstracting the control flow into a meaningful structure
If-else, while, etc.
28
Unstructured Flow Example
int main() { int arr[] = {1, 2, 3, 4, 5}; int size = 5, idx = 0, count = 0; LAB1: if (idx > size) goto LAB4; if (arr[idx] % 2 != 0) goto LAB3; sum += arr[idx]; LAB3: idx++; goto LAB1; LAB4: cout << "Count: " << count << endl; }
29
Structured Programming
"Structured programming" was the hot buzzword of the
1970s
Kicked off by Edsger Dijkstra in the late 60s with his letter "GOTO
statements considered harmful" (Communications of the ACM, March 1968)
In his letter, he identified a number of abstractions that GOTO
statements implemented
Basically if-else and its relatives, and while-do and its relatives
The structured programming proponents basically won the battle
And modern programming languages implement these ideas
30
Subroutines
aka procedures, methods, functions
Subroutine in the general term
Functions return a value
Procedures work via side effect
Method is an object-oriented term
Procedural languages
Languages where the procedure is the main abstraction
(C, Fortran, Pascal…)
Most subroutines are parameterized
Subroutines have a name, formal parameters, and optional return value
Java: int myFuction(int numerator, String msg)
31
Calling subroutines
Calling a subroutine (aka invoking a method)
A subroutine is called from another subroutine
Caller subroutine vs. called subroutine
The caller invokes the subroutine with arguments
aka actual parameters
The actual parameters are mapped to the formal parameters of
the called subroutine
A variety of parameter passing modes exist – we'll talk about these shortly
Memory is allocated for the formal parameters and the local
variables of the subroutine on the call stack
Flow of control passes to the start of the subroutine
When the routine finishes, flow of control returns to the caller
32
Run-time Call stack revisited
33
Image courtesy Programming Language Pragmatics p. 118
Using the call stack
The compiler must generate code for subroutine calls to
manage the stack
Responsibility of the calling sequence to maintain the stack Precise definition of calling sequence will vary from platform to
platform
Typically broken up into responsibilities for the calling subroutine
and the called subroutine
34
Using the call stack
When a subroutine S is called, the calling subroutine must:
Save the values currently stored in registers
Compute values for actual parameters and push them onto the stack
Push the value of the return address onto the stack
Jump to the prologue sequence of the called subroutine S
The prologue of S
Allocates memory for an activation record for S
Saves the address of the top of the stack (stack pointer or SP) and sets it to the new value
Saves the old activation record pointer (AP) to the stack and updates it to the new value
AP is also known as the frame pointer or (FP)
Save any registers that it may overwrite
35
Using the call stack (continued)
When S finishes, its epilogue is executed
Put the return value into a register
Or a reserved location on the stack
Restore any registers it saved in the prologue Restore the activation record pointer (AP) to its stored value Restore the stack pointer (SP) to its stored value an deallocates
the top of the stack
Jumps back to return address
When control is restored to the calling routine:
Return value is moved to wherever it is needed Registers are restored as needed 36
Structure of Activation Records
37 Surrounding procedure activation record (AR)
Calling procedure (S) AR
Return Address Caller's AR Address Surrounding proc. AR Address Saved Registers
SP AP Call Stack
Code for procedure P Code for procedure Q Code for procedure S (Calls P)
Code Activation Record (AR) for this call to P
Free space on top of stack
Static Link Dynamic Link
Local variables
Example – variable scope (static)
Consider x=25; in some procedure P
If x is a local variable for P, we can find it in the activation record
for P by looking at some offset from the AP
Is it the first local variable? The second? The third?
Compiler can generate the appropriate code
If x is a local variable in the surrounding block:
Follow the static link backwards through the stored AP values to get to the location for x in the stack
If x is a local variable nested two blocks out
Follow the static link backwards twice:
first to the immediate outer block
then to the next outer block
38
Dynamic scope?
Not as easy to implement dynamic scope
What if variable x is not local?
Need to search for it at runtime
Work backwards through dynamic links
Need to keep names at runtime too
So languages with dynamic scope tend to be implemented as
interpreted rather than compiled (Lisp, Perl)
39
Parameter Passing
Depending on the language, there are different
mechanisms for passing parameters
Call-by-value
The value of each actual parameter is copied into the formal parameter when the call is made
The activation record holds a copy of the value as if it were a local variable
In the subroutine, changes are made to the copy, not to the original value
Call-by-reference
The address of each actual parameter is copied into the formal parameter when the call is made
The activation record holds a copy of the address
Subroutine must be written to properly dereference this address
Note – this terminology gets some Java programmers confused
Java is considered call-by-value (though non-Java programmers might call it "call by sharing" or "call by object reference")
40
Call-by-reference example
void swap(int &x, int &y) { int tmp = x; x = y; y = tmp; } int main() { int a = 10, b=3; swap(a,b); // now b=10, a=3 }
41
Call-by-value example (Java primitives)
public static void swap(int x, int y) { int tmp = x; x = y; y = tmp; } public static void main(String[] args) { int a = 10, b=3; swap(a,b); // a is still 10, b still 3 }
42
Java call-by-?
public static void swap(int[] arr, int x, int y) { int tmp = arr[x]; arr[x] = arr[y]; arr[y] = tmp; } public static void main(String[] args) { int[] a = {1, 2, 3, 4}; swap(list,0,3); // a is now {4, 2, 3, 1} }
43
Python call-by-?
def swap(list, x, y) { tmp = list[x]; list[x] = list[y]; list[y] = tmp; } list = [1, 2, 3, 4] swap(list, 0, 3); // list is now [4, 2, 3, 1]
44
Parameter Passing (other modes)
Call-by-result
The calling routine provides an actual parameter that the called subroutine can write to
Value is copied into actual parameter on exit of the subroutine
Call-by-value-result
The calling routine provides an actual parameter that the called subroutine can read from and write to
Value is copied into actual parameter on exit from the subroutine
Like call by value, where the final value of the parameter gets copied back to the actual parameter when the routine is finished
Both call-by-result and call-by-value result can be implemented by using a local variable that is copied back to the caller's actual parameter value when the routine finishes
45