CSE 3341: Principles of Programming Languages Core Concepts I - - PowerPoint PPT Presentation

cse 3341 principles of programming languages core
SMART_READER_LITE
LIVE PREVIEW

CSE 3341: Principles of Programming Languages Core Concepts I - - PowerPoint PPT Presentation

CSE 3341: Principles of Programming Languages Core Concepts I Jeremy Morris 1 Imperative Language Basics The model of most imperative languages is straightforward: Programs are a sequence of expressions or statements Expressions


slide-1
SLIDE 1

1

CSE 3341: Principles of Programming Languages Core Concepts I

Jeremy Morris

slide-2
SLIDE 2

Imperative Language Basics

 The model of most imperative languages is

straightforward:

 Programs are a sequence of expressions or statements  Expressions read values from memory and produce a new value 

Some expressions may have side effects that change values in memory

 Statements read values from memory and do not produce new

values

All statements operate by side effect

Assignment statements – most basic example

 Core idea: Expressions are evaluated, Statements are executed 2

slide-3
SLIDE 3

Imperative Language Basics

 Early languages sought to break away from being

machine dependent

 Assembly languages are machine dependent

 These language designers sought a higher level of

abstraction

 Separation of function from the details of machine implementation 

Or separation of function from implementation details period

(Sound familiar – it should. Think client vs. implementer)

 Key to this idea are names 

Strings of symbols used to refer to "things" in a programming language

Variables, procedures, constants, etc.

3

slide-4
SLIDE 4

Binding and Binding Time

 Binding is an association between two things

 Typically we are using it in reference to binding a name to a

target (or object in the textbook's terminology)

 Binding time is when a binding occurs

 A few important ones: 

Compile time

Compiler chooses how to bind high-level constructs to machine code

Link time

When code in libraries are joined together by a linker

Load time

When the OS loads a program into memory

Run time

The entire span of execution of a program from beginning to end

 Static vs. Dynamic 

Static before run time, dynamic during runtime

4

slide-5
SLIDE 5

Object lifetime

 Lifetime of an object

 The period of time between when an object is created and when

it is destroyed

 Lifetime of a binding

 The period of time between when a binding between a name and

an object is created and when it is destroyed

 Object lifetimes generally fall into one of three types

based on how they are allocated

 Static objects  Stack objects  Heap objects 5

slide-6
SLIDE 6

Static Allocation

An absolute memory location retained through the program's execution

Typically global variables

Also (in some languages) variables that are local to a subroutine but retain value from one invocation to the next

Numeric and string-valued constant literals

The lifetime of a statically allocated variable is the entire program

Local variables are not typically statically allocated in modern languages

If a language doesn't have recursion, local variables can be statically allocated

Fortran as originally designed, for example

Recursion means we need to do something else

Stack allocation can help us solve this problem

6

slide-7
SLIDE 7

Static local variable in C

void adder() { static int x = 0; x++; printf("%d\n", x); } int main(int argc, char *argv[]) { adder(); // prints 1 adder(); // prints 2 adder(); // prints 3 return 0; }

7

Example courtesy Wikipedia: https://en.wikipedia.org/wiki/Static_variable

slide-8
SLIDE 8

Stack Allocation

8

Image courtesy Programming Language Pragmatics p. 118

slide-9
SLIDE 9

Heap Allocation

 The heap is free memory allocated to the program by the

OS

 The programmer can allocate it however they want  Uses keywords like new (Java/C++) or malloc (C) to allocate

memory from the heap for the program

 Uses keywords like delete (C++) or free (C) to free it up

 Data in the heap is accessed via references (or pointers)

 Addresses in the heap to where data is stored  new and malloc both return a reference (pointer) to the memory

they allocate

 The lifetime of a heap allocated object is controlled by the

programmer

May be as long or as short as the programmer needs

9

slide-10
SLIDE 10

C program running in memory

10

Example courtesy Wikipedia: https://en.wikipedia.org/wiki/Data_segment

slide-11
SLIDE 11

Scope

 The region in the code where a binding is active is its

scope

 Scope can be determined statically or dynamically 

Almost all modern languages have static scoping

Dynamic scoping comes with a run-time cost

 Static scoping is also known as lexical scoping

 Because we can determine it from the code at compile time 

Just by reading the source, we can understand the scope of every variable

 Dynamic scoping can only be determined at run-time

11

slide-12
SLIDE 12

Static/Lexical Scope

 Probably what you think of when you think of scope

 Java, C, C++, Python, Pascal …  Most modern languages use lexical scope  Typically the current binding between a name and a target is

based on the "block" of code that the name appears in

 Scope can often be nested – variables in outer blocks can be

accessed from within inner blocks (but not vice-versa)

12

slide-13
SLIDE 13

Static/Lexical Scope – Java Example

public class Sample { private static int avg = 33; private static int average(int[] list) { int x = avg; int avg = 0; for (int i=0; i<list.length; i++) { int y = list[i]; avg += y; } if (list.length > 0) { avg /= list.length; } return avg; } public static void main(String[] args) { int[] list = { 2, 4, 6, 8, 10 }; int avg = average(list); System.out.println("The average is: " + avg); } }

13

slide-14
SLIDE 14

Static/Lexical Scope – C++ Example

int avg = 33; double average(int list[], int size) { double x = avg; double avg = 0; for (int i=0; i<size; i++) { int x = list[i]; avg += x; } if (size > 0) { avg /= size; } return avg; } int main(String[] args) { int list[5] = { 2, 4, 6, 8, 10 }; double avg = average(list); cout << "The average is: " << avg << endl; return 0; }

14

slide-15
SLIDE 15

Static/Lexical Scope – Python Example

avg = 33 def average(list): avg = 0 for x in list: avg += x if ( len(list) > 0 ): avg /= len(list) return avg def foo(): local = 10 def bar(): local = 5 print("Local2: "+str(local)) bar() print("Local1: "+str(local)) print("The average is: "+str(avg)) list = [2,4,6,8,10] avg = average(list) print("The average is: "+str(avg)) 15

slide-16
SLIDE 16

Static/Lexical Scope – C/C++

 In Java, you can refer to methods before you declare

them

 Scope is the entire block they are declared in

 In C/C++, you cannot

 Scope is only after they've been declared  This is a problem for recursive types and subroutines 

How can you refer to something if it isn't in scope?

 C/C++ solve this problem by separating declaration from

definition

 Only declaration has to exist before you can use it  Definition can come after you've used it 16

slide-17
SLIDE 17

Static/Lexical Scope – C/C++

bool isEven(int x); bool isOdd(int y); bool isEven(int x) { if (x == 0) { return true; } else { return isOdd(x-1);} } bool isOdd(int x) { if (x == 0) { return false; } else { return isEven(x-1);} }

17

slide-18
SLIDE 18

Dynamic scope

 Bindings between names and targets can only be

determined at run-time

 Depending on the flow of control of the program, bindings may be

different

 You can't tell from reading the code exactly what the scope of a

variable is

It will depend on what was previously executed before the code reached that point

 If you are used to static scoping, this idea may seem very odd 18

slide-19
SLIDE 19

Dynamic scope example

int n void first(): n = 1 void second(): int n first() main(): n = 2 int x = readInteger() if (x > 0) then second() else first() println(n)

19

Example courtesy Michael Scott – Programming Language Pragmatics

slide-20
SLIDE 20

Scope implementation

 Scope rules in a compiler are implemented using the

symbol table

 If everything is global, it's just a mapping of symbols to internal

keys

Think about the recursive descent algorithm discussed in class

 Scope rules require some extra complexity 

Augment the symbol table with "enter scope" and "leave scope"

  • perations to keep track of visibility

Nothing is ever deleted from a symbol table – even when it falls out

  • f scope

It's just marked "out of scope"

Retained through entire compilation

 Scope rules in an interpreter are treated in a similar fashion 

Interpreter also has to keep track of the targets tied to each binding

20

slide-21
SLIDE 21

Expression Evaluation

 An expression is a line of code that is evaluated and

produces a result

 Expressions can be combinations of operands and operators 

An operator in a language is a built-in function that uses special syntax

An operand is one of the arguments to the operator function

3 + 2, x + y, x++, --j, etc.

 Expressions can also be function calls 

Typically the name in parentheses with arguments in parentheses

myFunction(x,y)

 In some languages, operands are just syntactic sugar for function

calls

Ada – "+"(a,b)

C++ - a.operator+(b)

Scala – a.+(b)

21

slide-22
SLIDE 22

Expression Values & Assignment

 Variables are a binding between a memory location and

a value

 Semantics of how we read the variable differ  Sometimes the variable designates a value 

x * 3

multiply the value stored in x by 3

 But sometimes it designates a memory location 

x = 3

store the value three in the memory location denoted by x

 Value model of variables

 "l-value" denotes a memory location  "r-value" denotes a value  But … pointers/references can make this tricky 22

slide-23
SLIDE 23

Pointers & Assignment

 Pointer values make the r-value/l-value split trickier

 A pointer value is a memory location (l-value) 

In C – the address of the first byte in memory for the object

 In C we can get the pointer value of any variable with &

(address of) operator

 y = &x – y holds the l-value of x

 In C we can also dereference the pointer value to get

back to the value stored in memory (r-value)

 *y – access the r-value stored at the l-value denoted by y

 This allows us to use l-values on the right side

 "pointer arithmetic" 23

slide-24
SLIDE 24

References in Java

 A Java reference is like a C pointer

 Except without the ability to use l-values as if they were r-values  That's why people like to say that "Java has no pointers" 

What they mean is "Java has no pointer arithmetic"

 The semantics of a Java reference exactly follow the r-

value/l-value scheme

 When a variable of a reference type appears on the left side of an

assignment, it refers to a memory location (l-value)

 When a variable of a reference type appears on the right side of

an assignment, it refers to the object stored at that location (r-val)

 Java distinguishes between reference types and primitive (or

value) types

Has turned out to cause some complications

24

slide-25
SLIDE 25

Expressions

 Expressions can be function calls, values or operations

 Operations have operators and operands 

An operator has arity (unary, binary or ternary)

An operator is used prefix, infix or postfix

Unary postfix: x++

Unary prefix: --x

Binary infix: x + y

Ternary: b ? 1 : 2

Operators will have precedence rules and associativity rules

And these will depend on the language

  • (Will they respect mathematical precedence and associativity? These

days probably yes…)

 Function calls are built-in or programmer defined 

Prefix notation typical - foo(x) or (foo x)

x is an actual parameter if foo(x)is a function call

Return a value – should not have side effects

25

slide-26
SLIDE 26

Expressions and Side Effects

 A side effect is when an expression affects computation

in some way other than "returning a value"

 1 + 2 has no side effects  x = 1 + 2 has a side effect 

The value stored in the location referred to by x changes

 Other examples: x++, x += y, --y

 Expressions with no side effects are referentially

transparent

 An expression is referentially transparent if you can replace the

expression with its value and not change program behavior

 j = 10; k = 1 + j++; if (k == (j+1)) b = true;  j = 10; k = 1 + 10; if (k == (j+1)) b = true; 

Not referentially transparent – different behaviors

26

slide-27
SLIDE 27

Evaluation order

 In the presence of side effects, precedence and

associativity aren't enough

 a – foo(b) + c 

When will foo(b) be evaluated? What if it has a side effect that changes a or c?

 foo(a, bar(b), foobar(c)) 

If bar and foobar have no side effects, it doesn't matter when they are evaluated

If they have side effects it could change the behavior of this code quite a lot

 Language semantics must state this order

 Clarify behavior in the presence of side effects  And to allow for compiler optimizations 27

slide-28
SLIDE 28

Statements

 Statements are lines of code that do not themselves

evaluate to a value

 All of their work is done by side effect

 Control flow is generally done by statement

 Selection statements (if-else, switch, etc.)  Iteration statements (while, for, do, etc.)  Jump statements (goto, return, break, continue, throw)

 Unstructured control flow

 Using goto to jump to arbitrary points in the code 

Based on the "jump" commands in assembly code

 Structured control flow

 Abstracting the control flow into a meaningful structure 

If-else, while, etc.

28

slide-29
SLIDE 29

Unstructured Flow Example

int main() { int arr[] = {1, 2, 3, 4, 5}; int size = 5, idx = 0, count = 0; LAB1: if (idx > size) goto LAB4; if (arr[idx] % 2 != 0) goto LAB3; sum += arr[idx]; LAB3: idx++; goto LAB1; LAB4: cout << "Count: " << count << endl; }

29

slide-30
SLIDE 30

Structured Programming

 "Structured programming" was the hot buzzword of the

1970s

 Kicked off by Edsger Dijkstra in the late 60s with his letter "GOTO

statements considered harmful" (Communications of the ACM, March 1968)

 In his letter, he identified a number of abstractions that GOTO

statements implemented

Basically if-else and its relatives, and while-do and its relatives

 The structured programming proponents basically won the battle 

And modern programming languages implement these ideas

30

slide-31
SLIDE 31

Subroutines

 aka procedures, methods, functions

 Subroutine in the general term 

Functions return a value

Procedures work via side effect

Method is an object-oriented term

 Procedural languages 

Languages where the procedure is the main abstraction

(C, Fortran, Pascal…)

 Most subroutines are parameterized 

Subroutines have a name, formal parameters, and optional return value

Java: int myFuction(int numerator, String msg)

31

slide-32
SLIDE 32

Calling subroutines

 Calling a subroutine (aka invoking a method)

 A subroutine is called from another subroutine 

Caller subroutine vs. called subroutine

 The caller invokes the subroutine with arguments 

aka actual parameters

 The actual parameters are mapped to the formal parameters of

the called subroutine

A variety of parameter passing modes exist – we'll talk about these shortly

 Memory is allocated for the formal parameters and the local

variables of the subroutine on the call stack

 Flow of control passes to the start of the subroutine 

When the routine finishes, flow of control returns to the caller

32

slide-33
SLIDE 33

Run-time Call stack revisited

33

Image courtesy Programming Language Pragmatics p. 118

slide-34
SLIDE 34

Using the call stack

 The compiler must generate code for subroutine calls to

manage the stack

 Responsibility of the calling sequence to maintain the stack  Precise definition of calling sequence will vary from platform to

platform

 Typically broken up into responsibilities for the calling subroutine

and the called subroutine

34

slide-35
SLIDE 35

Using the call stack

When a subroutine S is called, the calling subroutine must:

Save the values currently stored in registers

Compute values for actual parameters and push them onto the stack

Push the value of the return address onto the stack

Jump to the prologue sequence of the called subroutine S

The prologue of S

Allocates memory for an activation record for S

Saves the address of the top of the stack (stack pointer or SP) and sets it to the new value

Saves the old activation record pointer (AP) to the stack and updates it to the new value

AP is also known as the frame pointer or (FP)

Save any registers that it may overwrite

35

slide-36
SLIDE 36

Using the call stack (continued)

 When S finishes, its epilogue is executed

 Put the return value into a register 

Or a reserved location on the stack

 Restore any registers it saved in the prologue  Restore the activation record pointer (AP) to its stored value  Restore the stack pointer (SP) to its stored value an deallocates

the top of the stack

 Jumps back to return address

 When control is restored to the calling routine:

 Return value is moved to wherever it is needed  Registers are restored as needed 36

slide-37
SLIDE 37

Structure of Activation Records

37 Surrounding procedure activation record (AR)

Calling procedure (S) AR

Return Address Caller's AR Address Surrounding proc. AR Address Saved Registers

SP AP Call Stack

Code for procedure P Code for procedure Q Code for procedure S (Calls P)

Code Activation Record (AR) for this call to P

Free space on top of stack

Static Link Dynamic Link

Local variables

slide-38
SLIDE 38

Example – variable scope (static)

 Consider x=25; in some procedure P

 If x is a local variable for P, we can find it in the activation record

for P by looking at some offset from the AP

Is it the first local variable? The second? The third?

Compiler can generate the appropriate code

 If x is a local variable in the surrounding block: 

Follow the static link backwards through the stored AP values to get to the location for x in the stack

 If x is a local variable nested two blocks out 

Follow the static link backwards twice:

first to the immediate outer block

then to the next outer block

38

slide-39
SLIDE 39

Dynamic scope?

 Not as easy to implement dynamic scope

 What if variable x is not local? 

Need to search for it at runtime

Work backwards through dynamic links

Need to keep names at runtime too

 So languages with dynamic scope tend to be implemented as

interpreted rather than compiled (Lisp, Perl)

39

slide-40
SLIDE 40

Parameter Passing

 Depending on the language, there are different

mechanisms for passing parameters

 Call-by-value 

The value of each actual parameter is copied into the formal parameter when the call is made

The activation record holds a copy of the value as if it were a local variable

In the subroutine, changes are made to the copy, not to the original value

 Call-by-reference 

The address of each actual parameter is copied into the formal parameter when the call is made

The activation record holds a copy of the address

Subroutine must be written to properly dereference this address

Note – this terminology gets some Java programmers confused

Java is considered call-by-value (though non-Java programmers might call it "call by sharing" or "call by object reference")

40

slide-41
SLIDE 41

Call-by-reference example

void swap(int &x, int &y) { int tmp = x; x = y; y = tmp; } int main() { int a = 10, b=3; swap(a,b); // now b=10, a=3 }

41

slide-42
SLIDE 42

Call-by-value example (Java primitives)

public static void swap(int x, int y) { int tmp = x; x = y; y = tmp; } public static void main(String[] args) { int a = 10, b=3; swap(a,b); // a is still 10, b still 3 }

42

slide-43
SLIDE 43

Java call-by-?

public static void swap(int[] arr, int x, int y) { int tmp = arr[x]; arr[x] = arr[y]; arr[y] = tmp; } public static void main(String[] args) { int[] a = {1, 2, 3, 4}; swap(list,0,3); // a is now {4, 2, 3, 1} }

43

slide-44
SLIDE 44

Python call-by-?

def swap(list, x, y) { tmp = list[x]; list[x] = list[y]; list[y] = tmp; } list = [1, 2, 3, 4] swap(list, 0, 3); // list is now [4, 2, 3, 1]

44

slide-45
SLIDE 45

Parameter Passing (other modes)

 Call-by-result 

The calling routine provides an actual parameter that the called subroutine can write to

Value is copied into actual parameter on exit of the subroutine

 Call-by-value-result 

The calling routine provides an actual parameter that the called subroutine can read from and write to

Value is copied into actual parameter on exit from the subroutine

Like call by value, where the final value of the parameter gets copied back to the actual parameter when the routine is finished

Both call-by-result and call-by-value result can be implemented by using a local variable that is copied back to the caller's actual parameter value when the routine finishes

45