General remarks Algorithms Algorithms Week 7 Oliver Oliver - - PowerPoint PPT Presentation

general remarks
SMART_READER_LITE
LIVE PREVIEW

General remarks Algorithms Algorithms Week 7 Oliver Oliver - - PowerPoint PPT Presentation

CS 270 CS 270 General remarks Algorithms Algorithms Week 7 Oliver Oliver Kullmann Kullmann Binary Binary search search Arrays, lists, pointers and rooted trees Lists Lists Pointers Pointers We conclude elementary data structures


slide-1
SLIDE 1

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Week 7 Arrays, lists, pointers and rooted trees

1

Binary search

2

Lists

3

Pointers

4

Trees

5

Implementing rooted trees

6

Tutorial

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

General remarks

We conclude elementary data structures by discussing and implementing arrays, lists, pointers and trees. We also consider binary search.

Reading from CLRS for week 7

1 Chapter 10, Sections 10.2, 10.3, 10.4. CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Arrays

Arrays are the most fundamental data structure: An array A is a static data-structure, with a fixed length n ∈ N0, holding n objects of the same type. Access to elements happens via A[i] for indices i, typically 0-based (C-based languages), that is, i ∈ { 0, . . . , n − 1 }, or 1-based, that is, i ∈ { 1, . . . , n }. This access, called random access, happens in constant time, and can be used for reading and writing. Due to the fixed length of arrays, one cannot really speak of “insertion” and “deletion” for arrays. Search in general is slow (one has to run through all elements in the worst case), however fast in sorted arrays, via “binary search”.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Vectors

The dynamic form of an array (i.e., it can grow) can be called a vector (as for C++; or “dynamic array”): The growth of the vector happens by internally holding an array, and when the need arises, to allocate a new, bigger array, copy the old content, and delete the old array. When done “infrequently”, insertions (and deletions) at the end of the vector require only amortised constant time; see the tutorial. However insertions and deletions at the beginning of the vector (or somewhere else) needs time linear in the current size of the vector, since the elements need to be shifted. A vector with additional structure, where also insertions and deletions at the beginning happens in amortised constant time, is typically called a deque (a “double-ended queue”).

slide-2
SLIDE 2

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Searching in sorted vectors

Searching in general vectors takes linear time (running through all elements):

1 However, if the vector is sorted (we assume, as it is the

default, ascending order), then it can be done in logarithmic time (in the length n of the vector).

2 We present the Java-function binary search, which

searches for an element x in an array A.

3 Instead of just returning true or false (for found or not),

it is more informative to return an index i with A[i] = x, if found, and to return −1 otherwise.

4 Since it might not be so easy to (efficiently) form

sub-arrays, our version of binary search allows to specify a sub-array by its indices begin and end.

5 As it is usually best, this so-called “range” is right-open,

i.e., the beginning is included, but the ending excluded.

6 The role model for that is begin = 0 and end = n. CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Binary search

class BinarySearch { public s t a t i c int b i n a r y s e a r c h ( f i n a l int [ ] A, int begin , int end , f i n a l int x ) { i f (A == n u l l ) return −1; i f ( begin == end ) return −1; while ( true ) { f i n a l int mid = ( begin+end ) /2; i f (A[ mid ] == x ) return mid ; i f ( begin+1 == end ) return −1; i f (A[ mid ] < x ) { begin = mid+1; i f ( begin == end ) return −1; } else end = mid ; } }

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Binary search (cont.)

public s t a t i c int b i n a r y s e a r c h ( f i n a l int [ ] A, f i n a l int x ) { i f (A == n u l l ) return −1; return b i n a r y s e a r c h (A, 0 , A. length , x ) ; } }

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Binary search with assertions

public s t a t i c int b i n a r y s e a r c h ( f i n a l int [ ] A , int begin , int end , f i n a l int x ) { i f (A == n u l l ) return −1; a s s e r t (0 <= begin <= end <= A. length ) ; i f ( begin == end ) return −1; while ( true ) { a s s e r t (0 <= begin < end <= A. length ) ; f i n a l int mid = ( begin+end ) /2; a s s e r t ( begin <= mid < end ) ; i f (A[ mid ] == x ) return mid ; i f ( begin+1 == end ) return −1; a s s e r t ( begin < mid ) ; i f (A[ mid ] < x ) { begin = mid+1; i f ( begin == end ) return −1; } else end = mid ; } }

slide-3
SLIDE 3

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Analysing binary search

We have a divide-and-conquer algorithm, with the characteristic recurrence T(n) = T(n/2) + 1. That’s because we divide the array into two (nearly) equal parts, i.e., b = 2 in the standard form of the recurrence for the Master Theorem. While we only need to investigate one of the two parts (due to the sorting!), i.e., a = 1 for the Master Theorem. Finally the work done for splitting happens in constant time, and thus c = 0 for the Master Theorem.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Analysing binary search (cont.)

We obtain the second case of the Master Theorem (log2(1) = 0), whence T(n) = Θ(lg n). Recall that this actually only implies an upper bound for the run-time of binary search — the lower bound implied by the implicit Ω holds only for the recurrence, but not necessarily for the run-time. However, it is not too hard to see that also for the algorithm, and actually for every possible search algorithm, we need at least lg(n) comparisons.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Removing random access from vectors, gaining fast general insertion and deletion: Linked lists

With vectors we obtain random access — via indices, which are just natural numbers, and thus arbitrary arithmetic can be performed with them — due to the contiguous and uniform storage scheme: underlying is an array, which is stored as one contiguous block of memory cells, all of the same size. But to maintain contiguity, only deletions and insertions at the end of the vector are efficient (amortised constant-time) — if we give up contiguity, then we loose random access, but we gain efficient arbitrary deletions and insertions: (linked) lists. Lists formally implement a dictionary (search, insertion, deletion), but, different from “real” dictionaries, search is slow, while insertion and deletion is very fast, i.e., constant-time.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Pointers to next and previous elements

Like a vector, the elements of a list are arranged in a linear order. The basic idea here is that each elements contains a pointer to the next and the previous element of the list. So a list-object x is a triple: x.prev is a pointer to the previous element in the list; x.next is a pointer to the next element in the list; x.key contains the key (or the data, if there is no “key”). For the first element of the list, x.prev is NIL, and for the last element, x.next is NIL. The whole list is represented by a pointer L to the first element (as usual, NIL if the list is empty).

slide-4
SLIDE 4

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Searching

The SEARCH-function in Java-like code, using List as the pointer-type (recall — nearly everything in Java is a pointer!): s t a t i c L i s t search ( L i s t L , f i n a l Key k ) { while (L != n u l l && L . key != k ) L = L . next ; return L ; } Note that if x is not found, then L will automatically finally become NIL (that is, null for Java).

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Excursion: Searching, in C++

For comparison, the same code in C++:

const L i s t ∗ search ( const L i s t ∗ L , const Key k ) { while (L != n u l l p t r and ∗L . key != k ) L = ∗L . next ; return L ; }

We see that in C/C++ we not only have pointers, but also values (as the ints!), and thus one can distinguish between pointers and values: The *-operator makes pointer-types from value-types, and dereferences pointers (to values). Further remarks: “const List∗” means that we do not change the values. More idiomatic would be the use of the −> operator, which makes for example L−>key instead of ∗L.key.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Insertion

Inserting a list-object x into list L, at the beginning, again as Java-code: s t a t i c L i s t i n s e r t ( L i s t L , f i n a l L i s t x ) { a s s e r t ( x != n u l l ) ; x . next = L ; x . prev = n u l l ; i f (L != n u l l ) L . prev = x ; L = x ; return L ; } Note that the return-value is the new list.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Deletion

Deleting the list-element x from list L: s t a t i c L i s t d e l e t e ( L i s t L , f i n a l L i s t x ) { a s s e r t ( x != n u l l ) ; a s s e r t (L != n u l l ) ; i f ( x . prev != n u l l ) x . prev . next = x . next ; else L = x . next ; i f ( x . next != n u l l ) x . next . prev = x . prev ; return L ; } Again the return-value is the new list.

slide-5
SLIDE 5

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Other forms of linked lists

Our from of a linked list (the standard form) is more precisely called doubly linked list, since we have back- and forth-pointers for every node. In a singly linked list we only have the next-pointer; see the tutorial for a discussion. A list can be sorted or unsorted, if a linear order on the keys is given. Finally we can also have a circular list, if we link first and last element appropriately.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Final remarks on lists

What we have outlined as the class List (see the lab session for the full implementation) would typically be considered as a type ListNode, while the List-class itself would be kind of a manager-class, with class ListNode as private type hidden in it. This yields finally more user-convenience. The usage of the field L.head in the book is a faint move in that direction. Our approach is more rough, kind of swiss-army-knife approach (as one would do it in a simple-minded C-implementation). But this has also its advantages ...

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The concept of “pointer”

A “pointer” as a value is typically a memory address, and thus “points” to some value (at that memory address). Pointers are the basic means for indirection. In Java nearly everything (that is, besides the primitive types like int) is a pointer — so you can’t see them! The situation is much better in C/C++, where we have full duality between pointers and values: For a type T we construct the pointer type T∗. For a pointer p of type T∗, via the dereference operator ∗p we get a value of type T (pointed to by p). For a value v of type T, via the reference operator &v we get a pointer of type T∗ (pointing to v). So in C/C++ we have full control on values versus pointers. On the contrary, in Java it is the compiler which decides for you:

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The concept of “pointer” (cont.)

Variables for objects are always pointers. If you for example compare two such variables via x == y, the compiler assumes you mean the pointers x, y themselves. If one the other hand you write something like x.prev, then the value pointed to by x is taken.

slide-6
SLIDE 6

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Simulating pointers

The basic technique to simulate pointers of type T∗ is to use arrays, with indices replacing the pointers:

1 The objects (for example the list-objects) can be distributed

  • ver several arrays (in this case three arrays, two for the

pointers (i.e., indices), one for the key), each array holding some part of the data, and with data belonging to the same

  • bject using the same index.

2 Or the objects can be put into a single array, of type T

(which might have advantages for caching).

3 Going to a more basic level, the single array might hold

basic memory units (like real memory), where then an

  • bject of type T uses several units, and via an offset (stored

in an offset-table) one accesses the parts (data members)

  • f an object.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Allocation and deletion of objects

In Java, allocation of new objects (on the so-called “heap”) happens via the operator new, however deallocation is not under the control of the programmer, but happens via the garbage collection. In C++, one has symmetry between allocation (operator new) and deallocation (operator delete), and furthermore new objects don’t have to be created on the (program-)heap, but can also be created on the (program-)stack, namely using values (the slogan is “when in C++, do like the ints do”). Accordingly, Java only knows constructors, but no destructors, while C++ has both. C++ allows much more fine-grained memory-control. Especially the time of deletion is under full control of the programmer (which is important for time-critical applications, where garbage collection must not kick-in).

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The notion of a “tree”: origins

In Week 4 we defined the notion of a “tree” (see Sections B.5.1, B.5.2 and B.5.3 in CLRS for more information on this topic). This original notion of a tree (in our context) comes from mathematics, where a tree is a “connected undirected graph without cycles”, e.g. T = 1 2 6

❃ ❃ ❃ ❃ ❃ ❃ ❃ ❃

7

  • 4

3 5 We have vertices 1, . . . , 7 and edges between them. From any vertices we can reach every other vertex, and this essentially in a unique way (this is the special property of trees).

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The notion of a “tree”: roots

To obtain a rooted tree, one of the vertices of the tree has to be distinguished as a “root” (recall BFS and DFS), and then the tree is typically drawn growing downwards (in the direction of reading). For our example T we can make seven rooted trees out

  • f T, for example choosing vertex 1 resp. vertex 6 as the root:

(T, 1) = 1 2 3

✁✁✁✁ ❂ ❂ ❂ ❂

4 5

✁✁✁✁ ❂ ❂ ❂ ❂

6 7 (T, 6) = 6 5

✁✁✁✁ ❂ ❂ ❂ ❂

7 3

✁✁✁✁ ❂ ❂ ❂ ❂

4 2 1 The leaves of (T, 1) are 4, 6, 7, the leaves of (T, 6) are 7, 4, 1.

slide-7
SLIDE 7

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The height of rooted trees

Given a rooted tree T, its height ht(T) is the length of the longest path from the root to some leaf, where the length of a path is the number of edges on it. The recursive definition is as follows: the trivial tree, just consisting of the root, has height 0; if a rooted tree T has subtrees T1, . . . , Tm for m ≥ 1 (always at the root), then ht(T) = 1 +

m

max

i=1 ht(Ti).

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The notion of a “tree”: order

The same two rooted trees as before can be drawn differently, since there is no order on the children of a node, for example: (T, 1) = 1 2 3

✁✁✁✁ ❂ ❂ ❂ ❂

5

✁✁✁✁ ❂ ❂ ❂ ❂

4 6 7 (T, 6) = 6 5

✁✁✁✁ ❂ ❂ ❂ ❂

7 3

✁✁✁✁ ❂ ❂ ❂ ❂

2 4 1 However, if we consider ordered rooted trees, then the order of the children of a node is part of the tree, and the above re-drawings are different ordered rooted trees.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

The notion of a “tree”: binary trees

To begin with, a binary tree is a rooted ordered tree, where every node which is not a leaf has either one or two children. But that’s not quite complete: In case a node has one child, then for this child it must be said whether it is a left child or a right child. For example, the ordered rooted tree (T, 1) as given on the previous slide has 24 = 16 different versions as binary tree, e.g. 1

✁✁✁✁

2

❂ ❂ ❂ ❂

3

✁✁✁✁ ❂ ❂ ❂ ❂

5

✁✁✁✁ ❂ ❂ ❂ ❂

4 6 7 1

❂ ❂ ❂ ❂

2

✁✁✁✁

3

✁✁✁✁ ❂ ❂ ❂ ❂

5

✁✁✁✁ ❂ ❂ ❂ ❂

4 6 7

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Nodes and pointers

A node of a binary tree has the following information naturally associated with it: its parent node its left and its right child its label. There are four special cases:

1 the root has no parent node 2 a node may have only one child, left or right 3 a leaf has no children.

The links to parents and children can be represented by pointers (possibly the null pointer), while the label is just an “attribute” (a “data member” or “instance variable” of the class).

slide-8
SLIDE 8

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Representing the example

Using as label just the key, we arrive at a class with four data members: “p” is the pointer to the parent “left” and “right” are pointers to left resp. right child “key” is the key. Considering the first binary tree from two slides before, we can represent the tree by 7 nodes v1, . . . , v7, where each node is a pointer and where the data members are given in order p, left, right, key: ∗v1 = (NIL, v2, NIL, 1), ∗v2 = (v1, NIL, v3, 2), ∗v3 = (v2, v5, v4, 3), ∗v4 = (v3, NIL, NIL, 4), ∗v5 = (v3, v6, v7, 5), ∗v6 = (v5, NIL, NIL, 6), ∗v7 = (v5, NIL, NIL, 7).

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Remark on pointers

Since in a Java-environment students typically don’t get a good understanding on pointers, here some remarks we’ve already seen: Note that dereferenciation of pointers v is denoted by “*v” (the object to which it points). Since Java is “purely pointer-based”, one can not get at the address of an object, or to the object of a pointer: Every variable is a pointer, and using it in most cases(!) means to dereference it (an exception is for example when using ==). Another exception in Java is the handling of primitive types, where we can not have pointers. C and C++ have both values (objects) and pointers, and we have full symmetry (and full control):

For an object x the address is “&x”. For a pointer p the object is “*p”.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Rooted trees with unbounded branching

A few remarks in case there are more than two children: For k-ary trees (generalised binary trees) we just need to add further data members to the class, with appropriate names. This works best for small k. For large k or unbounded k (not known in advance) an easy way is to store the children in a list. Section 10.4 of CLRS contains a variation, where each element of the (singly-linked) list of children also contains a pointer to its parent.

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

How to implement vectors (dynamic arrays)

We already said that we could implement vectors by holding arrays which are replaced by bigger arrays when needed (copying the old values over). We also said we could do this in a such a way that insertion becomes amortised constant time — how is this possible??

slide-9
SLIDE 9

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

On deletion in linked lists

Our implementation of delete returns the new value of the list L — when is this different from the original value?

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

A more powerful insertion

Our insert -function inserts at the beginning — how to insert anywhere?

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Binary search in linked lists

Can we perform binary search for a sorted linked list? If yes, with what complexity?

CS 270 Algorithms Oliver Kullmann Binary search Lists Pointers Trees Implementing rooted trees Tutorial

Singly linked lists

A singly linked list has no prev-pointer (only the next-pointer): What are the advantages and disadvantages?