Data Structures continued Tyler Moore CSE 3353, SMU, Dallas, TX - - PDF document

data structures continued
SMART_READER_LITE
LIVE PREVIEW

Data Structures continued Tyler Moore CSE 3353, SMU, Dallas, TX - - PDF document

Notes Data Structures continued Tyler Moore CSE 3353, SMU, Dallas, TX February 7, 2013 Portions of these slides have been adapted from the slides written by Prof. Steven Skiena at SUNY Stony Brook, author of Algorithm Design Manual. For more


slide-1
SLIDE 1

Data Structures continued

Tyler Moore

CSE 3353, SMU, Dallas, TX

February 7, 2013

Portions of these slides have been adapted from the slides written by Prof. Steven Skiena at SUNY Stony Brook, author

  • f Algorithm Design Manual. For more information see http://www.cs.sunysb.edu/~skiena/

POTD: Attempt parts (a) and (b) of Q1

Before class on Thursday, please attempt problem Q1 (a) and (b) You won’t turn anything in on Thursday, but I want to know if you are able to successfully code this first part of the problem It’s OK if you can’t get a working solution. In this case, bring me your errors! If you get stuck on an error, I want to know about it, so we can discuss with the class. There is a VERY good chance some of your classmates are experiencing similar trouble.

2 / 29

Variables in Python

Better thought of as names or identifiers attached to an object. A nice explanation: http://python.net/~goodger/projects/pycon/2007/ idiomatic/handout.html#other-languages-have-variables

3 / 29

Key distinction: mutable vs. immutable objects

Immutable: objects whose value cannot change

1

Tuples (makes sense)

2

Booleans (surprise?)

3

Numbers (surprise?)

4

Strings (surprise?)

Mutable: objects whose value can change

1

Dictionaries

2

Lists

3

User-defined objects (unless defined as immutable)

This distinction matters because it explains seemingly contradictory behavior

4 / 29

Notes Notes Notes Notes

slide-2
SLIDE 2

Variable assignment in action

>>> #variables are really names ... c = 4 >>> d = c >>> c+=1 >>> c 5 >>> d #d does not change because numbers are immutable 4 >>> #lists are mutable ... a = [1,4,2] >>> b = a #so this assigns the name b to the object attached to name a >>> a.append(3) >>> a [1, 4, 2, 3] >>> b #b still points to the same object, its contents have just changed. [1, 4, 2, 3]

5 / 29

Im/mutablility and function calls

>>> #let’s try this in a function ... def increment(n): #n is a name assigned to the function argument when called ... #because numbers are immutable, the following ... #reassigns n to the number represented by n+1 ... n+=1 ... return n ... >>> a = 3 >>> increment(a) 4 >>> #a does not change ... a 3

6 / 29

Im/mutablility and function calls

>>> def sortfun(s): ... s.sort() ... return s ... >>> def sortfun2(s): ... l = list(s) ... l.sort() ... return l ... >>> a = [1,4,2] >>> sortfun(a) [1, 2, 4] >>> a [1, 2, 4] >>> b = [3,9,1] >>> sortfun2(b) [1, 3, 9] >>> b [3, 9, 1]

7 / 29

Im/mutablility and function calls

def selection_sort(s): """ Input: list s to be sorted Output: sorted list """ for i in range(len(s)): #don’t name min since reserved word minidx=i for j in range(i+1,len(s)): if s[j]<s[minidx]: minidx=j s[i],s[minidx]=s[minidx],s[i] return s >>> b [3, 9, 1] >>> selection_sort(b) [1, 3, 9] >>> b [1, 3, 9]

8 / 29

Notes Notes Notes Notes

slide-3
SLIDE 3

Empirically evaluating performance

Once you are confident that your algorithm is correct, you can evaluate its performance empirically Python’s timeit package repeatedly runs code and reports average execution time timeit arguments

1

code to be executed in string form

2

any setup code that needs to be run before executing the code (note: setup code is only run once)

3

parameter ‘number’, which indicates the number of times to run the code (default is 1000000)

9 / 29

Timeit in action: timing Python’s sort function and our selection sort

#store function in file called sortfun.py import random def sortfun(size): l = range(1000) random.shuffle(l) l.sort() >>> timeit.timeit("sortfun(1000)","from sortfun import sortfun",number=100) 0.0516510009765625 >>> #here is the wrong way to test the built-in sort function ... timeit.timeit("l.sort()","import random; l = range(1000); random.shuffle(l)" ,number=100) 0.0010929107666015625 >>> # WRONG way to time selection sort >>> timeit.timeit("selection_sort(l)","from selection_sort import selection_sort; import random; l = range(1000); random.shuffle(l)",number=100) 3.0629560947418213 >>> # RIGHT way to time selection sort >>> timeit.timeit("import random; l = range(1000); random.shuffle(l); selection_sort(l)","from selection_sort import selection_sort", number=100) 3.0623178482055664

10 / 29

Dynamic Arrays

Unfortunately we cannot adjust the size of simple arrays in the middle

  • f a programs execution.

Compensating by allocating extremely large arrays can waste a lot of space. With dynamic arrays we start with an array of size 1, and double its size from m to 2m each time we run out of space. How many times will we double for n elements? Answer: Only ⌈lg n⌉.

11 / 29

How Much Total Work?

The apparent waste in this procedure involves the recopying of the

  • ld contents on each expansion.

If half the elements move once, a quarter of the elements twice, and so on, the total number of movements M is given by M =

lg n

  • i=1

i · n 2i = n

lg n

  • i=1

i 2i ≤ n

  • i=1

i 2i = 2n Thus each of the n elements move an average of only twice, and the total work of managing the dynamic array is the same O(n) as a simple array.

12 / 29

Notes Notes Notes Notes

slide-4
SLIDE 4

Advantages of Linked Lists

The relative advantages of linked lists over static arrays include:

1 Overflow on linked structures can never occur unless the memory is

actually full.

2 Insertions and deletions are simpler than for contiguous (array) lists. 3 With large records, moving pointers is easier and faster than moving

the items themselves. Dynamic memory allocation provides us with flexibility on how and where we use our limited storage resources.

13 / 29

Question

Are Python lists like dynamic arrays or linked lists?

14 / 29

Implementing Linked Lists in Python

You would never actually want to use linked lists in Python Built-in lists are much more efficient Nonetheless, implementing linked lists serves as a nice introduction to OOP in Python Code at http://lyle.smu.edu/~tylerm/courses/cse3353/ code/linked_list.py Compare to the C code in ADM pp. 68–70. Which do you prefer?

15 / 29

Implementing Linked Lists in Python

1 class Node : 2

def i n i t ( s e l f , item=None , next=None ) :

3

s e l f . item = item

4

s e l f . next = next

5

def s t r ( s e l f ) :

6

return s t r ( s e l f . item )

7 8 9 class

L i n k e d L i s t :

10

def i n i t ( s e l f ) :

11

s e l f . length = 0

12

s e l f . head = None

1 #code

f o r Node and L i n k e d L i s t in l i n k e d l i s t . py

2 import

l i n k e d l i s t

3 l i l = l i n k e d l i s t . L i n k e d L i s t () 16 / 29

Notes Notes Notes Notes

slide-5
SLIDE 5

Inserting a Node

1

def i n s e r t l i s t ( s e l f , item ) :

2

node = Node ( item )

3

node . next = s e l f . head

4

s e l f . head = node

5

s e l f . length = s e l f . length + 1

1 l i l . i n s e r t n o d e ( ‘ ‘ a ’ ’ ) 17 / 29

Searching the list

1

def s e a r c h l i s t ( s e l f , item ) :

2

node = s e l f . head

3

while node :

4

i f node . item==item : return node

5

node = node . next

6

return None

1 l i l . search ( ‘ ‘ b ’ ’ ) 18 / 29

Deleting from the list

1

def p r e d e c e s s o r l i s t ( s e l f , item ) :

2

node = s e l f . head

3

while node . next :

4

i f node . next . item==item : return node

5

node = node . next

6

return None

7 8

def d e l e t e l i s t ( s e l f , item ) :

9

p = s e l f . s e a r c h l i s t ( item )

10

i f p :

11

pred = s e l f . p r e d e c e s s o r l i s t ( item )

12

i f pred i s None : #i f p i s the head , then there w i l l be no p r e d e c e s s o r

13

s e l f . head = p . next

14

else : #otherwise point p r e d e c e s s o r to item ’ s next element

15

pred . next = p . next

19 / 29

Representing the list as a string

1

def s t r ( s e l f ) :

2

node = s e l f . head

3

l l s t r = ” [ ”

4

while node :

5

l l s t r += ” %s ”%node . item

6

node = node . next

7

l l s t r+= ” ] ”

8

return l l s t r

20 / 29

Notes Notes Notes Notes

slide-6
SLIDE 6

Cost of operations in linked lists

What does node insertion cost in the worst case? What does node search cost in the worst case? What does node deletion cost in the worst case?

21 / 29

Stacks and Queues

Sometimes, the order in which we retrieve data is independent of its content, being only a function of when it arrived. A stack supports last-in, first-out operations: push and pop. A queue supports first-in, first-out operations: enqueue and dequeue. Lines in banks are based on queues, while food in my refrigerator is treated as a stack.

22 / 29

Python lists can be treated like stacks

Push: l.append() Pop: l.pop() What’s missing from list’s built-in methods to make queues possible?

List’s methods are ‘append’, ‘count’, ‘extend’, ‘index’, ‘insert’, ‘pop’, ‘remove’, ‘reverse’, ’sort’ enqueue(): dequeue():

23 / 29

Dictionary

Perhaps the most important class of data structures maintain a set of items, indexed by keys. Search(S, k) A query that, given a set S and a key value k, returns a pointer x to an element in S such that key[x] = k, or nil if no such element belongs to S. Insert(S, x) A modifying operation that augments the set S with the element x. Delete(S, x) Given a pointer x to an element in the set S,remove x from S. Observe we are given a pointer to an element x, not a key value. Min(S), Max(S) Returns the element of the totally ordered set S which has the smallest (largest) key. Next(S, x), Previous(S, x) Given an element x whose key is from a totally ordered set S, returns the next largest (smallest) element in S,

  • r NIL if x is the maximum (minimum) element.

There are a variety of implementations of these dictionary operations, each

  • f which yield different time bounds for various operations.

24 / 29

Notes Notes Notes Notes

slide-7
SLIDE 7

Different Ways to Implement Dictionaries

Array-based Sets: Unsorted Arrays Operation Implementation Efficiency Search(S, k) sequential search Insert(S, x) place in first empty spot Delete(S, x) copy nth item to the xth spot Min(S), Max(S) sequential search Successor(S, x), Pred(S, x) sequential search Array-based Sets: Sorted Arrays Operation Implementation Efficiency Search(S, k) binary search Insert(S, x) search, then move to make space Delete(S, x) move to fill up the hole Min(S), Max(S) first or last element Successor(S, x), Pred(S, x) add or subtract 1 from pointer

25 / 29

How could you implement a dictionary in Python with an unsorted array?

1 class

Item :

2

def i n i t ( s e l f , key , v a l ) :

3

s e l f . key = key

4

s e l f . v a l = v a l

5

def s t r ( s e l f ) :

6

return s e l f . key+” , ”+s e l f . v a l

7 8 class

D i c t i o n a r y :

9

def i n i t ( s e l f ) :

10

s e l f . array =[]

26 / 29

Working with a Dictionary object

>>> import dlist >>> d = dlist.Dictionary() >>> d.Insert("smu","mustangs") >>> d.Insert("texas","longhorns") >>> d.Insert("memphis","tigers") >>> d.Insert("tulsa","golden hurricane") >>> print d {smu: mustangs, texas: longhorns, memphis: tigers, tulsa: golden hurricane, } >>> print d.Search("tulsa") tulsa, golden hurricane >>> print d.Delete("texas","longhorns") >>> print d {smu: mustangs, memphis: tigers, tulsa: golden hurricane, }

27 / 29

Inserting an element

Insert(S, x) A modifying operation that augments the set S with the element x

1 class

Item :

2

def i n i t ( s e l f , key , v a l ) :

3

s e l f . key = key

4

s e l f . v a l = v a l

5

def s t r ( s e l f ) :

6

return s e l f . key+” , ”+s e l f . v a l

7 8 class

D i c t i o n a r y :

9

def i n i t ( s e l f ) :

10

s e l f . array =[]

28 / 29

Notes Notes Notes Notes

slide-8
SLIDE 8

Search for an element

Search(S, k) A query that, given a set S and a key value k, returns a pointer x to an element in S such that key[x] = k, or nil if no such element belongs to S. Code: http://lyle.smu.edu/~tylerm/courses/cse3353/code/dlist.txt

29 / 29

Notes Notes Notes Notes