CSE326:DataStructures Lecture#12 BartNiswonger SummerQuarter2001 - - PDF document

cse 326 data structures lecture 12
SMART_READER_LITE
LIVE PREVIEW

CSE326:DataStructures Lecture#12 BartNiswonger SummerQuarter2001 - - PDF document

CSE326:DataStructures Lecture#12 BartNiswonger SummerQuarter2001 TodaysOutline UnixTutorial Whatdoyouwantcovered? Midterm Amortizedtime ADTvsDataStructure 1


slide-1
SLIDE 1

1

CSE326:DataStructures Lecture#12

BartNiswonger SummerQuarter2001

Today’sOutline

  • UnixTutorial

– Whatdoyouwantcovered?

  • Midterm

– Amortizedtime – ADTvsDataStructure

slide-2
SLIDE 2

2

IntermediateUnixTutorial

  • 2minutes
  • 3thingsyoulove aboutunix
  • 3thingsyouhate
  • 5thingsyouwishyouknew howtodo
  • 1giftidea

AsymptoticTime

  • Boundsworst-case runningtime

– Overm operations

  • Worst-caseforsingle operationmaybe

reallybad,butworst-caseform

  • perationsisbounded
slide-3
SLIDE 3

3

ADTvsDataStructure

AbstractDataType

– Abstract – Operations& semantics – Data-less – One – Nonotionofrunning timeorcomplexity

Datastructures

– Concreteimplementation – Setofalgorithms a – Holdsdata – Many – Veryparticularrunning timesandcomplexities

  • Dictionaryoperations

– create – destroy – insert – find – delete

  • Storesvalues associatedwithuser-specified

keys

– values maybeany(homogenous)type – keys maybeany(homogenous)comparabletype

DictionaryADT

  • kimchi

– spicycabbage

  • KrispyKreme

– tastydoughnut

  • kiwi

– Australianfruit

  • kale

– leafygreen

  • Krispix

– breakfastcereal

insert find(kiwi)

  • kohlrabi
  • upscaletuber
  • kiwi
  • Australianfruit
slide-4
SLIDE 4

4

HashTableApproach

But…isthereaprobleminthispipe-dream? f(x) Kiwi Kimchi Kale Kohlrabi Kumquat

HashTable DictionaryDataStructure

  • Hashfunction:maps

keystointegers

– result:canquicklyfind therightspotforagiven entry

  • Unorderedandsparse

table

– result:cannotefficiently listallentries, – Cannotfindminandmax efficiently, – Cannotfindallitems withinaspecifiedrange efficiently.

f(x)

Kiwi Kimchi Kale Kohlrabi Kumquat

slide-5
SLIDE 5

5

HashTableTerminology

hashfunction collision keys loadfactor

  • =#ofentriesintable

tableSize

f(x) Kimchi Kale Kohlrabi Kumquat Kiwi table

HashTableCode(FirstPass)

Value&find(Key&key){ intindex=hash(key) %tableSize; returnTable[index]; }

Whatshouldthehash functionbe? (forintegers) Whatshouldthetable sizebe? Howshouldwe resolvecollisions?

slide-6
SLIDE 6

6

AGoodHashFunction…

…iseasy(fast)tocompute(O(1)and practically

fast).

…distributesthedataevenly(hash(a)

hash(b))

…usesthewholehashtable(forall0

k<size, there’sanisuchthathash(i)%size=k).

AGoodHashFunctionforIntegers

  • Choose

– tableSizeisprime – hash(n)=n%tableSize

  • Example:

– tableSize=7 insert(4) insert(17) find(12) insert(9) delete(17)

3 2 1 6 5 4

slide-7
SLIDE 7

7

GoodHashFunctionforStrings?

  • Iwanttobeableto:

insert(“kale”) insert(“Krispy Kreme”) insert(“kim chi”)

GoodHashFunctionforStrings?

  • SumtheASCIIvaluesofthecharacters.
  • Consideronlythefirst3characters.

– Usesonly2871outof17,576entriesinthetableon Englishwords.

  • Lets=s1s2s3s4…s5:choose

– hash(s)=s1 +s2128+s31282 +s41283 +…+sn128n

  • Problems:

– hash(“really,reallybig”)=well…somethingreally,really big – hash(“onething”)%128=hash(“otherthing”)%128 Thinkofthestringasabase128number.

slide-8
SLIDE 8

8

EasytoComputeStringHash

  • UseHorner’sRule

int hash(Strings){ h=0; for(i=s.length()- 1;i>=0;i--){ h=(si +128*h)%tableSize; } returnh; }

UniversalHashing

  • Foranyfixedhashfunction,therewillbe

somepathological setsofinputs

– everythinghashestothesamecell!

  • Solution:UniversalHashing

– Startwithalarge(parameterized)classofhash functions

  • Nosequenceofinputsisbadforallofthem!

– Whenyourprogramstartsup,pickoneofthehash functionstouseatrandom (fortheentiretime) – Now:nobadinputs,onlyunluckychoices!

  • Ifuniversalclasslarge,oddsofmakingabadchoicevery

low

  • Ifyoudofindyouareintrouble,justpickadifferenthash

functionandre-hashthepreviousinputs

slide-9
SLIDE 9

9

“Random”VectorUniversalHash

  • Parameterizedbyprimesizeandvector:

a=<a0 a1 …ar>where0<=ai <size

  • Representeachkeyasr+1integerswhereki <

size

– size=11,key=39752==><3,9,7,5,2> – size=29,key=“helloworld”==> <8,5,12,12,15,23,15,18,12,4> ha(k)= size k a

r i i i

mod

✂ ✄☎✆ ✝

dotproductwitha“random”vector!

UniversalHashFunction

  • Strengths:

– worksonanytypeaslongasyoucanformki’s – ifwe’rebuildingastatictable,wecantrymanya’s – arandoma hasguaranteedgoodpropertiesno matterwhatwe’rehashing

  • Weaknesses

– mustchooseprimetablesizelargerthananyki

slide-10
SLIDE 10

10

HashFunctionSummary

  • Goalsofahashfunction

– reproduciblemappingfromkeytotableentry – evenlydistributekeysacrossthetable – separatecommonlyoccurringkeys(neighboring keys?) – completequickly

  • ExampleHashfunctions

– h(n)=n%size – h(n)=stringasbase128number%size – OneUniversalhashfunction:dotproductwithrandom vector

HowtoDesignaHashFunction

  • Knowwhatyourkeysare
  • Studyhowyourkeysaredistributed
  • Trytoincludeallimportantinformationina

keyintheconstructionofitshash

  • Trytomake“neighboring”keyshashtovery

differentplaces

  • Prunethefeaturesusedtocreatethehash

untilitruns“fastenough”(veryapplication dependent)

slide-11
SLIDE 11

11

Collisions

  • Pigeonholeprinciple sayswecan’tavoidall

collisions

– trytohashwithoutcollisionm keysinton slotswithm >n – trytoput6pigeonsinto5holes

  • Whatdowedowhentwokeyshashtothesame

entry?

– openhashing:putlittledictionariesineachentry – closedhashing:pickanextentrytotry shoveextrapigeonsinonehole!

ToDo

  • ProjectII
  • Homework4
  • ReadChapter5(fast!)
slide-12
SLIDE 12

12

ComingUp

  • Morehashing
  • Coolstuff!
  • ProjectIII