22 greedy algorithms
play

22. Greedy Algorithms weight w i . The maximum weight is given as W - PowerPoint PPT Presentation

The Fractional Knapsack Problem set of n items { 1 , . . . , n } Each item i has value v i and 22. Greedy Algorithms weight w i . The maximum weight is given as W . Input is denoted as E = ( v i , w i ) i =1 ,...,n .


  1. The Fractional Knapsack Problem set of n ∈ ◆ items { 1 , . . . , n } Each item i has value v i ∈ ◆ and 22. Greedy Algorithms weight w i ∈ ◆ . The maximum weight is given as W ∈ ◆ . Input is denoted as E = ( v i , w i ) i =1 ,...,n . Wanted: Fractions 0 ≤ q i ≤ 1 ( 1 ≤ i ≤ n ) that maximise the sum Fractional Knapsack Problem, Huffman Coding [Cormen et al, Kap. � n i =1 q i · v i under � n i =1 q i · w i ≤ W . 16.1, 16.3] 658 659 Greedy heuristics Correctness Sort the items decreasingly by value per weight v i /w i . Assumption: optimal solution ( r i ) ( 1 ≤ i ≤ n ). Assumption v i /w i ≥ v i +1 /w i +1 The knapsack is full: � i r i · w i = � i q i · w i = W . Let j = max { 0 ≤ k ≤ n : � k i =1 w i ≤ W } . Set Consider k : smallest i with r i � = q i Definition of greedy: q k > r k . Let q i = 1 for all 1 ≤ i ≤ j . x = q k − r k > 0 . q j +1 = W − � j i =1 w i . Construct a new solution ( r ′ i ) : r ′ i = r i ∀ i < k . r ′ k = q k . Remove w j +1 weight � n q i = 0 for all i > j + 1 . i = k +1 δ i = x · w k from items k + 1 to n . This works because � n i = k r i · w i = � n i = k q i · w i . That is fast: Θ( n log n ) for sorting and Θ( n ) for the computation of the q i . 660 661

  2. Correctness Huffman-Codes Goal: memory-efficient saving of a sequence of characters using a binary code with code words.. n n v k ( r i w i − δ i ) v i � r ′ � i v i = r k v k + xw k + w k w i Example i = k i = k +1 File consisting of 100.000 characters from the alphabet { a, . . . , f } . n v k v i v k � ≥ r k v k + xw k + r i w i − δ i w k w i w k a b c d e f i = k +1 Frequency (Thousands) 45 13 12 16 9 5 n n v k v k v i Code word with fix length 000 001 010 011 100 101 � � = r k v k + xw k − xw k + r i w i = r i v i . Code word variable length 0 101 100 111 1101 1100 w k w k w i i = k +1 i = k File size (code with fix length): 300 . 000 bits. Thus ( r ′ i ) is also optimal. Iterative application of this idea generates File size (code with variable length): 224 . 000 bits. the solution ( q i ) . 662 663 Huffman-Codes Code trees 100 100 Consider prefix-codes: no code word can start with a different 0 1 0 1 codeword. 55 a:45 Prefix codes can, compared with other codes, achieve the optimal 86 14 0 1 0 0 1 data compression (without proof here). 25 30 Encoding: concatenation of the code words without stop character 0 0 1 1 58 28 14 (difference to morsing). 0 1 0 1 0 1 14 c:12 b:13 d:16 affe → 0 · 1100 · 1100 · 1101 → 0110011001101 0 1 a:45 b:13 c:12 d:16 e:9 f:5 Decoding simple because prefixcode f:5 e:9 0110011001101 → 0 · 1100 · 1100 · 1101 → affe Code words with fixed length Code words with variable length 664 665

  3. Properties of the Code Trees Algorithm Idea An optimal coding of a file is alway represented by a complete binary tree: every inner node has two children. Tree construction bottom 100 up Let C be the set of all code words, f ( c ) the frequency of a 55 codeword c and d T ( c ) the depth of a code word in tree T . Define Start with the set C of the cost of a tree as code words 30 Replace iteriatively the � 25 14 B ( T ) = f ( c ) · d T ( c ) . two nodes with smallest c ∈ C frequency by a new a:45 b:13 c:12 d:16 e:9 f:5 parent node. (cost = number bits of the encoded file) In the following a code tree is called optimal when it minimizes the costs. 666 667 Algorithm Huffman( C ) Analyse Input : code words c ∈ C Output : Root of an optimal code tree n ← | C | Q ← C Use a heap: build Heap in O ( n ) . Extract-Min in O (log n ) for n for i = 1 to n − 1 do Elements. Yields a runtime of O ( n log n ) . allocate a new node z z. left ← ExtractMin ( Q ) // extract word with minimal frequency. z. right ← ExtractMin ( Q ) z. freq ← z. left.freq + z. right.freq Insert( Q, z ) return ExtractMin( Q ) 668 669

  4. The greedy approach is correct Proof It holds that f ( x ) · d T ( x ) + f ( y ) · d T ( y ) = ( f ( x ) + f ( y )) · ( d T ′ ( z ) + 1) = f ( z ) · d T ′ ( x ) + f ( x ) + f ( y ) . Thus B ( T ′ ) = B ( T ) − f ( x ) − f ( y ) . Theorem Assumption: T is not optimal. Then there is an optimal tree T ′′ with Let x , y be two symbols with smallest frequencies in C and let T ′ ( C ′ ) be an optimal code tree to the alphabet C ′ = C − { x, y } + { z } with a B ( T ′′ ) < B ( T ) . We assume that x and y are brothers in T ′′ . Let T ′′′ be the tree where the inner node with children x and y is replaced by new symbol z with f ( z ) = f ( x ) + f ( y ) . Then the tree T ( C ) that is constructed from T ′ ( C ′ ) by replacing the node z by an inner node z . Then it holds that B ( T ′′′ ) = B ( T ′′ ) − f ( x ) − f ( y ) < B ( T ) − f ( x ) − f ( y ) = B ( T ′ ) . with children x and y is an optimal code tree for the alphabet C . Contradiction to the optimality of T ′ . The assumption that x and y are brothers in T ′′ can be justified because a swap of elements with smallest frequency to the lowest level of the tree can at most decrease the value of B . 670 671

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend