data representation
play

Data Representation Data Representation Types of data: Numbers - PowerPoint PPT Presentation

Data Representation Data Representation Types of data: Numbers Text Audio Images & Graphics Video Analog vs Digital data How is data represented? What is a signal? Transmission of data Analog vs Digital


  1. Data Representation

  2. Data Representation ● Types of data: ● Numbers ● Text ● Audio ● Images & Graphics ● Video

  3. Analog vs Digital data ● How is data represented? ● What is a signal? ● Transmission of data ● Analog vs Digital ● Analog: Continuous signal ● Digital: Discrete signal

  4. Analog vs Digital data Analog Digital Threshold

  5. Representing Text ● Document: Paragraphs, sentences, words ● All made up of characters ● English language has 26 letters ● 52 if you consider upper and lower case ● Punctuation characters ● Space ● Character sets: ASCII and Unicode

  6. ASCII Character Set

  7. ASCII Character Set 256 characters – 8 bits = 1 byte ASCII: Character a --> Dec: 97 --> Binary: 01100001

  8. Unicode Character Set 2 16 : 65000 characters ASCII is a subset of Unicode

  9. Unicode Character Set Why Unicode?

  10. Some terminology 1 gigabyte of storage 20 years ago!

  11. Some terminology

  12. Some terminology Up to this point we have been talking about data in either bits or bytes. 1 byte = 8 bits While this is the correct way to talk about data, sometimes it is a bit inefficient. Therefore, we use prefixes to given an order of magnitude. Much the same way we do with the metric system.

  13. Some terminology Kilobyte (KB) = 10 3 = 1000 bytes Megabyte (MB) = 10 6 = 1 million bytes Gigabyte (GB) = 10 9 = 1 billion bytes Terabyte (TB) = 10 12 = 1 trillion bytes

  14. Data Compression Why compress data? Storage, transmission within PC/over network

  15. Data Compression What is data compression? Reducing physical size of information blocks

  16. Data Compression Compression ratio Tells us how much compression occurs. Number between 0 and 1 Lossless versus lossy compression Images, sound files, videos Database of names, numbers compressed = ratio * uncompressed ratio = compressed/uncompressed

  17. Text Compression Examine three types of text compression: Keyword encoding Run-length encoding Huffman encoding

  18. Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol The human body is composed of many as ^ independent systems, such as the circulatory system, the respiratory system, the ~ and the reproductive system. Not only must and + all systems work independently, but they that $ must interact and cooperate as well. Overall health is a function of the well being must & of separate systems, as well as how these well % separate systems work in concert. these #

  19. Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol The human body is composed of many The human body is composed of many as ^ independent systems, such ^ the circulatory independent systems, such as the system, ~ respiratory system, + ~ circulatory system, the respiratory system, the ~ reproductive system. Not only & all systems and the reproductive system. Not only must and + all systems work independently, but they work independently, but they & interact that $ and cooperate ^ % . Overall health is a must interact and cooperate as well. Overall health is a function of the well being function of ~ % being of separate systems, must & ^% ^ how # separate systems work in of separate systems, as well as how these well % concert. separate systems work in concert. these #

  20. Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol The human body is composed of many The human body is composed of many Reduced from 352 to 317 as ^ independent systems, such ^ the circulatory independent systems, such as the Compression ratio: 317/352 = 0.9 system, ~ respiratory system, + ~ circulatory system, the respiratory system, the ~ reproductive system. Not only & all systems and the reproductive system. Not only must Is this efficient? and + work independently, but they & interact all systems work independently, but they that $ must interact and cooperate as well. and cooperate ^ % . Overall health is a Overall health is a function of the well being function of ~ % being of separate systems, must & ^% ^ how # separate systems work in of separate systems, as well as how these well % separate systems work in concert. concert. these #

  21. Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol Drawbacks: as ^ Symbols used for encoding must not appear in the text the ~ and + ‘The’ & ‘the’ needs to be represented by different symbols that $ Would not gain anything by encoding ‘a’ and ‘I’ must & well % Most frequently used words are often short these #

  22. Run-Length Encoding Also known as recurrence coding Encoding a single character that is repeated over and over again For example: replacing ‘AAAAAAA’ with a ‘*’ : *A7 Drawbacks? Uses: DNA sequences, simple images Lossy or lossless compression?

  23. Huffman Encoding Variable bit lengths to represent characters: a --> Binary 01100001 – 8 bits Why would character X take up as many bits as a ? Represent it using 5 bits instead Saving space: Frequently appearing characters are represented by shorter bit lengths

  24. Huffman Encoding Huffman Code Character DOORBELL 00 A D= 1011 O= 110 O=110 … 01 E 100 L 1011 110 110 111 101001100100 110 O 111 R If we used fixed size bit string: 64 bits 1010 B With Huffman encoding: 25 bits 1011 D Compression ratio: 25/64 = 0.39 What about the decoding process?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend