Text 1. A text is a sequence of characters 2. Each character is - - PDF document

text
SMART_READER_LITE
LIVE PREVIEW

Text 1. A text is a sequence of characters 2. Each character is - - PDF document

Computer Architecture applied computer science 02.03 Representation of non-numerical sets urbino worldwide campus 02 Information theory 02.03 Representation of non-numerical sets Texts Images Signals (Audio/Video)


slide-1
SLIDE 1

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 1/12

02.03 Representation of non-numerical sets

02 Information theory

02.03 Representation of non-numerical sets

  • Texts
  • Images
  • Signals (Audio/Video)
  • Redundancy and compression

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 2/12

02.03 Representation of non-numerical sets

Text

1. A text is a sequence of characters 2. Each character is taken from a finite alphabete 3. Using a constant-size encoding for the characters, a text is encoded as a concatenation of character codes 4. ASCII: 7-bit encoding 5. Extended ASCII: 8-bit encoding

slide-2
SLIDE 2

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 3/12

02.03 Representation of non-numerical sets

Images

1. An image is a matrix of points with assigned colors 2. An image contains infinite points and each point may take infinite colors 3. Both space and color discretization required 4. Discretized points are called pixels 5. Pixels are organized on a matrix 6. Using a constant size encoding for each pixel, an image is a concatenation of pixels, to be read in a given order

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 4/12

02.03 Representation of non-numerical sets

Color (gray) levels

1111 1110 1101 1100 1011 1010 1001 1000 0111 0110 0101 0100 0011 0010 0001 0000

The encoding associates a unique code with an interval of gray levels All gray levels within the interval are associated with the same code, thus loosing information The original gray level cannot be exactly reconstructed from the code Encoding associates each code with a unique gray level (representative of a class)

slide-3
SLIDE 3

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 5/12

02.03 Representation of non-numerical sets

2D images

Gray level x y

nlev nx ny

pixel

lev y x

n n n size

2

log

  • urbino worldwide campus

applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 6/12

02.03 Representation of non-numerical sets

Example

100x100x1bit 100x100x8bit 50x50x1bit 50x50x8bit 10x10x8bit 10x10x1bit

slide-4
SLIDE 4

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 7/12

02.03 Representation of non-numerical sets

Analog and digital signals

  • Signal: time-varying physical quantity

– Analog: continuous-time, continuous-value – Digital: discrete-time, discrete-value

  • The digital encoding of a continuous signal

entails:

– Sampling (i.e., time discretization) – Quantization (i.e., value discretization)

size rate

s T s size

  • Sampling rate

Duration Sample size

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 8/12

02.03 Representation of non-numerical sets

Audio: time series

time value

lev rate size rate

n T s s T s size

2

log

slide-5
SLIDE 5

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 9/12

02.03 Representation of non-numerical sets

Video

y x col rate size rate

n n n log T s s T s size

  • 2

srate = frame rate ncol = number of colors nxny = frame size

time

ny nx

color

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 10/12

02.03 Representation of non-numerical sets

Redundancy

  • Redundant encoding: encoding that makes use of more

than the minimum number of digits required by an exact encoding

  • M

N

S

log

  • Motivations for redundancy:

– Providing more expressive/natural encoding/decoding rules – Reliability (error detection) Ex: parity encoding – Noise immunity / fault tolerance (error correction) Ex: triplication

slide-6
SLIDE 6

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 11/12

02.03 Representation of non-numerical sets

01101

  • Parity encoding:

– A parity bit is used to guarantee that all codewords have an even number of 1’s – Single errors are detected by means of a parity check

Redundancy: examples

0010 00101 000000111000

parity check 0

1

error Irredundant codeword

  • Triple redundancy:

– Each character is repeats 3 times – Single errors are corrected by means of a majority voting 000000111010

error

1

voting result

urbino worldwide campus applied computer science

Computer Architecture

alessandro bogliolo isti information science and technology institute 12/12

02.03 Representation of non-numerical sets

Compression

  • Lossy compression

– Compression achieved at the cost of reducing the accuracy

  • f the representation

– The original representation cannot be restored – Always effective

  • Lossless compression

– Compression achieved by either removing redundancy or leveraging content-specific opportunities – The original representation can be restored – Not always effective