TKT TKT- -2431 SoC design 2431 SoC design Introduction to - - PowerPoint PPT Presentation

tkt tkt 2431 soc design 2431 soc design
SMART_READER_LITE
LIVE PREVIEW

TKT TKT- -2431 SoC design 2431 SoC design Introduction to - - PowerPoint PPT Presentation

TKT TKT- -2431 SoC design 2431 SoC design Introduction to exercises SoC design / September 09 Exercises and the project w ork Exercises and the project w ork Assistants: Juha Arvio juha.arvio@tut.fi, Tero Arpinen tero.arpinen@tut.fi


slide-1
SLIDE 1

TKT TKT-

  • 2431 SoC design

2431 SoC design

Introduction to exercises

SoC design / September 09

slide-2
SLIDE 2

SoC design / September 09

Exercises and the project w ork Exercises and the project w ork

Assistants:

Juha Arvio juha.arvio@tut.fi, Tero Arpinen tero.arpinen@tut.fi

In the project work, a simplified H.263 video encoder is implemented on Altera DE2 FPGA Development and Education board The projects work consists of a set of exercises

After successfully finishing each exercise, one should have a

working H.263 video encoder

Exercises: Wed 12-14, Thu 12-14, Fri 10-12 (TC417)

Assistance not available in any other time All needed software is installed on the PCs of the class and can

be used whenever the class is not reserved for other courses

slide-3
SLIDE 3

SoC design / September 09

Exercises and the project w ork Exercises and the project w ork

Attending the exercise hours is voluntary

The following assignment is introduced Tools and algorithms are introduced Hints are given Questions are answered

Completing each of the exercises is mandatory

The returns have to be in time The returns have to be accepted

Project work is carried out in groups of 1-2 students

Groups of 2 persons are preferred

slide-4
SLIDE 4

SoC design / September 09

Exercises and the project w ork Exercises and the project w ork

The project work consists of several phases and sub-tasks

Receiving and understanding the system requirements Writing a system specification Software implementation of the encoder Functional verification on PC workstation Migrating the SW implementation onto FPGA Verification and performance profiling for pure SW implementation HW/SW partitioning and hardware acceleration Verification and performance profiling for accelerated implementation Documentation

slide-5
SLIDE 5

SoC design / September 09

Exercises and the project w ork Exercises and the project w ork

Completed project work is valid for three successive exams Bonus points

The maximum amount of bonus points is 6 Given according to the quality of returned exercises Bonus point criteria will be explained during the first exercises

More detailed description about the project work will be given during the first exercises http://www.tkt.cs.tut.fi/kurssit/2431

slide-6
SLIDE 6

Exercise 1 / Part 1 Exercise 1 / Part 1

Introduction to topic

SoC design / September 09

slide-7
SLIDE 7

SoC design / September 09

Topic of the w ork Topic of the w ork

A simplified H.263 video encoder on DE2 FPGA Education and Development board The system design flow

Introducing the requirements for video encoder Functional specification is written Software implementation written in ANSI C language of the video

encoder algorithm is made and verified on PC workstation

Initial hardware architecture containing a single Nios II softcore CPU and

necessary peripherals is synthesized for FPGA

Software version is migrated to Nios II processor on FPGA Design is partitioned into software and hardware according to the

profiling result of software implementation

DCT algorithm is accelerated with dedicated logic

Accelerated system is implemented and verified on FPGA Performance analysis is carried out for the accelerated system as well

and compared with the pure software implementation

slide-8
SLIDE 8

SoC design / September 09

H.263 H.263

The basics of H.263 video encoding are explained during following exercises

Students are encouraged to get familiar with video encoding algorithms

in general before they start the project

H.263 has a lot in common with algorithms like JPEG and MPEG-2

A very simplified version of H.263 video encoder (resembling motion JPEG) is used.

Only INTRA coding (i.e. prediction of subsequent frames is not applied) Algorithms used are DCT (Discrete Cosine Transform), Quantization,

RLE (Run-Length Encoding), and VLC coding

slide-9
SLIDE 9

SoC design / September 09

Softw are Softw are

Altera Quartus II v7.2

System development front-end Schematic editing FPGA synthesis SOPC builder for building Avalon/Nios based systems Integrated Iogic analyzer

Nios II IDE

Software development environment for Nios II processor Part of Nios II development kit

Mentor Graphics ModelSim

Simulating own VHDL blocks/designs

ffplay

video player

tmndec

H.263 decoder

nios2-terminal

Terminal software for reading from jtag uart

slide-10
SLIDE 10

SoC design / September 09

Hardw are Hardw are

Altera DE2 Development and Education Board

Cyclone II 2C35 FPGA

33,216 logic elements 483,840 bits of embedded RAM 35 Embedded multipliers 4 PLLs 475 User I/O pins (at maximum)

External memory devices

4 MB Flash 512 KB SRAM 8 MB SDRAM

RS-232 serial port

Used for communication between PC and Nios II processor

USB blaster port

Used for programming the FPGA (memory contents and HW configuration)

In addition, the board contains following peripherals (not so relevant for the project)

Ethernet MAC/PHY device 4x user push-buttons, 18x toggle switches 18x red user leds, 9x green user leds 8x dual 7-segment display 2x expansion headers (40 user I/O pins / header) SD flash connector header 50 MHz and 27 MHz Oscillators

slide-11
SLIDE 11

SoC design / September 09

Exercise returns Exercise returns

Exercises are returned as follows:

Return for an exercise has to be made before the next week’s friday at

12:15 by E-mail

The return has to be made to the corresponding assistant (Juha Arvio or

Tero Arpinen for the english groups)

All the required documents have to be in either pdf or pure text-file

format

The subject for the email has the following form:

SOCD_Ex<exercise_number>_G<group_number> where

<exercise_number> is the number of the exercise in question and <group_number> is the number of your group.

slide-12
SLIDE 12

SoC design / September 09

Bonus points Bonus points

Three main exercise returns are rated

Excellent: 1 bonus point for the exam

The returned document is very good and/or the returned source codes

work correctly and are well done

Accepted: no bonus

The returned document or code is acceptable

Rejected: no bonus, the return has to be corrected

Use common sense: Do not return rubbish!

All the exercises have to be accepted At maximum six bonus points for the exam can be obtained

1 point can be obtained from each of the exercises 2, 5, 12 1 point for the first functional (HW accelerated) encoder implementation 2 points for the fastest encoder implementation 1 point for the second fastest encoder

slide-13
SLIDE 13

Exercise 1, Part 2 Exercise 1, Part 2

Introduction to algorithms

SoC design / September 09

slide-14
SLIDE 14

SoC design / September 09

Requirements for Video Transmission Requirements for Video Transmission

Communication delay

More important in video conferencing applications than in file-based

streaming applications

Should be as low as possible (< 250 ms, even 150 ms) Should be kept as constant as possible

Avoiding burst of frames followed by a still image Buffering

Frame rate

Affects to perceived smoothness of motion Under 10 fps video stream is perceived as “fast slide show”

Image resolution

Directly proportional to data size of a raw image Depends on the application

slide-15
SLIDE 15

SoC design / September 09

Introduction to H.263 Standard Introduction to H.263 Standard

May 1996, ITU-T recommendation v1 Block-based ( Macroblock size is 16 pixels by 16 lines ) Motion estimation for temporal redundancy reduction

Same objects are likely to be present in adjacent frames Half pixel accurate motion vectors

DCT for spatial redundancy reduction

8 x 8 blocks Adjacent pixel values have only a little difference

Quantization (lossy)

Control of compression ratio

RLE and Huffman as entropy coding algorithms

slide-16
SLIDE 16

SoC design / September 09

Block Diagram of H.263 Encoder Block Diagram of H.263 Encoder

pre-processing DCT Q Entropy coding Q-1 IDCT

  • Mot. Est.
  • Mot. Comp

+ +

  • Previous reconstructed pictures

Previous reconstructed pictures (same image as the decoder (same image as the decoder

  • bserves)
  • bserves)

motion vector v(u,v) motion vector v(u,v) v(u,v) v(u,v) bits out (Huffman, VLC) bits out (Huffman, VLC)

7 7 0 0 4 4 0 0 0 0 0 0 0 0 1 1 1 1 9 9 3 3 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

No need to send No need to send zeros in 8x8 block to zeros in 8x8 block to the decoder the decoder 1/2 pixel accurate 1/2 pixel accurate (interpolation) (interpolation) Prediction error computation Prediction error computation

1 1 1 1 1 1 1 1

In Intra mode, MBs are coded directly In Intra mode, MBs are coded directly

slide-17
SLIDE 17

SoC design / September 09

Motion Estimation Motion Estimation

Current frame Macroblock's prediction error ( to be encoded ) Previous reconstructed frame Motion vector (u,v)

+

  • Detected motion

16 16

+

p p

  • p
  • p

claire.qcif original claire.qcif original claire.qcif previous reconstructed claire.qcif previous reconstructed

slide-18
SLIDE 18

SoC design / September 09

Discrete Cosine Transform (DCT) Discrete Cosine Transform (DCT)

Assumption: Adjacent pixels differ only a little from each other

Thus, data in the frequency domain is easier to compress

Spatial domain compression Pixels are grouped into blocks and the blocks are then transformed into frequency domain Essential information is then in more compact form

Important DCT-coefficients in upper-left corner, that is, in low frequencies

Compression is achieved by discarding the less important information

  • f the transformed block

Quantization of coefficients

DCT itself is a lossless transform

Limited accuracy with coefficients, however, leads to some loss of

information

slide-19
SLIDE 19

SoC design / September 09

Entropy Encoding Entropy Encoding

After quantization, the quantized coefficients are compressed in a lossless manner using entropy encoding Run-length coding

Lower amplitude coefficient likely to be zero Arrange successive quantized non-zero coefficients

into combinations of (LAST, RUN, LEVEL)

Last = Whether this is the final non-zero coefficient

in the block

RUN =Number of preceding zeros LEVEL = sign and magnitude of the non-zero

coefficient Coefficients are processed in zig-zag order

Due to the fact that running zeros are most likely

located at higher frequencies

Huffman coding (variable length coding)

After RLE coefficients are encoded based on the

statistical characteristics

Shorter codewords for symbols which occur with

high probability

slide-20
SLIDE 20

SoC design / September 09

H.263 H.263 – – Project w ork Project w ork

A simplified version of H.263 video encoder (resembling motion JPEG) is used.

  • nly INTRA coding (i.e. prediction of subsequent frames is not applied)

used algorithms are DCT (Discrete Cosine Transform), quantization,

RLE (Run-Length Encoding), and VLC coding.

Image resolution used is QCIF (176 x 144)

Encoder: Decoder:

pre-processing DCT Q Entropy coding Q-1 IDCT Reconstructed pictures Reconstructed pictures Entropy decoding 011001011 011001011

slide-21
SLIDE 21

SoC design / September 09

Design flow Design flow

Specification HW / SW partitioning Final I m plem entation Requirem ents Verification Perform ance analysis Perform ance analysis Perform ance analysis Perform ance analysis Docum entation SW I m plem entation Perform ance analysis Perform ance analysis

slide-22
SLIDE 22

SoC design / September 09

Specification Specification

In this week the specification of the encoder is started Required C source codes for the encoder are pre-given

Can be downloaded from course web-pages

You have to write a simple specification for the video encoder system you are going to implement Specification does not have to be long

It is the quality of the contents that matters 4-7 pages in total (including the chapters introduced on next week)

The specification should be written before the implementation

An implementation document will be written later

A diagram of the video encoding flow is required

Control and data flow diagram describing how the pre-given H.263

functions are used

slide-23
SLIDE 23

SoC design / September 09

Specification (2) Specification (2)

  • 1. Introduction

What is being specified

  • 2. Flow of encoding

Present different phases of the encoding Explain the encoding flow briefly A flow diagram of encoding is required!

  • 3. Encoder interface

Inputs and outputs of encoder What kind of data is read in? What is the output data like?

4.Description of algorithms

Function prototypes Description of function parameters and return values Description of function behavior and purpose in this design At least DCT, quantization, RLE, and VLC have to be covered here

The subsequent sections will be written in exercise 2.

slide-24
SLIDE 24

SoC design / September 09

Links on H.263 related material Links on H.263 related material

http://www.itu.int/rec/T-REC-H.263/

ITU-T specification of H.263

http://www.jaxstream.com/products/jaxspeed/wp_m4venc.pdf

Basics of MPEG-4 video encoding

http://www.ece.purdue.edu/~ace/jpeg-tut/jpegtut1.html

JPEG tutorial