A Comparison of Five Different Multiprocessor SoC Bus Architectures - PowerPoint PPT Presentation

A Comparison of Five Different Multiprocessor SoC Bus Architectures Kyeong Keol Ryu, Eung Shin and Vincent J. Mooney III School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, USA {kkryu, eung, mooney}@ece.gatech.edu

Outline Introduction Motivation and Previous Work Five Bus Architectures for SoC: BFBA, GBIA, GBIIA, CSBA, and CCBA Application Examples: OFDM transmitter and MPEG2 decoder Experiment Environment Comparison in View of Algorithm and Architecture Comparison of Throughput of the Bus Architectures Conclusion

Introduction (A) (B) SRAM MPC750_A SRAM_A REGISTERS BI-FIFO_A zz CPU Bus A xx SRAM MPC750_B SRAM_B REGISTERS BI-FIFO_B zz CPU Bus B xx SRAM MPC750_C SRAM_C REGISTERS BI-FIFO_C zz CPU Bus C xx SRAM MPC750_D SRAM_D REGISTERS BI-FIFO_D zz CPU Bus D (A) (B) xx PCB

Motivation and Previous Work (I) CoreConnect (IBM): Processor Local Bus (PLB) On-chip Peripheral Bus (OPB) Intellectual Propery (IP) AMBA (ARM): Advanced High-performance Bus (AHB) Advanced Peripheral Bus (APB) IP1 IP2 IP3 IP1 IP2 IP3 PLB AHB

Motivation and Previous Work (II) Sonics uNetwork TDMA arbitration IP reuse and integration Whisbone architecture (Silicore) one bus for all supports multiple masters In terms of bus topology, uNetwork and Whisbone are similar to AMBA and CoreConnect

Five Bus Architectures for 4 processor System (I) Bi-FIFO Bus Architecture (BFBA) Global Bus I Architecture (GBIA) (A) (B) xx SRAM MPC750_A SRAM_A REGISTERS BI-FIFO_A zz CPU Bus A xx SRAM MPC750_B SRAM_B REGISTERS BI-FIFO_B zz CPU Bus B xx SRAM MPC750_C SRAM_C REGISTERS BI-FIFO_C zz CPU Bus C xx SRAM MPC750_D SRAM_D REGISTERS BI-FIFO_D zz CPU Bus D (A) (B) xx

Five Bus Architectures for 4 processor System (II) Global Bus II Architecture (GBIIA) Crossbar Switch Bus Architecture(CSBA) IBM CoreConnect Bus Architecture(CCBA)

Application Examples (I) OFDM Transmitter Block Diagram Data Format: 32 guard samples and 128 data samples Function Assignment Compute Node Pro_A A1 A2 A3 A4 ….. Pro_B B1 B2 B3 B4 Pro_C C1 C2 C3 C4 Pro_D D1 D2 D3 D4 Time Reference : D. Kim and G. L. St über, ''Performance of Multiresolution OFDM on Frequency-selective Fading Channels,'' IEEE Transaction on Vehicular Technology, vol. 48, no. 5, pp. 1740-1746, September 1999.

Application Examples (II) MPEG2 Decoder Video Processing Example 16 x 16 pixel resolution, M= 1, N= 2 Compute Node Compute Node Pro_A SH I P SH I P Pro_A SH I P SH I P Pro_B Pro_B ….. ….. SH I P SH I P SH I P SH I P Pro_C Pro_C SH I P SH I P SH I P SH I P Pro_D Pro_D SH I P SH I P SH I P SH I P Time Time ( BFBA and GBIA ) ( GBIIA, CSBA, and CCBA ) SH: Sequence header, I: Intra decoding frame, P: Predictive decoding frame

Experiment Environment Co-simulation Environment Seamless CVE • co-simulator from Mentor Graphics VCS • A Verilog HDL simulator from Synopsys XRAY • A High-level debugger from Mentor Graphics PowerPC C cross compiler • GCC External Clock of PowerPC 750 • 83.33 MHz (the internal clock speed can be much faster, e.g., 400MHz)

Comparison in View of Algorithm and Architecture Algorithm OFDM Transmitter • Strong output-data dependency between functions using many local variables • Many short loops • Few global variables MPEG2 Decoder • Many global variables for header information • Hierarchical data structure which has a long loop with many nested loops Architecture BFBA and GBIA • No method to access global data • Fast data transfer between processor blocks GBIIA, CSBA, and CCBA • Efficient access of global data

Comparison of Throughput of the Bus Architectures (I) OFDM Transmitter Bus Exe. Exe. Throughput [Mbps] Architecture Cycles/ Packet Time/ Packet 1.14 BFBA 378,348 4.5402 ms 1.1277Mbps 1.12 BFBA GBIA 403,000 4.8360 ms 1.0588Mbps GBIA 1.1 GBIIA GBIIA 381,061 4.5727 ms 1.1197Mbps 1.08 CSBA 1.06 CCBA 380,199 4.5624 ms 1.1222Mbps CSBA 1.04 CCBA 380,686 4.5682 ms 1.1208Mbps 1.02 Reference: 128 data samples and 32 guard samples per packet

Comparison of Throughput of the Bus Architectures (II) MPEG2 Decoder Bus Exe. Exe. Throughput [Mbps] Architecture Cycles/ Packet Time/ Packet 0.7 507,853 6.0942 ms 0.5041Mbps BFBA 0.6 BFBA GBIA 527,545 6.3305 ms 0.4852Mbps 0.5 GBIA 0.4 GBIIA GBIIA 377,562 4.5307 ms 0.6780Mbps CSBA 0.3 CCBA CSBA 377,548 4.5306 ms 0.6781Mbps 0.2 0.1 CCBA 378,181 4.5382 ms 0.6769Mbps 0 Reference: 128 data samples and 32 guard samples per packet

Conclusion Five bus architectures evaluated • BFBA, GBIA, GBIIA, CSBA, and CCBA Two application programs • OFDM transmitter and MPEG2 decoder Pipeline or parallel operation improves performance BFBA best for OFDM • pipelined applications CSBA best for MPEG2 • parallel applications bus architecture performance heavily dependent on • distribution of computation load • algorithm style Future work: combine the bus architectures with switching logic to maximize performance according to application characteristics

A Comparison of Five Different Multiprocessor SoC Bus Architectures - PowerPoint PPT Presentation

A Comparison of Five Different Multiprocessor SoC Bus Architectures Kyeong Keol Ryu, Eung Shin and Vincent J. Mooney III School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, USA {kkryu, eung,

SoC SoC Design SoC SoC Design Design Design Lecture Lecture 1 1: Introduction :

SoC Design Lecture 11: SoC Bus Architectures Shaahin Hessabi Department of Computer Engineering

Eduardo Gandara LEARNING OUTCOMES Introduced to the SOC Program Brief History of SOC

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency In addition,

The Bus Services Bill and Municipal Bus Companies Summary Why we need bus services What

Preparing SOC 1, SOC 2 or SOC 3 Reports: Best Practices Meeting Challenges Arising From SSAE 16,

TKT TKT- -2431 SoC design 2431 SoC design Introduction to exercises SoC design / September 09

FPAA: FPGA Extension SoC FPAA SoC FPAA: Computing SoC Dev Board Large Scale FPAA Devices

Lesson 2 Greek Vocabulary One does not equal five!!! One does not equal five!!! One does not

Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features:

The Diopsis Multiprocessor Tile of ShApes The Diopsis Multiprocessor Tile of ShApes Pier

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization,

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization,

Multiple processor Multiple processor systems systems 1 Multiprocessor Systems Multiprocessor

Improving the Performance of IPOP Research Project 2 Supervisors:

2019 Virginia Telecommunication Initiative How to Apply Webinar September 24, 2018 AGENDA

ATSC 3.0 overview Rich Chernock TG3 Chair Triveni Digital CSO BMSB, Ghent, June 2015 Subject

Taming QoE in Cellular Networks From Subjective Lab Studies to Measurements in the Field P. Casas

The Acumen Way New Leadership for a New World John McKinley Where it started

Analyze & Visualize SQL Server Data w/ PowerPivot, PowerView & Excel Wylie Blanchard

a dynamic team of architects, engineers, strategists, researchers, futurists, and industry

Sharing and Comparing: Best Practices from Education & Credentialing Contexts Tony Alpert,

Sambuz

Useful Links

Newsletter

Mail Us

A Comparison of Five Different Multiprocessor SoC Bus Architectures - PowerPoint PPT Presentation

A Comparison of Five Different Multiprocessor SoC Bus Architectures Kyeong Keol Ryu, Eung Shin and Vincent J. Mooney III School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, USA {kkryu, eung,

SoC SoC Design SoC SoC Design Design Design Lecture Lecture 1 1: Introduction :

SoC Design Lecture 11: SoC Bus Architectures Shaahin Hessabi Department of Computer Engineering

Eduardo Gandara LEARNING OUTCOMES Introduced to the SOC Program Brief History of SOC

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency In addition,

The Bus Services Bill and Municipal Bus Companies Summary Why we need bus services What

Preparing SOC 1, SOC 2 or SOC 3 Reports: Best Practices Meeting Challenges Arising From SSAE 16,

TKT TKT- -2431 SoC design 2431 SoC design Introduction to exercises SoC design / September 09

FPAA: FPGA Extension SoC FPAA SoC FPAA: Computing SoC Dev Board Large Scale FPAA Devices

Lesson 2 Greek Vocabulary One does not equal five!!! One does not equal five!!! One does not

Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features:

The Diopsis Multiprocessor Tile of ShApes The Diopsis Multiprocessor Tile of ShApes Pier

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization,

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization,

Multiple processor Multiple processor systems systems 1 Multiprocessor Systems Multiprocessor

Improving the Performance of IPOP Research Project 2 Supervisors:

2019 Virginia Telecommunication Initiative How to Apply Webinar September 24, 2018 AGENDA

ATSC 3.0 overview Rich Chernock TG3 Chair Triveni Digital CSO BMSB, Ghent, June 2015 Subject

Taming QoE in Cellular Networks From Subjective Lab Studies to Measurements in the Field P. Casas

The Acumen Way New Leadership for a New World John McKinley Where it started

Analyze &amp; Visualize SQL Server Data w/ PowerPivot, PowerView &amp; Excel Wylie Blanchard

a dynamic team of architects, engineers, strategists, researchers, futurists, and industry

Sharing and Comparing: Best Practices from Education &amp; Credentialing Contexts Tony Alpert,

Sambuz

Useful Links

Newsletter

Mail Us

Analyze & Visualize SQL Server Data w/ PowerPivot, PowerView & Excel Wylie Blanchard

Sharing and Comparing: Best Practices from Education & Credentialing Contexts Tony Alpert,