I A-64, the Trillian project, and CERNs involvement CERN - - PowerPoint PPT Presentation

i a 64 the trillian project
SMART_READER_LITE
LIVE PREVIEW

I A-64, the Trillian project, and CERNs involvement CERN - - PowerPoint PPT Presentation

S.Jarp CERN I A-64, the Trillian project, and CERNs involvement CERN Computing Seminar 1 17 May 2000 CERN Main justifications S.Jarp Contribute to Open Source Be fully prepared to evaluate I A-64 for LHC computing I nfluence


slide-1
SLIDE 1

17 May 2000

1 S.Jarp CERN

I A-64, the Trillian project,

and CERN’s involvement

CERN Computing Seminar

slide-2
SLIDE 2

17 May 2000

2 S.Jarp CERNMain justifications

Contribute to Open Source Be fully prepared to evaluate I A-64

for LHC computing

I nfluence key hardware/ software

vendors

CERN benchmarks for compiler improvements, etc.

slide-3
SLIDE 3

17 May 2000

3 S.Jarp CERNAgenda

What is this new architecture (I A-64)? History Projects Future objective

LHC Testbed/ Grids

Conclusions

slide-4
SLIDE 4

4

I A-64 Architecture

slide-5
SLIDE 5

17 May 2000

5 S.Jarp CERNSome definitions

We define:

I A-32

as “x86”, I .e. I ntel’s current 32-bit architecture

I A-64

as I ntel’s new 64-bit architecture

I A-XX

“XX” refers to integer registers So, I A-64 enables 64-bit “long” and “pointer” variables I n other words: Linux’s “LP64” programming model

Note:

I n a 64-bit architecture, one can normally still use 32-bit

“int” variable and 32-bit pointers (in a 4 GB Addr. Space)

Floating-point registers are already 80-bit on I A-32

  • Supporting 32-bit “float”, 64-bit “double”, and 80-bit “long double”
slide-6
SLIDE 6

17 May 2000

6 S.Jarp CERN I A-64 Highlights

Key I nnovations:

Rich I nstruction Set Bundled (parallel) Execution Predicated I nstructions Large Register Files

Register Stack Rotating Registers

Software Pipelined Loops Control/ Data Speculation Cache [I / D] Control I nstructions High-precision 82-bit Floating-Point

slide-7
SLIDE 7

17 May 2000

7 S.Jarp CERNCompared to I A-32 Many advantages:

Clear, explicit programming After all, this is EPI C:

  • “Explicitly Parallel I nstruction Computing”

Register-based programming Keep everything in registers (As long as possible) 128 integer + 128 floating-point register Architected renaming (“Rotation”) Architectural support for software pipelining “Modulo scheduling” All instructions (almost) can be predicated 64 1-bit registers (“Fire”/ ”Do not fire”) Much more general than CONDI TI ONAL MOVES

slide-8
SLIDE 8

17 May 2000

8 S.Jarp CERNI nstruction Bundle

‘Packaging entity’ (16 bytes):

3 * 41 bit I nstruction Slots 5 bits for Template:

Typical examples: MFI (Memory/ FLP/ I nteger) MI B (Memory/ I nteger/ Branch) I ncluding “stop” bit

Slot 2 Slot 1 Slot 0 T

slide-9
SLIDE 9

17 May 2000

9 S.Jarp CERN I nstruction Delivery

Must match (in I tanium):

6 instructions with 9 issue ports

w/ corresponding execution units attached

S0 S1 S2 S0 S1 S2

Dispersal network

(template interpretation)

M0 M1 F0 F1 I 0 I 1 B0 B1 B1

slide-10
SLIDE 10

17 May 2000

10 S.Jarp CERNSW Pipelined Loops Graphical example

7 loop traversals desired Skewed execution Stage 2 relative to Stage 1 Stage 3 relative to Stage 2

Stage 1

Time Completed Stages

Stage 2 Stage 3

Epilogue Main loop

See presentation in the references for further details

slide-11
SLIDE 11

11

The History

slide-12
SLIDE 12

17 May 2000

12 S.Jarp CERNHistory - 1 Visit to HP

Labs in November 1992

Great secret:

The PA-RI SC successor is under development:

PA-WW

HP architect: Bill Worley Mathlib expert: Clemens Roothaan

slide-13
SLIDE 13

17 May 2000

13 S.Jarp CERNHistory - 2 1994 – 1998

Architecture is renegotiated

between HP and I NTEL

New name: I A-64 Merge of ideas

As of 1997/ 98

Huge effort across I T industry

to prepare

OS, compilers, libraries,

middleware, applications

Mid-1999

First chip becomes reality: Merced

I tanium

I ntel architect: Gadi Singer

slide-14
SLIDE 14

17 May 2000

14 S.Jarp CERNCERN/ HP projects

1994 – 1997:

Review of architecture

This “project” was so secret that hardly anybody knew

about it !

1998 – 1999:

Joint projects

HP: I mplement a vector math library CERN: Random Number Generators (in vector mode)

1999 –

Linux/ I A-64 porting project

Which grew into Trillian

slide-15
SLIDE 15

15

TRI LLI AN Project

slide-16
SLIDE 16

17 May 2000

16 S.Jarp CERNTrillian

The porting goals:

Provision of:

Full support of hardware, firmware, boot process The PC platform has undergone a complete review with

I A-64.

Kernel Exploiting I A-64 features, such as huge address spaces,

large page sizes. Support of I A-32 binaries.

Native compilers I nitially gcc Libraries glibc, optimised libm, etc. Middleware X-server, Performance Counter Library, etc.

slide-17
SLIDE 17

17 May 2000

17 S.Jarp CERN Trillian (I nitial phase)

Leading I T companies + CERN:

Port basic OS, utilities, compilers, libraries CERN, Cygnus, HP, I BM, I NTEL, SGI , and VA Linux CERN team: Responsible for glibc (generic/ specific) Shared library support Testing environment: HP simulator (on top of Linux/ I A-32) I ntel simulator also available

  • n top of Windows/ NT

Goal: Be ready for first hardware with a fully functional port

slide-18
SLIDE 18

17 May 2000

18 S.Jarp CERN Trillian (Phase 2) New companies joined:

First, distributors joined: Caldera, RedHat, SuSE, and

Turbolinux.

and very recently: Linuxcare and NEC

I ntel

Distributed real prototype systems “Bigsur” - 2-way workstation “Lion” - 4-way server

Trillian was ready:

Native kernel/ compiler/ libs/ etc.

SGI added compilers:

sgicc, sgiCC, sgif90

Other compilers are expected to come

slide-19
SLIDE 19

17 May 2000

19 S.Jarp CERN Trillian (now)

Final phase:

Glibc

Stabilise and fix bugs Complete optimisation of time-critical routines: Memory (e.g.memcpy) and String (e.g. strcpy) routines Move from glibc 2.1 to 2.2

Porting real applications:

Solution stacks: GEANT4 (including CLHEP) using g+ + and sgiCC SI XTRACK using sgif90 Several benchmarks already “running”

Aim:

Be ready at first shipment (3Q 2000?) With well-running applications

slide-20
SLIDE 20

17 May 2000

20 S.Jarp CERNKernel Now fully integrated in standard distribution

Layout:

I nterrupt Vector Table

I nterrupt subsystem Trap handling Signal subsystem Process subsystem Virtual Mem subsystem Device drivers Network protocols File Systems I A-32 Subsystem

System Calls

Application 1 Application 2 X11 Subsystem

slide-21
SLIDE 21

17 May 2000

21 S.Jarp CERN Compiler technology As critical as it was for “vector

supercomputing”

Desired goal: Loops optimised through Software Pipelining Currently: This seems easier for FORTRAN than C+ + SI XTRACK (FORTRAN) Loop optimisation takes place GEANT4 Deep level of method nesting makes the task harder

Too early to draw any conclusions

slide-22
SLIDE 22

22

2005 Projections

I A-64 and LHC

slide-23
SLIDE 23

17 May 2000

23 S.Jarp CERN I ntel’s project future - 1

Madison

IA-64 IA-64 Perf Perf

Future IA-32 Deerfield

IA-64 Price/ IA-64 Price/Perf Perf

Performance ’02 ’00 ’01

.25µ .18µ .13µ

. . . . . .

McKinley

Itanium

Processor

’99

. . . . . . . . . . . .

Foster Cascades

Pentium III Processor First price/ performance version

slide-24
SLIDE 24

17 May 2000

24 S.Jarp CERN I ntel’s projected future - 2

20 40 60 80 100 120 1 q 1 q 1 1 q 2 1 q 3 1 q 4 1 q 5 1 q 6 1 q 7 IA-64 (percent

  • f total)

LHC

Pre LHC Forecast by Microprocessor Report

slide-25
SLIDE 25

17 May 2000

25 S.Jarp CERN

250 Gbps

  • 0. 8

Gbps 8 Gbps

… … … …

5600 processors 1400 boxes 160 clusters 40 sub- f arms 24 Gbps* 960 Gbps* 6 Gbps*

  • 1. 5 Gbps

100 drives 12 Gbps 5400 disks 340 arrays

……. ..

LAN- WAN routers

CERN

CMS Of f line Farm at CERN cir ca 2006

lmr f or Monarc st udy- april 1999

t apes

  • 0. 8 Gbps (daq)
  • 0. 8 Gbps

5 Gbps

disks processor s

st or age net wor k st or age net wor k f ar m net wor k

0.5 M SPECint95 > 5K processors 0.5 PByte disk >5K disks

slide-26
SLIDE 26

17 May 2000

26 S.Jarp CERN Possible Grid structure

CERN full data Tier 1 RC ?? full ESD Tier 1 RC France full ESD Tier 1 RC Italy full ESD

. .

622 Mpbs links + air freight University/ Department cluster University/ Department cluster University/ Department cluster 622 Mpbs links Tier 1 RC US full ESD

desktop desktop desktop desktop..

slide-27
SLIDE 27

17 May 2000

27 S.Jarp CERNConclusions

Exciting new architecture

A full-fledged Linux port is available I A-64 should, some day, replace I A-32

but watch out for ‘inflection points’

Understand full potential

HEP’s huge source base; Some hand-coded routines

Working directly with the ‘creators’

HP, I NTEL, and many others

Maintain CERN in a leading role

With top I T companies I nside key projects, like the LHC Testbed and proposed

European GRI D project

slide-28
SLIDE 28

17 May 2000

28 S.Jarp CERNFurther references

  • Trillian:
  • http:/ / www.linuxia64.org/ [Trillian home page]
  • http:/ / www.turbolinux.com/ ia64.html [Linux distribution]
  • http:/ / oss.sgi.com/ projects/ Pro64/ [Linux compilers]
  • At CERN:
  • http:/ / nicewww.cern.ch/ ~ sverre/ Linux_I A64_project.html
  • I A-64 programming:
  • http:/ / developer.intel.com/ design/ I a-64/ [I ntel documentation]
  • http:/ / nicewww.cern.ch/ ~ sverre/ I A64_1.pdf [My tutorial]
  • http:/ / nicewww.cern.ch/ ~ sverre/ I ntel_SW_Pipelining_Notes.pdf
  • Kernel:
  • http:/ / www.kernel.org/ pub/ linux/ kernel/ ports/ ia64 [Kernel source]
  • http:/ / www.linuxia64.org/ logos/ I A64linuxkernel.PDF [Presentation]