17 May 2000
1 S.Jarp CERN
I A-64, the Trillian project,
and CERN’s involvement
CERN Computing Seminar
I A-64, the Trillian project, and CERNs involvement CERN - - PowerPoint PPT Presentation
S.Jarp CERN I A-64, the Trillian project, and CERNs involvement CERN Computing Seminar 1 17 May 2000 CERN Main justifications S.Jarp Contribute to Open Source Be fully prepared to evaluate I A-64 for LHC computing I nfluence
17 May 2000
1 S.Jarp CERN
CERN Computing Seminar
17 May 2000
2 S.Jarp CERNMain justifications
Contribute to Open Source Be fully prepared to evaluate I A-64
for LHC computing
I nfluence key hardware/ software
vendors
CERN benchmarks for compiler improvements, etc.
17 May 2000
3 S.Jarp CERNAgenda
What is this new architecture (I A-64)? History Projects Future objective
LHC Testbed/ Grids
Conclusions
4
17 May 2000
5 S.Jarp CERNSome definitions
We define:
I A-32
as “x86”, I .e. I ntel’s current 32-bit architecture
I A-64
as I ntel’s new 64-bit architecture
I A-XX
“XX” refers to integer registers So, I A-64 enables 64-bit “long” and “pointer” variables I n other words: Linux’s “LP64” programming model
Note:
I n a 64-bit architecture, one can normally still use 32-bit
“int” variable and 32-bit pointers (in a 4 GB Addr. Space)
Floating-point registers are already 80-bit on I A-32
17 May 2000
6 S.Jarp CERN I A-64 Highlights
Key I nnovations:
Rich I nstruction Set Bundled (parallel) Execution Predicated I nstructions Large Register Files
Register Stack Rotating Registers
Software Pipelined Loops Control/ Data Speculation Cache [I / D] Control I nstructions High-precision 82-bit Floating-Point
17 May 2000
7 S.Jarp CERNCompared to I A-32 Many advantages:
Clear, explicit programming After all, this is EPI C:
Register-based programming Keep everything in registers (As long as possible) 128 integer + 128 floating-point register Architected renaming (“Rotation”) Architectural support for software pipelining “Modulo scheduling” All instructions (almost) can be predicated 64 1-bit registers (“Fire”/ ”Do not fire”) Much more general than CONDI TI ONAL MOVES
17 May 2000
8 S.Jarp CERNI nstruction Bundle
‘Packaging entity’ (16 bytes):
3 * 41 bit I nstruction Slots 5 bits for Template:
Typical examples: MFI (Memory/ FLP/ I nteger) MI B (Memory/ I nteger/ Branch) I ncluding “stop” bit
Slot 2 Slot 1 Slot 0 T
17 May 2000
9 S.Jarp CERN I nstruction Delivery
Must match (in I tanium):
6 instructions with 9 issue ports
w/ corresponding execution units attached
S0 S1 S2 S0 S1 S2
Dispersal network
(template interpretation)
M0 M1 F0 F1 I 0 I 1 B0 B1 B1
17 May 2000
10 S.Jarp CERNSW Pipelined Loops Graphical example
7 loop traversals desired Skewed execution Stage 2 relative to Stage 1 Stage 3 relative to Stage 2
Stage 1
Time Completed Stages
Stage 2 Stage 3
Epilogue Main loop
See presentation in the references for further details
11
17 May 2000
12 S.Jarp CERNHistory - 1 Visit to HP
Labs in November 1992
Great secret:
The PA-RI SC successor is under development:
PA-WW
HP architect: Bill Worley Mathlib expert: Clemens Roothaan
17 May 2000
13 S.Jarp CERNHistory - 2 1994 – 1998
Architecture is renegotiated
between HP and I NTEL
New name: I A-64 Merge of ideas
As of 1997/ 98
Huge effort across I T industry
to prepare
OS, compilers, libraries,
middleware, applications
Mid-1999
First chip becomes reality: Merced
I tanium
I ntel architect: Gadi Singer
17 May 2000
14 S.Jarp CERNCERN/ HP projects
1994 – 1997:
Review of architecture
This “project” was so secret that hardly anybody knew
about it !
1998 – 1999:
Joint projects
HP: I mplement a vector math library CERN: Random Number Generators (in vector mode)
1999 –
Linux/ I A-64 porting project
Which grew into Trillian
15
17 May 2000
16 S.Jarp CERNTrillian
The porting goals:
Provision of:
Full support of hardware, firmware, boot process The PC platform has undergone a complete review with
I A-64.
Kernel Exploiting I A-64 features, such as huge address spaces,
large page sizes. Support of I A-32 binaries.
Native compilers I nitially gcc Libraries glibc, optimised libm, etc. Middleware X-server, Performance Counter Library, etc.
17 May 2000
17 S.Jarp CERN Trillian (I nitial phase)
Leading I T companies + CERN:
Port basic OS, utilities, compilers, libraries CERN, Cygnus, HP, I BM, I NTEL, SGI , and VA Linux CERN team: Responsible for glibc (generic/ specific) Shared library support Testing environment: HP simulator (on top of Linux/ I A-32) I ntel simulator also available
Goal: Be ready for first hardware with a fully functional port
17 May 2000
18 S.Jarp CERN Trillian (Phase 2) New companies joined:
First, distributors joined: Caldera, RedHat, SuSE, and
Turbolinux.
and very recently: Linuxcare and NEC
I ntel
Distributed real prototype systems “Bigsur” - 2-way workstation “Lion” - 4-way server
Trillian was ready:
Native kernel/ compiler/ libs/ etc.
SGI added compilers:
sgicc, sgiCC, sgif90
Other compilers are expected to come
17 May 2000
19 S.Jarp CERN Trillian (now)
Final phase:
Glibc
Stabilise and fix bugs Complete optimisation of time-critical routines: Memory (e.g.memcpy) and String (e.g. strcpy) routines Move from glibc 2.1 to 2.2
Porting real applications:
Solution stacks: GEANT4 (including CLHEP) using g+ + and sgiCC SI XTRACK using sgif90 Several benchmarks already “running”
Aim:
Be ready at first shipment (3Q 2000?) With well-running applications
17 May 2000
20 S.Jarp CERNKernel Now fully integrated in standard distribution
Layout:
I nterrupt Vector Table
I nterrupt subsystem Trap handling Signal subsystem Process subsystem Virtual Mem subsystem Device drivers Network protocols File Systems I A-32 Subsystem
System Calls
Application 1 Application 2 X11 Subsystem
17 May 2000
21 S.Jarp CERN Compiler technology As critical as it was for “vector
supercomputing”
Desired goal: Loops optimised through Software Pipelining Currently: This seems easier for FORTRAN than C+ + SI XTRACK (FORTRAN) Loop optimisation takes place GEANT4 Deep level of method nesting makes the task harder
Too early to draw any conclusions
22
I A-64 and LHC
17 May 2000
23 S.Jarp CERN I ntel’s project future - 1
Madison
IA-64 IA-64 Perf Perf
Future IA-32 Deerfield
IA-64 Price/ IA-64 Price/Perf Perf
Performance ’02 ’00 ’01
.25µ .18µ .13µ
. . . . . .
McKinley
Itanium
Processor
’99
. . . . . . . . . . . .
Foster Cascades
Pentium III Processor First price/ performance version
17 May 2000
24 S.Jarp CERN I ntel’s projected future - 2
20 40 60 80 100 120 1 q 1 q 1 1 q 2 1 q 3 1 q 4 1 q 5 1 q 6 1 q 7 IA-64 (percent
LHC
Pre LHC Forecast by Microprocessor Report
17 May 2000
25 S.Jarp CERN
250 Gbps
Gbps 8 Gbps
… … … …
5600 processors 1400 boxes 160 clusters 40 sub- f arms 24 Gbps* 960 Gbps* 6 Gbps*
100 drives 12 Gbps 5400 disks 340 arrays
……. ..
LAN- WAN routers
CERN
CMS Of f line Farm at CERN cir ca 2006
lmr f or Monarc st udy- april 1999
t apes
5 Gbps
disks processor s
st or age net wor k st or age net wor k f ar m net wor k
0.5 M SPECint95 > 5K processors 0.5 PByte disk >5K disks
17 May 2000
26 S.Jarp CERN Possible Grid structure
CERN full data Tier 1 RC ?? full ESD Tier 1 RC France full ESD Tier 1 RC Italy full ESD
622 Mpbs links + air freight University/ Department cluster University/ Department cluster University/ Department cluster 622 Mpbs links Tier 1 RC US full ESD
desktop desktop desktop desktop..
17 May 2000
27 S.Jarp CERNConclusions
Exciting new architecture
A full-fledged Linux port is available I A-64 should, some day, replace I A-32
but watch out for ‘inflection points’
Understand full potential
HEP’s huge source base; Some hand-coded routines
Working directly with the ‘creators’
HP, I NTEL, and many others
Maintain CERN in a leading role
With top I T companies I nside key projects, like the LHC Testbed and proposed
European GRI D project
17 May 2000
28 S.Jarp CERNFurther references