SLIDE 10 FAM vs. CAM correlation for KNL
Configuration simulated:
Xeon Phi “Knights Landing” core 1 to 8 cores 2 cores per tile 1 to 4 SMT threads per core
Metrics compared:
IPC L1 and L2 cache miss rates Speedup
10
- Correlation typically in the ~20% range for 1T, but worsens with SMT
- FAM vs. CAM speedup trends are similar to each other
1 2 3 4 5 6 7 8 1 2 4 6 8 1 2 3 4 Speedup
CAM Loop1 Speedup
1smt 2smt 4smt 1 2 3 4 5 6 7 8 1 2 4 6 8 1 2 3 4 Speedup
FAM Loop1 Speedup
1smt 2smt 4smt 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 tpc 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 #cores 1 2 4 6 8
FAM vs. CAM for Loop1
Total IPC L1D mpki L2 mpki