AMBER 16 on P100s February 2017 PME-Cellulose_NPT on P100s PCIe 40 - - PowerPoint PPT Presentation

amber 16 on p100s
SMART_READER_LITE
LIVE PREVIEW

AMBER 16 on P100s February 2017 PME-Cellulose_NPT on P100s PCIe 40 - - PowerPoint PPT Presentation

AMBER 16 on P100s February 2017 PME-Cellulose_NPT on P100s PCIe 40 PME-Cellulose_NPT 35 Running AMBER version 16.3 30.00 30 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 25 12.8X ns/day 21.85 (Broadwell) CPUs


slide-1
SLIDE 1

February 2017

AMBER 16 on P100s

slide-2
SLIDE 2

38

PME-Cellulose_NPT on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

2.35 21.85 30.00 5 10 15 20 25 30 35 40 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node ns/day

PME-Cellulose_NPT

9.3X

12.8X

slide-3
SLIDE 3

39

PME-Cellulose_NPT on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

2.35 23.37 32.22 36.65 5 10 15 20 25 30 35 40 1 Broadwell node 1 node + 1x P100 SXM2 per node 1 node + 2x P100 SXM2 per node 1 node + 4x P100 SXM2 per node ns/day

PME-Cellulose_NPT

9.9X 13.7X 15.6X

slide-4
SLIDE 4

40

PME-Cellulose_NVE on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

2.47 23.34 32.55 5 10 15 20 25 30 35 40 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node ns/day

PME-Cellulose_NVE

9.4X 13.2X

slide-5
SLIDE 5

41

PME-Cellulose_NVE on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

2.47 24.94 35.16 40.88 5 10 15 20 25 30 35 40 45 1 Broadwell node 1 node + 1x P100 SXM2 per node 1 node + 2x P100 SXM2 per node 1 node + 4x P100 SXM2 per node ns/day

PME-Cellulose_NVE

10.1X 14.2X 16.6X

slide-6
SLIDE 6

42

PME-FactorIX_NPT on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

11.43 98.77 132.86 20 40 60 80 100 120 140 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node ns/day

PME-FactorIX_NPT

8.6X 11.6X

slide-7
SLIDE 7

43

PME-FactorIX_NPT on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

11.43 106.25 144.11 159.80 20 40 60 80 100 120 140 160 180 1 Broadwell node 1 node + 1x P100 SXM2 per node 1 node + 2x P100 SXM2 per node 1 node + 4x P100 SXM2 per node ns/day

PME-FactorIX_NPT

9.3X 12.6X 14.0X

slide-8
SLIDE 8

44

PME-FactorIX_NVE on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

11.98 105.86 145.83 20 40 60 80 100 120 140 160 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node ns/day

PME-FactorIX_NVE

8.8X 12.2X

slide-9
SLIDE 9

45

PME-FactorIX_NVE on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

11.98 114.88 159.24 178.02 20 40 60 80 100 120 140 160 180 200 1 Broadwell node 1 node + 1x P100 SXM2 per node 1 node + 2x P100 SXM2 per node 1 node + 4x P100 SXM2 per node ns/day

PME-FactorIX_NVE 9.6X

13.3X 14.9X

slide-10
SLIDE 10

46

PME-JAC_NPT on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

45.89 283.60 327.69 50 100 150 200 250 300 350 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node ns/day

PME-JAC_NPT

6.2X 7.1X

slide-11
SLIDE 11

47

PME-JAC_NPT on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

45.89 310.52 360.64 423.09 50 100 150 200 250 300 350 400 450 1 Broadwell node 1 node + 1x P100 PCIe per node 1 node + 2x P100 PCIe per node 1 node + 4x P100 PCIe per node ns/day

PME-JAC_NPT

6.8X 7.9X 9.2X

slide-12
SLIDE 12

48

PME-JAC_NVE on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

47.90 308.46 363.79 50 100 150 200 250 300 350 400 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node ns/day

PME-JAC_NVE

6.4X 7.6X

slide-13
SLIDE 13

49

PME-JAC_NVE on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

47.90 339.81 402.18 473.10 50 100 150 200 250 300 350 400 450 500 1 Broadwell node 1 node + 1x P100 PCIe per node 1 node + 2x P100 PCIe per node 1 node + 4x P100 PCIe per node ns/day

PME-JAC_NVE

7.1X 8.4X 9.9X

slide-14
SLIDE 14

50

GB-Myoglobin on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

28.86 483.37 561.94 100 200 300 400 500 600 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 4x P100 PCIe (16GB) per node ns/day

GB-Myoglobin

16.7X 19.5X

slide-15
SLIDE 15

51

GB-Myoglobin on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

28.86 534.28 639.37 100 200 300 400 500 600 700 1 Broadwell node 1 node + 1x P100 PCIe per node 1 node + 4x P100 PCIe per node ns/day

GB-Myoglobin

18.5X 22.2X

slide-16
SLIDE 16

52

GB-Nucleosome on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

0.40 11.91 22.77 39.91 45.92 5 10 15 20 25 30 35 40 45 50 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node 1 node + 4x P100 PCIe (16GB) per node 1 node + 8x P100 PCIe (16GB) per node ns/day

GB-Nucleosome

29.8X 56.9X 99.8X 114.8X

slide-17
SLIDE 17

53

GB-Nucleosome on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

0.40 13.36 25.53 46.29 48.29 10 20 30 40 50 60 1 Broadwell node 1 node + 1x P100 SXM2 per node 1 node + 2x P100 SXM2 per node 1 node + 4x P100 SXM2 per node 1 node + 8x P100 SXM2 per node ns/day

GB-Nucleosome

33.4X 63.8X 115.7X 120.7X

slide-18
SLIDE 18

54

Rubisco-75K on P100s PCIe

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs ➢ 1x P100 PCIe is paired with Single Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

0.01 0.71 1.40 2.69 4.20 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 1 Broadwell node 1 node + 1x P100 PCIe (16GB) per node 1 node + 2x P100 PCIe (16GB) per node 1 node + 4x P100 PCIe (16GB) per node 1 node + 8x P100 PCIe (16GB) per node ns/day

Rubisco-75K

71.0X 140.0X 269.0X 420.0X

slide-19
SLIDE 19

55

Rubisco-75K on P100s SXM2

Running AMBER version 16.3 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs ➢ 1x P100 SXM2 is paired with Single Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell)

0.01 0.80 1.57 3.06 4.46 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 1 Broadwell node 1 node + 1x P100 SXM2 per node 1 node + 2x P100 SXM2 per node 1 node + 4x P100 SXM2 per node 1 node + 8x P100 SXM2 per node ns/day

Rubisco-75K

80.0X 157.0X 306.0X 446.0X

slide-20
SLIDE 20

56

Recommended GPU Node Configuration for AMBER Computational Chemistry

Workstation or Single Node Configuration

# of CPU sockets 2 Cores per CPU socket 6+ (1 CPU core drives 1 GPU) CPU speed (Ghz) 2.66+ System memory per node (GB) 16 GPUs P100, V100 # of GPUs per CPU socket 1-4 GPU memory preference (GB) 6 GPU to CPU connection PCIe 3.0 16x or higher Server storage 2 TB Network configuration Infiniband QDR or better

Scale to multiple nodes with same single node configuration

56