Quantum simulation with hardware acceleration (arXiv:2009.01845)
Stefano Carrazza 18th September 2020, QTI TH meeting, CERN.
Universit` a degli Studi di Milano, INFN Milan, CERN, TII
PDF N 3
Machine Learning • PDFs • QCD
N 3 PDF Machine Learning PDFs QCD Introduction Introduction - - PowerPoint PPT Presentation
Quantum simulation with hardware acceleration (arXiv:2009.01845) Stefano Carrazza 18th September 2020, QTI TH meeting, CERN. Universit` a degli Studi di Milano, INFN Milan, CERN, TII N 3 PDF Machine Learning PDFs QCD Introduction
Stefano Carrazza 18th September 2020, QTI TH meeting, CERN.
Universit` a degli Studi di Milano, INFN Milan, CERN, TII
Machine Learning • PDFs • QCD
1
1
2
3
4
5
6
7
1 How to prepare and execute quantum algorithms? 2 How to make quantum hardware accessible to users? 8
9
10
11
Ry
12
1 efficient simulation engine for:
2 designed with modern standards:
3 released as an open-source code
13
14
i1, . . . , σ′ iNtargets , . . . , σN),
15
2
2
2
2
16
N−1
N |k
17
18
5 10 15 20 25 30 35 Number of Qubits 10-3 10-2 10-1 100 101 102 103 104 Total time (sec)
QFT (complex64)
Qibo (GPU) Qibo (multi-GPU) Qibo (CPU) Qibo (CPU-1) QCGPU (GPU) QCGPU (CPU) Cirq (CPU) TFQ (CPU) 10 20 30 Number of Qubits 2 4 Ratio to Qibo (GPU) 10 20 30 Number of Qubits 100 101 Ratio to Qibo (CPU) 5 10 15 20 25 30 Number of Qubits 10-3 10-1 101 103 Total time (sec)
QFT (complex128)
Qibo (GPU) Qibo (multi-GPU) Qibo (CPU) Qibo (CPU-1) Qulacs (GPU) Qulacs (CPU) IntelQS (CPU) Qiskit (CPU) PyQuil (CPU) 10 20 30 Number of Qubits 2 4 Ratio to Qibo (GPU) 10 20 30 Number of Qubits 10-1 101 Ratio to Qibo (CPU)
Quantum Fourier Transform simulation performance comparison in single precision (left) and double precision (right).
19
Ry
20
5 10 15 20 25 30 35 Number of Qubits 10-2 10-1 100 101 102 103 104 Total time (sec)
Variational 5 layers (complex64)
Qibo (GPU) Qibo (CPU) Qibo (CPU-1) QCGPU (GPU) QCGPU (CPU) Cirq (CPU) TFQ (CPU) 10 20 30 Number of Qubits 2 4 Ratio to Qibo (GPU) 10 20 30 Number of Qubits 100 101 102 Ratio to Qibo (CPU) 5 10 15 20 25 30 Number of Qubits 10-3 10-1 101 103 Total time (sec)
Variational 5 layers (complex128)
Qibo (GPU) Qibo (CPU) Qibo (CPU-1) Qulacs (GPU) Qulacs (CPU) IntelQS (CPU) Qiskit (CPU) PyQuil (CPU) 10 20 30 Number of Qubits 2 4 6 Ratio to Qibo (GPU) 10 20 30 Number of Qubits 10-1 101 Ratio to Qibo (CPU)
Variational circuit simulation performance comparison in single precision (left) and double precision (right).
21
5 10 15 20 25 30 35 Number of Qubits 10-2 10-1 100 101 102 103 Total Time (sec) GPU c64 GPU c128 CPU c64 CPU c128 10 20 30 Number of Qubits 1.0 1.5 2.0 Ratio to GPU c64 10 20 30 Number of Qubits 1.0 1.5 2.0 Ratio to CPU c64
Comparison of simulation time when using single (complex64) and double (complex128) precision on GPU and multi- threading (40 threads) CPU.
22
101 102 103 104 105 106 Number of shots 10-3 10-2 10-1 100 101 Total time (sec)
DGX CPU
N = 10 N = 12 N = 14 N = 16 N = 18 N = 20 N = 22 N = 24 N = 26 N = 28 N = 30 101 102 103 104 105 106 Number of shots 10-3 10-2 10-1 100 101 Total time (sec)
DGX V100
N = 10 N = 12 N = 14 N = 16 N = 18 N = 20 N = 22 N = 24 N = 26 N = 28 N = 30
Example of measurement shots simulation on CPU (left) and GPU (right).
23
25 26 27 28 29 30 31 32 33 Number of Qubits 100 101 102 103 104 Time (sec) 2x 2x 2x 2x 2x 2x 2x 4x 2x4 1-thread 10-threads 20-threads 40-threads single-GPU multi-GPU
Comparison of Qibo performance for QFT on multiple hardware configurations. For the multi-GPU setup we include a label on top of each histogram bar summarizing the effective number of NVIDIA V100 cards used during the benchmark.
24
14 15 16 Number of Qubits 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 Time (sec) 1-thread 40-threads single-GPU
Comparison of Qibo performance for small QFT circuits on single thread CPU, multi-threading CPU and GPU. Single thread CPU is the optimal choice for up to 15 qubits.
25
26
27
N
N
28
29
29
5 10 15 20 25 30 Number of Qubits 10-1 100 101 102 103 104 Total time (sec)
TFIM Adiabatic Evolution (δt = 0.01, T = 1, complex128)
Trotter (GPU) Trotter (multi-GPU) Trotter (CPU) Exp (GPU) Exp (CPU) RK4 (GPU) RK4 (CPU) Trotter RK4 (GPU) Trotter RK4 (CPU)
10 20 30 Number of Qubits 100 101 102 103 Ratio to Trotter (GPU) 10 20 30 Number of Qubits 101 103 Ratio to Trotter (CPU)
Adiabatic evolution performance using Qibo and TFIM for extact and Trotter solution.
30
31
systems
32
5 10 15 20 25 30 35 Number of Qubits 10-3 10-2 10-1 100 101 102 103 104 Total time (sec)
QFT (complex64)
Qibo (GPU) Qibo (multi-GPU) Qibo (CPU) Qibo (CPU-1) QCGPU (GPU) QCGPU (CPU) Cirq (CPU) TFQ (CPU) 10 20 30 Number of Qubits 2 4 Ratio to Qibo (GPU) 10 20 30 Number of Qubits 100 101 Ratio to Qibo (CPU)
33
33