for efficient quantum sorting
play

for Efficient Quantum Sorting Naveed Mahmud, Bailey K. - PowerPoint PPT Presentation

Combining Perfect Shuffle and Bitonic Networks for Efficient Quantum Sorting Naveed Mahmud, Bailey K. Srimoungchanh, Bennett Haase-Divine, Nolan Blankenau, Annika Kuhnke, and Esam El-Araby University of Kansas (KU) Fifth International Workshop


  1. Combining Perfect Shuffle and Bitonic Networks for Efficient Quantum Sorting Naveed Mahmud, Bailey K. Srimoungchanh, Bennett Haase-Divine, Nolan Blankenau, Annika Kuhnke, and Esam El-Araby University of Kansas (KU) Fifth International Workshop on Heterogeneous High-performance Reconfigurable Computing (H 2 RC’19) November 17-22, 2019 Denver, Colorado

  2. Outline ◆ Introduction and Motivation ◆ Background and Related Work ◆ Proposed Work ◆ Experimental Results ◆ Conclusions and Future Work H 2 RC 2019 – Nov. 17 th , 2019 2

  3. Introduction and Motivation ◆ Why Quantum? ▪ Efficient quantum algorithms ▪ Solving NP-hard problems ▪ source: Speedup over classical https://learning.acm.org/ ▪ techtalks/qiskit Quantum supremacy ▪ Quantum Ready NISQ devices ◆ Need for Quantum Emulation ▪ Difficult to control QC experiments ▪ Verification and benchmarking ▪ High-cost of accessing QCs E.g., academic hourly rate of $1,250 up to 499 ◆ annual hours ◆ Emulation using FPGAs ▪ Greater speedup vs. SW ▪ Dynamic (reconfigurable) vs. fixed architectures ▪ Exploiting parallelism ▪ Limitation → Scalability H 2 RC 2019 – Nov. 17 th , 2019 3

  4. Introduction and Motivation ◆ Why Quantum? ▪ Efficient quantum algorithms ▪ Solving NP-hard problems ▪ source: Speedup over classical https://learning.acm.org/ ▪ techtalks/qiskit Quantum supremacy ▪ Quantum Ready NISQ devices ◆ Need for Quantum Emulation ▪ Difficult to control QC experiments ▪ Verification and benchmarking ▪ High-cost of accessing QCs E.g., academic hourly rate of $1,250 up to 499 ◆ annual hours ◆ Emulation using FPGAs Google’s 72 - qubit “Bristlecone” Intel’s 49 - qubit “Tangle Lake” IBM-Q 53-qubit computer ▪ Greater speedup vs. SW ▪ Dynamic (reconfigurable) vs. fixed architectures ▪ Exploiting parallelism ▪ Limitation → Scalability Rigetti’s 16-qubit ASPEN-4 IonQ’s 79-qubit computer D-Wave 2000Q H 2 RC 2019 – Nov. 17 th , 2019 4

  5. Outline ◆ Introduction and Motivation ◆ Background and Related Work ◆ Proposed Work ◆ Experimental Results ◆ Conclusions and Future Work H 2 RC 2019 – Nov. 17 th , 2019 5

  6. Background (Quantum Computing)    ◆ Qubits  =  +     Single- Qubit Superpo i s tion: 0 1  ▪ 1   Physical implementations ( ) ( ) ◆ Electron (spin)  → =  2  → =  2 Born Rule p : 0 , p 1 1 1 NMR ≡ N uclear M agnetic R esonance ◆ Nucleus (spin through NMR) ◆ Photon (polarization encoding) Multi-Qubit Sup erpo sit o i n : ◆ Josephson junction (superconducting qubits)           = =   =    2 1 0 ◆ Trapped ions q q q q q q         3 2 1 0 2 1 0       ◆ Anions 2 1 0 ▪  =    +    + +    Theoretical representation 000 001 ... 11 1 3 2 1 0 2 1 0 2 1 0 ◆ Bloch sphere − n 2 1   = + + +   = Basis states → ȁ ۧ » 0 , ȁ ۧ 1 c 0 c 1 .. . c 7 c q 3 0 1 7 n q Pure states → ȁ ۧ » 𝜔 = q 0 ◆ Vector of complex coefficients n − ( ) 2 1  2 2 2  → =   = = Born R ul e : p q c c 1 ◆ Superposition n q n q = q 0 ▪ Linear sum of distinct basis states Multi-Qubit Entangl em nt e : ▪ Converts to classical logic when measured ( ) ( ) ▪  =   =   Applies to state with n -qubits q ... q q q ... q q − − n n 1 1 0 n n 1 1 0 entangled entangled un-entangled    ◆ Entanglement ( )       =   =  1 0       For Example : q q q q ▪ Strong correlation between qubits   2 1 0 1 0 entangled entangled       1 0 ▪ Measuring a qubit gives information about other qubits  = +    +   +   +   c 00 c 11 00 01 1 0 0 11 ▪ Entangled state cannot be factored into a tensor product 2 0 3 1 0 1 0 1 0 1 entangled H 2 RC 2019 – Nov. 17 th , 2019 6

  7. Background (Quantum Gates) ◆ X Gate (NOT) gate 𝑌 = 0 1 ▪ 1-qubit gate 1 0 ▪ Inverts the magnitude of the qubit 1 0 0 0 0 1 0 0 ◆ cX (Controlled NOT) Gate 𝑑𝑌 = 0 0 0 1 ▪ 2-qubit gate 0 0 1 0 ▪ Control qubit and a target qubit ▪ Inverts target qubit based on value of control 1 0 0 0 0 0 1 0 SWAP = 0 1 0 0 ◆ SWAP Gate 0 0 0 1 ▪ 2-qubit gate ▪ Exchanges positions of the two qubits 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ◆ cSWAP (Controlled SWAP) Gate 0 0 0 1 0 0 0 0 𝑑SWAP = ▪ 0 0 0 0 1 0 0 0 3-qubit gate 0 0 0 0 0 0 1 0 ▪ Exchanges positions of the two qubits based on 0 0 0 0 0 1 0 0 the control qubit 0 0 0 0 0 0 0 1 H 2 RC 2019 – Nov. 17 th , 2019 7

  8. Background (Sorting) Bitonic sort Insertion ◆ Classical Sorting Complexity Quicksort Merge sort with perfect sort ▪ shuffle Quicksort log 2 N ▪ N log N N log N N 2 Merge sort Time ▪ Insertion sort log N N 1 N Space ▪ Bitonic sort with perfect shuffle source: https://www.bigocheatsheet.com/ H 2 RC 2019 – Nov. 17 th , 2019 8

  9. Background (Sorting) Bitonic sort Insertion ◆ Classical Sorting Complexity Quicksort Merge sort with perfect sort ▪ shuffle Quicksort log 2 N ▪ N log N N log N N 2 Merge sort Time ▪ Insertion sort log N N 1 N Space ▪ Bitonic sort with perfect shuffle source: https://www.bigocheatsheet.com/ ◆ Quantum Sorting ▪ Relatively new realm of research ▪ Based on encoding of data as coefficients of a superimposed quantum state ( N=2 n ) ▪ Parallel architecture ▪ Speedup compared to classical sorters N ≡ number of states n ≡ number of qubits H 2 RC 2019 – Nov. 17 th , 2019 9

  10. Background (Sorting) Bitonic sort Insertion ◆ Classical Sorting Complexity Quicksort Merge sort with perfect sort ▪ shuffle Quicksort log 2 N ▪ N log N N log N N 2 Merge sort Time ▪ Insertion sort log N N 1 N Space ▪ Bitonic sort with perfect shuffle source: https://www.bigocheatsheet.com/ ◆ Quantum Sorting ▪ Relatively new realm of research ▪ Based on encoding of data as coefficients of a superimposed quantum state ( N=2 n ) ▪ Parallel architecture ▪ Speedup compared to classical sorters N ≡ number of states n ≡ number of qubits Quantum bitonic Quantum merge Complexity sort with perfect sorting [Chen, et al] shuffle log 2 n log 2 n Time n n Space H 2 RC 2019 – Nov. 17 th , 2019 10

  11. Related Work (Quantum Sorting) ◆ Chen, et al., “Quantum switching and quantum merge sorting,” February 2006 ▪ Bitonic merge sorting with a divide-and-conquer approach ▪ 𝑷(𝒎𝒑𝒉 𝟑 𝒐) time complexity to sort n qubits ▪ Not enough details about ‘quantum comparator’ ▪ No experimental evaluation ◆ Hoyer, et al., “Quantum complexities of ordered searching, sorting, and element distinctness,” November 2002 ▪ Proof showing lower bound of general quantum sorting is 𝛁(𝑶 𝒎𝒑𝒉 𝑶) ▪ Based on comparison matrix given as input oracle ▪ No circuit realizations or implementations H 2 RC 2019 – Nov. 17 th , 2019 11

  12. Related Work (Parallel SW Simulators) Villalonga , et al., “Establishing the Quantum Supremacy Frontier with a 281 Pflop /s Simulation,” May 2019 ◆ ▪ Simulation of 7x7 and 11x11 random quantum circuits (RQCs) of depth 42 and 26 respectively. ▪ Summit supercomputer (ORNL, USA) with 4550 nodes List of quantum SW simulators ▪ 1.6 TB of non-volatile memory per node https://quantiki.org/wiki/list-qc-simulators ▪ Power consumption of 7.3 MW Li et al., “Quantum Supremacy Circuit Simulation on Sunway TaihuLight ,” August 2018 ◆ ▪ Simulation of 49-qubit random quantum circuits of depth of 55 ▪ Sunway supercomputer (NSC, China) with 131,072 nodes (32,768 CPUs) ▪ 1 PB total main memory J. Chen, et al., “Classical Simulation of Intermediate - Size Quantum Circuits,” May 2018 ◆ ▪ Simulation of up to 144-qubit random quantum circuits of depth 27 ▪ Supercomputing cluster (Alibaba Group, China) with 131,072 nodes ▪ 8 GB memory per node De Raedt et al., “Massively parallel quantum computer simulator eleven years later,” May 2018 ◆ ▪ Simulation of Shor’s algorithm using 48-qubits ▪ Various supercomputing platforms: IBM Blue Gene/Q (decommissioned), JURECA (Germany), K computer (Japan), Sunway TaihuLight (China) ▪ Up to 16-128 GB memory/node utilized T. Jones, et al., “ QuEST and High Performance Simulation of Quantum Computers,” May 2018 ◆ ▪ Simulation of random quantum circuits up to 38 qubits ▪ ARCUS supercomputer (ARCHER, UK) with 2048 nodes ▪ Up to 256 GB memory per node H 2 RC 2019 – Nov. 17 th , 2019 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend