CUNQA: HPC System Distributed Quantum Computing Emulator
CUNQA Emulates HPC Distributed Quantum Computing, Enabling Scalable Hybrid Architectures
As the search for powerful quantum computers continues, researchers are focussing on Distributed Quantum Computing (DQC) by connecting several quantum processors. After realising that these processors will initially act as accelerators in HPC environments, CESGA and the University of Santiago de Compostela founded CUNQA.
The open-source emulator CUNQA was developed to evaluate DQC techniques on existing HPC systems. This strategy lets researchers study programming, architectural, and performance issues before quantum technology is widely available.
Outstanding Emulation
CUNQA is the first HPC tool to accurately mimic all three DQC procedures. These three DQC models are crucial:
No-communication (Embarrassingly Parallel): This method divides quantum tasks classically across many virtual Quantum Processing Units without runtime communication.
The classical distribution of quantum tasks and classical links between vQPUs allow this model to classically regulate an instruction. This allows one QPU to acquire classical information (such a measurement result) from another during execution. This approach is similar to the Iterative Phase Estimation Algorithm (IPEA), which leverages classical communication to reduce the number of ancilla qubits needed compared to QPE.
Quantum communication preserves the classical link while linking QPUs via quantum channels. Pure quantum distribution requires quantum-communication protocols like teledata and telegate. Asymmetric Quantum Networks with Qubit Control can read MIS/MWIS.
Connecting Classical and Quantum Hardware
VQPUs, conventional processes that imitate a real QPU and run on HPC resources, are the foundation of CUNQA. They act as accelerators by taking CPU tasks, executing them, and providing results.
CUNQA supports key hybrid computing integration methods:
QPUs are distinct hardware available across a network, yet they are co-located in the same HPC facility. On-node: Like a GPU accelerator, the QPU is integrated into an HPC node.
Former standalone quantum systems are obsolete due to the accelerator paradigm.
Software architecture in CUNQA puts resource management on the user (out of the middleware). Users can reserve and customise resources, including the number of vQPUs and their maximum availability duration, using SLURM wrappers like qraise to govern their life cycles.
Showing Capability with QPE
The researchers tested the emulator with all three DQC methods using Quantum Phase Estimation (QPE). Each scheme's QPE showed trade-offs:
No-communication (Distribution of Shots): This method parallelises well and reduces simulation time compared to the basic scenario. Acceleration will suffer if distribution and information collection overhead is too high.
Classical-communication (IPEA): Improves the basic case but is slower than optimised no-communication due to synchronisation delays and internal simulator optimisations removed while extracting and modifying gate execution.
Quantum communication -- distributed QPE—had execution speeds two orders of magnitude faster than the basic case. Due to the need to simulate quantum communications tasks in a single executor process and the additional protocols needed to distribute controlled gates, simulation time scales proportionally with vQPUs.
Despite performance variances due to distributed architectural complexity, CUNQA simulated QPE across all models with an estimated phase that matched the theoretical value.
Create CUNQA to solve software and architectural concerns before they hinder the practical implementation of scalable, powerful hybrid quantum-classical processing. CUNQA code is open-source.











