Massively parallel

Flexi is designed to make use of the modern parallel CPU-based architectures of HPC systems. It has demonstrated perfect weak and strong scaling on up to 260,000 cores, and is routinely used in production runs on 100,000 cores.
The simulation framework is MPI-parallelized based on a space-filling curve approach connecting the elements. Due to the small communication footprint of DG methods and latency hiding techniques through non-blocking, overlapping message patterns, optimal strong scaling is achieved down to one element per core for moderate polynomial orders.
The framework uses the HDF5 library for fast parallel I/O. It has a very low memory footprint and is generally CPU-bound, which makes it highly suitable for modern CPU-based HPC platforms. Among others, the framework has run successfully on the following HPC systems:


Domain Decomposition optimal number of neighbors by space-filling curve


Strong Scaling on Hornet perfect granular strong scaling for N>5


Communication Strategy latency hiding through non-blocking send/receive