Massively parallel


Flexi is designed to make use of the modern parallel CPU-based architectures of HPC systems. It has demonstrated perfect weak and strong scaling on up to 260,000 cores, and is routinely used in production runs on 100,000 cores.
The simulation framework is MPI-parallelized based on a space-filling curve approach connecting the elements. Due to the small communication footprint of DG methods and latency hiding techniques through non-blocking, overlapping message patterns, optimal strong scaling is achieved down to one element per core for moderate polynomial orders.
The framework uses the HDF5 library for fast parallel I/O. It has a very low memory footprint and is generally CPU-bound, which makes it highly suitable for modern CPU-based HPC platforms.
Among others, the framework has run successfully on the following HPC systems:

sphere_domain_decomp_comm_00004096_small

Domain Decomposition optimal number of neighbors by space-filling curve

strong_nohyper_12288_N

Strong Scaling on Hornet perfect granular strong scaling for N>5

Comm_overview

Communication Strategy latency hiding through non-blocking send/receive