

moving mid-grid bounce-back boundaries (moving solid boundaries).

stationary mid-grid bounce-back boundaries (stationary solid boundaries).TYPE_Y remaining for custom use or further extensions.TYPE_X remaining for custom use or further extensions.TYPE_E equilibrium boundaries (inflow/outflow).TYPE_S (stationary or moving) solid boundaries.only 8 flag bits per lattice point (can be used independently / at the same time):.single-relaxation-time (SRT/BGK) (default).updating density and velocity in the stream_collide() kernel is optional (higher performance if disabled).DDF-shifting and other algebraic optimization to minimize round-off error.decoupled arithmetic precision (FP32) and memory precision (FP32 or FP16S or FP16C): all arithmetic is done in FP32 for compatibility on all hardware, but LBM density distribution functions in memory can be compressed to FP16S or FP16C: almost cuts memory demand in half again and almost doubles performance, without impacting overall accuracy for most setups.in-place streaming with Esoteric-Pull: almost cuts memory demand in half and slightly increases performance due to implicit bounce-back boundaries offers optimal memory access patterns for single-node in-place streaming.optimized to minimize memory demand to 55 Bytes/node (~⅙ (~⅓) of conventional FP64 (FP32) LBM solvers).up to 4.29 billion (2³²) lattice points, or 1624³ resolution, on a single GPU (if it has 225 GB memory).peak performance on most GPUs (datacenter/gaming/professional/laptop), validated with roofline model.U( x, t) = 1∕ ρ( x, t) Σ i c i f i temp( x, t) CFD model: lattice Boltzmann method (LBM)į i temp( x, t) = f ( t%2 ? i : ( i%2 ? i+1 : i-1))( i%2 ? x : x- e i, t) for i ∈.The fastest and most memory efficient lattice Boltzmann CFD software, running on any GPU via OpenCL.
