FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering

Shanghai AI Laboratory
Preprint

*Indicates Equal Contribution
Teaser Figure

FlashGS Design with algorithm and low-level optimizations.

Abstract

We introduces FlashGS, an open-source CUDA Python library designed to facilitate the efficient differentiable rasterization of 3D Gaussian Splatting through algorithmic and kernel-level optimizations. FlashGS is developed based on the observations from a comprehensive analysis of rendering process to enhance computational efficiency and bring the technique to wide adoption. The paper includes a suite of optimization strategies, encompassing redundancy elimination, efficient pipelining, refined control and scheduling mechanisms, and memory access optimizations, all of which are meticulously integrated to amplify the performance of the rasterization process. An extensive evaluation of FlashGS’ performance has been conducted across a diverse spectrum of synthetic and real-world large-scale scenes, encompassing a variety of image resolutions. The empirical findings demonstrate that FlashGS consistently achieves an average 4x acceleration over mobile consumer GPUs, coupled with reduced memory consumption. These results underscore the superior performance and resource optimization capabilities of FlashGS, positioning it as a formidable tool in the domain of 3D rendering.

Evaluation

FlashGS average/slowest frame rendering time (ms) and corresponding FPS with speedup relative to 3DGS across different datasets and resolutions on 3090 GPU. FlashGS can always achieve > 100FPS rendering on RTX 3090, even for high-resolution and large-scale datasets. In the slowest frame in Rubble of all datasets, we achieve 107.3 FPS. This demonstrates that FlashGS can perform realtime rendering even in extremely large and high-resolution cases. Flash GS achieves up to 30.53× speedup with an average of 12.18× on the Matrixcity dataset at 4k resolution. We achieve 7.2× average speedup on all 11 scenes, while achieving 8.6× speedup on the 7 large-scale or high-resolution tests

BibTeX

@article{feng2024flashgs,
        title={Flashgs: Efficient 3d gaussian splatting for large-scale and high-resolution rendering},
        author={Feng, Guofeng and Chen, Siyan and Fu, Rong and Liao, Zimu and Wang, Yi and Liu, Tao and Pei, Zhilin and Li, Hengjie and Zhang, Xingcheng and Dai, Bo},
        journal={arXiv preprint arXiv:2408.07967},
        year={2024}
      }