AITemplate – Ein Python-Framework zur Umwandlung tiefer neuronaler Netze in hochperformanten CUDA/HIP-C++-Code

xguru · 2023-06-10T10:31:01+09:00

Wandelt tiefe neuronale Netze für schnelle Inferenz in CUDA- (NVIDIA GPU) / HIP- (AMD GPU) C++-Code um Erreicht bei großen Modellen wie ResNet, BERT, VisionTransformer und Stable Diffusion nahezu die Roofline-fp16-TensorCore/MatrixCore-Performance Ein einheitliches, offenes und flexibles Open-Source-Projekt Hervorragende Abwärtskompatibilität (keine Abhängigkeit von Drittanbieter-Bibliotheken/Runtimes). Jedes Modell wird als portables Binärprogramm kompiliert Horizontal Fusion / Vertical Fusion / Memory Fusion Funktioniert mit und ohne PyTorch

(github.com/facebookincubator)

11 Punkte von xguru 2023-06-10 | Noch keine Kommentare. | Auf WhatsApp teilen

Wandelt tiefe neuronale Netze für schnelle Inferenz in CUDA- (NVIDIA GPU) / HIP- (AMD GPU) C++-Code um
Erreicht bei großen Modellen wie ResNet, BERT, VisionTransformer und Stable Diffusion nahezu die Roofline-fp16-TensorCore/MatrixCore-Performance
Ein einheitliches, offenes und flexibles Open-Source-Projekt
Hervorragende Abwärtskompatibilität (keine Abhängigkeit von Drittanbieter-Bibliotheken/Runtimes). Jedes Modell wird als portables Binärprogramm kompiliert
Horizontal Fusion / Vertical Fusion / Memory Fusion
Funktioniert mit und ohne PyTorch

AITemplate – Ein Python-Framework zur Umwandlung tiefer neuronaler Netze in hochperformanten CUDA/HIP-C++-Code

Verwandte Beiträge

Noch keine Kommentare.