63 posts in total
2024
MLIR — Using Tablegen for Passes
MLIR-Writing Our First Pass
MLIR-Running and Testing a Lowering
DistriFusion
Deepseed Ulysses
TVM Python/C++ Interaction
Efficient Large-Scale Language Model Training on GPU Clusters
Megatron-LM
PipeFusion-Displaced Patch Pipeline Parallelism for Inference of DiT Models
xDiT Principle