Course Learning  (3)

DeepSeek Related  (4)

Engineering Development Skills🔧  (7)

Leetcode  (10)

SJTU-xflops2024  (6)

Articles

astra-Sim

source code reading of astra-sim

Jun-09-2025 · 14 min · 6842 words · WITHER

Transformer Family

Introduction of Transformer Family

Jun-07-2025 · 5 min · 2392 words · WITHER

ZeRO, ZeRO-Offload, ZeRO-Infinity

Paper reading of ZeRO.

Jun-07-2025 · 19 min · 9049 words · WITHER

xDiT Principle

This is a brief introduction to the xDiT Principle.

Jun-07-2025 · 14 min · 7008 words · WITHER

VLLM Sourse Code Reading

vllm structure

Jun-07-2025 · 24 min · 12012 words · WITHER

Functional Test of Hugo

function test

Jun-07-2025 · 1 min · 239 words · WITHER

All2All Communication Cost

Introduction of Transformer Family

Jan-12-2025 · 5 min · 2403 words · WITHER

DistriFusion

Paper reading about DistriFusion.

Oct-23-2024 · 7 min · 3399 words · WITHER

DeepSpeedUlysses

Paper reading of Deepseed Ulysses.

Oct-21-2024 · 2 min · 678 words · WITHER

Efficient Large-Scale Language Model Training on GPU

Paper reading about Efficient Large-Scale Language Model Training on GPU Clusters.

Oct-05-2024 · 9 min · 4182 words · WITHER