2025  34

September  1

Bithack

Sep-18-2025 · 6 min · 2664 words · WITHER

August  7

Hot100

Aug-23-2025 · 18 min · 8661 words · WITHER

09 Monotone Deque

Aug-21-2025 · 2 min · 606 words · WITHER

08 Monotone Stack

Aug-19-2025 · 4 min · 1829 words · WITHER

Cpp Lambda Expression

Aug-15-2025 · 3 min · 1021 words · WITHER

TX8 Inference Engine

Aug-07-2025 · 3 min · 1292 words · WITHER

07 DynamicProgramming

Aug-07-2025 · 20 min · 9561 words · WITHER

06 Backtracking

Aug-04-2025 · 14 min · 6638 words · WITHER

July  8

05 Binary Tree

Jul-28-2025 · 13 min · 6179 words · WITHER

04 Chain List

Jul-25-2025 · 11 min · 5198 words · WITHER

03 BinarySearch

Jul-24-2025 · 12 min · 5826 words · WITHER

TX8 Backend

Jul-23-2025 · 44 min · 21991 words · WITHER

02 Sliding Window

Jul-23-2025 · 2 min · 679 words · WITHER

01 Double-pointer

Jul-20-2025 · 6 min · 2958 words · WITHER

InternVideo2.5

Jul-10-2025 · 3 min · 1050 words · WITHER

Regular Expression Rules

Jul-04-2025 · 4 min · 1577 words · WITHER

June  17

🔑 Useful VSCode Shortcut keys

Jun-28-2025 · 2 min · 668 words · WITHER

DualPipe

Jun-21-2025 · 13 min · 6075 words · WITHER

DeepSeek-V3 Technical Report

Jun-20-2025 · 19 min · 9396 words · WITHER

DeepSeekMoE

Jun-19-2025 · 5 min · 2061 words · WITHER

DeepSeekMLA

Jun-19-2025 · 8 min · 3711 words · WITHER

PipeFusion

Jun-13-2025 · 8 min · 3663 words · WITHER

Fast-dLLM

Jun-12-2025 · 21 min · 10115 words · WITHER

LLaDA

Jun-12-2025 · 8 min · 3615 words · WITHER

Tx8read

Jun-11-2025 · 13 min · 6269 words · WITHER

astra-Sim

Jun-09-2025 · 14 min · 6842 words · WITHER

Transformer Family

Jun-07-2025 · 5 min · 2392 words · WITHER

ZeRO, ZeRO-Offload, ZeRO-Infinity

Jun-07-2025 · 19 min · 9049 words · WITHER

xDiT Principle

Jun-07-2025 · 14 min · 7008 words · WITHER

VLLM Sourse Code Reading

Jun-07-2025 · 24 min · 12012 words · WITHER

Functional Test of Hugo

Jun-07-2025 · 1 min · 239 words · WITHER

A Simple Cmake Example

Jun-06-2025 · 5 min · 2223 words · WITHER

How to Use git rebase

Jun-06-2025 · 4 min · 1569 words · WITHER

January  1

All2All Communication Cost

Jan-12-2025 · 5 min · 2403 words · WITHER

2024  39

November  7

MLIR-Ch9 Dialect Conversion

Nov-12-2024 · 6 min · 2988 words · WITHER

MLIR-Ch8 Canonicalizers and Declarative Rewrite Patterns

Nov-11-2024 · 5 min · 2105 words · WITHER

MLIR-Ch7 Verifiers

Nov-10-2024 · 3 min · 1376 words · WITHER

MLIR-Ch6 Folders and Constant Propagation

Nov-09-2024 · 4 min · 1999 words · WITHER

MLIR-Ch5 Using Traits

Nov-08-2024 · 5 min · 2212 words · WITHER

MLIR-Ch4 Defining a New Dialect

Nov-07-2024 · 7 min · 3486 words · WITHER

MLIR-Ch3 Using Tablegen for Passes

Nov-06-2024 · 5 min · 2430 words · WITHER

October  5

MLIR-Ch2 Writing Our First Pass

Oct-30-2024 · 11 min · 5258 words · WITHER

DistriFusion

Oct-23-2024 · 7 min · 3399 words · WITHER

DeepSpeedUlysses

Oct-21-2024 · 2 min · 678 words · WITHER

Efficient Large-Scale Language Model Training on GPU

Oct-05-2024 · 9 min · 4182 words · WITHER

Megatron-LM

Oct-02-2024 · 4 min · 1866 words · WITHER

September  16

Ring Attention Principle

Sep-26-2024 · 6 min · 2551 words · WITHER

PMPP Learning-Chapter 15 Graph traversal

Sep-18-2024 · 8 min · 3608 words · WITHER

PMPP Learning-Chapter 14 Sparse Matrix Computation

Sep-18-2024 · 8 min · 3680 words · WITHER

PMPP Learning-Chapter 13 Sorting

Sep-14-2024 · 5 min · 2184 words · WITHER

PMPP Learning-Chapter 12 Merge-An Introduction to Dynamic Input Data Identification

Sep-13-2024 · 7 min · 3140 words · WITHER

PMPP Learning-Chapter 11 Prefix sum (scan)-An Introduction to Work Efficiency in Parallel Algorithms

Sep-11-2024 · 7 min · 3345 words · WITHER

PMPP Learning-Chapter 10 Reduction and Minimizing Divergence

Sep-10-2024 · 7 min · 3209 words · WITHER

PMPP Learning-Chapter 8 Stencil

Sep-09-2024 · 5 min · 2466 words · WITHER

PMPP Learning-Chapter 9 Parallel Histogram-An Introduction to Atomic Operations and Privatization

Sep-09-2024 · 7 min · 3150 words · WITHER

PMPP Learning-Chapter 7 Convolution-An Introduction to Constant Memory and Caching

Sep-06-2024 · 5 min · 2226 words · WITHER

PMPP Learning-Chapter 6 Performance Considerations

Sep-05-2024 · 6 min · 2954 words · WITHER

PMPP Learning-Chapter 4 Compute Architecture and Scheduling

Sep-05-2024 · 7 min · 3082 words · WITHER

PMPP Learning-Chapter 5 Memory Architecture and Data Locality

Sep-05-2024 · 8 min · 3747 words · WITHER

PMPP Learning-Chapter 3 Multidimensional Grids and Data

Sep-04-2024 · 4 min · 1828 words · WITHER

PMPP Learning-Chapter 2 Heterogeneous Data Parallel

Sep-03-2024 · 7 min · 3466 words · WITHER

PMPP Learning-Chapter 1 Introduction

Sep-03-2024 · 4 min · 1895 words · WITHER

August  10

TVM Learning (10)-Computational Graph Optimization

Aug-25-2024 · 10 min · 4810 words · WITHER

TVM Learning (9)-GPU and Hardware Acceleration, Part 2

Aug-25-2024 · 8 min · 3884 words · WITHER

TVM Learning (8)-GPU and Hardware Acceleration, Part 1

Aug-24-2024 · 7 min · 3024 words · WITHER

TVM Learning (6)-Exercise of End to End Model Execution

Aug-20-2024 · 7 min · 3271 words · WITHER

TVM Learning (5)-Automatic Program Optimization

Aug-19-2024 · 5 min · 2050 words · WITHER

TVM Learning (4)-End to End Model Execution

Aug-18-2024 · 7 min · 3447 words · WITHER

TVM Learning (3)-Schedule Analysis

Aug-17-2024 · 13 min · 6420 words · WITHER

TVM Learning (2)-Tensor Program Abstraction Case

Aug-15-2024 · 5 min · 2085 words · WITHER

TVM Learning (1)-Tensor Program Abstraction in Action

Aug-15-2024 · 8 min · 3770 words · WITHER

TVM Learning (11)-Add Model Architeture in MLC LLM

Aug-08-2024 · 8 min · 3855 words · WITHER

June  1

About Me

Jun-29-2024 · 1 min · 85 words · WITHER

2023  1

November  1

Comparsion of Parallelsim Metods in ViT

Nov-13-2023 · 15 min · 7045 words · WITHER