#compiler
18 articles
Advanced
SPIR-V Compilation and Level Zero Runtime
#intel
#spirv
#level-zero
#compiler
#runtime
#jit
#aot
Advanced
Graph Capture: TorchDynamo, AOTAutograd & Functionalization
#compiler
#pytorch
#torchdynamo
#aotautograd
#fx-graph
Advanced
IR Design (Part 1): SSA, FX IR & MLIR Dialects
#compiler
#ir
#ssa
#pytorch
#mlir
#fx-graph
#dialect
Advanced
IR Design (Part 2): Progressive Lowering and Multi-Level IR
#compiler
#mlir
#progressive-lowering
#dialect-conversion
#bufferization
Intermediate
Panorama: The World of ML Compilers
#compiler
#pytorch
#mlir
#triton
#optimization
Advanced
Graph Optimization Passes (Part 2): Advanced Optimizations & Pattern Matching
#compiler
#optimization
#layout
#pattern-matching
#memory-planning
Advanced
Graph Optimization Passes (Part 1): Data Flow Analysis & Pass Fundamentals
#compiler
#optimization
#pass
#dataflow-analysis
#dce
#cse
Advanced
Graph Optimization Passes (Part 2): Polyhedral Optimization & Loop Transformations
#compiler
#polyhedral
#loop-optimization
#affine
#mlir
#tiling
Advanced
Operator Fusion (Part II): Cost Models & Fusion in Practice
#compiler
#fusion
#cost-model
#flash-attention
#inductor
#optimization
Advanced
Operator Fusion (Part I): Taxonomy & Decision Algorithms
#compiler
#fusion
#operator-fusion
#kernel-fusion
#optimization
Advanced
Code Generation (Part I): Instruction Selection, Vectorization & Register Allocation
#compiler
#codegen
#instruction-selection
#vectorization
#register-allocation
#gpu
Advanced
Code Generation (Part II): Triton Pipeline, Compiler Backends & Numerical Correctness
#compiler
#codegen
#triton
#llvm
#ptx
#numerical-accuracy
#backends
Advanced
Dynamic Shapes: The Full-Pipeline Challenge from Capture to Execution
#compiler
#dynamic-shapes
#symbolic-shapes
#guards
#bucketing
#pytorch
Advanced
Tiling Strategies & Memory Hierarchy Optimization
#compiler
#tiling
#memory-hierarchy
#gpu
#shared-memory
#optimization
Advanced
Autotuning and End-to-End Practice
#compiler
#autotuning
#triton
#mlir
#transform-dialect
#end-to-end
#torch-compile
Advanced
Distributed Compilation and Graph Partitioning
#compiler
#distributed
#tensor-parallel
#pipeline-parallel
#gspmd
#sharding
#communication
Advanced
Quantization Compilation and Mixed-Precision Optimization
#compiler
#quantization
#mixed-precision
#kernel-generation
#fusion
Advanced
Scheduling and Execution Optimization
#compiler
#scheduling
#cuda-stream
#cuda-graph
#memory-planning
#activation-checkpointing
#multi-backend