Content on this site is AI-generated and may contain errors. If you find issues, please report at GitHub Issues .

#compiler

18 articles

Advanced

SPIR-V Compilation and Level Zero Runtime

#intel #spirv #level-zero #compiler #runtime #jit #aot
Advanced

Graph Capture: TorchDynamo, AOTAutograd & Functionalization

#compiler #pytorch #torchdynamo #aotautograd #fx-graph
Advanced

IR Design (Part 1): SSA, FX IR & MLIR Dialects

#compiler #ir #ssa #pytorch #mlir #fx-graph #dialect
Advanced

IR Design (Part 2): Progressive Lowering and Multi-Level IR

#compiler #mlir #progressive-lowering #dialect-conversion #bufferization
Intermediate

Panorama: The World of ML Compilers

#compiler #pytorch #mlir #triton #optimization
Advanced

Graph Optimization Passes (Part 2): Advanced Optimizations & Pattern Matching

#compiler #optimization #layout #pattern-matching #memory-planning
Advanced

Graph Optimization Passes (Part 1): Data Flow Analysis & Pass Fundamentals

#compiler #optimization #pass #dataflow-analysis #dce #cse
Advanced

Graph Optimization Passes (Part 2): Polyhedral Optimization & Loop Transformations

#compiler #polyhedral #loop-optimization #affine #mlir #tiling
Advanced

Operator Fusion (Part II): Cost Models & Fusion in Practice

#compiler #fusion #cost-model #flash-attention #inductor #optimization
Advanced

Operator Fusion (Part I): Taxonomy & Decision Algorithms

#compiler #fusion #operator-fusion #kernel-fusion #optimization
Advanced

Code Generation (Part I): Instruction Selection, Vectorization & Register Allocation

#compiler #codegen #instruction-selection #vectorization #register-allocation #gpu
Advanced

Code Generation (Part II): Triton Pipeline, Compiler Backends & Numerical Correctness

#compiler #codegen #triton #llvm #ptx #numerical-accuracy #backends
Advanced

Dynamic Shapes: The Full-Pipeline Challenge from Capture to Execution

#compiler #dynamic-shapes #symbolic-shapes #guards #bucketing #pytorch
Advanced

Tiling Strategies & Memory Hierarchy Optimization

#compiler #tiling #memory-hierarchy #gpu #shared-memory #optimization
Advanced

Autotuning and End-to-End Practice

#compiler #autotuning #triton #mlir #transform-dialect #end-to-end #torch-compile
Advanced

Distributed Compilation and Graph Partitioning

#compiler #distributed #tensor-parallel #pipeline-parallel #gspmd #sharding #communication
Advanced

Quantization Compilation and Mixed-Precision Optimization

#compiler #quantization #mixed-precision #kernel-generation #fusion
Advanced

Scheduling and Execution Optimization

#compiler #scheduling #cuda-stream #cuda-graph #memory-planning #activation-checkpointing #multi-backend