#optimization
9 articles
Advanced
GEMM Optimization β From Naive to Peak Performance
#gpu
#gemm
#cuda
#optimization
#tensor-core
#xmx
#intel
Advanced
KV Cache Fundamentals
#inference
#kv-cache
#memory
#optimization
Advanced
Speculative Decoding β Accelerating LLM Inference via Guessing
#inference
#optimization
#speculative-decoding
Intermediate
Panorama: The World of ML Compilers
#compiler
#pytorch
#mlir
#triton
#optimization
Advanced
Graph Optimization Passes (Part 2): Advanced Optimizations & Pattern Matching
#compiler
#optimization
#layout
#pattern-matching
#memory-planning
Advanced
Graph Optimization Passes (Part 1): Data Flow Analysis & Pass Fundamentals
#compiler
#optimization
#pass
#dataflow-analysis
#dce
#cse
Advanced
Operator Fusion (Part II): Cost Models & Fusion in Practice
#compiler
#fusion
#cost-model
#flash-attention
#inductor
#optimization
Advanced
Operator Fusion (Part I): Taxonomy & Decision Algorithms
#compiler
#fusion
#operator-fusion
#kernel-fusion
#optimization
Advanced
Tiling Strategies & Memory Hierarchy Optimization
#compiler
#tiling
#memory-hierarchy
#gpu
#shared-memory
#optimization