Content on this site is AI-generated and may contain errors. If you find issues, please report at GitHub Issues .

Intel iGPU Inference Deep Dive: Xe2 Architecture, oneDNN & OpenVINO

From Xe2 microarchitecture to oneDNN primitives, from SPIR-V compilation to OpenVINO graph optimization, from performance analysis to GPU+NPU co-inference β€” a systematic deep dive into AI inference on Intel iGPU.

  1. 1

    Xe2 GPU Architecture

    Advanced
    #intel#xe2#gpu-architecture#igpu#lunar-lake#panther-lake
  2. 2

    Xe2 Execution Model and Programming Abstractions

    Advanced
    #intel#xe2#simd#sycl#execution-model#workgroup
  3. 3

    SPIR-V Compilation and Level Zero Runtime

    Advanced
    #intel#spirv#level-zero#compiler#runtime#jit#aot
  4. 4

    oneDNN Primitive System

    Advanced
    #intel#onednn#primitive#memory-format#operator-library
  5. 5

    oneDNN GPU Kernel Optimization

    Advanced
    #intel#onednn#kernel-optimization#gemm#xmx#mixed-precision
  6. 6

    OpenVINO Graph Optimization Pipeline

    Advanced
    #intel#openvino#graph-optimization#model-compilation#plugin
  7. 7

    Intel Model Optimization Stack: Choosing Between Optimum Intel, NNCF, and OpenVINO

    Intermediate
    #intel#optimum#nncf#openvino#quantization#model-conversion
  8. 8

    Performance Analysis and Bottleneck Diagnosis

    Advanced
    #intel#performance#profiling#roofline#vtune#bottleneck
  9. 9

    NPU Architecture and GPU+NPU Co-Inference

    Advanced
    #intel#npu#openvino#hetero#multi-device#co-inference
  10. 10

    LLM Inference on NPU: KV Cache and the Software Stack

    Advanced
    #intel#npu#llm#kv-cache#openvino#npuw#static-shape
  11. 11

    NPU Execution Model and the Boundaries of Its Programming Model

    Advanced
    #intel#npu#execution-model#dma#tiling#attention#programming-model#cute
  12. 12

    Hands-On: HF β†’ GGUF / ONNX / OpenVINO β€” Three End-to-End Paths

    Intermediate
    #quantization#model-conversion#hands-on#llama-cpp#onnx#openvino#intel-igpu