资源推荐

按学习路径整理的参考资源，自动聚合自各篇文章的引用。

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

arxiv.org · 来源: MQA 与 GQA

📄 论文

Fast Transformer Decoding: One Write-Head is All You Need

arxiv.org · 来源: MQA 与 GQA

📄 论文

Mistral 7B

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA

📄 论文

Gemma 2 Technical Report

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA

📄 论文

Jamba: A Hybrid Transformer-Mamba Language Model

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA , Hybrid 架构：Mamba 与 Attention 的融合

📄 论文

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5)

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA

📄 论文

Flamingo: a Visual Language Model for Few-Shot Learning

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA

📄 论文

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA , Mixture of Experts：稀疏激活的大模型架构

📄 论文

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA

📄 论文

Retentive Network: A Successor to Transformer for Large Language Models

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA

📄 论文

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA , Qwen3-Coder-Next 架构解析：当 SSM、Attention 与 MoE 三合一

📄 论文

Gated Delta Networks: Improving Mamba2 with Delta Rule

arxiv.org · 来源: Attention 变体：从 Sliding Window 到 MLA , Qwen3-Coder-Next 架构解析：当 SSM、Attention 与 MoE 三合一

📄 论文

Efficient Memory Management for Large Language Model Serving with PagedAttention

arxiv.org · 来源: KV Cache 原理

📄 论文

Efficiently Scaling Transformer Inference

arxiv.org · 来源: Prefill vs Decode 阶段

📄 论文

LLM Inference Unveiled: Survey and Roofline Model Insights

arxiv.org · 来源: Prefill vs Decode 阶段

📄 论文

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

arxiv.org · 来源: Flash Attention 分块原理

📄 论文

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

arxiv.org · 来源: Flash Attention 分块原理

📄 论文

Self-Attention with Relative Position Representations

arxiv.org · 来源: Positional Encoding — 让 Transformer 理解顺序

📄 论文

RoFormer: Enhanced Transformer with Rotary Position Embedding

arxiv.org · 来源: Positional Encoding — 让 Transformer 理解顺序

📄 论文

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

arxiv.org · 来源: Positional Encoding — 让 Transformer 理解顺序

📄 论文

The Curious Case of Neural Text Degeneration

arxiv.org · 来源: Sampling & Decoding — 从概率到文本

📄 论文

Hierarchical Neural Story Generation

arxiv.org · 来源: Sampling & Decoding — 从概率到文本

📄 论文

Perplexity — a Measure of the Difficulty of Speech Recognition Tasks

ieeexplore.ieee.org · 来源: Sampling & Decoding — 从概率到文本

📄 论文

Fast Inference from Transformers via Speculative Decoding

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

Accelerating Large Language Model Decoding with Speculative Sampling

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

Better & Faster Large Language Models via Multi-Token Prediction

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

🌐 网站

DeepSeek-V3 Technical Report

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速 , Mixture of Experts：稀疏激活的大模型架构

📄 论文

EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test

arxiv.org · 来源: Speculative Decoding — 猜测式解码加速

📄 论文

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

arxiv.org · 来源: Mixture of Experts：稀疏激活的大模型架构

📄 论文

Mixtral of Experts

arxiv.org · 来源: Mixture of Experts：稀疏激活的大模型架构

📄 论文

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

arxiv.org · 来源: Mixture of Experts：稀疏激活的大模型架构

📄 论文

Efficiently Modeling Long Sequences with Structured State Spaces (S4)

arxiv.org · 来源: 状态空间模型与 Mamba

📄 论文

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

arxiv.org · 来源: 状态空间模型与 Mamba

📄 论文

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

arxiv.org · 来源: 状态空间模型与 Mamba

📄 论文

HiPPO: Recurrent Memory with Optimal Polynomial Projections

arxiv.org · 来源: 状态空间模型与 Mamba

📄 论文

Hungry Hungry Hippos: Towards Language Modeling with State Space Models (H3)

arxiv.org · 来源: 状态空间模型与 Mamba

📄 论文

On the Parameterization and Initialization of Diagonal State Space Models (S4D)

arxiv.org · 来源: 状态空间模型与 Mamba

🌐 网站

Zamba2-Small: A Hybrid SSM-Transformer Model

zyphra.com · 来源: Hybrid 架构：Mamba 与 Attention 的融合

📄 论文

Hymba: A Hybrid-head Architecture for Small Language Models

arxiv.org · 来源: Hybrid 架构：Mamba 与 Attention 的融合

📄 论文

An Empirical Study of Mamba-based Language Models

arxiv.org · 来源: Hybrid 架构：Mamba 与 Attention 的融合

📄 论文

Repeat After Me: Transformers are Better than State Space Models at Copying

arxiv.org · 来源: Hybrid 架构：Mamba 与 Attention 的融合

📄 论文

Qwen3 Technical Report

arxiv.org · 来源: Qwen3-Coder-Next 架构解析：当 SSM、Attention 与 MoE 三合一

💻 代码

Ollama - Qwen3-Next 模型实现

github.com · 来源: Qwen3-Coder-Next 架构解析：当 SSM、Attention 与 MoE 三合一

Transformer 跨模态应用 39 个资源

📄 论文

Efficient Estimation of Word Representations in Vector Space

arxiv.org · 来源: 从文本到向量：Tokenization 与词嵌入

📄 论文

Neural Machine Translation of Rare Words with Subword Units

arxiv.org · 来源: 从文本到向量：Tokenization 与词嵌入

📄 论文

GloVe: Global Vectors for Word Representation

nlp.stanford.edu · 来源: 从文本到向量：Tokenization 与词嵌入

📄 论文

SentencePiece: A simple and language independent subword tokenizer

arxiv.org · 来源: 从文本到向量：Tokenization 与词嵌入

🌐 网站

The Illustrated Word2Vec

jalammar.github.io · 来源: 从文本到向量：Tokenization 与词嵌入

🌐 网站

Hugging Face Tokenizer Summary

huggingface.co · 来源: 从文本到向量：Tokenization 与词嵌入

📄 论文

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

arxiv.org · 来源: BERT 与 GPT：理解与生成的两条路线

📄 论文

Improving Language Understanding by Generative Pre-Training

cdn.openai.com · 来源: BERT 与 GPT：理解与生成的两条路线

📄 论文

Language Models are Unsupervised Multitask Learners

cdn.openai.com · 来源: BERT 与 GPT：理解与生成的两条路线

📄 论文

Language Models are Few-Shot Learners

arxiv.org · 来源: BERT 与 GPT：理解与生成的两条路线

📄 论文

BERT for Joint Intent Classification and Slot Filling

arxiv.org · 来源: BERT 与 GPT：理解与生成的两条路线

📄 论文

Scaling Laws for Neural Language Models

arxiv.org · 来源: BERT 与 GPT：理解与生成的两条路线

📄 论文

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

arxiv.org · 来源: 句子嵌入：从 Token 级到语义检索

📄 论文

Text Embeddings by Weakly-Supervised Contrastive Pre-training

arxiv.org · 来源: 句子嵌入：从 Token 级到语义检索

📄 论文

C-Pack: Packaged Resources To Advance General Chinese Embedding

arxiv.org · 来源: 句子嵌入：从 Token 级到语义检索

📄 论文

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

arxiv.org · 来源: 句子嵌入：从 Token 级到语义检索

📄 论文

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

arxiv.org · 来源: Vision Transformer：当图像变成 Token 序列

📄 论文

Training data-efficient image transformers & distillation through attention

arxiv.org · 来源: Vision Transformer：当图像变成 Token 序列

📄 论文

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

arxiv.org · 来源: Vision Transformer：当图像变成 Token 序列

📄 论文

Learning Transferable Visual Models From Natural Language Supervision

arxiv.org · 来源: 多模态对齐：CLIP 与跨模态嵌入空间

📄 论文

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

arxiv.org · 来源: 多模态对齐：CLIP 与跨模态嵌入空间

📄 论文

Sigmoid Loss for Language Image Pre-Training

arxiv.org · 来源: 多模态对齐：CLIP 与跨模态嵌入空间

📄 论文

Visual Instruction Tuning

arxiv.org · 来源: 多模态对齐：CLIP 与跨模态嵌入空间

📄 论文

Denoising Diffusion Probabilistic Models

arxiv.org · 来源: 扩散模型基础：从噪声中生成

📄 论文

Denoising Diffusion Implicit Models

arxiv.org · 来源: 扩散模型基础：从噪声中生成

📄 论文

High-Resolution Image Synthesis with Latent Diffusion Models

arxiv.org · 来源: 扩散模型基础：从噪声中生成

📄 论文

Classifier-Free Diffusion Guidance

arxiv.org · 来源: 扩散模型基础：从噪声中生成

📄 论文

Scalable Diffusion Models with Transformers

arxiv.org · 来源: Diffusion Transformer：用 Transformer 做图像生成

📄 论文

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

arxiv.org · 来源: Diffusion Transformer：用 Transformer 做图像生成

📄 论文

Video generation models as world simulators

openai.com · 来源: 视频生成：时空注意力与 Sora 架构

📄 论文

Make-A-Video: Text-to-Video Generation without Text-Video Data

arxiv.org · 来源: 视频生成：时空注意力与 Sora 架构

📄 论文

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

arxiv.org · 来源: 视频生成：时空注意力与 Sora 架构

📄 论文

Robust Speech Recognition via Large-Scale Weak Supervision

arxiv.org · 来源: 语音与 Transformer：从 Whisper 到 VALL-E

📄 论文

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

arxiv.org · 来源: 语音与 Transformer：从 Whisper 到 VALL-E

📄 论文

High Fidelity Neural Audio Compression

arxiv.org · 来源: 语音与 Transformer：从 Whisper 到 VALL-E

📄 论文

Simple and Controllable Music Generation

arxiv.org · 来源: 音乐生成：当 Transformer 学会作曲

📄 论文

Jukebox: A Generative Model for Music

arxiv.org · 来源: 音乐生成：当 Transformer 学会作曲

📄 论文

MusicLM: Generating Music From Text

arxiv.org · 来源: 音乐生成：当 Transformer 学会作曲

📄 论文

Fast Timing-Conditioned Latent Audio Diffusion

arxiv.org · 来源: 音乐生成：当 Transformer 学会作曲

LLM 量化技术 27 个资源

📄 论文

A Survey of Quantization Methods for Efficient Neural Network Inference

arxiv.org · 来源: 量化基础

📄 论文

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

arxiv.org · 来源: 量化基础

📄 论文

FP8 Formats for Deep Learning

arxiv.org · 来源: 量化基础 , 推理时量化：KV Cache 与 Activation 量化

📄 论文

GPTQ: Accurate Post-Training Quantization for Generative Pre-Trained Transformers

arxiv.org · 来源: PTQ 权重量化：从 GPTQ 到 AWQ , llama.cpp 量化方案

📄 论文

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

arxiv.org · 来源: PTQ 权重量化：从 GPTQ 到 AWQ , llama.cpp 量化方案

📄 论文

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

arxiv.org · 来源: PTQ 权重量化：从 GPTQ 到 AWQ

📄 论文

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

arxiv.org · 来源: 量化感知训练 (QAT)

📄 论文

BitNet: Scaling 1-bit Transformers for Large Language Models

arxiv.org · 来源: 量化感知训练 (QAT)

📄 论文

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

arxiv.org · 来源: 量化感知训练 (QAT)

📄 论文

QLoRA: Efficient Finetuning of Quantized LLMs

arxiv.org · 来源: 量化感知训练 (QAT)

📄 论文

LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning

arxiv.org · 来源: 量化感知训练 (QAT)

📄 论文

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

arxiv.org · 来源: 推理时量化：KV Cache 与 Activation 量化

📄 论文

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

arxiv.org · 来源: 推理时量化：KV Cache 与 Activation 量化

💻 代码

llama.cpp Quantization Types

github.com · 来源: llama.cpp 量化方案

💻 代码

K-quant PR

github.com · 来源: llama.cpp 量化方案

🌐 网站

NVIDIA Model Optimizer GitHub

github.com · 来源: 量化与模型转换工具链全景

🌐 网站

vLLM Quantization - LLM Compressor

github.com · 来源: 量化与模型转换工具链全景

🌐 网站

Microsoft Olive Documentation - Why Olive

microsoft.github.io · 来源: 量化与模型转换工具链全景

🌐 网站

Apple coremltools Optimization Overview

apple.github.io · 来源: 量化与模型转换工具链全景

🌐 网站

AMD Quark Documentation

quark.docs.amd.com · 来源: 量化与模型转换工具链全景

🌐 网站

Google AI Edge Torch GitHub

github.com · 来源: 量化与模型转换工具链全景

🌐 网站

NNCF GitHub Repository

github.com · 来源: 量化与模型转换工具链全景

🌐 网站

Optimum Intel Documentation

huggingface.co · 来源: 量化与模型转换工具链全景 , 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

llama.cpp GitHub Repository

github.com · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

ONNX Runtime Documentation

onnxruntime.ai · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

OpenVINO Documentation

docs.openvino.ai · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

lm-evaluation-harness

github.com · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

vLLM + SGLang 推理引擎深度解析 13 个资源

📄 论文

Efficient Memory Management for Large Language Model Serving with PagedAttention

arxiv.org · 来源: LLM 推理引擎全景：vLLM、SGLang、Ollama 与 TensorRT-LLM , PagedAttention 与 Continuous Batching , 调度与抢占：推理引擎的 Scheduler

📄 论文

SGLang: Efficient Execution of Structured Language Model Programs

arxiv.org · 来源: LLM 推理引擎全景：vLLM、SGLang、Ollama 与 TensorRT-LLM , 前缀缓存与 RadixAttention , SGLang 编程模型与结构化输出

🌐 网站

NVIDIA TensorRT-LLM Documentation

nvidia.github.io · 来源: LLM 推理引擎全景：vLLM、SGLang、Ollama 与 TensorRT-LLM

🌐 网站

Ollama GitHub Repository

github.com · 来源: LLM 推理引擎全景：vLLM、SGLang、Ollama 与 TensorRT-LLM

📄 论文

Orca: A Distributed Serving System for Transformer-Based Generative Models

arxiv.org · 来源: PagedAttention 与 Continuous Batching

🌐 网站

vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention

blog.vllm.ai · 来源: PagedAttention 与 Continuous Batching

📄 论文

Sarathi: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills

arxiv.org · 来源: 调度与抢占：推理引擎的 Scheduler

📄 论文

Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve

arxiv.org · 来源: 调度与抢占：推理引擎的 Scheduler

🌐 网站

vLLM Automatic Prefix Caching

docs.vllm.ai · 来源: 前缀缓存与 RadixAttention

📄 论文

Trie Memory

dl.acm.org · 来源: 前缀缓存与 RadixAttention

📄 论文

Efficient Guided Generation for Large Language Models

arxiv.org · 来源: SGLang 编程模型与结构化输出

🌐 网站

Fast JSON Decoding for Local LLMs with Compressed Finite State Machine

lmsys.org · 来源: SGLang 编程模型与结构化输出

🌐 网站

SGLang Documentation — Structured Outputs

docs.sglang.io · 来源: SGLang 编程模型与结构化输出

LLM Model Routing：智能模型选择与混合推理 16 个资源

📄 论文

RouteLLM: Learning to Route LLMs with Preference Data

arxiv.org · 来源: Model Routing 全景：为什么一个模型不够 , 路由分类器：让小模型决定谁来回答 , RouteLLM 实战：从偏好数据到生产路由 , 因子分解机与 LLM 路由：从 FM 理论到 MF 路由器 , 在线学习与成本优化：路由也需要持续进化

📄 论文

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

arxiv.org · 来源: Model Routing 全景：为什么一个模型不够 , 级联与自验证：先试便宜的，不行再升级

📄 论文

AutoMix: Automatically Mixing Language Models

arxiv.org · 来源: Model Routing 全景：为什么一个模型不够 , 级联与自验证：先试便宜的，不行再升级

🌐 网站

RouteLLM GitHub Repository

github.com · 来源: Model Routing 全景：为什么一个模型不够 , RouteLLM 实战：从偏好数据到生产路由

📄 论文

Evaluating Small Language Models for Front-Door Routing

arxiv.org · 来源: 路由分类器：让小模型决定谁来回答

🌐 网站

semantic-router: Superfast Decision-Making Layer

github.com · 来源: 路由分类器：让小模型决定谁来回答

📄 论文

Factorization Machines

csie.ntu.edu.tw · 来源: 因子分解机与 LLM 路由：从 FM 理论到 MF 路由器

📄 论文

Factorization Machines with libFM

dl.acm.org · 来源: 因子分解机与 LLM 路由：从 FM 理论到 MF 路由器

📄 论文

Confidence-Driven LLM Router

arxiv.org · 来源: 级联与自验证：先试便宜的，不行再升级

📄 论文

ConsRoute: Consistency-Driven LLM Routing for Cloud-Edge-Device

arxiv.org · 来源: Hybrid LLM：本地与云端的智能路由

📄 论文

HybridFlow: Subtask-level DAG Routing

arxiv.org · 来源: Hybrid LLM：本地与云端的智能路由

📄 论文

PRISM: Privacy-Sensitive Entity-Level LLM Routing

arxiv.org · 来源: Hybrid LLM：本地与云端的智能路由

📄 论文

Bridging On-Device and Cloud LLMs for Collaborative Reasoning

arxiv.org · 来源: Hybrid LLM：本地与云端的智能路由

📄 论文

Robust Batch-Level LLM Routing

arxiv.org · 来源: 在线学习与成本优化：路由也需要持续进化

📄 论文

Council Mode: Multi-LLM Collaboration for Hallucination Reduction

arxiv.org · 来源: 多模型协作：从选一个到用多个

🌐 网站

Mixture of Agents - Together AI

together.ai · 来源: 多模型协作：从选一个到用多个

LLM 评估与 Benchmark 深度解析 27 个资源

📄 论文

Measuring Massive Multitask Language Understanding (MMLU)

arxiv.org · 来源: Benchmark 全景与评估方法论

🌐 网站

lm-evaluation-harness

github.com · 来源: Benchmark 全景与评估方法论 , 优化对精度的影响 , lm-eval-harness 实操指南

📄 论文

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

arxiv.org · 来源: Benchmark 全景与评估方法论

🌐 网站

LiveBench

livebench.ai · 来源: Benchmark 全景与评估方法论 , 排行榜解读与模型选型

📄 论文

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

arxiv.org · 来源: 知识与推理 Benchmark

📄 论文

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

arxiv.org · 来源: 知识与推理 Benchmark

📄 论文

Measuring Mathematical Problem Solving With the MATH Dataset

arxiv.org · 来源: 知识与推理 Benchmark

📄 论文

Evaluating Large Language Models Trained on Code

arxiv.org · 来源: 代码 Benchmark

📄 论文

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

arxiv.org · 来源: 代码 Benchmark , SWE-bench 实操指南

📄 论文

Is Your Code Generated by ChatGPT Really Correct? (EvalPlus)

arxiv.org · 来源: 代码 Benchmark

🌐 网站

Berkeley Function Calling Leaderboard (BFCL)

gorilla.cs.berkeley.edu · 来源: Agent 与 Tool Use Benchmark , BFCL 实操指南

📄 论文

GAIA: A Benchmark for General AI Assistants

arxiv.org · 来源: Agent 与 Tool Use Benchmark

📄 论文

WebArena: A Realistic Web Environment for Building Autonomous Agents

arxiv.org · 来源: Agent 与 Tool Use Benchmark

🌐 网站

Google Gemma 2 Technical Report

ai.google.dev · 来源: 模型发布 Benchmark 标配解析

🌐 网站

Microsoft Phi-3 Technical Report

arxiv.org · 来源: 模型发布 Benchmark 标配解析

📄 论文

Qwen2.5 Technical Report

arxiv.org · 来源: 模型发布 Benchmark 标配解析

🌐 网站

Meta Llama 3.1 Model Card

huggingface.co · 来源: 模型发布 Benchmark 标配解析

🌐 网站

Open LLM Leaderboard

huggingface.co · 来源: 模型发布 Benchmark 标配解析 , 排行榜解读与模型选型

🌐 网站

OpenVINO Neural Network Compression Framework (NNCF)

github.com · 来源: 优化对精度的影响

🌐 网站

Optimum Intel

huggingface.co · 来源: 优化对精度的影响

🌐 网站

llama.cpp

github.com · 来源: 优化对精度的影响

🌐 网站

Chatbot Arena (LMSYS)

lmarena.ai · 来源: 排行榜解读与模型选型

🌐 网站

Artificial Analysis LLM Leaderboard

artificialanalysis.ai · 来源: 排行榜解读与模型选型

🌐 网站

lm-eval Documentation

lm-evaluation-harness.readthedocs.io · 来源: lm-eval-harness 实操指南

🌐 网站

SWE-bench GitHub

github.com · 来源: SWE-bench 实操指南

🌐 网站

SWE-agent GitHub

github.com · 来源: SWE-bench 实操指南

🌐 网站

Gorilla / BFCL GitHub

github.com · 来源: BFCL 实操指南

Ollama + llama.cpp 深度解析 20 个资源

💻 代码

Ollama GitHub

github.com · 来源: Ollama + llama.cpp 架构总览 , 一次推理的完整旅程 , KV Cache 与 Batch 调度 , 服务层与调度

💻 代码

llama.cpp GitHub

github.com · 来源: Ollama + llama.cpp 架构总览 , 一次推理的完整旅程 , 计算图与推理引擎

💻 代码

GGML GitHub

github.com · 来源: Ollama + llama.cpp 架构总览 , 计算图与推理引擎 , 硬件后端

📄 论文

Qwen3 Technical Report

arxiv.org · 来源: 一次推理的完整旅程

💻 代码

GGUF Specification

github.com · 来源: GGUF 模型格式

🌐 网站

Safetensors Documentation

huggingface.co · 来源: GGUF 模型格式

🌐 网站

ONNX

onnx.ai · 来源: GGUF 模型格式

💻 代码

llama.cpp Quantization Types

github.com · 来源: llama.cpp 量化方案

💻 代码

K-quant PR

github.com · 来源: llama.cpp 量化方案

📄 论文

GPTQ: Accurate Post-Training Quantization

arxiv.org · 来源: llama.cpp 量化方案

📄 论文

AWQ: Activation-aware Weight Quantization

arxiv.org · 来源: llama.cpp 量化方案

📄 论文

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

arxiv.org · 来源: 计算图与推理引擎

📄 论文

Efficient Memory Management for LLM Serving with PagedAttention

arxiv.org · 来源: KV Cache 与 Batch 调度

🌐 网站

CUDA Programming Guide

docs.nvidia.com · 来源: 硬件后端

🌐 网站

Metal Shading Language

developer.apple.com · 来源: 硬件后端

🌐 网站

Vulkan Compute

khronos.org · 来源: 硬件后端

💻 代码

Ollama FAQ

github.com · 来源: 服务层与调度

💻 代码

Ollama Modelfile

github.com · 来源: 模型生态

💻 代码

Ollama API

github.com · 来源: 模型生态

📄 论文

LLaVA: Visual Instruction Tuning

arxiv.org · 来源: 模型生态

llama.cpp 源码精读 1 个资源

💻 代码

llama.cpp GitHub

github.com · 来源: llama.cpp 执行流程总览 , 工具全景与 GGUF 二进制解析 , 模型加载：从文件到设备 , Warmup、Tokenization 与 Chat Template , Batch、Ubatch 与解码主循环 , 计算图构建与架构分发 , Backend 调度、Op Fusion 与内存分配 , 执行、采样与上下文管理

AI Compute Stack 21 个资源

🌐 网站

NVIDIA CUDA C++ Programming Guide

docs.nvidia.com · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集 , GPU Architecture — 从晶体管到线程 , CUDA 编程模型 — 从代码到硬件

🌐 网站

Khronos OpenCL Specification

khronos.org · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

🌐 网站

Khronos SYCL Specification

khronos.org · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

🌐 网站

Intel oneAPI Level Zero Specification

spec.oneapi.io · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

🌐 网站

AMD ROCm HIP Programming Guide

rocm.docs.amd.com · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

🌐 网站

Apple Metal Shading Language Specification

developer.apple.com · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

💻 代码

ggml / llama.cpp

github.com · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

🌐 网站

ONNX Runtime Documentation

onnxruntime.ai · 来源: AI Compute Stack 全景 — 从推理框架到硬件指令集

🌐 网站

NVIDIA H100 Tensor Core GPU Architecture Whitepaper

resources.nvidia.com · 来源: GPU Architecture — 从晶体管到线程 , 矩阵加速单元 — Tensor Core 与 XMX

📄 论文

Why Systolic Architectures? — H.T. Kung

cs.virginia.edu · 来源: 矩阵加速单元 — Tensor Core 与 XMX

🌐 网站

NVIDIA PTX ISA — Matrix Multiply-Accumulate

docs.nvidia.com · 来源: 矩阵加速单元 — Tensor Core 与 XMX

🌐 网站

Intel Xe2 Architecture — Xe-Core and XMX

intel.com · 来源: 矩阵加速单元 — Tensor Core 与 XMX , CUDA 编程模型 — 从代码到硬件

📄 论文

DeepSeek-V3 Technical Report

arxiv.org · 来源: 矩阵加速单元 — Tensor Core 与 XMX

🌐 网站

NVIDIA Kernel Profiling Guide — Memory Coalescing

docs.nvidia.com · 来源: CUDA 编程模型 — 从代码到硬件

🌐 网站

CUDA Occupancy Calculator

docs.nvidia.com · 来源: CUDA 编程模型 — 从代码到硬件

🌐 网站

SYCL 2020 Specification

registry.khronos.org · 来源: CUDA 编程模型 — 从代码到硬件

🌐 网站

CUTLASS: Fast Linear Algebra in CUDA C++

github.com · 来源: GEMM 优化 — 从 Naive 到极致

🌐 网站

How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance

siboehm.com · 来源: GEMM 优化 — 从 Naive 到极致

🌐 网站

CUDA C++ Programming Guide — Warp Matrix Functions

docs.nvidia.com · 来源: GEMM 优化 — 从 Naive 到极致

🌐 网站

Intel oneAPI DPC++ — joint_matrix Extension

github.com · 来源: GEMM 优化 — 从 Naive 到极致

📄 论文

Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking

arxiv.org · 来源: GEMM 优化 — 从 Naive 到极致

图编译与优化 79 个资源

📝 博客

PyTorch 2.0: Our next generation release

pytorch.org · 来源: 全景图：ML 编译器的世界 , 计算图捕获：TorchDynamo、AOTAutograd 与 Functionalization

🌐 网站

MLIR: Multi-Level Intermediate Representation

mlir.llvm.org · 来源: 全景图：ML 编译器的世界

🌐 网站

Triton Language and Compiler

triton-lang.org · 来源: 全景图：ML 编译器的世界

📄 论文

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

arxiv.org · 来源: 全景图：ML 编译器的世界

📄 论文

MLIR: A Compiler Infrastructure for the End of Moore's Law

arxiv.org · 来源: 全景图：ML 编译器的世界 , IR 设计（上）：SSA、FX IR 与 MLIR Dialect , IR 设计（下）：Progressive Lowering 与多层 IR

📝 博客

TorchDynamo: An Experiment in Dynamic Python Bytecode Transformation

dev-discuss.pytorch.org · 来源: 全景图：ML 编译器的世界 , 计算图捕获：TorchDynamo、AOTAutograd 与 Functionalization

🌐 网站

PEP 523 – Adding a frame evaluation API to CPython

peps.python.org · 来源: 计算图捕获：TorchDynamo、AOTAutograd 与 Functionalization

🌐 网站

torch.compiler — PyTorch Documentation

pytorch.org · 来源: 计算图捕获：TorchDynamo、AOTAutograd 与 Functionalization

🌐 网站

AOT Autograd — How to use and optimize?

pytorch.org · 来源: 计算图捕获：TorchDynamo、AOTAutograd 与 Functionalization

📄 论文

Efficiently Computing Static Single Assignment Form and the Control Dependence Graph

dl.acm.org · 来源: IR 设计（上）：SSA、FX IR 与 MLIR Dialect , 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

🌐 网站

torch.fx — PyTorch Documentation

pytorch.org · 来源: IR 设计（上）：SSA、FX IR 与 MLIR Dialect , 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

🌐 网站

MLIR Language Reference

mlir.llvm.org · 来源: IR 设计（上）：SSA、FX IR 与 MLIR Dialect

🌐 网站

MLIR Dialects

mlir.llvm.org · 来源: IR 设计（上）：SSA、FX IR 与 MLIR Dialect

🌐 网站

MLIR Dialect Conversion

mlir.llvm.org · 来源: IR 设计（下）：Progressive Lowering 与多层 IR

🌐 网站

MLIR Bufferization

mlir.llvm.org · 来源: IR 设计（下）：Progressive Lowering 与多层 IR

🌐 网站

MLIR Pass Infrastructure

mlir.llvm.org · 来源: IR 设计（下）：Progressive Lowering 与多层 IR , 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

💻 代码

torch-mlir: PyTorch to MLIR compiler

github.com · 来源: IR 设计（下）：Progressive Lowering 与多层 IR

📄 论文

A Unified Approach to Global Program Optimization

dl.acm.org · 来源: 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

🌐 网站

MLIR Canonicalization

mlir.llvm.org · 来源: 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

📄 论文

Constant Propagation with Conditional Branches

dl.acm.org · 来源: 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

🌐 网站

PyTorch FX Subgraph Rewriter

pytorch.org · 来源: 图优化 Pass（上）：数据流分析基础与通用 Pass 模式

🌐 网站

MLIR Declarative Rewrite Rules (DRR)

mlir.llvm.org · 来源: 图优化 Pass（中）：高级优化与 Pattern Matching

🌐 网站

MLIR PDL — Pattern Description Language

mlir.llvm.org · 来源: 图优化 Pass（中）：高级优化与 Pattern Matching

🌐 网站

torch.fx — Subgraph Rewriting

pytorch.org · 来源: 图优化 Pass（中）：高级优化与 Pattern Matching

📄 论文

Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

arxiv.org · 来源: 图优化 Pass（中）：高级优化与 Pattern Matching

🌐 网站

NVIDIA Tensor Core Programming

docs.nvidia.com · 来源: 图优化 Pass（中）：高级优化与 Pattern Matching

📄 论文

A Practical Automatic Polyhedral Parallelizer and Locality Optimizer

dl.acm.org · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换

🌐 网站

MLIR Affine Dialect

mlir.llvm.org · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换

📄 论文

Polyhedral Compilation as a Design Pattern for Compiler Construction

link.springer.com · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换

🌐 网站

MLIR Transform Dialect

mlir.llvm.org · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换 , 自动调优与端到端实战

📄 论文

Optimizing Compilers for Modern Architectures

elsevier.com · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换

🌐 网站

Polly - Polyhedral optimizations for LLVM

polly.llvm.org · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换

📄 论文

Integer Set Library: A Library for Manipulating Integer Sets

libisl.sourceforge.io · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换

🌐 网站

MLIR Linalg Dialect

mlir.llvm.org · 来源: 图优化 Pass（下）：Polyhedral 优化与循环变换 , 算子融合（下）：Cost Model 与融合实战

📄 论文

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

arxiv.org · 来源: 算子融合（上）：融合类型学与判定算法 , 算子融合（下）：Cost Model 与融合实战 , Tiling 策略与内存层次优化

📄 论文

PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

dl.acm.org · 来源: 算子融合（上）：融合类型学与判定算法 , 算子融合（下）：Cost Model 与融合实战 , Dynamic Shapes：从捕获到执行的全链路挑战

🌐 网站

TorchInductor: a PyTorch-native Compiler

dev-discuss.pytorch.org · 来源: 算子融合（上）：融合类型学与判定算法

📄 论文

XLA: Optimizing Compiler for Machine Learning

tensorflow.org · 来源: 算子融合（上）：融合类型学与判定算法

🌐 网站

Roofline Model

docs.nersc.gov · 来源: 算子融合（上）：融合类型学与判定算法 , 算子融合（下）：Cost Model 与融合实战

📄 论文

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

arxiv.org · 来源: 算子融合（下）：Cost Model 与融合实战

📄 论文

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

arxiv.org · 来源: 算子融合（下）：Cost Model 与融合实战

📄 论文

Roofline: An Insightful Visual Performance Model for Multicore Architectures

www2.eecs.berkeley.edu · 来源: Tiling 策略与内存层次优化

🌐 网站

NVIDIA CUDA C++ Programming Guide — Shared Memory

docs.nvidia.com · 来源: Tiling 策略与内存层次优化

🌐 网站

CUTLASS: CUDA Templates for Linear Algebra Subroutines

github.com · 来源: Tiling 策略与内存层次优化 , 代码生成（上）：指令选择、Vectorization 与 Register Allocation

🌐 网站

Triton Language Documentation

triton-lang.org · 来源: Tiling 策略与内存层次优化 , 代码生成（下）：Triton Pipeline、编译器后端与数值正确性

🌐 网站

NVIDIA A100 GPU Architecture Whitepaper

images.nvidia.com · 来源: Tiling 策略与内存层次优化

🌐 网站

torch.compile Dynamic Shapes Documentation

pytorch.org · 来源: Dynamic Shapes：从捕获到执行的全链路挑战

🌐 网站

TorchDynamo Deep Dive

pytorch.org · 来源: Dynamic Shapes：从捕获到执行的全链路挑战

🌐 网站

MLIR Tensor Type — Dynamic Dimensions

mlir.llvm.org · 来源: Dynamic Shapes：从捕获到执行的全链路挑战

🌐 网站

NVIDIA CUDA C++ Programming Guide — PTX ISA

docs.nvidia.com · 来源: 代码生成（上）：指令选择、Vectorization 与 Register Allocation

🌐 网站

LLVM Code Generator Documentation

llvm.org · 来源: 代码生成（上）：指令选择、Vectorization 与 Register Allocation

🌐 网站

NVIDIA GPU Architecture — Execution Units

docs.nvidia.com · 来源: 代码生成（上）：指令选择、Vectorization 与 Register Allocation

📄 论文

Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations

eecs.harvard.edu · 来源: 代码生成（上）：指令选择、Vectorization 与 Register Allocation , 代码生成（下）：Triton Pipeline、编译器后端与数值正确性 , 自动调优与端到端实战

🌐 网站

MLIR GPU Dialect

mlir.llvm.org · 来源: 代码生成（下）：Triton Pipeline、编译器后端与数值正确性

🌐 网站

IREE Compiler and Runtime

iree.dev · 来源: 代码生成（下）：Triton Pipeline、编译器后端与数值正确性

🌐 网站

TensorRT Developer Guide

docs.nvidia.com · 来源: 代码生成（下）：Triton Pipeline、编译器后端与数值正确性

🌐 网站

What Every Computer Scientist Should Know About Floating-Point Arithmetic

docs.oracle.com · 来源: 代码生成（下）：Triton Pipeline、编译器后端与数值正确性

📄 论文

A Survey of Quantization Methods for Efficient Neural Network Inference

arxiv.org · 来源: 量化编译与混合精度优化

📄 论文

GPTQ: Accurate Post-Training Quantization for Generative Pre-Trained Transformers

arxiv.org · 来源: 量化编译与混合精度优化

📄 论文

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

arxiv.org · 来源: 量化编译与混合精度优化

📄 论文

FP8 Formats for Deep Learning

arxiv.org · 来源: 量化编译与混合精度优化

🌐 网站

PyTorch Quantization Documentation

pytorch.org · 来源: 量化编译与混合精度优化

🌐 网站

TensorRT Quantization Toolkit

docs.nvidia.com · 来源: 量化编译与混合精度优化

📄 论文

GSPMD: General and Scalable Parallelization for ML Computation Graphs

arxiv.org · 来源: 分布式编译与图分割

📄 论文

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

arxiv.org · 来源: 分布式编译与图分割

📄 论文

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

arxiv.org · 来源: 分布式编译与图分割

📄 论文

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

arxiv.org · 来源: 分布式编译与图分割

🌐 网站

PyTorch Distributed Overview

pytorch.org · 来源: 分布式编译与图分割

🌐 网站

XLA SPMD Partitioner

openxla.org · 来源: 分布式编译与图分割

🌐 网站

CUDA C++ Programming Guide — Streams

docs.nvidia.com · 来源: 调度与执行优化

🌐 网站

CUDA Graphs

docs.nvidia.com · 来源: 调度与执行优化

📄 论文

Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization

arxiv.org · 来源: 调度与执行优化

📄 论文

Dynamic Tensor Rematerialization

arxiv.org · 来源: 调度与执行优化

🌐 网站

TorchInductor: A PyTorch Native Compiler

dev-discuss.pytorch.org · 来源: 调度与执行优化

🌐 网站

PyTorch Activation Checkpointing

pytorch.org · 来源: 调度与执行优化

📄 论文

Ansor: Generating High-Performance Tensor Programs for Deep Learning

arxiv.org · 来源: 自动调优与端到端实战

📄 论文

Learning to Optimize Tensor Programs

arxiv.org · 来源: 自动调优与端到端实战

🌐 网站

Triton Autotune Documentation

triton-lang.org · 来源: 自动调优与端到端实战

🌐 网站

torch.compile Troubleshooting

pytorch.org · 来源: 自动调优与端到端实战

强化学习：从基础到 LLM 对齐与推理 35 个资源

🌐 网站

Reinforcement Learning: An Introduction (Sutton & Barto, 2nd Edition)

incompleteideas.net · 来源: 强化学习基础：从 Agent 到 Bellman 方程

🌐 网站

David Silver UCL Reinforcement Learning Course

davidsilver.uk · 来源: 强化学习基础：从 Agent 到 Bellman 方程

🌐 网站

OpenAI Spinning Up: Introduction to RL

spinningup.openai.com · 来源: 强化学习基础：从 Agent 到 Bellman 方程

🌐 网站

Hugging Face Deep RL Course

huggingface.co · 来源: 强化学习基础：从 Agent 到 Bellman 方程 , Test-Time Scaling 与思维强化

🌐 网站

A (Long) Peek into Reinforcement Learning — Lilian Weng

lilianweng.github.io · 来源: 强化学习基础：从 Agent 到 Bellman 方程

🌐 网站

UC Berkeley CS285: Deep Reinforcement Learning

rail.eecs.berkeley.edu · 来源: 强化学习基础：从 Agent 到 Bellman 方程 , Policy Gradient：直接优化策略 , Actor-Critic 与 PPO：稳定的策略优化

🌐 网站

Deep Reinforcement Learning: Pong from Pixels — Andrej Karpathy

karpathy.github.io · 来源: 强化学习基础：从 Agent 到 Bellman 方程 , Test-Time Scaling 与思维强化

📄 论文

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning (Williams, 1992)

link.springer.com · 来源: Policy Gradient：直接优化策略

📄 论文

Policy Gradient Methods for Reinforcement Learning with Function Approximation (Sutton et al., 1999)

proceedings.neurips.cc · 来源: Policy Gradient：直接优化策略

🌐 网站

Policy Gradient Algorithms — Lilian Weng

lilianweng.github.io · 来源: Policy Gradient：直接优化策略 , Actor-Critic 与 PPO：稳定的策略优化 , 当 RL 遇上 LLM：从语言生成到策略优化

🌐 网站

OpenAI Spinning Up: Vanilla Policy Gradient

spinningup.openai.com · 来源: Policy Gradient：直接优化策略

📄 论文

Proximal Policy Optimization Algorithms (Schulman et al., 2017)

arxiv.org · 来源: Actor-Critic 与 PPO：稳定的策略优化

📄 论文

High-Dimensional Continuous Control Using Generalized Advantage Estimation (Schulman et al., 2016)

arxiv.org · 来源: Actor-Critic 与 PPO：稳定的策略优化

📄 论文

Trust Region Policy Optimization (Schulman et al., 2015)

arxiv.org · 来源: Actor-Critic 与 PPO：稳定的策略优化

🌐 网站

Hugging Face Deep RL Course: PPO

huggingface.co · 来源: Actor-Critic 与 PPO：稳定的策略优化

📄 论文

Training language models to follow instructions with human feedback (Ouyang et al., 2022)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , RLHF：从人类反馈中学习

📄 论文

Direct Preference Optimization: Your Language Model is Secretly a Reward Model (Rafailov et al., 2023)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , 从 DPO 到 GRPO：直接偏好优化

📄 论文

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Shao et al., 2024)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , 从 DPO 到 GRPO：直接偏好优化

📄 论文

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (2025)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , Test-Time Scaling 与思维强化

📄 论文

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning (Ross et al., 2011)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化

📄 论文

Fine-Tuning Language Models from Human Preferences (Ziegler et al., 2019)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , RLHF：从人类反馈中学习

📄 论文

Learning to summarize from human feedback (Stiennon et al., 2020)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化

🌐 网站

RLHF: Reinforcement Learning from Human Feedback — Chip Huyen

huyenchip.com · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , RLHF：从人类反馈中学习

📄 论文

Let's Verify Step by Step (Lightman et al., 2023)

arxiv.org · 来源: 当 RL 遇上 LLM：从语言生成到策略优化 , Reward 设计与 Scaling , Test-Time Scaling 与思维强化

📄 论文

Deep Reinforcement Learning from Human Preferences (Christiano et al., 2017)

arxiv.org · 来源: RLHF：从人类反馈中学习

🌐 网站

RLHF 系列 — Nathan Lambert (interconnects.ai)

interconnects.ai · 来源: RLHF：从人类反馈中学习 , Reward 设计与 Scaling

🌐 网站

Reward Hacking in Reinforcement Learning — Lilian Weng

lilianweng.github.io · 来源: RLHF：从人类反馈中学习 , 从 DPO 到 GRPO：直接偏好优化 , Reward 设计与 Scaling

📄 论文

A General Theoretical Paradigm to Understand Learning from Human Feedback (Azar et al., 2023)

arxiv.org · 来源: 从 DPO 到 GRPO：直接偏好优化

📄 论文

KTO: Model Alignment as Prospect Theoretic Optimization (Ethayarajh et al., 2024)

arxiv.org · 来源: 从 DPO 到 GRPO：直接偏好优化

🌐 网站

Hugging Face TRL Documentation: DPO Trainer

huggingface.co · 来源: 从 DPO 到 GRPO：直接偏好优化

📄 论文

Training Verifiers to Solve Math Word Problems (Cobbe et al., 2021)

arxiv.org · 来源: Reward 设计与 Scaling

📄 论文

Constitutional AI: Harmlessness from AI Feedback (Bai et al., 2022)

arxiv.org · 来源: Reward 设计与 Scaling

📄 论文

Scaling Laws for Reward Model Overoptimization (Gao et al., 2022)

arxiv.org · 来源: Reward 设计与 Scaling

📄 论文

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Snell et al., 2024)

arxiv.org · 来源: Test-Time Scaling 与思维强化

📄 论文

AlphaZero-like Tree-Search can Guide Large Language Model Decoding and Training (Feng et al., 2024)

arxiv.org · 来源: Test-Time Scaling 与思维强化

Intel iGPU 推理深度解析：Xe2 架构、oneDNN 与 OpenVINO 40 个资源

🌐 网站

Intel Xe2 Architecture — Intel

intel.com · 来源: Xe2 GPU 架构

🌐 网站

Intel Data Center GPU Max Series Architecture — Intel

intel.com · 来源: Xe2 GPU 架构

🌐 网站

oneAPI GPU Optimization Guide — Intel

intel.com · 来源: Xe2 GPU 架构

🌐 网站

oneAPI GPU Optimization Guide — Thread Hierarchy — Intel

intel.com · 来源: Xe2 执行模型与编程抽象

🌐 网站

SYCL 2020 Specification — Khronos Group

registry.khronos.org · 来源: Xe2 执行模型与编程抽象

🌐 网站

Intel GPU Occupancy Calculator — Intel

intel.com · 来源: Xe2 执行模型与编程抽象

🌐 网站

DPC++ Language Extensions for SYCL — Intel

github.com · 来源: Xe2 执行模型与编程抽象

🌐 网站

SPIR-V Specification — Khronos Group

registry.khronos.org · 来源: SPIR-V 编译与 Level Zero 运行时

🌐 网站

oneAPI Level Zero Specification — Intel

spec.oneapi.io · 来源: SPIR-V 编译与 Level Zero 运行时 , NPU 上的 LLM 推理：KV Cache 与软件栈 , NPU 执行模型与编程模型的边界

💻 代码

Intel Graphics Compiler (IGC) — GitHub

github.com · 来源: SPIR-V 编译与 Level Zero 运行时

🌐 网站

SPIR-V Guide — Khronos

github.com · 来源: SPIR-V 编译与 Level Zero 运行时

🌐 网站

oneDNN Developer Guide — Intel

oneapi-src.github.io · 来源: oneDNN Primitive 体系

💻 代码

oneAPI Deep Neural Network Library (oneDNN) — GitHub

github.com · 来源: oneDNN Primitive 体系

🌐 网站

oneDNN Programming Model — Intel

oneapi-src.github.io · 来源: oneDNN Primitive 体系

🌐 网站

Memory Format Propagation — oneDNN

oneapi-src.github.io · 来源: oneDNN Primitive 体系

🌐 网站

oneDNN Performance Profiling and Inspection — Intel

oneapi-src.github.io · 来源: oneDNN GPU Kernel 优化

🌐 网站

oneAPI GPU Optimization Guide — GEMM — Intel

intel.com · 来源: oneDNN GPU Kernel 优化

🌐 网站

XMX and XVE Architecture — Intel

intel.com · 来源: oneDNN GPU Kernel 优化

🌐 网站

OpenVINO Architecture — Intel

docs.openvino.ai · 来源: OpenVINO 图优化 Pipeline

🌐 网站

OpenVINO GPU Plugin — Intel

docs.openvino.ai · 来源: OpenVINO 图优化 Pipeline

💻 代码

OpenVINO Toolkit — GitHub

github.com · 来源: OpenVINO 图优化 Pipeline

🌐 网站

Optimum Intel Documentation

huggingface.co · 来源: Intel 模型优化栈：Optimum Intel / NNCF / OpenVINO 三件套选型 , 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

NNCF GitHub Repository

github.com · 来源: Intel 模型优化栈：Optimum Intel / NNCF / OpenVINO 三件套选型

🌐 网站

NNCF API Documentation

openvinotoolkit.github.io · 来源: Intel 模型优化栈：Optimum Intel / NNCF / OpenVINO 三件套选型

🌐 网站

OpenVINO Model Conversion

docs.openvino.ai · 来源: Intel 模型优化栈：Optimum Intel / NNCF / OpenVINO 三件套选型 , 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

Optimum Intel Source - Quantization

github.com · 来源: Intel 模型优化栈：Optimum Intel / NNCF / OpenVINO 三件套选型

🌐 网站

Intel VTune Profiler — GPU Analysis — Intel

intel.com · 来源: 性能分析与瓶颈诊断

🌐 网站

OpenVINO Benchmark Tool — Intel

docs.openvino.ai · 来源: 性能分析与瓶颈诊断

🌐 网站

Intel GPU Top — intel_gpu_top man page

manpages.ubuntu.com · 来源: 性能分析与瓶颈诊断

🌐 网站

OpenVINO Multi-Device Execution — Intel

docs.openvino.ai · 来源: NPU 架构与 GPU+NPU 协同推理

🌐 网站

OpenVINO AUTO Device — Intel

docs.openvino.ai · 来源: NPU 架构与 GPU+NPU 协同推理

🌐 网站

Intel NPU Device — OpenVINO Documentation

docs.openvino.ai · 来源: NPU 架构与 GPU+NPU 协同推理 , NPU 上的 LLM 推理：KV Cache 与软件栈

🌐 网站

Heterogeneous Execution — OpenVINO Docs

docs.openvino.ai · 来源: NPU 架构与 GPU+NPU 协同推理

🌐 网站

OpenVINO GenAI — Stateful LLM Pipeline

docs.openvino.ai · 来源: NPU 上的 LLM 推理：KV Cache 与软件栈

🌐 网站

openvinotoolkit/npu_compiler — GitHub

github.com · 来源: NPU 上的 LLM 推理：KV Cache 与软件栈 , NPU 执行模型与编程模型的边界

🌐 网站

Flash Attention — Tri Dao et al.

arxiv.org · 来源: NPU 执行模型与编程模型的边界

🌐 网站

CUTLASS 3.0 & CuTe — NVIDIA

github.com · 来源: NPU 执行模型与编程模型的边界

🌐 网站

llama.cpp GitHub Repository

github.com · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

ONNX Runtime Documentation

onnxruntime.ai · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

🌐 网站

lm-evaluation-harness

github.com · 来源: 动手：HF → GGUF / ONNX / OpenVINO 三条路径端到端

矩阵数学：从基础理论到现代 AI 架构 127 个资源

🌐 网站

MIT 18.06 Linear Algebra (Gilbert Strang)

ocw.mit.edu · 来源: 矩阵数学全景图：ML 的通用语言 , 核心性质速查：概念关系图与公式速查表 , 数据矩阵分解概述：问题、工具与方法谱系 , 向量空间的几何：内积、投影、秩与子空间 , 矩阵结构的几何：二次型、正定性与协方差 , 特征分解与对角化：万物之基 , 奇异值分解：核心中的核心 , 矩阵范数、内积与条件数：度量的艺术 , 矩阵微积分：从 Jacobian 到损失曲面 , 优化算法：从梯度下降到牛顿法 , PCA 与 Eigenfaces：从方差最大化到人脸识别 , 随机化 SVD：当精确分解算不动的时候 , NMF：非负约束下的 Parts-Based 分解 , Word2Vec 与 GloVe：隐式 vs 显式矩阵分解 , 算子矩阵全景：当矩阵不再装数据 , 马尔可夫链与转移矩阵：当矩阵编码概率 , 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动 , PageRank 与幂迭代：图上的马尔可夫链 , 随机游走与图嵌入：DeepWalk/Node2Vec , Kernel 矩阵与再生核：数据定义的给定算子 , 图 Laplacian 与谱聚类：从图结构到最优分割 , 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

The Matrix Cookbook

math.uwaterloo.ca · 来源: 矩阵数学全景图：ML 的通用语言 , 核心性质速查：概念关系图与公式速查表 , 数据矩阵分解概述：问题、工具与方法谱系 , 向量空间的几何：内积、投影、秩与子空间 , 矩阵结构的几何：二次型、正定性与协方差 , 特征分解与对角化：万物之基 , 奇异值分解：核心中的核心 , 矩阵范数、内积与条件数：度量的艺术 , 矩阵微积分：从 Jacobian 到损失曲面 , PCA 与 Eigenfaces：从方差最大化到人脸识别 , 算子矩阵全景：当矩阵不再装数据 , 马尔可夫链与转移矩阵：当矩阵编码概率 , 隐马尔可夫模型：当状态看不见 , 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动

📄 论文

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

arxiv.org · 来源: 矩阵数学全景图：ML 的通用语言 , 数据矩阵分解概述：问题、工具与方法谱系 , 奇异值分解：核心中的核心 , 随机化 SVD：当精确分解算不动的时候

📄 论文

Exact Matrix Completion via Convex Optimization (Candès & Recht, 2009)

arxiv.org · 来源: 数据矩阵分解概述：问题、工具与方法谱系 , 矩阵范数、内积与条件数：度量的艺术 , 矩阵补全：从极少观测恢复低秩矩阵 , MF 与 FM：协同过滤的矩阵分解视角 , Robust PCA：低秩 + 稀疏分解

📄 论文

Robust Principal Component Analysis? (Candès, Li, Ma, Wright, 2011)

arxiv.org · 来源: 数据矩阵分解概述：问题、工具与方法谱系 , Robust PCA：低秩 + 稀疏分解

📄 论文

Efficient Estimation of Word Representations in Vector Space (Mikolov et al., 2013)

arxiv.org · 来源: 数据矩阵分解概述：问题、工具与方法谱系 , Word2Vec 与 GloVe：隐式 vs 显式矩阵分解

📄 论文

A Tutorial on Principal Component Analysis (Shlens, 2014)

arxiv.org · 来源: 数据矩阵分解概述：问题、工具与方法谱系 , 特征分解与对角化：万物之基 , PCA 与 Eigenfaces：从方差最大化到人脸识别 , Kernel 矩阵与再生核：数据定义的给定算子

🌐 网站

Inner product space — Wikipedia

en.wikipedia.org · 来源: 向量空间的几何：内积、投影、秩与子空间

🌐 网站

Projection (linear algebra) — Wikipedia

en.wikipedia.org · 来源: 向量空间的几何：内积、投影、秩与子空间

🌐 网站

Rank (linear algebra) — Wikipedia

en.wikipedia.org · 来源: 向量空间的几何：内积、投影、秩与子空间

🌐 网站

Kernel (linear algebra) — Wikipedia

en.wikipedia.org · 来源: 向量空间的几何：内积、投影、秩与子空间

🌐 网站

Orthogonal matrix — Wikipedia

en.wikipedia.org · 来源: 向量空间的几何：内积、投影、秩与子空间

🌐 网站

Essence of Linear Algebra — 3Blue1Brown

3blue1brown.com · 来源: 向量空间的几何：内积、投影、秩与子空间

🌐 网站

Quadratic form — Wikipedia

en.wikipedia.org · 来源: 矩阵结构的几何：二次型、正定性与协方差

🌐 网站

Definite matrix — Wikipedia

en.wikipedia.org · 来源: 矩阵结构的几何：二次型、正定性与协方差

🌐 网站

Covariance matrix — Wikipedia

en.wikipedia.org · 来源: 矩阵结构的几何：二次型、正定性与协方差

🌐 网站

Gramian matrix — Wikipedia

en.wikipedia.org · 来源: 矩阵结构的几何：二次型、正定性与协方差

🌐 网站

Determinant — Wikipedia

en.wikipedia.org · 来源: 矩阵结构的几何：二次型、正定性与协方差

🌐 网站

Interactive Linear Algebra — Eigenvalues and Eigenvectors (Georgia Tech)

textbooks.math.gatech.edu · 来源: 特征分解与对角化：万物之基

🌐 网站

Singular Value Decomposition — Wikipedia

en.wikipedia.org · 来源: 奇异值分解：核心中的核心

🌐 网站

Moore–Penrose Inverse — Wikipedia

en.wikipedia.org · 来源: 奇异值分解：核心中的核心

📄 论文

Implicit Regularization in Matrix Factorization (Gunasekar et al., 2017)

arxiv.org · 来源: 奇异值分解：核心中的核心

🌐 网站

Matrix norm — Wikipedia

en.wikipedia.org · 来源: 矩阵范数、内积与条件数：度量的艺术

🌐 网站

Condition number — Wikipedia

en.wikipedia.org · 来源: 矩阵范数、内积与条件数：度量的艺术

📄 论文

Spectral Normalization for Generative Adversarial Networks (Miyato et al., 2018)

arxiv.org · 来源: 矩阵范数、内积与条件数：度量的艺术

🌐 网站

What Is a Condition Number? (Nick Higham, 2020)

nhigham.com · 来源: 矩阵范数、内积与条件数：度量的艺术

🌐 网站

Matrix calculus — Wikipedia

en.wikipedia.org · 来源: 矩阵微积分：从 Jacobian 到损失曲面

🌐 网站

Jacobian matrix and determinant — Wikipedia

en.wikipedia.org · 来源: 矩阵微积分：从 Jacobian 到损失曲面

🌐 网站

Hessian matrix — Wikipedia

en.wikipedia.org · 来源: 矩阵微积分：从 Jacobian 到损失曲面

📄 论文

Measuring the Intrinsic Dimension of Objective Landscapes (Li et al., 2018)

arxiv.org · 来源: 矩阵微积分：从 Jacobian 到损失曲面 , 学习算子中的低秩结构：为什么神经网络权重是低秩的？ , LoRA：低秩分解在 LLM 微调中的应用

🌐 网站

Gradient descent — Wikipedia

en.wikipedia.org · 来源: 优化算法：从梯度下降到牛顿法

🌐 网站

Newton's method in optimization — Wikipedia

en.wikipedia.org · 来源: 优化算法：从梯度下降到牛顿法

📄 论文

Introductory Lectures on Stochastic Optimization (Bottou, Curtis, Nocedal, 2018)

arxiv.org · 来源: 优化算法：从梯度下降到牛顿法

📄 论文

Adam: A Method for Stochastic Optimization (Kingma & Ba, 2015)

arxiv.org · 来源: 优化算法：从梯度下降到牛顿法

📄 论文

Numerical Optimization (Nocedal & Wright, 2006)

link.springer.com · 来源: 优化算法：从梯度下降到牛顿法

📄 论文

Eigenfaces for Recognition (Turk & Pentland, 1991)

face-rec.org · 来源: PCA 与 Eigenfaces：从方差最大化到人脸识别

🌐 网站

Principal component analysis — Wikipedia

en.wikipedia.org · 来源: PCA 与 Eigenfaces：从方差最大化到人脸识别

🌐 网站

Eigenface — Wikipedia

en.wikipedia.org · 来源: PCA 与 Eigenfaces：从方差最大化到人脸识别

📄 论文

Lecture Notes on Randomized Linear Algebra (Mahoney, 2016)

arxiv.org · 来源: 随机化 SVD：当精确分解算不动的时候

🌐 网站

Johnson-Lindenstrauss lemma — Wikipedia

en.wikipedia.org · 来源: 随机化 SVD：当精确分解算不动的时候

🌐 网站

Random projection — Wikipedia

en.wikipedia.org · 来源: 随机化 SVD：当精确分解算不动的时候

📄 论文

The Power of Convex Relaxation: Near-Optimal Matrix Completion (Candès & Tao, 2010)

arxiv.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

📄 论文

A Simpler Approach to Matrix Completion (Recht, 2011)

arxiv.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

📄 论文

A Singular Value Thresholding Algorithm for Matrix Completion (Cai, Candès & Shen, 2010)

arxiv.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

📄 论文

Low-rank Matrix Completion using Alternating Minimization (Jain, Netrapalli & Sanghavi, 2013)

arxiv.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

📄 论文

Restricted Strong Convexity and Weighted Matrix Completion: Optimal Bounds with Noise (Negahban & Wainwright, 2012)

arxiv.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

🌐 网站

Netflix Prize — Wikipedia

en.wikipedia.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

🌐 网站

Matrix completion — Wikipedia

en.wikipedia.org · 来源: 矩阵补全：从极少观测恢复低秩矩阵

📄 论文

Learning the parts of objects by non-negative matrix factorization (Lee & Seung, 1999)

doi.org · 来源: NMF：非负约束下的 Parts-Based 分解

📄 论文

Algorithms for Non-negative Matrix Factorization (Lee & Seung, 2000)

papers.nips.cc · 来源: NMF：非负约束下的 Parts-Based 分解

📄 论文

The Why and How of Nonnegative Matrix Factorization (Gillis, 2014)

arxiv.org · 来源: NMF：非负约束下的 Parts-Based 分解

🌐 网站

Non-negative matrix factorization — Wikipedia

en.wikipedia.org · 来源: NMF：非负约束下的 Parts-Based 分解

📄 论文

Matrix Factorization Techniques for Recommender Systems (Koren, Bell, Volinsky, 2009)

doi.org · 来源: MF 与 FM：协同过滤的矩阵分解视角

📄 论文

Factorization Machines (Rendle, 2010)

ismll.uni-hildesheim.de · 来源: MF 与 FM：协同过滤的矩阵分解视角

📄 论文

Factorization Machines with libFM (Rendle, 2012)

dl.acm.org · 来源: MF 与 FM：协同过滤的矩阵分解视角

🌐 网站

Matrix factorization (recommender systems) — Wikipedia

en.wikipedia.org · 来源: MF 与 FM：协同过滤的矩阵分解视角

📄 论文

Neural Word Embedding as Implicit Matrix Factorization (Levy & Goldberg, 2014)

papers.nips.cc · 来源: Word2Vec 与 GloVe：隐式 vs 显式矩阵分解 , 随机游走与图嵌入：DeepWalk/Node2Vec

📄 论文

Distributed Representations of Words and Phrases and their Compositionality (Mikolov et al., 2013)

arxiv.org · 来源: Word2Vec 与 GloVe：隐式 vs 显式矩阵分解

📄 论文

GloVe: Global Vectors for Word Representation (Pennington, Socher & Manning, 2014)

nlp.stanford.edu · 来源: Word2Vec 与 GloVe：隐式 vs 显式矩阵分解

📄 论文

Improving Distributional Similarity with Lessons Learned from Word Embeddings (Levy, Goldberg & Dagan, 2015)

aclanthology.org · 来源: Word2Vec 与 GloVe：隐式 vs 显式矩阵分解

📄 论文

word2vec Explained: Deriving Mikolov et al.'s Negative-Sampling Word-Embedding Method (Goldberg & Levy, 2014)

arxiv.org · 来源: Word2Vec 与 GloVe：隐式 vs 显式矩阵分解

📄 论文

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices (Lin, Chen, Ma, 2010)

arxiv.org · 来源: Robust PCA：低秩 + 稀疏分解

📄 论文

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers (Boyd, Parikh, Chu, Peleato, Eckstein, 2011)

stanford.edu · 来源: Robust PCA：低秩 + 稀疏分解

📄 论文

Stable Principal Component Pursuit (Zhou, Li, Wright, Candès, Ma, 2010)

arxiv.org · 来源: Robust PCA：低秩 + 稀疏分解

🌐 网站

Robust principal component analysis — Wikipedia

en.wikipedia.org · 来源: Robust PCA：低秩 + 稀疏分解

📄 论文

Tensor Decompositions and Applications (Kolda & Bader, 2009)

doi.org · 来源: 张量分解与知识图谱嵌入：从二维到高阶

📄 论文

Embedding Entities and Relations for Learning and Inference in Knowledge Bases (Yang et al., 2015) — DistMult

arxiv.org · 来源: 张量分解与知识图谱嵌入：从二维到高阶

📄 论文

Complex Embeddings for Simple Link Prediction (Trouillon et al., 2016) — ComplEx

arxiv.org · 来源: 张量分解与知识图谱嵌入：从二维到高阶

📄 论文

A Three-Way Model for Collective Learning on Multi-Relational Data (Nickel et al., 2011) — RESCAL

icml.cc · 来源: 张量分解与知识图谱嵌入：从二维到高阶

📄 论文

Analysis of individual differences in multidimensional scaling via an n-way generalization of 'Eckart-Young' decomposition (Tucker, 1966)

doi.org · 来源: 张量分解与知识图谱嵌入：从二维到高阶

🌐 网站

Tensor decomposition — Wikipedia

en.wikipedia.org · 来源: 张量分解与知识图谱嵌入：从二维到高阶

📄 论文

A Tutorial on Spectral Clustering (von Luxburg, 2007)

arxiv.org · 来源: 算子矩阵全景：当矩阵不再装数据 , 图 Laplacian 与谱聚类：从图结构到最优分割

📄 论文

An Intuitive Tutorial to Gaussian Process Regression (Wang, 2020)

arxiv.org · 来源: 算子矩阵全景：当矩阵不再装数据

📄 论文

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Gu & Dao, 2023)

arxiv.org · 来源: 算子矩阵全景：当矩阵不再装数据 , 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动 , 学习算子中的低秩结构：为什么神经网络权重是低秩的？ , SSM / Mamba：矩阵对角化的胜利

📄 论文

A Comprehensive Survey on Graph Neural Networks (Wu et al., 2019)

arxiv.org · 来源: 算子矩阵全景：当矩阵不再装数据 , 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

Markov Chains and Mixing Times (Levin, Peres & Wilmer, 2nd ed., AMS, 2017)

pages.uoregon.edu · 来源: 马尔可夫链与转移矩阵：当矩阵编码概率 , PageRank 与幂迭代：图上的马尔可夫链

🌐 网站

Perron–Frobenius Theorem — Wikipedia

en.wikipedia.org · 来源: 马尔可夫链与转移矩阵：当矩阵编码概率

🌐 网站

Markov Chain — Wikipedia

en.wikipedia.org · 来源: 马尔可夫链与转移矩阵：当矩阵编码概率 , 随机游走与图嵌入：DeepWalk/Node2Vec

📄 论文

An Introduction to MCMC for Machine Learning (Andrieu et al., 2003)

link.springer.com · 来源: 马尔可夫链与转移矩阵：当矩阵编码概率

📄 论文

A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (Rabiner, 1989)

cs.ubc.ca · 来源: 隐马尔可夫模型：当状态看不见

📄 论文

An Introduction to Hidden Markov Models (Rabiner & Juang, IEEE ASSP Magazine, 1986)

ieeexplore.ieee.org · 来源: 隐马尔可夫模型：当状态看不见

📄 论文

Biological Sequence Analysis (Durbin, Eddy, Krogh & Mitchison, Cambridge University Press, 1998)

cambridge.org · 来源: 隐马尔可夫模型：当状态看不见

🌐 网站

Hidden Markov Model — Wikipedia

en.wikipedia.org · 来源: 隐马尔可夫模型：当状态看不见

🌐 网站

MIT 6.864 Advanced Natural Language Processing — HMM Lecture Notes

ocw.mit.edu · 来源: 隐马尔可夫模型：当状态看不见

📖 书籍

Feedback Systems: An Introduction for Scientists and Engineers (Åström & Murray, Princeton University Press, 2021)

cds.caltech.edu · 来源: 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动

📄 论文

A New Approach to Linear Filtering and Prediction Problems (Kalman, 1960)

cs.unc.edu · 来源: 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动

📄 论文

Efficiently Modeling Long Sequences with Structured State Spaces (Gu, Goel & Ré, ICLR 2022)

arxiv.org · 来源: 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动 , SSM / Mamba：矩阵对角化的胜利

🌐 网站

Kalman Filter — Wikipedia

en.wikipedia.org · 来源: 连续时间线性系统与 Kalman 滤波：从离散步进到平滑流动

📄 论文

The Anatomy of a Large-Scale Hypertextual Web Search Engine (Brin & Page, 1998)

infolab.stanford.edu · 来源: PageRank 与幂迭代：图上的马尔可夫链

📄 论文

The PageRank Citation Ranking: Bringing Order to the Web (Page et al., 1999)

ilpubs.stanford.edu · 来源: PageRank 与幂迭代：图上的马尔可夫链

🌐 网站

PageRank — Wikipedia

en.wikipedia.org · 来源: PageRank 与幂迭代：图上的马尔可夫链

📄 论文

Deeper Inside PageRank (Langville & Meyer, 2004)

doi.org · 来源: PageRank 与幂迭代：图上的马尔可夫链

📄 论文

DeepWalk: Online Learning of Social Representations (Perozzi, Al-Rfou & Skiena, 2014)

arxiv.org · 来源: 随机游走与图嵌入：DeepWalk/Node2Vec

📄 论文

node2vec: Scalable Feature Learning for Networks (Grover & Leskovec, 2016)

arxiv.org · 来源: 随机游走与图嵌入：DeepWalk/Node2Vec

📄 论文

Network Embedding as Matrix Factorization: Approximating Graph Context with Polynomial Functions (Qiu et al., 2018)

arxiv.org · 来源: 随机游走与图嵌入：DeepWalk/Node2Vec

📄 论文

DeepWalk is equivalent to computing a specific matrix factorization (Yang & Qiu, 2015, see Qiu et al. 2018 for unified treatment)

dl.acm.org · 来源: 随机游走与图嵌入：DeepWalk/Node2Vec

🌐 网站

Learning with Kernels (Schölkopf & Smola, 2002)

en.wikipedia.org · 来源: Kernel 矩阵与再生核：数据定义的给定算子

📖 书籍

Gaussian Processes for Machine Learning (Rasmussen & Williams, 2006)

gaussianprocess.org · 来源: Kernel 矩阵与再生核：数据定义的给定算子

🌐 网站

Mercer's theorem — Wikipedia

en.wikipedia.org · 来源: Kernel 矩阵与再生核：数据定义的给定算子

🌐 网站

Kernel Principal Component Analysis (Schölkopf, Smola & Müller, 1998)

en.wikipedia.org · 来源: Kernel 矩阵与再生核：数据定义的给定算子

🌐 网站

Reproducing kernel Hilbert space — Wikipedia

en.wikipedia.org · 来源: Kernel 矩阵与再生核：数据定义的给定算子

📄 论文

Algebraic connectivity of graphs (Fiedler, 1973)

doi.org · 来源: 图 Laplacian 与谱聚类：从图结构到最优分割

📄 论文

Normalized Cuts and Image Segmentation (Shi & Malik, 2000)

people.eecs.berkeley.edu · 来源: 图 Laplacian 与谱聚类：从图结构到最优分割

📄 论文

On Spectral Clustering: Analysis and an Algorithm (Ng, Jordan & Weiss, 2001)

papers.nips.cc · 来源: 图 Laplacian 与谱聚类：从图结构到最优分割

📄 论文

Using the Nyström Method to Speed Up Kernel Machines (Williams & Seeger, 2001)

proceedings.neurips.cc · 来源: 图 Laplacian 与谱聚类：从图结构到最优分割

📄 论文

Semi-Supervised Classification with Graph Convolutional Networks (Kipf & Welling, ICLR 2017)

arxiv.org · 来源: 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (Defferrard, Bresson & Vandergheynst, NeurIPS 2016)

arxiv.org · 来源: 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

Diffusion Kernels on Graphs and Other Discrete Structures (Kondor & Lafferty, ICML 2002)

cs.cmu.edu · 来源: 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

The Emerging Field of Signal Processing on Graphs (Shuman, Narang, Frossard, Ortega & Vandergheynst, 2013)

arxiv.org · 来源: 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

Denoising Diffusion Probabilistic Models (Ho, Jain & Abbeel, NeurIPS 2020)

arxiv.org · 来源: 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

🌐 网站

Heat Equation — Wikipedia

en.wikipedia.org · 来源: 图扩散、热核与 GNN 消息传递：从热方程到图神经网络

📄 论文

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning (Aghajanyan et al., ACL 2021)

arxiv.org · 来源: 学习算子中的低秩结构：为什么神经网络权重是低秩的？ , LoRA：低秩分解在 LLM 微调中的应用

📄 论文

LoRA: Low-Rank Adaptation of Large Language Models (Hu et al., ICLR 2022)

arxiv.org · 来源: 学习算子中的低秩结构：为什么神经网络权重是低秩的？ , LoRA：低秩分解在 LLM 微调中的应用

📄 论文

Traditional and Heavy-Tailed Self Regularization in Neural Network Models (Martin & Mahoney, 2019)

arxiv.org · 来源: 学习算子中的低秩结构：为什么神经网络权重是低秩的？ , Attention 的低秩结构与 Efficient Attention

📄 论文

Gradient Descent Happens in a Tiny Subspace (Gur-Ari et al., 2018)

arxiv.org · 来源: 学习算子中的低秩结构：为什么神经网络权重是低秩的？

📄 论文

Rethinking Attention with Performers (Choromanski et al., ICLR 2021)

arxiv.org · 来源: 学习算子中的低秩结构：为什么神经网络权重是低秩的？ , Attention 的低秩结构与 Efficient Attention

📄 论文

QLoRA: Efficient Finetuning of Quantized LLMs (Dettmers et al., NeurIPS 2023)

arxiv.org · 来源: LoRA：低秩分解在 LLM 微调中的应用

📄 论文

LoRA+: Efficient Low Rank Adaptation of Large Models (Hayou et al., ICML 2024)

arxiv.org · 来源: LoRA：低秩分解在 LLM 微调中的应用

📄 论文

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA (Kalajdzievski, 2023)

arxiv.org · 来源: LoRA：低秩分解在 LLM 微调中的应用

📄 论文

LQ-LoRA: Low-Rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning (Guo et al., ICLR 2024)

arxiv.org · 来源: LoRA：低秩分解在 LLM 微调中的应用

📄 论文

Attention Is All You Need (Vaswani et al., NeurIPS 2017)

arxiv.org · 来源: Attention 的低秩结构与 Efficient Attention

📄 论文

Linformer: Self-Attention with Linear Complexity (Wang et al., 2020)

arxiv.org · 来源: Attention 的低秩结构与 Efficient Attention

📄 论文

A Unified Taxonomy and Evaluation of Efficient Transformers (Tay et al., 2022)

arxiv.org · 来源: Attention 的低秩结构与 Efficient Attention

📄 论文

HiPPO: Recurrent Memory with Optimal Polynomial Projections (Gu et al., NeurIPS 2020)

arxiv.org · 来源: SSM / Mamba：矩阵对角化的胜利

📄 论文

On the Parameterization and Initialization of Diagonal State Space Models (Gu et al., NeurIPS 2022)

arxiv.org · 来源: SSM / Mamba：矩阵对角化的胜利

📄 论文

It's Raw! Audio Generation with State-Space Models (Goel et al., ICML 2022)

arxiv.org · 来源: SSM / Mamba：矩阵对角化的胜利

📄 论文

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections (Gu et al., ICLR 2023)

arxiv.org · 来源: SSM / Mamba：矩阵对角化的胜利

图算法：从结构探索到组合优化 162 个资源

🌐 网站

Introduction to Algorithms (CLRS), 4th Edition — Part VI: Graph Algorithms

mitpress.mit.edu · 来源: 图算法全景图：从结构探索到组合优化 , BFS 与 DFS：图的两种基本呼吸方式 , 连通性：图能拆成几块？ , 拓扑排序与 DAG：有依赖时的合法顺序 , 欧拉与哈密顿：遍历的两种完备性 , 树上算法：图的特殊骨架 , 最短路径：图上的距离 , 中心性：谁最重要？ , 社区发现：哪些节点抱团？ , 团与密子图：最紧密的子群 , 最小生成树：最便宜地连通所有人 , 网络流：管道能通多少？ , 匹配：最优配对 , 着色与划分：最少几种颜色？ , NP-hard 与近似算法：当最优解算不出来 , 图建模案例集：这个问题其实是图问题

🌐 网站

Graph Theory (Reinhard Diestel), 5th Edition — Free online version

diestel-graph-theory.com · 来源: 图算法全景图：从结构探索到组合优化 , BFS 与 DFS：图的两种基本呼吸方式 , 连通性：图能拆成几块？ , 拓扑排序与 DAG：有依赖时的合法顺序 , 欧拉与哈密顿：遍历的两种完备性 , 树上算法：图的特殊骨架 , 团与密子图：最紧密的子群 , 着色与划分：最少几种颜色？

🌐 网站

Network Science (Albert-László Barabási) — Free online textbook

networksciencebook.com · 来源: 图算法全景图：从结构探索到组合优化 , 随机图与网络模型：真实网络长什么样？

🌐 网站

NetworkX Documentation

networkx.org · 来源: 图算法全景图：从结构探索到组合优化

🌐 网站

Jeff Erickson — Algorithms (2019), Chapter 5: Whatever-First Search

jeffe.cs.illinois.edu · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

🌐 网站

Sedgewick, R. — Algorithms in C, Part 5: Graph Algorithms (1990/2003)

cs.princeton.edu · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架

🌐 网站

Sedgewick, R. & Wayne, K. — Algorithms, 4th Edition (2011)

algs4.cs.princeton.edu · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Pingali, K. et al. — The Tao of Parallelism in Algorithms (PLDI 2011)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Kildall, G. — A Unified Approach to Global Program Optimization (POPL 1973)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Cousot, P. & Cousot, R. — Abstract Interpretation: A Unified Lattice Model (POPL 1977)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Dijkstra, E.W. et al. — On-the-fly Garbage Collection: An Exercise in Cooperation (CACM 1978)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Dijkstra, E.W. — A note on two problems in connexion with graphs (1959)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

🌐 网站

Pearl, J. — Probabilistic Reasoning in Intelligent Systems (1988)

dl.acm.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Tarski, A. — A lattice-theoretical fixpoint theorem and its applications (1955)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Mohri, M. — Semiring Frameworks and Algorithms for Shortest-Distance Problems (2002)

doi.org · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

🌐 网站

Russell, S. & Norvig, P. — Artificial Intelligence: A Modern Approach, 4th Edition (2020)

aima.cs.berkeley.edu · 来源: 图上的通用迭代机器（上）：从数学问题到求解框架 , 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Xu, K. et al. — How Powerful are Graph Neural Networks? (ICLR 2019)

arxiv.org · 来源: 图上的通用迭代机器（下）：范式、领域与边界 , 图嵌入与图神经网络：把图变成向量

📄 论文

Gondran, M. & Minoux, M. — Graphs, Dioids and Semirings (2008)

link.springer.com · 来源: 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Malewicz, G. et al. — Pregel: A System for Large-Scale Graph Processing (SIGMOD 2010)

doi.org · 来源: 图上的通用迭代机器（下）：范式、领域与边界

📄 论文

Kipf, T.N. & Welling, M. — Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017)

arxiv.org · 来源: 图上的通用迭代机器（下）：范式、领域与边界 , 图嵌入与图神经网络：把图变成向量

🌐 网站

Algorithms, 4th Edition (Sedgewick & Wayne) — Section 4.1-4.2: Undirected/Directed Graphs

algs4.cs.princeton.edu · 来源: BFS 与 DFS：图的两种基本呼吸方式 , 树上算法：图的特殊骨架

🌐 网站

BFS and DFS — Wikipedia

en.wikipedia.org · 来源: BFS 与 DFS：图的两种基本呼吸方式

🌐 网站

NetworkX Documentation — Traversal

networkx.org · 来源: BFS 与 DFS：图的两种基本呼吸方式

🌐 网站

Algorithms, 4th Edition (Sedgewick & Wayne) — Section 4.2: Directed Graphs

algs4.cs.princeton.edu · 来源: 连通性：图能拆成几块？ , 拓扑排序与 DAG：有依赖时的合法顺序

🌐 网站

Robert Tarjan — Depth-First Search and Linear Graph Algorithms (1972)

doi.org · 来源: 连通性：图能拆成几块？

🌐 网站

NetworkX Documentation — Components

networkx.org · 来源: 连通性：图能拆成几块？

🌐 网站

2-SAT — CP-Algorithms

cp-algorithms.com · 来源: 连通性：图能拆成几块？

🌐 网站

Kahn, Arthur B. (1962) — Topological sorting of large networks, Communications of the ACM

dl.acm.org · 来源: 拓扑排序与 DAG：有依赖时的合法顺序

🌐 网站

Topological sorting — Wikipedia

en.wikipedia.org · 来源: 拓扑排序与 DAG：有依赖时的合法顺序

🌐 网站

NetworkX Documentation — DAG algorithms

networkx.org · 来源: 拓扑排序与 DAG：有依赖时的合法顺序

🌐 网站

Critical Path Method — Wikipedia

en.wikipedia.org · 来源: 拓扑排序与 DAG：有依赖时的合法顺序

🌐 网站

Euler, Leonhard (1736) — Solutio problematis ad geometriam situs pertinentis

scholarlycommons.pacific.edu · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Hierholzer, Carl (1873) — Ueber die Möglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechung zu umfahren

en.wikipedia.org · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Hamiltonian path problem — Wikipedia

en.wikipedia.org · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Karp, Richard M. (1972) — Reducibility Among Combinatorial Problems

link.springer.com · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Pevzner, P.A. et al. (2001) — An Eulerian path approach to DNA fragment assembly

doi.org · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Chinese Postman Problem — Wikipedia

en.wikipedia.org · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Travelling salesman problem — Wikipedia

en.wikipedia.org · 来源: 欧拉与哈密顿：遍历的两种完备性 , NP-hard 与近似算法：当最优解算不出来

🌐 网站

NetworkX Documentation — Eulerian circuits

networkx.org · 来源: 欧拉与哈密顿：遍历的两种完备性

🌐 网站

Bender & Farach-Colton (2000) — The LCA Problem Revisited

ics.uci.edu · 来源: 树上算法：图的特殊骨架

🌐 网站

Tarjan, Robert E. (1979) — Applications of Path Compression on Balanced Trees, Journal of the ACM

dl.acm.org · 来源: 树上算法：图的特殊骨架

🌐 网站

Lengauer, Thomas; Tarjan, Robert E. (1979) — A Fast Algorithm for Finding Dominators in a Flowgraph

dl.acm.org · 来源: 树上算法：图的特殊骨架

🌐 网站

Lowest common ancestor — Wikipedia

en.wikipedia.org · 来源: 树上算法：图的特殊骨架

🌐 网站

Heavy path decomposition — Wikipedia

en.wikipedia.org · 来源: 树上算法：图的特殊骨架

🌐 网站

Dominator (graph theory) — Wikipedia

en.wikipedia.org · 来源: 树上算法：图的特殊骨架

🌐 网站

NetworkX Documentation — Tree algorithms

networkx.org · 来源: 树上算法：图的特殊骨架 , 最小生成树：最便宜地连通所有人

🌐 网站

Algorithms, 4th Edition (Sedgewick & Wayne) — Section 4.4: Shortest Paths

algs4.cs.princeton.edu · 来源: 最短路径：图上的距离

🌐 网站

Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269-271

link.springer.com · 来源: 最短路径：图上的距离

🌐 网站

Bellman, Richard (1958). On a routing problem. Quarterly of Applied Mathematics, 16(1), 87-90

ams.org · 来源: 最短路径：图上的距离

🌐 网站

Floyd, Robert W. (1962). Algorithm 97: Shortest path. Communications of the ACM, 5(6), 345

dl.acm.org · 来源: 最短路径：图上的距离

🌐 网站

Hart, P. E., Nilsson, N. J., & Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on SSC, 4(2), 100-107

ieeexplore.ieee.org · 来源: 最短路径：图上的距离

🌐 网站

Dijkstra's algorithm — Wikipedia

en.wikipedia.org · 来源: 最短路径：图上的距离

🌐 网站

Bellman-Ford algorithm — Wikipedia

en.wikipedia.org · 来源: 最短路径：图上的距离

🌐 网站

Floyd-Warshall algorithm — Wikipedia

en.wikipedia.org · 来源: 最短路径：图上的距离

🌐 网站

A* search algorithm — Wikipedia

en.wikipedia.org · 来源: 最短路径：图上的距离

🌐 网站

NetworkX Documentation — Shortest Paths

networkx.org · 来源: 最短路径：图上的距离

🌐 网站

Brandes, Ulrik (2001). A Faster Algorithm for Betweenness Centrality. Journal of Mathematical Sociology, 25(2), 163-177

doi.org · 来源: 中心性：谁最重要？

🌐 网站

Page, Lawrence; Brin, Sergey; Motwani, Rajeev; Winograd, Terry (1999). The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab Technical Report

ilpubs.stanford.edu · 来源: 中心性：谁最重要？

🌐 网站

Freeman, Linton C. (1977). A set of measures of centrality based on betweenness. Sociometry, 40(1), 35-41

doi.org · 来源: 中心性：谁最重要？

🌐 网站

Katz, Leo (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39-43

doi.org · 来源: 中心性：谁最重要？

🌐 网站

Bonacich, Phillip (1987). Power and centrality: A family of measures. American Journal of Sociology, 92(5), 1170-1182

doi.org · 来源: 中心性：谁最重要？

🌐 网站

Centrality — Wikipedia

en.wikipedia.org · 来源: 中心性：谁最重要？

🌐 网站

NetworkX Documentation — Centrality

networkx.org · 来源: 中心性：谁最重要？

🌐 网站

Neo4j Graph Data Science — Centrality Algorithms

neo4j.com · 来源: 中心性：谁最重要？

📄 论文

Fast unfolding of communities in large networks (Blondel et al., 2008) — Louvain algorithm

arxiv.org · 来源: 社区发现：哪些节点抱团？

📄 论文

Near linear time algorithm to detect community structures in large-scale networks (Raghavan et al., 2007) — Label Propagation

arxiv.org · 来源: 社区发现：哪些节点抱团？

📄 论文

Modularity and community structure in networks (Newman, 2006)

arxiv.org · 来源: 社区发现：哪些节点抱团？

📄 论文

Resolution limit in community detection (Fortunato & Barthélemy, 2007)

doi.org · 来源: 社区发现：哪些节点抱团？

📄 论文

From Louvain to Leiden: guaranteeing well-connected communities (Traag et al., 2019)

arxiv.org · 来源: 社区发现：哪些节点抱团？

🌐 网站

NetworkX Community Detection Documentation

networkx.org · 来源: 社区发现：哪些节点抱团？

🌐 网站

Network Science (Barabási) — Chapter 9: Communities

networksciencebook.com · 来源: 社区发现：哪些节点抱团？

📄 论文

Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks (Morris et al., 2019)

arxiv.org · 来源: 相似性与同构：两个图/节点有多像？ , 图嵌入与图神经网络：把图变成向量

📄 论文

Graph Isomorphism in Quasipolynomial Time (Babai, 2016)

arxiv.org · 来源: 相似性与同构：两个图/节点有多像？

📄 论文

SimRank: A Measure of Structural-Context Similarity (Jeh & Widom, 2002)

dl.acm.org · 来源: 相似性与同构：两个图/节点有多像？

📄 论文

Weisfeiler-Lehman Graph Kernels (Shervashidze et al., 2011)

jmlr.org · 来源: 相似性与同构：两个图/节点有多像？

📄 论文

An Improved Algorithm for Matching Large Graphs — VF2 (Cordella et al., 2004)

ieeexplore.ieee.org · 来源: 相似性与同构：两个图/节点有多像？

🌐 网站

NetworkX — Graph Isomorphism

networkx.org · 来源: 相似性与同构：两个图/节点有多像？

🌐 网站

GraKeL — Graph Kernels Library

ysig.github.io · 来源: 相似性与同构：两个图/节点有多像？

🌐 网站

Algorithm Design (Kleinberg & Tardos) — Chapter 13: Randomized Algorithms

cs.cornell.edu · 来源: 团与密子图：最紧密的子群 , 匹配：最优配对 , NP-hard 与近似算法：当最优解算不出来 , 图建模案例集：这个问题其实是图问题

📄 论文

Bron, Kerbosch — Algorithm 457: Finding All Cliques of an Undirected Graph (1973)

doi.org · 来源: 团与密子图：最紧密的子群

📄 论文

Eppstein, Löffler, Strash — Listing All Maximal Cliques in Large Sparse Real-World Graphs (2013)

doi.org · 来源: 团与密子图：最紧密的子群

📄 论文

Batagelj, Zaversnik — An O(m) Algorithm for Cores Decomposition of Networks (2003)

arxiv.org · 来源: 团与密子图：最紧密的子群

📄 论文

Cohen — Trusses: Cohesive Subgraphs for Social Network Analysis (2008)

doi.org · 来源: 团与密子图：最紧密的子群

🌐 网站

NetworkX Documentation — Cliques

networkx.org · 来源: 团与密子图：最紧密的子群

🌐 网站

Algorithms, 4th Edition (Sedgewick & Wayne) — Section 4.3: Minimum Spanning Trees

algs4.cs.princeton.edu · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Kruskal, Joseph B. (1956). On the Shortest Spanning Subtree of a Graph. Proceedings of the AMS, 7(1), 48-50

doi.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Prim, R. C. (1957). Shortest Connection Networks And Some Generalizations. Bell System Technical Journal, 36(6), 1389-1401

doi.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Borůvka, Otakar (1926). O jistém problému minimálním. Práce Moravské Přírodovědecké Společnosti, 3, 37-58

en.wikipedia.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Tarjan, Robert E. (1975). Efficiency of a Good But Not Linear Set Union Algorithm. JACM, 22(2), 215-225

doi.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Minimum spanning tree — Wikipedia

en.wikipedia.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Matroid — Wikipedia

en.wikipedia.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Christofides, Nicos (1976). Worst-Case Analysis of a New Heuristic for the Travelling Salesman Problem. Technical Report

en.wikipedia.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Steiner tree problem — Wikipedia

en.wikipedia.org · 来源: 最小生成树：最便宜地连通所有人

🌐 网站

Ford, L. R. & Fulkerson, D. R. (1956). Maximal flow through a network. Canadian Journal of Mathematics, 8, 399-404

doi.org · 来源: 网络流：管道能通多少？

🌐 网站

Edmonds, J. & Karp, R. M. (1972). Theoretical improvements in algorithmic efficiency for network flow problems. Journal of the ACM, 19(2), 248-264

dl.acm.org · 来源: 网络流：管道能通多少？

🌐 网站

Dinic, E. A. (1970). Algorithm for solution of a problem of maximum flow in networks with power estimation. Soviet Mathematics Doklady, 11, 1277-1280

en.wikipedia.org · 来源: 网络流：管道能通多少？

🌐 网站

Maximum flow problem — Wikipedia

en.wikipedia.org · 来源: 网络流：管道能通多少？

🌐 网站

Max-flow min-cut theorem — Wikipedia

en.wikipedia.org · 来源: 网络流：管道能通多少？

🌐 网站

NetworkX Documentation — Flow Algorithms

networkx.org · 来源: 网络流：管道能通多少？

🌐 网站

Boykov, Y. & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE TPAMI, 26(9), 1124-1137

ieeexplore.ieee.org · 来源: 网络流：管道能通多少？

📄 论文

Hopcroft, J. E. & Karp, R. M. (1973). An n^(5/2) Algorithm for Maximum Matchings in Bipartite Graphs. SIAM J. Comput., 2(4), 225-231

doi.org · 来源: 匹配：最优配对

📄 论文

Edmonds, J. (1965). Paths, Trees, and Flowers. Canadian Journal of Mathematics, 17, 449-467

doi.org · 来源: 匹配：最优配对

📄 论文

Kuhn, H. W. (1955). The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly, 2(1-2), 83-97

doi.org · 来源: 匹配：最优配对

📄 论文

Hall, P. (1935). On Representatives of Subsets. Journal of the London Mathematical Society, s1-10(1), 26-30

doi.org · 来源: 匹配：最优配对

📄 论文

König, D. (1931). Gráfok és mátrixok. Matematikai és Fizikai Lapok, 38, 116-119

en.wikipedia.org · 来源: 匹配：最优配对

📄 论文

Gale, D. & Shapley, L. S. (1962). College Admissions and the Stability of Marriage. The American Mathematical Monthly, 69(1), 9-15

doi.org · 来源: 匹配：最优配对

🌐 网站

Matching (graph theory) — Wikipedia

en.wikipedia.org · 来源: 匹配：最优配对

🌐 网站

NetworkX Documentation — Matching Algorithms

networkx.org · 来源: 匹配：最优配对

📄 论文

Brélaz — New Methods to Color the Vertices of a Graph (1979)

doi.org · 来源: 着色与划分：最少几种颜色？

📄 论文

Brooks — On Colouring the Nodes of a Network (1941)

doi.org · 来源: 着色与划分：最少几种颜色？

📄 论文

Appel, Haken — Every Planar Map is Four Colorable (1976)

doi.org · 来源: 着色与划分：最少几种颜色？

📄 论文

Karypis, Kumar — A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs (1998)

doi.org · 来源: 着色与划分：最少几种颜色？ , 图建模案例集：这个问题其实是图问题

🌐 网站

Kuratowski's Theorem — Wikipedia

en.wikipedia.org · 来源: 着色与划分：最少几种颜色？

🌐 网站

NetworkX Documentation — Coloring

networkx.org · 来源: 着色与划分：最少几种颜色？

🌐 网站

METIS — Serial Graph Partitioning and Fill-reducing Matrix Ordering

github.com · 来源: 着色与划分：最少几种颜色？

📄 论文

Christofides, N. (1976). Worst-case analysis of a new heuristic for the travelling salesman problem. Technical Report 388, Graduate School of Industrial Administration, CMU

doi.org · 来源: NP-hard 与近似算法：当最优解算不出来

📄 论文

Karlin, Klein, Oveis Gharan (2021). A (Slightly) Improved Approximation Algorithm for Metric TSP. STOC 2021

doi.org · 来源: NP-hard 与近似算法：当最优解算不出来

📄 论文

Vazirani, V. V. (2001). Approximation Algorithms. Springer

link.springer.com · 来源: NP-hard 与近似算法：当最优解算不出来

📄 论文

Downey, R. G. & Fellows, M. R. (1999). Parameterized Complexity. Springer

link.springer.com · 来源: NP-hard 与近似算法：当最优解算不出来

📄 论文

Courcelle, B. (1990). The Monadic Second-Order Logic of Graphs. I. Recognizable Sets of Finite Graphs. Information and Computation, 85(1), 12-75

doi.org · 来源: NP-hard 与近似算法：当最优解算不出来

🌐 网站

Approximation algorithm — Wikipedia

en.wikipedia.org · 来源: NP-hard 与近似算法：当最优解算不出来

🌐 网站

Fixed-parameter tractability — Wikipedia

en.wikipedia.org · 来源: NP-hard 与近似算法：当最优解算不出来

🌐 网站

Treewidth — Wikipedia

en.wikipedia.org · 来源: NP-hard 与近似算法：当最优解算不出来

🌐 网站

NetworkX Documentation — Approximation Algorithms

networkx.org · 来源: NP-hard 与近似算法：当最优解算不出来

🌐 网站

Concorde TSP Solver

math.uwaterloo.ca · 来源: NP-hard 与近似算法：当最优解算不出来

🌐 网站

Google OR-Tools — Vehicle Routing

developers.google.com · 来源: NP-hard 与近似算法：当最优解算不出来

📄 论文

Erdos, P. & Renyi, A. (1959). On Random Graphs I. Publicationes Mathematicae, 6, 290-297

renyi.hu · 来源: 随机图与网络模型：真实网络长什么样？

📄 论文

Erdos, P. & Renyi, A. (1960). On the Evolution of Random Graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5, 17-61

renyi.hu · 来源: 随机图与网络模型：真实网络长什么样？

📄 论文

Watts, D. J. & Strogatz, S. H. (1998). Collective dynamics of 'small-world' networks. Nature, 393, 440-442

doi.org · 来源: 随机图与网络模型：真实网络长什么样？

📄 论文

Barabasi, A.-L. & Albert, R. (1999). Emergence of Scaling in Random Networks. Science, 286, 509-512

doi.org · 来源: 随机图与网络模型：真实网络长什么样？

📄 论文

Albert, R., Jeong, H. & Barabasi, A.-L. (2000). Error and attack tolerance of complex networks. Nature, 406, 378-382

doi.org · 来源: 随机图与网络模型：真实网络长什么样？

📄 论文

Newman, M. E. J. (2003). The Structure and Function of Complex Networks. SIAM Review, 45(2), 167-256

doi.org · 来源: 随机图与网络模型：真实网络长什么样？

🌐 网站

Random graph — Wikipedia

en.wikipedia.org · 来源: 随机图与网络模型：真实网络长什么样？

🌐 网站

Small-world network — Wikipedia

en.wikipedia.org · 来源: 随机图与网络模型：真实网络长什么样？

🌐 网站

Barabasi-Albert model — Wikipedia

en.wikipedia.org · 来源: 随机图与网络模型：真实网络长什么样？

🌐 网站

NetworkX Documentation — Graph Generators

networkx.org · 来源: 随机图与网络模型：真实网络长什么样？

📖 书籍

Probabilistic Graphical Models: Principles and Techniques (Koller & Friedman, 2009)

mitpress.mit.edu · 来源: 概率图模型：图上的不确定性推理

📖 书籍

Pattern Recognition and Machine Learning, Chapter 8: Graphical Models (Bishop, 2006)

microsoft.com · 来源: 概率图模型：图上的不确定性推理

📄 论文

An Introduction to Variational Methods for Graphical Models (Jordan et al., 1999)

doi.org · 来源: 概率图模型：图上的不确定性推理

📄 论文

Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms (Yedidia, Freeman & Weiss, 2005)

doi.org · 来源: 概率图模型：图上的不确定性推理

📄 论文

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data (Lafferty, McCallum & Pereira, 2001)

repository.upenn.edu · 来源: 概率图模型：图上的不确定性推理

🌐 网站

pgmpy — Probabilistic Graphical Models library for Python

pgmpy.org · 来源: 概率图模型：图上的不确定性推理

🌐 网站

Bayesian network — Wikipedia

en.wikipedia.org · 来源: 概率图模型：图上的不确定性推理

🌐 网站

Belief propagation — Wikipedia

en.wikipedia.org · 来源: 概率图模型：图上的不确定性推理

📄 论文

DeepWalk: Online Learning of Social Representations (Perozzi et al., 2014)

arxiv.org · 来源: 图嵌入与图神经网络：把图变成向量

📄 论文

node2vec: Scalable Feature Learning for Networks (Grover & Leskovec, 2016)

arxiv.org · 来源: 图嵌入与图神经网络：把图变成向量

📄 论文

LINE: Large-scale Information Network Embedding (Tang et al., 2015)

arxiv.org · 来源: 图嵌入与图神经网络：把图变成向量

📄 论文

Graph Attention Networks (Veličković et al., 2018)

arxiv.org · 来源: 图嵌入与图神经网络：把图变成向量

📄 论文

Inductive Representation Learning on Large Graphs — GraphSAGE (Hamilton et al., 2017)

arxiv.org · 来源: 图嵌入与图神经网络：把图变成向量

🌐 网站

PyTorch Geometric (PyG) Documentation

pytorch-geometric.readthedocs.io · 来源: 图嵌入与图神经网络：把图变成向量

🌐 网站

Deep Graph Library (DGL) Documentation

dgl.ai · 来源: 图嵌入与图神经网络：把图变成向量

🌐 网站

Stanford CS224W: Machine Learning with Graphs

web.stanford.edu · 来源: 图嵌入与图神经网络：把图变成向量

📄 论文

Aho, A. V., Lam, M. S., Sethi, R., & Ullman, J. D. (2006). Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley

suif.stanford.edu · 来源: 图建模案例集：这个问题其实是图问题

📄 论文

Chaitin, G. J. (1982). Register Allocation & Spilling via Graph Coloring. SIGPLAN Notices, 17(6), 98-101

doi.org · 来源: 图建模案例集：这个问题其实是图问题

📄 论文

Compeau, P. E. C., Pevzner, P. A. & Tesler, G. (2011). How to apply de Bruijn graphs to genome assembly. Nature Biotechnology, 29, 987-991

doi.org · 来源: 图建模案例集：这个问题其实是图问题

📄 论文

Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd Edition). Cambridge University Press

doi.org · 来源: 图建模案例集：这个问题其实是图问题

📄 论文

Kempe, D., Kleinberg, J. & Tardos, E. (2003). Maximizing the Spread of Influence through a Social Network. KDD 2003

doi.org · 来源: 图建模案例集：这个问题其实是图问题

🌐 网站

NetworkX Documentation — Algorithms

networkx.org · 来源: 图建模案例集：这个问题其实是图问题

🌐 网站

Google OR-Tools — Optimization

developers.google.com · 来源: 图建模案例集：这个问题其实是图问题

🌐 网站

Graph neural network — Wikipedia

en.wikipedia.org · 来源: 图建模案例集：这个问题其实是图问题

🌐 网站

Causal inference — Wikipedia

en.wikipedia.org · 来源: 图建模案例集：这个问题其实是图问题

🌐 网站

De Bruijn graph — Wikipedia

en.wikipedia.org · 来源: 图建模案例集：这个问题其实是图问题

资源推荐

Attention Is All You Need

Language Models are Unsupervised Multitask Learners (GPT-2)

The Illustrated Transformer

LLM Visualization — Brendan Bycroft

Transformer Explainer — Georgia Tech / Polo Club

GLU Variants Improve Transformer

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Fast Transformer Decoding: One Write-Head is All You Need

Mistral 7B

Gemma 2 Technical Report

Jamba: A Hybrid Transformer-Mamba Language Model

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5)

Flamingo: a Visual Language Model for Few-Shot Learning

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

Retentive Network: A Successor to Transformer for Large Language Models

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

Gated Delta Networks: Improving Mamba2 with Delta Rule

Efficient Memory Management for Large Language Model Serving with PagedAttention

Efficiently Scaling Transformer Inference

LLM Inference Unveiled: Survey and Roofline Model Insights

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Self-Attention with Relative Position Representations

RoFormer: Enhanced Transformer with Rotary Position Embedding

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

The Curious Case of Neural Text Degeneration

Hierarchical Neural Story Generation

Perplexity — a Measure of the Difficulty of Speech Recognition Tasks

Fast Inference from Transformers via Speculative Decoding

Accelerating Large Language Model Decoding with Speculative Sampling

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Better & Faster Large Language Models via Multi-Token Prediction

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

DeepSeek-V3 Technical Report

EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

Mixtral of Experts

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

Efficiently Modeling Long Sequences with Structured State Spaces (S4)

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

HiPPO: Recurrent Memory with Optimal Polynomial Projections

Hungry Hungry Hippos: Towards Language Modeling with State Space Models (H3)

On the Parameterization and Initialization of Diagonal State Space Models (S4D)

Zamba2-Small: A Hybrid SSM-Transformer Model

Hymba: A Hybrid-head Architecture for Small Language Models

An Empirical Study of Mamba-based Language Models

Repeat After Me: Transformers are Better than State Space Models at Copying

Qwen3 Technical Report

Ollama - Qwen3-Next 模型实现

Efficient Estimation of Word Representations in Vector Space

Neural Machine Translation of Rare Words with Subword Units

GloVe: Global Vectors for Word Representation

SentencePiece: A simple and language independent subword tokenizer

The Illustrated Word2Vec

Hugging Face Tokenizer Summary

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Improving Language Understanding by Generative Pre-Training

Language Models are Unsupervised Multitask Learners

Language Models are Few-Shot Learners

BERT for Joint Intent Classification and Slot Filling

Scaling Laws for Neural Language Models

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Text Embeddings by Weakly-Supervised Contrastive Pre-training

C-Pack: Packaged Resources To Advance General Chinese Embedding

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Training data-efficient image transformers & distillation through attention

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Learning Transferable Visual Models From Natural Language Supervision

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

Sigmoid Loss for Language Image Pre-Training

Visual Instruction Tuning

Denoising Diffusion Probabilistic Models

Denoising Diffusion Implicit Models

High-Resolution Image Synthesis with Latent Diffusion Models