Flops on CctoctoFX

Flops on CctoctoFX https://pillumina.github.io/tags/flops/ Recent content in Flops on CctoctoFX CctoctoFX https://pillumina.github.io/imgs/icon_head.png https://pillumina.github.io/imgs/icon_head.png Hugo -- 0.148.2 en Mon, 15 Jun 2026 00:00:00 +0000 LLM 架构计算方法论：从 config.json 到推理显存 https://pillumina.github.io/posts/aiinfra/llm-computation-methodology/ Mon, 15 Jun 2026 00:00:00 +0000 https://pillumina.github.io/posts/aiinfra/llm-computation-methodology/ 从 config.json 到参数量、FLOPs、KV Cache、推理显存的完整计算推导。基于 8 个开源模型（M2.7 / GLM-5.1 / V4-Flash / Qwen3.5 / Mimo / Kimi / Nemotron / M3）的实战拆解经验。覆盖 Full Attention / MSA / MLA / Mamba-2 / SWA / GDN 六种注意力架构的 FLOPs 与 KV Cache 公式。