DeepSeek V3 与 V4 架构信息图
一张简洁的并排技术信息图,对比 DeepSeek V3/R1 和 DeepSeek V4,包含架构模块、标注说明以及底部对比表。设计采用白色背景、细线描边和色彩编码高亮,整体为演示风格布局。
模型: gpt-image-2分类: Infographic/Edu Visual风格: Minimalist语言: en
提示词
{"type":"并排 AI 架构对比信息图","style":"简洁技术图表、白色背景、细黑轮廓、圆角矩形、虚线标注框、颜色编码高亮、演示幻灯片风格、矢量信息图","canvas":{"aspect_ratio":"2:1","resolution":"宽幅横向"},"title_row":{"left_title":"DeepSeek V3/R1 (671 billion)","right_title":"DeepSeek V4 (1.2 trillion)","left_title_color":"亮橙红","right_title_color":"亮蓝"},"layout":{"columns":2,"sections":[{"title":"DeepSeek V3/R1 (671 billion)","position":"左半部分","count":9,"labels":["词汇表大小为 129k","FeedForward (SwiGLU) 模块","中间隐藏层维度为 2,048","MoE 层","支持的上下文长度为 128k tokens","前 3 个 block 使用隐藏层大小为 18,432 的密集 FFN,而不是 MoE","示例输入文本","嵌入维度为 7,168","128 heads"]},{"title":"DeepSeek V4 (1.2 trillion)","position":"右半部分","count":9,"labels":["词汇表大小为 160k","FeedForward (SwiGLU) 模块","中间隐藏层维度为 3,072","MoE 层","支持的上下文长度为 256k tokens","前 3 个 block 使用隐藏层大小为 24,576 的密集 FFN,而不是 MoE","示例输入文本","嵌入维度为 8,192","128 heads"]},{"title":"底部对比表","position":"底部全宽","count":10,"labels":["总参数量","每个 token 的激活参数量","隐藏大小","Esmple dimesiegn","DeepSeek V3/R1","Intermediate (FF)","Attention heads","上下文长度","嵌入维度","词汇表大小"]}]},"left_panel":{"background":"很浅的灰色圆角矩形","main_stack":{"count":8,"blocks":["分词后的文本","Token 嵌入层","RMSNorm 1","多头潜在注意力","RMSNorm 2","MoE","最终 RMSNorm","线性输出层"]},"side_module":"RoPE 附着在注意力块左侧","attention_block":{"label":"多头潜在注意力","accent":"单词 Latent 使用橙红色文本"},"feedforward_inset":{"title":"FeedForward (SwiGLU) 模块","count":4,"blocks":["线性层","SiLU 激活","线性层","线性层"],"diagram":"两条分支相乘,然后投影"},"moe_inset":{"title":"MoE 层","count":5,"blocks":["顶部合并节点","前馈","前馈","路由器","专家数量徽章 256"],"details":"带有 1 个已选专家的小黑方块,箭头向上路由到各专家,虚线分隔线"},"annotations":{"vocab":"词汇表大小为 129k","ff_dim":"中间隐藏层维度为 2,048","context":"支持的上下文长度为 128k tokens","dense_first_blocks":"前 3 个 block 使用隐藏层大小为 18,432 的密集 FFN,而不是 MoE","resource_savings":"资源节省:模型大小为 671B,但每个 token 仅有 1(共享)+ 8 个专家处于激活状态;每次推理步骤仅有 37B 参数处于激活状态"},"bottom_stats":{"count":10,"items":["总参数量:671B","每个 token 的激活参数量:37B(1 + 8 个专家)","隐藏大小:7,128","Esmple dimesiegn:28,432","Intermediate (FF):2,048","Attention heads:128","上下文长度:128k","嵌入维度:前 3 个 block","Context ler length:22G7","词汇表大小:129k"]}},"right_panel":{"background":"很浅的蓝色圆角矩形","main_stack":{"count":8,"blocks":["分词后的文本","Token 嵌入层","RMSNorm 1","多头潜在注意力","RMSNorm 2","MoE","最终 RMSNorm","线性输出层"]},"side_module":"RoPE 附着在注意力块左侧","attention_block":{"label":"多头潜在注意力","accent":"单词 Latent 使用蓝色文本"},"feedforward_inset":{"title":"FeedForward (SwiGLU) 模块","count":4,"blocks":["线性层","SiLU 激活","线性层","线性层"],"diagram":"与左侧面...