生成式媒体计算

图像和视频生成依赖于 GPU 吞吐量、去噪迭代成本和潜空间压缩。

13 篇论文 第 7 个,共 10 个计算范式

Generative media compute

英文原文文件:README.md

设备/设置

用于高维图像/视频/音频生成的 GPU/TPU 训练设置,采样成本和内存密集的 U-Net/Transformer 尤为突出。

瓶颈

稳定生成训练、高分辨率合成、潜空间效率、扩散采样步数和多模态数据吞吐。

适配的方法

VAE、GAN、DCGAN/pix2pix/CycleGAN/StyleGAN、DDPM/SDE 扩散、潜扩散、DALL-E 式文生图、改进 DDPM 和 DiT 将生成任务适配到可用加速器预算。

变得过时或不再中心的方法

无压缩的像素空间生成,以及纯对抗管线,在扩散/潜空间/Transformer 设置更易扩展后不再占据中心地位。

代表性论文

排名 年份 论文 优先级 状态
75 2020 Denoising Diffusion Probabilistic Models 5 downloaded / read_complete
76 2021 High-Resolution Image Synthesis with Latent Diffusion Models 5 downloaded / read_complete
77 2013 Auto-Encoding Variational Bayes 4 downloaded / read_complete
78 2014 Generative Adversarial Nets 4 downloaded / read_complete
79 2018 A Style-Based Generator Architecture for Generative Adversarial Networks 4 downloaded / read_complete
80 2020 Score-Based Generative Modeling through Stochastic Differential Equations 4 downloaded / read_complete
81 2021 Zero-Shot Text-to-Image Generation 4 downloaded / read_complete
82 2022 Scalable Diffusion Models with Transformers 4 downloaded / read_complete
83 2015 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks 3 downloaded / read_complete
84 2016 Image-to-Image Translation with Conditional Adversarial Networks 3 downloaded / read_complete
85 2017 Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks 3 downloaded / read_complete
86 2021 Improved Denoising Diffusion Probabilistic Models 3 downloaded / read_complete
132 2026 Qwen3.5-Omni Technical Report 4 downloaded / read_complete

开放问题

  • 区分媒体模型中的训练计算改进与推理/采样计算改进。

相关论文 13