У нас вы можете посмотреть бесплатно New AI Model Boosts Image Generation Efficiency Without Sacrificing Quality или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
A new framework called SparseDiT has been introduced to solve a major efficiency bottleneck in Diffusion Transformer (DiT) models, which are powerful but computationally expensive AI models used for generating images and videos. The Challenge: While DiTs produce high-quality results, their high computational cost—primarily due to token-level self-attention operations—limits their practical, large-scale use. The Solution: SparseDiT tackles this by dynamically making the model "sparser," reducing unnecessary computations in two smart ways: Spatial Sparsity: It uses a three-stage architecture: A Poolingformer at the bottom for efficient global feature extraction. A Sparse-Dense Token Module in the middle to balance global structure and local details. Standard dense Transformer layers at the top to refine fine details. Temporal Sparsity: It dynamically adjusts the number of tokens used throughout the denoising process. It starts with very few tokens (high pruning rate) when generating the rough structure and gradually uses more tokens (lower pruning rate) to add intricate details. Key Results: Experiments show SparseDiT maintains high-quality output while dramatically improving speed and reducing computational load (FLOPs). In 512x512 image generation, it reduced FLOPs by 55% and increased inference speed by 175% with only a minimal impact on quality (FID score increase of 0.09). Similar significant efficiency gains were demonstrated in video generation and text-to-image tasks. Significance: SparseDiT provides a scalable solution that makes high-quality diffusion models more practical for real-world applications by achieving an excellent balance between performance and computational efficiency. Resources: Paper: https://arxiv.org/pdf/2412.06028 Code: https://github.com/changsn/SparseDiT