Open Access iconOpen Access

REVIEW

crossmark

Anime Generation through Diffusion and Language Models: A Comprehensive Survey of Techniques and Trends

Yujie Wu1, Xing Deng1,*, Haijian Shao1, Ke Cheng1, Ming Zhang1, Yingtao Jiang2, Fei Wang1

1 School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang, 212003, China
2 Department of Electrical and Computer Engineering University of Nevada, Las Vegas, NV 89154, USA

* Corresponding Author: Xing Deng. Email: email

Computer Modeling in Engineering & Sciences 2025, 144(3), 2709-2778. https://doi.org/10.32604/cmes.2025.066647

Abstract

The application of generative artificial intelligence (AI) is bringing about notable changes in anime creation. This paper surveys recent advancements and applications of diffusion and language models in anime generation, focusing on their demonstrated potential to enhance production efficiency through automation and personalization. Despite these benefits, it is crucial to acknowledge the substantial initial computational investments required for training and deploying these models. We conduct an in-depth survey of cutting-edge generative AI technologies, encompassing models such as Stable Diffusion and GPT, and appraise pivotal large-scale datasets alongside quantifiable evaluation metrics. Review of the surveyed literature indicates the achievement of considerable maturity in the capacity of AI models to synthesize high-quality, aesthetically compelling anime visual images from textual prompts, alongside discernible progress in the generation of coherent narratives. However, achieving perfect long-form consistency, mitigating artifacts like flickering in video sequences, and enabling fine-grained artistic control remain critical ongoing challenges. Building upon these advancements, research efforts have increasingly pivoted towards the synthesis of higher-dimensional content, such as video and three-dimensional assets, with recent studies demonstrating significant progress in this burgeoning field. Nevertheless, formidable challenges endure amidst these advancements. Foremost among these are the substantial computational exigencies requisite for training and deploying these sophisticated models, particularly pronounced in the realm of high-dimensional generation such as video synthesis. Additional persistent hurdles include maintaining spatial-temporal consistency across complex scenes and mitigating ethical considerations surrounding bias and the preservation of human creative autonomy. This research underscores the transformative potential and inherent complexities of AI-driven synergy within the creative industries. We posit that future research should be dedicated to the synergistic fusion of diffusion and autoregressive models, the integration of multimodal inputs, and the balanced consideration of ethical implications, particularly regarding bias and the preservation of human creative autonomy, thereby establishing a robust foundation for the advancement of anime creation and the broader landscape of AI-driven content generation.

Keywords

Diffusion models; language models; anime generation; image synthesis; video generation; stable diffusion; AIGC

Cite This Article

APA Style
Wu, Y., Deng, X., Shao, H., Cheng, K., Zhang, M. et al. (2025). Anime Generation through Diffusion and Language Models: A Comprehensive Survey of Techniques and Trends. Computer Modeling in Engineering & Sciences, 144(3), 2709–2778. https://doi.org/10.32604/cmes.2025.066647
Vancouver Style
Wu Y, Deng X, Shao H, Cheng K, Zhang M, Jiang Y, et al. Anime Generation through Diffusion and Language Models: A Comprehensive Survey of Techniques and Trends. Comput Model Eng Sci. 2025;144(3):2709–2778. https://doi.org/10.32604/cmes.2025.066647
IEEE Style
Y. Wu et al., “Anime Generation through Diffusion and Language Models: A Comprehensive Survey of Techniques and Trends,” Comput. Model. Eng. Sci., vol. 144, no. 3, pp. 2709–2778, 2025. https://doi.org/10.32604/cmes.2025.066647



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 787

    View

  • 350

    Download

  • 0

    Like

Share Link