Evaluating Domain Randomization Techniques in DRL Agents: A Comparative Study of Normal, Randomized, and Non-Randomized Resets

Abubakar Elsafi

doi:10.32604/cmes.2025.066449

Open Access icon Open Access

ARTICLE

Evaluating Domain Randomization Techniques in DRL Agents: A Comparative Study of Normal, Randomized, and Non-Randomized Resets

Abubakar Elsafi^*

Department of Software Engineering, College of Computer Science and Engineering, University of Jeddah, Jeddah, 21959, Saudi Arabia

* Corresponding Author: Abubakar Elsafi. Email: email

Computer Modeling in Engineering & Sciences 2025, 144(2), 1749-1766. https://doi.org/10.32604/cmes.2025.066449

Received 09 April 2025; Accepted 11 July 2025; Issue published 31 August 2025

Abstract

Domain randomization is a widely adopted technique in deep reinforcement learning (DRL) to improve agent generalization by exposing policies to diverse environmental conditions. This paper investigates the impact of different reset strategies, normal, non-randomized, and randomized, on agent performance using the Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3) algorithms within the CarRacing-v2 environment. Two experimental setups were conducted: an extended training regime with DDPG for 1000 steps per episode across 1000 episodes, and a fast execution setup comparing DDPG and TD3 for 30 episodes with 50 steps per episode under constrained computational resources. A step-based reward scaling mechanism was applied under the randomized reset condition to promote broader state exploration. Experimental results show that randomized resets significantly enhance learning efficiency and generalization, with DDPG demonstrating superior performance across all reset strategies. In particular, DDPG combined with randomized resets achieves the highest smoothed rewards (reaching approximately 15), best stability, and fastest convergence. These differences are statistically significant, as confirmed by t-tests: DDPG outperforms TD3 under randomized (t = −101.91, p < 0.0001), normal (t = −21.59, p < 0.0001), and non-randomized (t = −62.46, p < 0.0001)) reset conditions. The findings underscore the critical role of reset strategy and reward shaping in enhancing the robustness and adaptability of DRL agents in continuous control tasks, particularly in environments where computational efficiency and training stability are crucial.

Keywords

DDPG agent; TD3 agent; deep reinforcement learning; domain randomization; generalization; non-randomized reset; normal reset; randomized reset

Cite This Article

APA Style

Elsafi, A. (2025). Evaluating Domain Randomization Techniques in DRL Agents: A Comparative Study of Normal, Randomized, and Non-Randomized Resets. Computer Modeling in Engineering & Sciences, 144(2), 1749–1766. https://doi.org/10.32604/cmes.2025.066449

Vancouver Style

Elsafi A. Evaluating Domain Randomization Techniques in DRL Agents: A Comparative Study of Normal, Randomized, and Non-Randomized Resets. Comput Model Eng Sci. 2025;144(2):1749–1766. https://doi.org/10.32604/cmes.2025.066449

IEEE Style

A. Elsafi, “Evaluating Domain Randomization Techniques in DRL Agents: A Comparative Study of Normal, Randomized, and Non-Randomized Resets,” Comput. Model. Eng. Sci., vol. 144, no. 2, pp. 1749–1766, 2025. https://doi.org/10.32604/cmes.2025.066449

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Evaluating Domain Randomization Techniques in DRL Agents: A Comparative Study of Normal, Randomized, and Non-Randomized Resets

Abstract

Keywords

Cite This Article

3609

2122

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link