6/5/2023 0 Comments Super hexagon final level![]() ![]() In order to efficiently train the agent, a C++ library was written. However, the noise is turned off after 500,000 training iterations.Noisy networks facilitate the exploration process.Distributional RL with quantile regression gives similar results.The distributional approach significantly increases the performance of the agent.However, after roughly 300,000 training steps the agent trained without prioritized experience replay performs better Prioritized experience replay at first performs better,. ![]()
0 Comments
Leave a Reply. |