KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies

Abstract

Soft Actor-Critic (SAC) has achieved notable success in continuous control tasks but struggles in sparse reward settings, where infrequent rewards make efficient exploration challenging. While novelty-based exploration methods address this issue by encouraging the agent to explore novel states, they are not trivial to apply to SAC. In particular, managing the interaction between novelty-based exploration and SAC’s stochastic policy can lead to inefficient exploration and redundant sample collection. In this paper, we propose KEA (Keeping Exploration Alive) which tackles the inefficiencies in balancing exploration strategies when combining SAC with novelty-based exploration. KEA integrates a novelty-augmented SAC with a standard SAC agent, proactively coordinated via a switching mechanism. This coordination allows the agent to maintain stochasticity in high-novelty regions, enhancing exploration efficiency and reducing repeated sample collection. We first analyze this potential issue in a 2D navigation task, and then evaluate KEA on the DeepSea hard-exploration benchmark as well as sparse reward control tasks from the DeepMind Control Suite. Compared to state-of-the-art novelty-based exploration baselines, our experiments show that KEA significantly improves learning efficiency and robustness in sparse reward setups.

KEA

BibTeX


    @inproceedings{yang2025kea,
    title={KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies},
    author={Shih-Min Yang and Martin Magnusson and Johannes A. Stork and Todor Stoyanov},
    booktitle={Forty-second International Conference on Machine Learning},
    year={2025}
    }