cal q learning

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

A compelling use case of offline reinforcement learning (RL) is to obtain a policy initialization from existing datasets followed by fast online fine-tuning with limited interaction. However, existing offline RL methods tend to behave poorly during fine-tu

arxiv.org

soft actor critic은 자신을 너무 과대 평가하고

CQL은 자신을 너무 과소 평가한다.

baseline을 만들어서 이를 활용하는 방식ㄷ

'강화학습' 카테고리의 다른 글

Diffusion for World Modeling: Visual Details Matter in Atari (3)	2024.11.30
Voyager: An Open-Ended Embodied Agent with Large Language Models (0)	2024.02.05
Chapter 12. Model-based Reinforcement Learning (0)	2023.06.05
Chapter 11. Imitation Learning (0)	2023.06.02
Chapter 10. Exploration (1)	2023.05.30

JunHan's AI Factory

cal q learning

'강화학습' 카테고리의 다른 글

티스토리툴바

cal q learning

'강화학습' 카테고리의 다른 글

'강화학습' Related Articles

티스토리툴바