Self-Rewarding Language Models 썸네일형 리스트형 Self-Rewarding Language Models https://arxiv.org/abs/2401.10020 Self-Rewarding Language Models We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human preferences, which may then be bottlenecked by human performan arxiv.org 간단 명료 사람이 판단하던 RLHF에서 인공지능이 인공지능을 판단하는 단계로 넘어가자는 이야기 강화학습에서 이미 전년도에.. 더보기 이전 1 다음