Preprint, Oct 17, 2023

Introduction

Motivation

LLM에는 여전히 factual inaccurate response가 존재함.
Retrieval-Augmented Generation (RAG)의 등장으로 문제가 어느정도 보완 되기는 했으나,
1. Non-factual prompt: 질문 prompt에 따라 검색이 필요하지 않은 경우가 있을 수 있다. 이 경우 LM의 답변의 다양성을 떨어뜨린다.
2. Fixed-number of document: 항상 고정된 갯수의 문서를 retrieve 하는 경우 관련성이 낮은 정보를 사용하게 될 수 있다.

LLM generation quality 향상을 위해 Self-RAG 제안
- on-demand retrieval and self-reflection
- LLM에 대해 self-check 방식을 도입하고 학습을 유도

Related work

Retrieval-Augmented Generation

Reinforcement Learning from Human Feedback (RLHF)

GPT-3 ➝ ChatGPT에서 큰 성능 변화를 이끌어낸, 사람을 이용한 LLM 학습 방식