Sorry, what is RLHF? | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		EVa5I7bHFq9mnYK on March 14, 2023 \| parent \| context \| favorite \| on: Microsoft lays off one of its responsible AI teams Sorry, what is RLHF?

spacebanana7 on March 14, 2023 | [–]

RLHF is Reinforcement Learning from Human Feedback.

It usually refers to fine tuning language models using data labelled by humans.

Hugging face have a good overview in this article: https://huggingface.co/blog/rlhf

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact