Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sorry, what is RLHF?


RLHF is Reinforcement Learning from Human Feedback.

It usually refers to fine tuning language models using data labelled by humans.

Hugging face have a good overview in this article: https://huggingface.co/blog/rlhf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: