Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, perfection is difficult, but it's relative. It can definitely be made much safer. Looking at the analysis of pre vs post alignment makes this obvious, including when the raw unaligned models are compared to "uncensored" models.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: