Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's "perplexity" a measure of? First I've heard of it.


e^loss. It's a bad name for a confusing concept: Loss. (e^loss is just another way of plotting loss, after all.)

Loss isn't the whole story -- the steepest slope during training often produces the worst quality language models. You want a nice, gentle downward slope.

SubsimulatorGPT2 (https://reddit.com/r/subsimulatorgpt2) continued to improve in terms of human evaluation even though the loss stayed flat for over a week.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: