Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Note that the model is based on RoBERTa and has only 125m parameter. It is not competing against any of the new popular models, not even small ones like Phi or GeMMa.


It’s also not meant to be a generative model - only to be used as an encoder model (they list retrieval as a potential use case )


Given the current state of LLMs, I am not even sure this qualify to be called an LLM.


second opinion - BERT family are transformer-based, and that is a big threshold right there.. secondly I am not sure that two one-minute comments could capture what exactly went on with fine tuning or graph-based methods of constraint or whatnot.. with respect to the fitness of the production tools for intended purposes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: