Note that the model is based on RoBERTa and has only 125m parameter. It is not competing against any of the new popular models, not even small ones like Phi or GeMMa.
second opinion - BERT family are transformer-based, and that is a big threshold right there.. secondly I am not sure that two one-minute comments could capture what exactly went on with fine tuning or graph-based methods of constraint or whatnot.. with respect to the fitness of the production tools for intended purposes.