Right, but that's exactly what static typing does in most cases - it provides model with more explicit context for what it's doing.
For the same reason, they tend to handle XML better than JSON - sure, both are trees, but XML is has redundancy, and that redundancy helps keep the model on the rails, so to speak.
I actually wonder if the perfect LLM language would be something a lot more like COBOL - not in a sense of being similarly high level, but rather verbosity. And perhaps also being closer to natural English, which is, after all, still a lot of its training set, especially for reasoning stuff. For query languages they seem to like SQL the most of all the things I've tried, and I strongly suspect that it's the same underlying cause.
Of course, a new language designed like that would have the fundamental problem that there's no existing corpus in such a language. Then again, if it is also designed such that one can reliably convert e.g. from Java to that language, then perhaps we can still pull that off.
> For query languages, at least, they seem to like SQL the most of all the things I've tried, and I strongly suspect that it's the same underlying cause.
The reason they like SQL more than other query languages is primarily that their training data has orders of magnitude more of it than any other query language, and that advantage is so huge that any other possible advantage would probably have comparatively negligible effect.
I'm not so sure about that. This was in comparison to e.g. walking object graphs in Python, C#, and JavaScript using the usual APIs (i.e. where querying one one-to-many relationship looks like Foo.Bars rather than joins, and using map/fold/filter or equivalents). I would expect there to be a lot more of that kinda stuff in the training set than SQL.
For the same reason, they tend to handle XML better than JSON - sure, both are trees, but XML is has redundancy, and that redundancy helps keep the model on the rails, so to speak.
I actually wonder if the perfect LLM language would be something a lot more like COBOL - not in a sense of being similarly high level, but rather verbosity. And perhaps also being closer to natural English, which is, after all, still a lot of its training set, especially for reasoning stuff. For query languages they seem to like SQL the most of all the things I've tried, and I strongly suspect that it's the same underlying cause.
Of course, a new language designed like that would have the fundamental problem that there's no existing corpus in such a language. Then again, if it is also designed such that one can reliably convert e.g. from Java to that language, then perhaps we can still pull that off.