Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

there’s decent work on computational reasoning power of transformers, SSMs, etc.

some approximate snippets that come to mind are that decoder-only transformers recognize AC^0 and think in TC^0, that encoder-decoders are strictly more powerful than decoder-only, etc.

Person with last name Miller iric if poke around on arXiv, a few others, been a while since was current top of mind so ymmv on exact correctness of above snippets



You are probably thinking of Merrill (whose work is referenced towards the end of the article).


ah yes Merrill thx!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: