oh, you said *trained*. If trained, then the long context length issue may not b...

oh, you said trained. If trained, then the long context length issue may not be as severe. It might still go mad if you let it eat too much of a hundred-page lawsuit, but if you work with portions of it (like how transformers work), RWKV can be vastly more economical than the larger models (requiring a much less powerful GPU, or even running on no GPU at all, thanks to rwkv.cpp).

rwkv.cpp in particular depends on a project that would not have existed in its current form without LLaMA, even though the project itself isn't LLaMA-specific. However there are enough other implementations of CPU inference (at least two?) that I think RWKV could still exist even if LLaMA had never.