I don't think it meets your ask of "solve this particularly well" but the unlimi...

zaptrem · 2025-03-03T19:32:56 1741030376

I think this is a good start, X high speed queries per hour then unlimited low-priority ones after. Do you know of any specific companies that do this we could take a look at?

throwup238 · 2025-03-04T01:59:43 1741053583

Remember that you’ve also got a nice natural limitation here: if it’s a hobbyist and not a (commercial) API consumer, there’s only so fast they can listen to the output. Even if they’re rapidly tweaking nobs in a DAW, you can use the play/pause signal to help prioritize the queue, depending on how expensive it is to serialize the GPU state and rehydrate it again. You also might not need to complete generation until the user reaches the play point so you can shuffle around the queue a lot. For example if the user skips after ten seconds you might not need to generate the rest until they try to play that track again, and when they do you usually have enough time before they reach the previous stopping point to generate some more sections.

It might also be helpful to come up with some ways to segregate customers so that “prosumer” users get faster “cold starts” (so that they can iterate faster) at the expense of sometimes having to wait for generation to start back up again.

zoogeny · 2025-03-03T19:36:08 1741030568

runway.ai (video gen) is what I was thinking when I suggested this.