Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While progress has been made in computer vision, that progress has been relatively narrow up until now, and I think the activation energy required to produce this level of quality would be more than it's worth. As others have mentioned, new footage comes out all the time.

However, I agree with the sentiment. Someday, we will have a massive foundation model capable of producing any video with a little conditioning on text. But we don't currently have such a model. In some sense, we're still in the era of easily verifiable video, and this era might end someday soon.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: