Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s nonsensical to call it “zero shot” when a sample of the voice is provided. The term “zero shot cloning” implies you have some representation of the voice from another domain - e.g. a text description of the voice. What they’re doing is ABSOLUTELY one shot cloning. I don’t care if lots of STT folks use the term this way, they’re wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: