Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

...and I have no plans to add NLP tools in gensim. The connection between gensim and tokenizing/tagging/parsing libs is intentionally loose and flexible.

I'm a fan of "do one thing, do it well".

Having said that, it would be great to facilitate "spaCy + gensim" pipelines for users.

For example, the "word vector representations" can be trained easily with gensim, on arbitrary user-specified corpora, whereas spaCy loads something pre-trained, in a specific format. Maybe room for some interoperability there?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: