Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Personalized Recommendations at Etsy (codeascraft.com)
73 points by kellegous on Nov 18, 2014 | hide | past | favorite | 5 comments


Very interesting article with some cool approaches. I'd be really interested to know how often the model needs to be trained? Seems like a lot of purchases are holiday/seasonally relevant and you'd hate to be suggesting valentine's day gifts on Feb 20th because everybody was buying them 2 weeks ago. Also be great to see any insights on how many features you need to get a good guess on a users tastes and preferences? Are 20 numbers enough to represent most of the dimensionality of Etsy products or 100 or 1000?


In their recent KDD article [1], Etsy used 200 SVD dimensions.

For those interested in trying this out in Python:

* `gensim` contains stochastic SVD for large data (fast online model training) [2]

* I wrote a benchmark of (approximate) nearest neighbour libraries in Python [3]

[1] https://dl.dropboxusercontent.com/u/2143857/papers/topics.pd...

[2] https://github.com/piskvorky/gensim/

[3] http://radimrehurek.com/2013/12/performance-shootout-of-near...


Fantastic! I had been looking for this kind of overview of basic recommendation systems for the past few months, and this is the first time I've seen so much understandable yet really helpful information in one place. If anyone else knows of similar kinds of articles, I'd appreciate suggestions.


I was expecting yet another article on Collaborative filtering but was pleasantly surprised to find SVD and Locality-sensitive hashing mentioned. If you're looking for a more thorough understanding of these topics (& other algorithms related to mining data) checkout the course - Mining Massive DataSets at Coursera - http://coursera.org/course/mmds


Not an article, but Apache Mahout is a machine learning library that includes things like recommendation systems. Very quick and easy to get going, and for large data sets can automatically scale out to using a Hadoop cluster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: