Good example of data engineering/MLops for people who aren't familiar. I'd sugge...

zetazzed · on May 10, 2024

For folks with GPUs, note that HDBscan is very optimized in cuML (https://docs.rapids.ai/api/cuml/stable/api/#clustering / https://developer.nvidia.com/blog/faster-hdbscan-soft-cluste...).

jszymborski · on May 10, 2024

Ooo thanks for this

wilsonzlin · on May 9, 2024

Thanks for the great pointers! I didn't get the time to look into hierarchical clustering unfortunately but it's on my TODO list. Your comment about making the map clearer is great and something I think there's a lot of low-hanging approaches for improving. Another thing for the TODO list :)