Good example of data engineering/MLops for people who aren't familiar.
I'd suggest using HDBScan to generate hierarchical clusters for the points, then use a model to generate names for interior clusters. That'll make it easy to explore topics out to the leaves, as you can just pop up refinements based on the connectivity to the current node using the summary names.
The groups need more distinct coloring, which I think having clusters could help with. The individual article text size should depend on how important or relevant the article is, either in general or based on the current search. If you had more interior cluster summaries that'd also help cut down on some of the text clutter, as you could replace multiple posts with a group summary until more zoomed in.
Thanks for the great pointers! I didn't get the time to look into hierarchical clustering unfortunately but it's on my TODO list. Your comment about making the map clearer is great and something I think there's a lot of low-hanging approaches for improving. Another thing for the TODO list :)
I'd suggest using HDBScan to generate hierarchical clusters for the points, then use a model to generate names for interior clusters. That'll make it easy to explore topics out to the leaves, as you can just pop up refinements based on the connectivity to the current node using the summary names.
The groups need more distinct coloring, which I think having clusters could help with. The individual article text size should depend on how important or relevant the article is, either in general or based on the current search. If you had more interior cluster summaries that'd also help cut down on some of the text clutter, as you could replace multiple posts with a group summary until more zoomed in.