Hacker Newsnew | past | comments | ask | show | jobs | submit | Cian911's commentslogin

> Shard by Git blob object ID which gives us a nice way of evenly distributing documents between the shards while avoiding any duplication. There won’t be any hot servers due to special repositories and we can easily scale the number of shards as necessary.

What exactly do they mean by "special repositories" here?


(I edited this post.) Maybe that could have been worded more clearly... What we mean is repositories with atypical activity patterns, for example a busy monorepo with lots of files and continuous pushes. If we sharded by repository, a single shard would be responsible for processing all the updated documents for this busy repository.


You don't throw out the basket for one bad egg.


The fact its still there implies its the basket that is bad.


Been following your blog for a while now Peter, since I first came across the article Wired did about you. I'd love it if you wrote a post about this, and your success in generalup until this point!


I will! My blog is a perpetual backlog of posts, but it'll happen :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: