BitTorrent's DHT

jwcrux · on Oct 29, 2015

There was an interesting talk [0] and paper [1] presented at some security conferences about crawling the DHT for different purposes. It's been a while, but I remember these to be a great read.

[0] https://www.defcon.org/images/defcon-18/dc-18-presentations/...

[1] https://jhalderm.com/pub/papers/dht-woot10.pdf

synctext · on Oct 29, 2015

State-of-the-art in DHT measurements: http://www.p2p-conference.org/p2p14/wp-content/uploads/2014/... "100 Million DHT replies"

voltagex_ · on Oct 30, 2015

Am I missing it in the paper or was the code for this ever released?

jlouis · on Oct 30, 2015

Here is an Erlang implementation of, essentially, the BT DHT:

https://github.com/jlouis/dht

Its main feature is that it has a full formal model written in QuickCheck to verify its integrity. That is, the dht code is simulated in a universe in which rarely occurring events are far more likely to happen.

lewisl9029 · on Oct 29, 2015

There's also fairly robust pure JS implementation of the BitTorrent DHT that works in both Node and the browser: https://github.com/feross/bittorrent-dht

I'd love to see more decentralized web apps built on similar technologies.

DiThi · on Oct 30, 2015

How can it work in the browser? There's no reference to WebRTC in the code.

lewisl9029 · on Oct 30, 2015

Sorry, it looks like you're right, there's no browser support yet.

I had always simply assumed that since webtorrent works in the browser, the DHT it uses must also work in the browser, but apparently browser DHT support is still a work in progress: https://github.com/feross/webtorrent/issues/288

kauffj · on Oct 30, 2015

We're building one: http://lbry.io

motoboi · on Oct 30, 2015

Very interesting project. I got curious why you decided to use a one-level namespace.

A multi-level namespace guarantees that if I manage to build a strong brand, I can continue to use it, creating sub-domains. The alternative being people buying up names to sell them later.

Example:

lbry://avengers

The first movie come out, makes a lot of success. A brand is built. One year later, I make Avengers II: lbry://avengers.ageofultron and so on.

But with one-level namespace, when avengers make success, people will buy all variants of it.

TL:DR Just let people buy namespace levels and do whatever they want with it.

kauffj · on Nov 3, 2015

Super slow to respond to this, but this is possible! But note there can be no speculation in names since selling is mandatory.

runn1ng · on Oct 29, 2015

This is why I don't get the fuss about IPFS.

It's basically bittorrent and its DHT, just named differently! (and without the added value of trackers)

sliken · on Oct 29, 2015

I suggest you learn more about IPFS, you are missing quite a bit. Read up on merkle-dag's, content addressable storage, and universal name spaces.

It's a distributed filesystem. IPFS uses the DHT similarly to bittorrent, just to find peers. But there's much more to IPFS than DHT+torrent.

runn1ng · on Oct 29, 2015

All right, will try

InclinedPlane · on Oct 30, 2015

This is why I don't get the fuss about http.

It's basically just gopher and sgml, just named differently! (and without the added value of search)

Natanael_L · on Oct 29, 2015

Except you get a Git-like tree of files, not just one single prepackaged directory.

kpcyrd · on Oct 29, 2015

ipfs is way easier to use compared to libtorrent.

_ejov · on Oct 30, 2015

is this actually true? I use libtorrent, would be interested to learn how exactly it is easier? Is it mainly due to avoiding C++? Is IPFS implementation stable enough?

diggan · on Oct 30, 2015

Write how it would be used in a cli environment, and we'll see.

For IPFS:

ipfs add your_file

On the other computer

ipfs get $HASH_FROM_PREVIOUS_COMMAND

Done

jerguismi · on Oct 29, 2015

Could this DHT be used to store _any_ content?

sliken · on Oct 29, 2015

Yes, http://www.bittorrent.org/beps/bep_0044.html

jerguismi · on Oct 29, 2015

Awesome, have to research this more.

LukeB42 · on Oct 29, 2015

These DHTs are also designed to store _any_ content:

https://github.com/LukeB42/Uroko

https://github.com/ipfs/go-ipfs

sliken · on Oct 30, 2015

go-ipfs uses a DHT, but it's not just a DHT. Nor does it actually store files in a DHT. Finding peers with a give file (named by hash) uses the DHT. But the files are stored outside the DHT.

olympus · on Oct 29, 2015

You're probably better off using the BitCoin blockchain (which is a distributed network). It's well documented that you can include data in BitCoin transactions and pretty secure.

But I'd bet you could store information in the DHT itself. Here's the issue: to store the information in the DHT you'd have to attach it to a torrent. To make it relatively persistent the torrent would have to be popular so that it will continue to be seeded and not disappear for a while (like a movie file or a popular music album). If you're doing that you might as well stick the info in the torrent itself instead of hacking it into the DHT. That's what torrents are for.

sliken · on Oct 29, 2015

First blockchains are slow. Second you can put things in the DHT without having anything to do with a torrent.

olympus · on Oct 29, 2015

Yeah, but if other nodes aren't passing it around it won't get distributed right? The client nodes aren't passing around information for more than 10 minutes unless they have a good reason (aka announcing torrents) so it will die out pretty quick and thus not be good for storage?

I suppose you could just keep your personal computer up and keep announcing, but then you are really just storing information on your computer and not really storing it in the DHT. Feel free to correct me if I'm wrong.

thisisrobv · on Oct 29, 2015

+1 not sure where "to store the information in the DHT you'd have to attach it to a torrent" comes from.

olympus · on Oct 29, 2015

It comes from the fact that the Mainline DHT is specifically about BitTorrent nodes and who has which torrents. The client programs don't just pass information around willy-nilly.

Torgo · on Oct 29, 2015

Well, as I understand, the tradeoff is that with BEP44 you don't have any storage or retrieval guarantees, and the data is purged a couple hours after nobody's asked for it.

sliken · on Oct 30, 2015

Agreed. It makes it a bit easier to do something like publish your public key and have it available not just when intermittenly online, but somewhat longer.

Basically it makes it a bit easier to layer your special application on top of the mainline DHT, help your peers find each other, and make it a bit easier to publish some metadata about your peers.

That way you can take advantage of millions of nodes in the DHT and not just the ones running your special application.

But if you need stronger guarantees or availability those should come from your application, not the mainline DHT.

jerguismi · on Oct 29, 2015

Bitcoin blockchain makes sense only for content which I want to store on 5000+ nodes - it is not very cost effective. You can store 40 bytes of data in one OP_RETURN transaction.

I was thinking more about storing bigger amounts of data, and more cost-effectively. Of course you can't rely on having the data in DHT:s. That could be fixed with incentivization (paying btc for the nodes for storing your data).

sktrdie · on Oct 29, 2015

How's this news? BitTorrent's DHT has been around for years.

TazeTSchnitzel · on Oct 29, 2015

Things don't have to be new to appear on HN, just interesting.