twisterp2pblockchainnetworkbittorrentmicrobloggingipv6social-networkdhtdecentralizedp2p-networktwister-servertwister-ipv6twister-coretwisterarmy
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
266 lines
7.9 KiB
266 lines
7.9 KiB
============================================ |
|
BitTorrent extension for arbitrary DHT store |
|
============================================ |
|
|
|
:Author: Arvid Norberg, arvid@rasterbar.com |
|
:Version: 1.0.0 |
|
|
|
.. contents:: Table of contents |
|
:depth: 2 |
|
:backlinks: none |
|
|
|
This is a proposal for an extension to the BitTorrent DHT to allow |
|
storing and retrieving of arbitrary data. |
|
|
|
It supports both storing *immutable* items, where the key is |
|
the SHA-1 hash of the data itself, and *mutable* items, where |
|
the key is the public key of the key pair used to sign the data. |
|
|
|
There are two new proposed messages, ``put`` and ``get``. |
|
|
|
terminology |
|
----------- |
|
|
|
In this document, a *storage node* refers to the node in the DHT to which |
|
an item is being announced and stored on. A *subscribing node* refers to |
|
a node which makes look-ups in the DHT to find the storage nodes, to |
|
request items from them, and possibly re-announce those items to keep them |
|
alive. |
|
|
|
messages |
|
-------- |
|
|
|
The proposed new messages ``get`` and ``put`` are similar to the existing ``get_peers`` |
|
and ``announce_peer``. |
|
|
|
Responses to ``get`` should always include ``nodes`` and ``nodes6`` has the same |
|
semantics as in its ``get_peers`` response. It should also include a write token, |
|
``token``, with the same semantics as ``get_peers``. |
|
|
|
The ``id`` field in these messages has the same semantics as the standard DHT messages, |
|
i.e. the node ID of the node sending the message, to maintain the structure of the DHT |
|
network. |
|
|
|
The ``token`` field also has the same semantics as the standard DHT message ``get_peers`` |
|
and ``announce_peer``, when requesting an item and to write an item respectively. |
|
|
|
The ``k`` field is the PKCS#1 encoded 2048 bit RSA public key, which the signature |
|
can be authenticated with. When looking up a mutable item, the ``target`` field |
|
MUST be the SHA-1 hash of this key. |
|
|
|
The distinction between storing mutable and immutable items is the inclusion |
|
of a public key, a sequence number and signature (``k``, ``seq`` and ``sig``). |
|
|
|
``get`` requests for mutable items and immutable items cannot be distinguished from |
|
eachother. An implementation can either store mutable and immutable items in the same |
|
hash table internally, or in separate ones and potentially do two lookups for ``get`` |
|
requests. |
|
|
|
The ``v`` field is the *value* to be stored. It is allowed to be any bencoded type (list, |
|
dict, string or integer). When it's being hashed (for verifying its signature or to calculate |
|
its key), its flattened, bencoded, form is used. It is important to use the exact |
|
bencoded representation as it appeared in the message. decoding and then re-encoding |
|
bencoded structures is not necessarily an identity operation. |
|
|
|
Storing nodes SHOULD reject ``put`` requests where the bencoded form of ``v`` is longer |
|
than 767 bytes. |
|
|
|
immutable items |
|
--------------- |
|
|
|
Immutable items are stored under their SHA-1 hash, and since they cannot be modified, |
|
there is no need to authenticate the origin of them. This makes immutable items simple. |
|
|
|
A node making a lookup SHOULD verify the data it receives from the network, to verify |
|
that its hash matches the target that was looked up. |
|
|
|
put message |
|
........... |
|
|
|
Request: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"a": |
|
{ |
|
"id": *<20 byte id of sending node (string)>*, |
|
"v": *<any bencoded type, whose encoded size < 768>* |
|
}, |
|
"t": *<transaction-id (string)>*, |
|
"y": "q", |
|
"q": "put" |
|
} |
|
|
|
Response: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"r": { "id": *<20 byte id of sending node (string)>* }, |
|
"t": *<transaction-id (string)>*, |
|
"y": "r", |
|
} |
|
|
|
get message |
|
........... |
|
|
|
Request: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"a": |
|
{ |
|
"id": *<20 byte id of sending node (string)>*, |
|
"target": *<SHA-1 hash of item (string)>*, |
|
}, |
|
"t": *<transaction-id (string)>*, |
|
"y": "q", |
|
"q": "get" |
|
} |
|
|
|
Response: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"r": |
|
{ |
|
"id": *<20 byte id of sending node (string)>*, |
|
"token": *<write token (string)>*, |
|
"v": *<any bencoded type whose SHA-1 hash matches 'target'>*, |
|
"nodes": *<IPv4 nodes close to 'target'>*, |
|
"nodes6": *<IPv6 nodes close to 'target'>* |
|
}, |
|
"t": *<transaction-id>*, |
|
"y": "r", |
|
} |
|
|
|
|
|
mutable items |
|
------------- |
|
|
|
Mutable items can be updated, without changing their DHT keys. To authenticate |
|
that only the original publisher can update an item, it is signed by a private key |
|
generated by the original publisher. The target ID mutable items are stored under |
|
is the SHA-1 hash of the public key (as it appears in the ``put`` message). |
|
|
|
In order to avoid a malicious node to overwrite the list head with an old |
|
version, the sequence number ``seq`` must be monotonically increasing for each update, |
|
and a node hosting the list node MUST not downgrade a list head from a higher sequence |
|
number to a lower one, only upgrade. The sequence number SHOULD not exceed ``MAX_INT64``, |
|
(i.e. ``0x7fffffffffffffff``. A client MAY reject any message with a sequence number |
|
exceeding this. |
|
|
|
The signature is a 2048 bit RSA signature of the SHA-1 hash of the bencoded sequence |
|
number and ``v`` key. e.g. something like this:: ``3:seqi4e1:v12:Hello world!``. |
|
|
|
put message |
|
........... |
|
|
|
Request: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"a": |
|
{ |
|
"id": *<20 byte id of sending node (string)>*, |
|
"k": *<RSA-2048 public key (PKCS#1 encoded)>*, |
|
"seq": *<monotonically increasing sequence number (integer)>*, |
|
"sig": *<RSA-2048 signature (256 bytes string)>*, |
|
"token": *<write-token (string)>*, |
|
"v": *<any bencoded type, whose encoded size < 768>* |
|
}, |
|
"t": *<transaction-id (string)>*, |
|
"y": "q", |
|
"q": "put" |
|
} |
|
|
|
Storing nodes receiving a ``put`` request where ``seq`` is lower than what's already |
|
stored on the node, MUST reject the request. |
|
|
|
Response: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"r": { "id": *<20 byte id of sending node (string)>* }, |
|
"t": *<transaction-id (string)>*, |
|
"y": "r", |
|
} |
|
|
|
get message |
|
........... |
|
|
|
Request: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"a": |
|
{ |
|
"id": *<20 byte id of sending node (string)>*, |
|
"target:" *<20 byte SHA-1 hash of public key (string)>* |
|
}, |
|
"t": *<transaction-id (string)>*, |
|
"y": "q", |
|
"q": "get" |
|
} |
|
|
|
Response: |
|
|
|
.. parsed-literal:: |
|
|
|
{ |
|
"r": |
|
{ |
|
"id": *<20 byte id of sending node (string)>*, |
|
"k": *<RSA-2048 public key (268 bytes string)>*, |
|
"nodes": *<IPv4 nodes close to 'target'>*, |
|
"nodes6": *<IPv6 nodes close to 'target'>*, |
|
"seq": *<monotonically increasing sequence number (integer)>*, |
|
"sig": *<RSA-2048 signature (256 bytes string)>*, |
|
"token": *<write-token (string)>*, |
|
"v": *<any bencoded type, whose encoded size < 768>* |
|
}, |
|
"t": *<transaction-id (string)>*, |
|
"y": "r", |
|
} |
|
|
|
signature verification |
|
---------------------- |
|
|
|
In order to make it maximally difficult to attack the bencoding parser, signing and verification of the |
|
value and sequence number should be done as follows: |
|
|
|
1. encode value and sequence number separately |
|
2. concatenate "3:seqi" ``seq`` "e1:v" and the encoded value. |
|
sequence number 1 of value "Hello World!" would be converted to: 3:seqi1e1:v12:Hello World! |
|
In this way it is not possible to convince a node that part of the length is actually part of the |
|
sequence number even if the parser contains certain bugs. Furthermore it is not possible to have a |
|
verification failure if a bencoding serializer alters the order of entries in the dictionary. |
|
3. hash the concatenated string with SHA-1 |
|
4. sign or verify the hash digest. |
|
|
|
On the storage node, the signature MUST be verified before accepting the store command. The data |
|
MUST be stored under the SHA-1 hash of the public key (as it appears in the bencoded dict). |
|
|
|
On the subscribing nodes, the key they get back from a ``get`` request MUST be verified to hash |
|
to the target ID the lookup was made for, as well as verifying the signature. If any of these fail, |
|
the response SHOULD be considered invalid. |
|
|
|
expiration |
|
---------- |
|
|
|
Without re-announcement, these items MAY expire in 2 hours. In order |
|
to keep items alive, they SHOULD be re-announced once an hour. |
|
|
|
Subscriber nodes MAY help out in announcing items the are interested in to the DHT, |
|
to keep them alive. |
|
|
|
test vectors |
|
------------ |
|
|
|
|
|
|