twister-core/libtorrent/docs/dht_store.html
2013-07-28 20:41:06 -03:00

303 lines
15 KiB
HTML

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.10: http://docutils.sourceforge.net/" />
<title>BitTorrent extension for arbitrary DHT store</title>
<meta name="author" content="Arvid Norberg, arvid&#64;rasterbar.com" />
<link rel="stylesheet" type="text/css" href="../../css/base.css" />
<link rel="stylesheet" type="text/css" href="../../css/rst.css" />
<script type="text/javascript">
/* <![CDATA[ */
(function() {
var s = document.createElement('script'), t = document.getElementsByTagName('script')[0];
s.type = 'text/javascript';
s.async = true;
s.src = 'http://api.flattr.com/js/0.6/load.js?mode=auto';
t.parentNode.insertBefore(s, t);
})();
/* ]]> */
</script>
<link rel="stylesheet" href="style.css" type="text/css" />
<style type="text/css">
/* Hides from IE-mac \*/
* html pre { height: 1%; }
/* End hide from IE-mac */
</style>
</head>
<body>
<div class="document" id="bittorrent-extension-for-arbitrary-dht-store">
<div id="container">
<div id="headerNav">
<ul>
<li class="first"><a href="/">Home</a></li>
<li><a href="../../products.html">Products</a></li>
<li><a href="../../contact.html">Contact</a></li>
</ul>
</div>
<div id="header">
<h1><span>Rasterbar Software</span></h1>
<h2><span>Software developement and consulting</span></h2>
</div>
<div id="main">
<h1 class="title">BitTorrent extension for arbitrary DHT store</h1>
<table class="docinfo" frame="void" rules="none">
<col class="docinfo-name" />
<col class="docinfo-content" />
<tbody valign="top">
<tr><th class="docinfo-name">Author:</th>
<td>Arvid Norberg, <a class="last reference external" href="mailto:arvid&#64;rasterbar.com">arvid&#64;rasterbar.com</a></td></tr>
<tr><th class="docinfo-name">Version:</th>
<td>1.0.0</td></tr>
</tbody>
</table>
<div class="contents topic" id="table-of-contents">
<p class="topic-title first">Table of contents</p>
<ul class="simple">
<li><a class="reference internal" href="#terminology" id="id3">terminology</a></li>
<li><a class="reference internal" href="#messages" id="id4">messages</a></li>
<li><a class="reference internal" href="#immutable-items" id="id5">immutable items</a><ul>
<li><a class="reference internal" href="#put-message" id="id6">put message</a></li>
<li><a class="reference internal" href="#get-message" id="id7">get message</a></li>
</ul>
</li>
<li><a class="reference internal" href="#mutable-items" id="id8">mutable items</a><ul>
<li><a class="reference internal" href="#id1" id="id9">put message</a></li>
<li><a class="reference internal" href="#id2" id="id10">get message</a></li>
</ul>
</li>
<li><a class="reference internal" href="#signature-verification" id="id11">signature verification</a></li>
<li><a class="reference internal" href="#expiration" id="id12">expiration</a></li>
<li><a class="reference internal" href="#test-vectors" id="id13">test vectors</a></li>
</ul>
</div>
<p>This is a proposal for an extension to the BitTorrent DHT to allow
storing and retrieving of arbitrary data.</p>
<p>It supports both storing <em>immutable</em> items, where the key is
the SHA-1 hash of the data itself, and <em>mutable</em> items, where
the key is the public key of the key pair used to sign the data.</p>
<p>There are two new proposed messages, <tt class="docutils literal">put</tt> and <tt class="docutils literal">get</tt>.</p>
<div class="section" id="terminology">
<h1>terminology</h1>
<p>In this document, a <em>storage node</em> refers to the node in the DHT to which
an item is being announced and stored on. A <em>subscribing node</em> refers to
a node which makes look-ups in the DHT to find the storage nodes, to
request items from them, and possibly re-announce those items to keep them
alive.</p>
</div>
<div class="section" id="messages">
<h1>messages</h1>
<p>The proposed new messages <tt class="docutils literal">get</tt> and <tt class="docutils literal">put</tt> are similar to the existing <tt class="docutils literal">get_peers</tt>
and <tt class="docutils literal">announce_peer</tt>.</p>
<p>Responses to <tt class="docutils literal">get</tt> should always include <tt class="docutils literal">nodes</tt> and <tt class="docutils literal">nodes6</tt> has the same
semantics as in its <tt class="docutils literal">get_peers</tt> response. It should also include a write token,
<tt class="docutils literal">token</tt>, with the same semantics as <tt class="docutils literal">get_peers</tt>.</p>
<p>The <tt class="docutils literal">id</tt> field in these messages has the same semantics as the standard DHT messages,
i.e. the node ID of the node sending the message, to maintain the structure of the DHT
network.</p>
<p>The <tt class="docutils literal">token</tt> field also has the same semantics as the standard DHT message <tt class="docutils literal">get_peers</tt>
and <tt class="docutils literal">announce_peer</tt>, when requesting an item and to write an item respectively.</p>
<p>The <tt class="docutils literal">k</tt> field is the PKCS#1 encoded 2048 bit RSA public key, which the signature
can be authenticated with. When looking up a mutable item, the <tt class="docutils literal">target</tt> field
MUST be the SHA-1 hash of this key.</p>
<p>The distinction between storing mutable and immutable items is the inclusion
of a public key, a sequence number and signature (<tt class="docutils literal">k</tt>, <tt class="docutils literal">seq</tt> and <tt class="docutils literal">sig</tt>).</p>
<p><tt class="docutils literal">get</tt> requests for mutable items and immutable items cannot be distinguished from
eachother. An implementation can either store mutable and immutable items in the same
hash table internally, or in separate ones and potentially do two lookups for <tt class="docutils literal">get</tt>
requests.</p>
<p>The <tt class="docutils literal">v</tt> field is the <em>value</em> to be stored. It is allowed to be any bencoded type (list,
dict, string or integer). When it's being hashed (for verifying its signature or to calculate
its key), its flattened, bencoded, form is used. It is important to use the exact
bencoded representation as it appeared in the message. decoding and then re-encoding
bencoded structures is not necessarily an identity operation.</p>
<p>Storing nodes SHOULD reject <tt class="docutils literal">put</tt> requests where the bencoded form of <tt class="docutils literal">v</tt> is longer
than 767 bytes.</p>
</div>
<div class="section" id="immutable-items">
<h1>immutable items</h1>
<p>Immutable items are stored under their SHA-1 hash, and since they cannot be modified,
there is no need to authenticate the origin of them. This makes immutable items simple.</p>
<p>A node making a lookup SHOULD verify the data it receives from the network, to verify
that its hash matches the target that was looked up.</p>
<div class="section" id="put-message">
<h2>put message</h2>
<p>Request:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em>,
&quot;v&quot;: <em>&lt;any bencoded type, whose encoded size &lt; 768&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;put&quot;
}
</pre>
<p>Response:</p>
<pre class="literal-block">
{
&quot;r&quot;: { &quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em> },
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
}
</pre>
</div>
<div class="section" id="get-message">
<h2>get message</h2>
<p>Request:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em>,
&quot;target&quot;: <em>&lt;SHA-1 hash of item (string)&gt;</em>,
},
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;get&quot;
}
</pre>
<p>Response:</p>
<pre class="literal-block">
{
&quot;r&quot;:
{
&quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em>,
&quot;token&quot;: <em>&lt;write token (string)&gt;</em>,
&quot;v&quot;: <em>&lt;any bencoded type whose SHA-1 hash matches 'target'&gt;</em>,
&quot;nodes&quot;: <em>&lt;IPv4 nodes close to 'target'&gt;</em>,
&quot;nodes6&quot;: <em>&lt;IPv6 nodes close to 'target'&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
}
</pre>
</div>
</div>
<div class="section" id="mutable-items">
<h1>mutable items</h1>
<p>Mutable items can be updated, without changing their DHT keys. To authenticate
that only the original publisher can update an item, it is signed by a private key
generated by the original publisher. The target ID mutable items are stored under
is the SHA-1 hash of the public key (as it appears in the <tt class="docutils literal">put</tt> message).</p>
<p>In order to avoid a malicious node to overwrite the list head with an old
version, the sequence number <tt class="docutils literal">seq</tt> must be monotonically increasing for each update,
and a node hosting the list node MUST not downgrade a list head from a higher sequence
number to a lower one, only upgrade. The sequence number SHOULD not exceed <tt class="docutils literal">MAX_INT64</tt>,
(i.e. <tt class="docutils literal">0x7fffffffffffffff</tt>. A client MAY reject any message with a sequence number
exceeding this.</p>
<p>The signature is a 2048 bit RSA signature of the SHA-1 hash of the bencoded sequence
number and <tt class="docutils literal">v</tt> key. e.g. something like this:: <tt class="docutils literal">3:seqi4e1:v12:Hello world!</tt>.</p>
<div class="section" id="id1">
<h2>put message</h2>
<p>Request:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em>,
&quot;k&quot;: <em>&lt;RSA-2048 public key (PKCS#1 encoded)&gt;</em>,
&quot;seq&quot;: <em>&lt;monotonically increasing sequence number (integer)&gt;</em>,
&quot;sig&quot;: <em>&lt;RSA-2048 signature (256 bytes string)&gt;</em>,
&quot;token&quot;: <em>&lt;write-token (string)&gt;</em>,
&quot;v&quot;: <em>&lt;any bencoded type, whose encoded size &lt; 768&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;put&quot;
}
</pre>
<p>Storing nodes receiving a <tt class="docutils literal">put</tt> request where <tt class="docutils literal">seq</tt> is lower than what's already
stored on the node, MUST reject the request.</p>
<p>Response:</p>
<pre class="literal-block">
{
&quot;r&quot;: { &quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em> },
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
}
</pre>
</div>
<div class="section" id="id2">
<h2>get message</h2>
<p>Request:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em>,
&quot;target:&quot; <em>&lt;20 byte SHA-1 hash of public key (string)&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;get&quot;
}
</pre>
<p>Response:</p>
<pre class="literal-block">
{
&quot;r&quot;:
{
&quot;id&quot;: <em>&lt;20 byte id of sending node (string)&gt;</em>,
&quot;k&quot;: <em>&lt;RSA-2048 public key (268 bytes string)&gt;</em>,
&quot;nodes&quot;: <em>&lt;IPv4 nodes close to 'target'&gt;</em>,
&quot;nodes6&quot;: <em>&lt;IPv6 nodes close to 'target'&gt;</em>,
&quot;seq&quot;: <em>&lt;monotonically increasing sequence number (integer)&gt;</em>,
&quot;sig&quot;: <em>&lt;RSA-2048 signature (256 bytes string)&gt;</em>,
&quot;token&quot;: <em>&lt;write-token (string)&gt;</em>,
&quot;v&quot;: <em>&lt;any bencoded type, whose encoded size &lt; 768&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id (string)&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
}
</pre>
</div>
</div>
<div class="section" id="signature-verification">
<h1>signature verification</h1>
<p>In order to make it maximally difficult to attack the bencoding parser, signing and verification of the
value and sequence number should be done as follows:</p>
<ol class="arabic simple">
<li>encode value and sequence number separately</li>
<li>concatenate &quot;3:seqi&quot; <tt class="docutils literal">seq</tt> &quot;e1:v&quot; and the encoded value.
sequence number 1 of value &quot;Hello World!&quot; would be converted to: 3:seqi1e1:v12:Hello World!
In this way it is not possible to convince a node that part of the length is actually part of the
sequence number even if the parser contains certain bugs. Furthermore it is not possible to have a
verification failure if a bencoding serializer alters the order of entries in the dictionary.</li>
<li>hash the concatenated string with SHA-1</li>
<li>sign or verify the hash digest.</li>
</ol>
<p>On the storage node, the signature MUST be verified before accepting the store command. The data
MUST be stored under the SHA-1 hash of the public key (as it appears in the bencoded dict).</p>
<p>On the subscribing nodes, the key they get back from a <tt class="docutils literal">get</tt> request MUST be verified to hash
to the target ID the lookup was made for, as well as verifying the signature. If any of these fail,
the response SHOULD be considered invalid.</p>
</div>
<div class="section" id="expiration">
<h1>expiration</h1>
<p>Without re-announcement, these items MAY expire in 2 hours. In order
to keep items alive, they SHOULD be re-announced once an hour.</p>
<p>Subscriber nodes MAY help out in announcing items the are interested in to the DHT,
to keep them alive.</p>
</div>
<div class="section" id="test-vectors">
<h1>test vectors</h1>
</div>
</div>
<div id="footer">
<span>Copyright &copy; 2005 Rasterbar Software.</span>
</div>
</div>
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-1599045-1";
urchinTracker();
</script>
</div>
</body>
</html>