If our tip hasn't updated in a while, that may be because our peers are
not relaying blocks to us that we would consider valid. Allow connection
to an additional outbound peer in that circumstance.
Also, periodically check to see if we are exceeding our target number of
outbound peers, and disconnect the one which has least recently
announced a new block to us (choosing the newest such peer in the case
of tie).
Currently we have no rotation of outbound peers. If an outbound peer
stops serving us blocks, or is on a consensus-incompatible chain with
less work than our tip (but otherwise valid headers), then we will never
disconnect that peer, even though that peer is using one of our 8
outbound connection slots. Because we rely on our outbound peers to
find an honest node in order to reach consensus, allowing an
incompatible peer to occupy one of those slots is undesirable,
particularly if it is possible for all such slots to be occupied by such
peers.
Protect against this by always checking to see if a peer's best known
block has less work than our tip, and if so, set a 20 minute timeout --
if the peer is still not known to have caught up to a chain with as much
work as ours after 20 minutes, then send a single getheaders message,
wait 2 more minutes, and if a better header hasn't been received by then,
disconnect that peer.
Note:
- we do not require that our peer sync to the same tip as ours, just an
equal or greater work tip. (Doing otherwise would risk partitioning the
network in the event of a chain split, and is also unnecessary.)
- we pick 4 of our outbound peers and do not subject them to this logic,
to be more conservative. We don't wish to permit temporary network
issues (or an attacker) to excessively disrupt network topology.
There are a few too many edge-cases here to make this a scripted diff.
The following commits will move a few functions into PeerLogicValidation, where
the local connman instance can be used. This change prepares for that usage.
At startup, we choose one peer to serve us the headers chain, until
our best header is close to caught up. Disconnect this peer if more
than 15 minutes + 1ms/expected_header passes and our best header
is still more than 1 day away from current time.
This simplifies fixing the wallet-returns-stale-info issue as we
can now hold cs_wallet across an entire block instead of only
per-tx (though we only actually do so in the next commit).
This change also removes the NOT_IN_BLOCK constant in favor of only
passing the CBlockIndex* parameter to SyncTransactions when a new
block is being connected, instead of also when a block is being
disconnected.
This change adds a parameter to BlockConnectedDisconnected which
lists the transactions which were removed from mempool due to
confliction as a result of this operation. While its somewhat of a
shame to make block-validation-logic generate a list of mempool
changes to be included in its generated callbacks, fixing this isnt
too hard.
Further in this change-set, CValidationInterface starts listening
to mempool directly, placing it in the middle and giving it a bit
of logic to know how to route notifications from block-validation,
mempool, etc (though not listening for conflicted-removals yet).
In order to sleep accurately, the message handler needs to know if _any_ node
has more processing that it should do before the entire thread sleeps.
Rather than returning a value that represents whether ProcessMessages
encountered a message that should trigger a disconnnect, interpret the return
value as whether or not that node has more work to do.
Also, use a global fProcessWake value that can be set by other threads,
which takes precedence (for one cycle) over the messagehandler's decision.
Note that the previous behavior was to only process one message per loop
(except in the case of a bad checksum or invalid header). That was changed in
PR #3180.
The only change here in that regard is that the current node now falls to the
back of the processing queue for the bad checksum/invalid header cases.