Status

COMMENTED

Stakeholders
Outcome

Unable to render Jira issues macro, execution error.

Unable to render Jira issues macro, execution error.

Due date
Owner

Background

Block synchronization is briefly discussed in whitepaper, nevertheless much more details are needed for this feature. Also several design decisions were taken during the implementation that influence other core modules. Therefore this document is meant to review in detail the current design of block synchronization and start a discussion of what changes are accepted by the core team and what needs to be corrected in the future.

Problem

Scenarios

The following are the scenarios for the block synchronization:

  1. New peer joins the validator network and needs to get the whole blockchain.
  2. The peer lost several commit messages due to poor internet connection and needs to get the latest blocks.
  3. The peer was stopped and restarted after some time and needs to get the latest blocks.
  4. ...

Solution

Gossip

For peers to discover that they do not have latest blocks, peers send their latest block hashes to all other peers in the network every N seconds. If a peer discovers that it has a different latest block hash, it will request the missing blocks from other peers.

Sequence Diagram

The sequence diagram here represents a scenario where new peer has joined the peer network. The peers latest block hash will be `[0u8, 32]`, while other peers would have different latest block hash, assuming they committed several other blocks. Therefore the new peer will request missing blocks.

Activity Diagram

Activity diagram represents a synchronization process from the moment of receiving the latest block hash to the moment when all the new blocks are stored.

Block Validation

To validate the signatures of the block the peer needs to have:

  1. The set of peers that were validators at the moment of discussing that block in consensus
  2. The order of those peers that defines their roles at the moment of discussion

Set of peers

The set of peers will already be up to date for this block validation on our peer as the initial peers set is taken from the `trusted_peers` file and new peers are added through AddPeer ISI, therefore only on block commits.

Order of peers

The order and therefore roles are defined based on the previously committed block hash. But the order is shifted every time a view change happens (when leader or proxy tail is faulty or the voting fails).

The synchronizing peer will definitely have a previous block hash, but it does not know the number of view changes (order shifts) that happened between the previous block and the new block were committed.

Therefore a new field was added to the block header: number_of_view_changes. It specifies the number of view changes that happened between the commit of the previous block and the commit of the block in which header it exists.


With this design both the set of validating peers and their roles are known to the synchronizing peer for each new block.

Decisions

Alternatives

Concerns

Assumptions

Risks

Additional Information

  • No labels

19 Comments

  1. Therefore a new field was added to the block header: number_of_view_changes. It specifies the number of view changes that happened between the commit of the previous block and the commit of the block in which header it exists.

    I think we should find the way to calculate that value or stick to the difference between blocks heights.

    1. So you mean something like block at height 3 will be chained with block at height 5 if there were 2 view changes (due to faulty leader or something else) between the commits of these blocks? I think it will work, if the rest of the team agrees, we can implement it this way.

      1. But we have to pay attention that view change does not always mean invalidated block (the view change may happen if the leader does not accept tx for example). So the height difference might be confusing to comprehend.

        1. Yeah - idk what can work in this situation, because we can have situation where quantity of network topology's view changes won't be equal to the differences in last blocks heights.

  2. I like Iroha1 approach more. It is simpler and does not require introducing additional gossip messages to send latest blocks.
    Please check how it is explained in this video (~1 minute long explanation): 

    In short in Iroha1 synchronization is done during block commits after consensus. Whenever block is finalized and broadcasted, every peer checks if it has this block (in case it voted for the same block) or block does not exist (for example block#11 was received but current peer's latest block is #4).

    If block does not exist we check the signatures of finalized block to figure out who has this block and ask one of these peers to send missing blocks.

    1. Thanks Salakhiev Kamil, how it handles situations where peer has block#11, but missed #9 and #10?

    2. This approach sounds good. The main reason I took the gossip approach is that it is mentioned in the iroha2 design whitepaper. So if Makoto Takemiya agrees that we implement it as was done in Iroha1 we can put this into whitepaper and add an issue to correct the implementation.

      1. I think we should do the gossip approach as in the iroha2 design Whitepaper. The main reason is that it is more robust against collusion to use a gossip approach and this is also the dogmatic way to do it in most blockchain systems. Also, without transactions, iroha doesn't commit blocks, so you wouldn't want to wait until the next commit to sync, but to do it proactively.

        1. Doing it proactively might be beneficial only for the slow nodes or nodes that missed some blocks and rejoined the network. For majority of cases that kind of gossip messages can be redundant as peers will announce their latest state which is very likely to be the latest state of others as well.
          I think time required to wait new committed block is negligible in comparison to the time required to synchronize all missing blocks

  3. It cannot have block#11 but miss blocks #9 and #10. 
    You probably meant what if peer's latest block is #8, but consensus finalized #11. For that we request for missing blocks when we see commit for block #11

    1. So we assume that our ledger is always knows about all trusted peers and there won't be a case with 2 or more clusters (network loss on the border of two regions for example) building their own chains?

  4. Clusters cannot build their own chains as they need to have supermajority of votes to commit a block. There is always only one "cluster"
    Yes, every peer knows about trusted peers corresponding to the state of its latest committed block

  5. Moved from inline comments:

    Makoto Takemiya : Proof of view change needs to be stored in a finalized block.
    Egor Ivkov :

    What kind of proof can we store?

    So for example peer receives a message BlockCreationTimeout, signed by f + 1 peers, so it changes the view. Should it store this message, or the hash of this message, ...?

    1. Egor Ivkov if you made an agreement, can I move this RFC into ADR?


      1. I am still waiting for clarification from Makoto Takemiya . Mentioned this in a chat several times.

        1. Egor Ivkov let's consider modern approach to store only hashes for proofs in this case.

          Please wrap up this RFC, change status to `commented` and fill empty sections. After that let's have a meeting to split it into Jira Issues and move to ADR.

          1. The thing is I am not sure if we can retroactively prove the necessity of a view change just from the hash of the received message. To prove the validity of a view change we would need to be sure that the message had enough signatures and the signatures were correct (therefore also we need payload). That's how I see it.

            But maybe there are other methods to do it, or we don't need to retroactively prove it, etc. Therefore I wanted to get an opinion of Makoto Takemiya on this.

            1. View change count can just be stored in the block header. The proof is then in the block signature, signed by all the validators.

              For cases where a peer previously signed off on a block, please recall that as per the whitepaper design, the hash needs to be stored in the block header as permanently blacklisted:

              the commit timer on nodes will go off and a new leader and proxy tail are elected; the signatures from at least f+1 nodes saying their commit timer goes off, invalidates a block hash forever; this invalidation is written in the next block created successfully, to prevent arbitrary rewriting of history in the future

  6. Moved from inline comments:

    > For peers to discover that they do not have latest blocks, peers send their latest block hashes to all other peers

    Salakhiev KamilAll peers or only new one?

    Egor Ivkov All peers gossip about the latest block hash, so if some of them missed a commit message they can request these blocks.
    Makoto Takemiya This is too much message passing. Instead, peers who want to sync should request randomly from peers that they know, rather than getting messages from other peers. It needs to be a "pull" modality, not push.

    Egor Ivkov

    So in terms of an algorithm:

    1. Every N seconds peer requests latest block hashes from a small random subset of all peers.
    2. Peers answer to this requests with their latest bock hashes
    3. Peer compares its own block hash to the received ones
    4. ...

    Something like that?
    Makoto TakemiyaYes, I think that is reasonable.