You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »


Modularization of Hyperledger Besu

Goal of this document: 


  • Starting a conversation about modularizing Besu.
  • Keeping track of the discussions.

General context


We are getting various signals

Erigon, Turbo Geth, MEV proposer /producer ..


  • Pros: 
  • Cons:
  • Risks:
    • Hyperledger 
  • Mitigations:
  • Horizon: 
    • pre-merge ? 
    • post merge?
  • Duration: 
    • 9 months - but we should divide in increment (module per module). 
    • For example, how long to modularize one core component of Besu?




General

Two step approach 

  • Open-minded approach, engineers attempting the modularization to see the real scope


Engineering effort around Besu

Cross team (internal and external) + Infura

Series of workshops to define the work

Technical organization

Communication organization - multi stakeholders discussion, federating people around modular besu

  • Large engineering effort 

Use case: MEV, rollup

Modules: EVM, tx pool (pluggable storage, tracing/analytics/API layer?) 

Modular infra would allow us to select specific core parts of Besu needed to run an ethereum client by our customers with consensys plugins on top.

Reducing the dependency on HL?

Modularization

Pros: speeding up release cycle, better defined scope for contributor to target a specific part of the codebase

  • Inversion of control (justin)
    • Paradigm to help modularizing the codebase that we need to 


Solution

  • That is not adding tech debt
  • Easy to maintain from a ConsenSys perspective
  • Providing consensys some controls over extra features
  • Allowing for a platform to help create features more easily (new modules)
    • Hackathon/Bounties “best MEV plugin on Besu”


Step 1: What are Besu Minimum viable components ? Or should we think about it as a combination of modules (brainstorming / workshop session)

Situations to assess:

  • Situation where only the EVM is needed and not the Consensus. Ex: Rollup, Hedera Hashgraph



  • Set up call with Revenant / Chupa
  • Ask folks to bring ideas around modularization
  • Find minimum viable components 
  • Catalog all components 
  • Scope MVP (minimum viable platform)
  • Test approach on one or more modules (see the timing and scoping/lessons learned)
  • Extrapolate out rough timeline on MVP scope and modules timing vs the catalog 






  • EVM Engine
  • Consensus Protocols
  • Peer-to-peer communications
  • JSON-RPC Communications
  • Data Storage
  • Block Production



Modularization technical braindump


(brainstorm) Modular Client for the Merge - draft

  1. Approach #1 - “Besu as Debian” - create distribution artifacts from modules
  2. Approach #2 - “Execution Engine Shell” - create an execution engine project leveraging hyperledger/besu modules
  3. ?


Synthesis of the conversation on Discord

Participants: Gary, Danno, Sajida, Tim

https://discord.com/channels/@me/804833347816914944/885936379618545724

(add notes)


Debrief of meeting with Erigon

Meeting #1 - 9/14/21

Participants: Alexey, Madeline, Sajida

  • Sentry component
  • C++ and rust implementation are being done
  • Each reimplem takes less time than the precedent
  • Contrary to popular belief, it’s not hard to rewrite things from scratch. Might even be easier.
  • Alexey wants to start a Java reimplementation, and they don’t have anyone to do it in java
  • Besu in ⅔ years - he sees a dead end for the monolith model like besu, nethermind, openE
  • Geth snapshotter; Geth realised that traversing the tree
  • Collaboration would be:
    • Join their family of product
    • Reimplement core product like evm
    • Make them compatible with their others components
    • That will be a 4th compatible implement to their portfolio



  • Erigon is funded by EF, gnosis and small amount from various org 
  • They are hiring for the go implementation, they have 2 active dev, they might bring couple other, it is a small team
    • Cpp team : ⅘ ppl
    • Rust team: 2,5 ppl , some of them are not employed but just contributing part time


  • Cycles of modularization
    • 1st rewrite: 2017 - 4 years or 3,5 years
    • 2st rewrite may 2020 - c++ w/ couple ppl , now they are almost finish the core component (1 year and half) might get the core component roughly finished end of 2021
    • 3rd rewrite jan 2021 - rust, could get to the same level as the other by the end of 2021, so 1 year; Rust will be ahead of the C++ implementation
    • He predicts that with Besu in 6 months because we already have a codebase, we don’t start from scratch.
  • Should we join the effort ? should we invest in Erigon?




Meeting #2 - 10/6/21

Participants: Artem +1, Gary, Sajida

  • Starting from scratch is easier than refactoring existing code into Erigon architecture.
  • Artem used to work on OE and is now working on Acula (rust) mainly alone for 4 months and it’s already passing consensus.
  • Modularization
    • Breaking the monolith - reusable parts: tx pool, consensus engine, sync module
    • Sync module is interesting alone to process by block or by stage
    • might require a change of database, stage sync require MVCC database  (LMDB, Badger LSMbased, B+2
    • it might be possible to start module by module. 
    • Data model could be a good start (might reduce space consumption). 
    • We already have a pluggable storage engine that we could Interface of the pluggable storage resembles MDB/LMDB/DBX Peer 2 peer part (sentry) of Geth was re-used by Erigon but the plumbing is totally different
    • Erigon is heavily optimized toward sequential writes. Random reads / Sequential write - very fast for MDBX.
  • EVM bug leveraging a hole in the memory as triggered by a tx, that was broadcasted everywhere and affected all clients (even on Binance smart chain) - spreads like wildfire.
  • If they have a clique ethereum, fork the module, modify it and connect to JRPC and connect the rest of Erigon. You just had to invest time in creating a module and you get the rest of the client for free.
  • Erigon can be run as a Kubernetes cluster.
  • Transaction pool should get EVM inside and be able to be part of the consensus. It is a security parameter. If we have a DOS attack, the tx pool should guard the blockchain from an attack. Having multiple tx pools that could coexist: one for MEV, on maybe getting DOS in this scenario and one running smoothly. And then you can pick the one that can do the work. Any tx pool could go down while the node is still up. Node is behind the “forest” of other P2P nodes. Ex: Besu sentry (x10 instances), all sentries go down but the core that runs the database/blockchain and stores the chain stays up.
  • The idea of modularity; you make the core, the spec, and the rest is up to you.
  • Andrew: maintainer of yellow paper, has an enum that maps to yellow paper parts. He runs silkworm - very good resource to start the work. Should be interesting to Justin.




  • Very fruitful to invest R&D in this because lots of work has been done so the cycle of reimplementation are getting smaller
  • Refactor: use case -> modularity for l2 , rollups, pluggable, MEV
  • Argument:
    • database - we (besu) have a trie in a trie MPT (access complexity is multiplied). so just switching to another data model would increase our performance. 
    • Erigon threw out the MPT (merkle patricia trie) completely and computes state root post execution and other than that we have a flat state. Plain state table: value = account, key = account address. We are almost there with bonsai on the flat storage but we should work on simplifying
    • using JRPC sure adds communication overhead but it brings so much value in other places that they (erigon) can live with it - JRPC could be replaced of course by something else, like jar(?)



  • No labels