Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Reduce Gossip Flakes
  2. time.sleep in gossip tests
  3. remove data races

Gossip Flakes / Connection Management

TestBasic

  • Bi-

Test Accept

This is less a design document and more a "design discovery". We don't know why a lot of this is the way it is. There don't seem to be any contemporaneous records from three years ago when most of it was written.

Gossip Flakes

The general concept of gossip is is that nodes pass data around. It is not necessarily 100% foolproof to get messages through.

Unfortunately a lot of the tests behave as if the implementation can never fail.


Why are we tracking connections at all? Should gRPC be handling that? Why do we care to try to only have one and only one connection for pkiid != pkiid?

Connection Management

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-15570

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-14936
dup as 
Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-14960

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-14048

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-13997

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-13539

Jira
serverHyperledger JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-15486

TestBasic FAB-13539
Assignee:

  • Bi-directional connections, one side hangs up the other. Additional connection racing like the below TestAccept case, which is uni-directional.


Test Accept:
Assignee: Swetha Repakula

  • Same client makes two connections, does not finish servicing connection before second one comes in, causing a race
  • Possible Solution:
  • Separate outgoing and incoming versions of the connection -> so that if we see a second incoming we don’t cancel the first incoming
  • https://jira.hyperledger.org/browse/FAB-15486

...

  • stopFlag (atomic) and stopCh manage whether the connection is being closed - > We should use one or the other, not both. I think with the way most of the code is, we need the stopCh, so we should get rid of the toDie() function and select on the stopCh as appropriate 
  • Two peers try to make a connection to each other at the same time, will end up with both connections being closed. Solution: make a deterministic decision on which peer should drop the connection and who should keep making the connection. Logic is in getConnection. Assignee: Swetha Repakula
  • Single connection, single direction
    • single connection is good for nodes behind a firewall
    • single direction (grpc uses either client stream or server stream) was the initial implementation. Changing that now would require a large refactor and may make the peers unable to talk to peers of older versions.


DeMultiplexer

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-13377

Jira
serverHyperledger JIRA
serverId6326cb0b-65b2-38fd-a82c-67a89277103b
keyFAB-14956

Code isn't safe to call multiple times. Rewritten in whole. Same interface.