Status

PROPOSED 

BlockingNone blocking, but Besu docs and Fabric docs are regularly impacted
Outcome
Minutes Link

Overview of Proposal

Multiple projects have faced issues with new contributors not correctly adding DCO sign off lines in their contributions, and when maintainers attempt to help the new contributors fix the issue the contributor withdraws from the process either implicitly (by ghosting) or explicitly (by saying this is too complex and then withdrawing).  This is most pronounced in the documentation repositories.

The proposed solution would allow for an author to make a DCO declaration outside of the git commits for workflows that support it. A comment in a Github PR would be an example.  Rather than asking the contributor to re-submit their git commits they can assert the DCO in the comments and the maintainers would squash merge the commit.

This is what Chef does with their DCO statements (see the next to last Q&A question).

Formal Proposal(s)

If a contributor does not make a DCO statement in each of their commits in a proposed change (via GitHub PR or other similar workflow) they can make a declaration in the comments of the review stream.  Ideally adding the "Signed-off-by: John Doe <john.doe@example.com>" line in the comment stream or another statement such as "this is my work and it complies with the DCO."  It is good form for the maintainers to directly link https://developercertificate.org/ when requesting the contributor provide an in-review stream sign-off. One such claim in a PR will be sufficient for each commit in the PR up to the point of attestation.

When merging such a commit the maintainer will either need to rebase the commit so that all commit entries have a DCO or perform a "squash" commit and enter the DCO. The "signed off by tag" in one of two forms:

  • If the contributor provided a "signed off by" tag in the comment stream that tag alone may be used.  The maintainer may optionally add their own tag in addition.
  • Otherwise the maintainer will need to provide their own signed off by tag (pursuant to clause c) in one of two forms:
    • Signed-off-by: Jane Maintainer <jmaintainer@example.org> 
    • Signed-off-by: Jane Maintainer <jmaintainer@example.org> on behalf of John Doe <john.doe@example.com> 

This proposal only applies to commits submitted for review.  All commits in branches that releases of any caliber are built off of (main/master, release branches, patch branches, alpha/beta/nightly) must still contain a signed off by tag in each commit.

Linear merges or linear rebases should not be done for commits covered by this policy.  Squash merges are recommended instead.

Tooling Impact

The current configuration of the DCO bot will need to be adjusted.  The quickest change is to make the DCO bot check not block merging, although that introduces risk that invalid merges may occur.  A more involved change would be for the DCO bot to check merge comments for a DCO tag.  If the maintainer accepts a non-tag statement they should reply with an "on behalf of" tag in the stream which the DCO bot can pick up.

Projects should be encouraged to add DCO checks into their CI build process.  Such a check would cause the CI to fail to produce a usable output if the DCO validation of the commits fails. These checks are not as universal as the DCO bot. If a project can demonstrate that the DCO checks are caught by their CI system then they would be allowed to drop DCO but errors to advisory errors in the Github PR flow.

Action Items

  • Type your task here, using "@" to assign to a user and "//" to select a due date

Reviewed By


20 Comments

  1. I like the fact that Chef limits this to only obvious fixes. I like that they define what an obvious fix is in that link. By limiting to obvious fixes, this helps protect the community and the code from risks due to inappropriate contributions of the intellectual property of others. If we are moving forward with this, we should have a similar requirement.  Also, chef seems to require the maintainer to rebase the branch and add an appropriate attestation to each commit. It seems like the devil is in the details of how this would be implemented. I suggest adding some additional commentary to how you expect this to be done.

    1. There are actually two rules.  The obvious fix rule doesn't require a DCO sign off.

      The rule I was referencing in chef was this one (quoted because it doesn't have a deep link) - https://blog.chef.io/introducing-developer-certificate-of-origin

      Rewrite my commit history?!  I am not a git guru, can you just accept my pull request?

      Every commit, that does not meet the criteria for an obvious fix, must have a Signed-off-by line.  Chef maintainers, listed in the project’s MAINTAINERS.md file, may ask you to attest to the DCO by way of a comment on your PR.  After you have done so, a maintainer will rebase your branch adding an appropriate attestation to each commit.  The resulting commit message would include something along the lines of:

      Signed-off-by: Jane Doe <jane@example.com> on behalf of Joe Smith <joe.smith@email.com>

      The contributor still provides a DCO, it is in the comment stream instead of in the commit stream.  They still assert that the origin fits the standards in the DCO, and such initial attestations are done in a traceable means, in this case a github commit.

      1. That is the one that I was looking at too. I am suggesting that the write up in the proposal does not include this language, and I believe it should. Specifically, the "obvious fix" being the only thing that does not require sign-off and what an obvious fix is. 

        1. I would prefer it to be a separate proposal, they really are two orthogonal issues.  Setting standards for what is an obvious fix is a separate possibly deep discussion.  Even though it impacts the same tooling in the end.

          1. I think that are one and the same. I would not support this proposal unless it is limited to "obvious fixes" since there is too great of a chance of bringing risk into the project by allowing inappropriate contributions of the intellectual property of others.

  2. As commented in chat, if DCO is required on the commits, it is important that tooling enforces that DCO is provided prior to the merge. If an invalid (no DCO) merge occurs, what would happen? Rewriting git history to fix missing DCOs on commits is not a good solution.

    1. For Besu if a non-DCO merge happens to main then the commit is rolled back and then replayed with proper DCO signatures.  Because we use squash and merge occasionally maintainers delete all the DCOs from the concatenated change lists when cleaning it up. When this happens the CI tooling will not produce any actionable artifacts as DCO is baked into the CI pipeline.

      In blockchain terms a commit without a DCO becomes a bad block and the chain rejects that block and builds off of the last valid block.  The chain is finalized on epoch boundaries, where such epoch boundaries in Besu are bi-weekly releases.

      1. occasionally maintainers delete all the DCOs from the concatenated change lists when cleaning it up

        This is a concerning statement. Why are maintainers deleting all DCOs? They should leave non-duplicates if they want to clean it up, but removing all means that you are losing the history of the DCO sign-offs when the originating branch is deleted.

        1. We don't lose the history, The GitHub PRs the squash origniated from are still in place, even if the original user deletes their branch.

          1. So history is not lost, but much harder to see. For example, if I wanted to write a script that used `git log` to get all of the DCO sign-off lines, I would no longer be able to do that and get an entire list of the people who have contributed.

            1. As I said in a parallel comment, when doing the squash merge the maintainer is supposed to retain all the DCO lines.  When doing their own commits some maintainers are over-zealous in their deletions.  Those are the situations I have observed it happening in Besu.  I am unaware of bad merges happening with non-maintainer contributions, they've all been self merges.

        2. Also, such deletions are accidental and policy is all unique DCO lines should remain when writing the squash comment.

  3. Comments as below:

    • I am with Troy, tooling should not allow to merge (including cases where it is unintentional) without a DCO sign-off. Rewriting history and missing out DCO sign-offs would be risky legally.
    • If maintainers have access to write/edit the commit messages, it is ok for maintainers to do it before merge. However, I have a concern on contributors adding a comment in PR, it isn't same as adding a line the commit description. PR comments are easily modifiable. Somebody can delete these comments. What if the developer/contributor claims not agreeing to DCO status once the commits are merged. This would require additional protection on who can delete/edit comments.
      • Github does not have pre-merge hooks for Squash commits.  I really wish they did.
      • contributors can similarly force-push to a PR to overwrite what is there. Only linear merges prevent this. And requiring linear rebasing by contributors will result in more contributors abandoning commits.
      • Also, comment edits are tracked by github - https://docs.github.com/en/github/building-a-strong-community/tracking-changes-in-a-comment - they cannot claim they never gave DCO attestation without leaving a trail, it would require collaboration of a maintainer.  However deletes do leave a mark.

      My main question is how does this square with clause C in the DCO.  Seems like something that should be referred to the legal committee if we are concerned about legal risk instead of speculating in comments.

      1. Sure, I agree. Let's ask Hyperledger team to check this with legal.

  4. Do we have a response yet from legal on this?  I'm worried that this is something we should not be answering ourselves.

    1. I didn't refer anything for input from LF legal yet as it seemed like the requirements and design were still being settled. If those are mostly settled I can pass it along.

      One point I don't entirely understand - PR comments aren't tracked in the git repo itself, only as metadata tracked in Github. Currently, the git repo contains all that is needed to audit for DCO (in theory) and so is not dependent upon GH data. Do I have that right? So audit tools would need to be GH aware, and should we ever have to move off of GH this means we need to pull forward PR data in order to remain auditable. That seems like a high albeit long-tail risk even if "LF Legal" says that's acceptable

      All of these shenanigans for lack of a simple by-default-set DCO grant in the Github pull request user interface, sigh.

      Brian

      1. I think moving away from github would be really hard, and we'd probably need to pull all of this data for other reasons as well.

        But yes, a simple by-default-set DCO in github would make this much better.

      2. The entire PR itself is github metadata.  All the git commits constituting the pr are only connected via GitHub when a squash commits are used.  Github does include the PR number in the title of the commit.   So if you clone and host the repo in some other git server you don't have any of the PR metadata to go with it, and that is the state today.  This proposal requires that all commits that would be in such a clone require the signed-off-by attestations in each commit for that clone. Rebase and direct merges bring in commits from such PRs however squash commits create a new "roll-up" commit.

        This proposal only allows for a separate workflow in github metadata that does not join that cloned repo, and the commit that joins the cloneable commits must still keep a valid signed-off-by line.