Page tree
Skip to end of metadata
Go to start of metadata
Status

PROPOSED 

Outcome
Minutes Link

What is the policy on accepting DCOs from contributors operating under pseudonyms?

Some possibilities:

  • Always.
  • Never, DCOs fields must have "Real Names".
  • If the real identity is known to the TSC, it is acceptable.
  • If the real identity is known to a maintainer of the project, it is acceptable.
  • If the real identity is known to the maintainer merging the commit, it is acceptable.
  • If the commit contains a signed-off-by for someone using their real identity and who knows who the person represented by the pseudonym is, it is acceptable.
  • If the commit contains at least one signed-off-by by someone using their real identity, it is acceptable.

Background

Blockchains, cryptocurrencies, and cryptography in general have had a long history of people contributing and operating under pseudonyms: from Captain Crunch to Satoshi Nakamoto.  Some contributors prefer to operate under a pseudonym in order to avoid online harassment because they have been or have reason to believe they would be subject to harassment. 

Regardless of the reasons for contributors "masking" themselves we would benefit from clarity regarding how to handle contributions from anonymous and pseudonymous contributors.  Are they required to "unmask" and if so to what degree and to whom (completely or to trusted parties).

To complicate things there are also individuals who operate under pseudonyms that appear and act like real identities.  If the policy is anything but "always" what duties do maintainers have when they discover a contributor with a pseudonym that passes as a real identity is contributing or has contributed?

A related issue would be whether maintainers would be able to operate under a pseudonym and what degree of unmasking would be required if allowed.

Reviewed By

13 Comments

  1. I think that the whole purpose of the DCO is meant to be able to trace back a commit to its author who is asserting that they have the right to make the contribution etc etc. IMO the signed-off-by needs to be a real email address that we can use too contact the submitter should we need to.

  2. You can operate under a pseudonym AND have a real email address. Case in point is Satoshi herself who was reachable using an email address. A person can assert that they have the right to contribute the code, but falsely. There is not much recourse to this state of affairs.

  3. I followed up with another email addressing some of this.  Is there any way we could get someone from legal to address DCO stuff?

    1. Yes, I brought it up and Brian's going to ask the LF lawyers for their take on this.


      1. Thanks a lot!  Do we have a timeline on when we might hear back?

        1. No but let's see what Brian reports back.

  4. LF legal looked at the item and were wondering what underlying need was motivating the ask.  "In the Linux kernel for example the maintainers are expected to know the identity of anyone whose patches they're contributing. The real issue is if there was ever a legal matter, would the person be identifiable and available because we have their identity."  I was going to bring that question back to here but fell behind. 

    The risk of taking a DCO from someone that can't be identified and reached is that a challenge to the provenance of that code can't be answered - basically anyone could claim "that was mine, you accepted stolen property" and there'd be no one to refute that or take the blame for it.  In which case there'd be a very difficult decision - fight in court without any testimony that the code wasn't stolen, or purge the code and require a clean-room rewrite.  Those seem like awful paths to have to take, for the price of more vigilance up front.

    Given this is a matter of legal liability, it's not a decision the TSC can make; at best it could recommend a change to the Governing Board and LF, but it's the GB and LF that need to weigh that risk as they're the ones who would bear the costs of any legal action.

    I wasn't on Hyperledger on day zero, but one thing I recall hearing is that one reason it was formed was to provide a space safe from anonymous contributors who may come along later seeking rent.  I remember specifically hearing that if it turned out Craig Wright was Satoshi, then the Australian patents he (much later) filed on Bitcoin architecture could be leveraged against anyone in the Bitcoin community, in part because the license on the code was MIT and thus came with no patent grants.  I think we want to avoid that risk. 

    However I know the term "real identity" is highly problematic.  We aren't storing Social Security numbers or DNA or anything like that.  The DCO is attached to the commit or PR, from which we can get the Github account name, but that doesn't necessarily come with a real name or even a contactable email address, which is also a problem when we pull together the voter lists for the TSC election.  Are each of you sure you'd be able to get in contact with all submitters of PRs you've accepted?  Even good, real people have their email addresses go bad or name changes and then can't be reached.  So this isn't about providing a hermetic seal around the problem, more showing good faith and intent in ensuring we don't receive stolen or patent-covered code.


    I'll try and get more clarity.  Til then, please document any instances where people refuse to offer PRs because they don't want to be contactable after the fact.

  5. If LF/Hyperledger has no interest in patenting any of its code, then I suggest a simple way to preemptively disclose pertinent information is require the committer to briefly describe what the code is supposed to do.  Such disclosure makes the code prior art and negates any attempt to claim novelty as part of the patenting application.  A corollary is when an author discloses his/her invention in an academic publication prior to any patent submission, and automatically becomes "state of the art" information that anyone can use.

    Kent Lau.

    LLB,LLM (Intellectual Property).

    Hong Kong.

    1. This wouldn't address the case where a patent was filed before the code was contributed though, right?

      1. Making it easy for the Patent Examiner to find and understand the breadth of hyperledger code in the public domain is a useful strategy to make it harder for the patentee to claim novelty.  A search by the Patent Examiner will reveal areas of contention and the onus is on the patentee to satisfactorily explain to the Patent Examiner exactly how his/her claims are novel with respect to Prior Art.  Conversely, there is no legal requirement for LF to sustain or explain any commit message or other disclosure.  My suggestion is to maximise the barrier for any malicious actor to be granted a relevant patent.


        Regards


        Kent


        Hong Kong

    2. IANAL, but AIUI, the US recently moved from first-to-invent to first-to-file, which means even an openly disclosed invention developed by person A can be patented by person B if person A doesn't do it first (with the inevitable caveats).  Furthermore, how does a maintainer accepting the contribution know that all possible patentable claims have been declared in the commit message of "what the code is supposed to do"?  The threat here is from malicious contributors who intentionally don't disclose patents or make themselves uncontactable, so adding a new reporting requirement on contributors doesn't seem effective.

      1. There are 3 features of patent law to bear in mind:


        1)  Grace Period is a set time limit within which you can still file a patent after public disclosure of your invention.  There is no Grace Period in a first-to-file system.  Therefore, you must submit your patent application BEFORE public disclosure.


        2)  Novelty is the concept that you cannot patent anything already in the public domain (also known as Prior Art).  Public disclosure destroys novelty in a first-to-file system.


        3)  Scope is the breadth of claims for which you are applying for patent protection.  It is impossible to know all patentable claims, although patentees do try to maximise the scope.


        In reponse to your particular points:


        i)  Openly disclosed invention destroys novelty, so person B cannot get a patent for Prior Art by person A, unless person B makes a new and different claim or an improvement on the Prior Art.


        ii)  Maintainer does not need to know all patentable claims because we are creating a "spoiler system" and the Law of Obviousness increases the zone around the public disclosure.  The maintainer need only speculate (rightly or wrongly) about the function of the code, since LF is not applying for a patent, and no actual evidence is needed.  


        iii)  I agree that it is difficult to preempt a malicious actor submitting a patent before committing that code.  However, a "spoiler system" does not need the malicious contributor to be contactable, or even identifiable.


        I submit the above points for your consideration and commend a "spoiler system" of public disclosure as an effective "shield" with minimum effort and overhead.


        Regards,


        Kent.


        Hong Kong.

  6. In short how can you make the DCO stick?

    1. DCO signer identity is unknown
    2. DCO signer signs without owning rights to the code (this can happen when a mass of code with multiple authors are signed off by one person after a squash)- happens especially when a new project with existing code joins the system.

    For 1. Suggestions: Hart/Chris- make it correspond to LFID which have a lot of hidden and verified attributes which are not public, but known only to LF (see below for SSI)

    For 2. There is no known antidote(yet). Going after the DCO signer is possible with 1, but then what is the recourse?

    For 1. and 2. Kent G Lau suggests creating prior art (Brian Behlendorf already has the first-to-file objections around this- but prior art defense does trump this as noted by Kent G Lau ), IF the implementation is not patented yet. Pieces of code cannot be patented only copy-righted, nor ideas- only implementations or inventions.

    Finally MOSS solutions can look for similarity in software (usually used for anti-plagiarism in universities)- don't know how this would work or how accurate it will be.

    In the end this remains a conundrum. Basically the aim is to start off with clean unencumbered code (Apache 2) and accept only unencumbered code signed off properly, with properly publicly disclosed details.

    On another note: I wonder if these verifiable claims can somehow be issued by the signer and held by LF (Hyperledger) for later verification using Self Sovereign Identity and DIDs- a question for Nathan George . Let us look at this as a use case for SSI- I doubt whether we will solve the DCO issue in the short term with this solution.