Sources of data to support the various aspects of projects health

  • Automation friendly
    • github (contributors, code activities, PRs, issues)
    • discord (engagement level, responsiveness)
  • Manual collection
    • information contributed by project maintainers and members (technology usage)
    • meet-up and other outside evangelism
    • translations of collaterals to languages other than English
    • quality of docs
    • are there project meetings in place on a regular basis
    • amount of research publications it generates
    • performance and reliability testing data
  • Survey-only (?)
    • community involvement in roadmap definition
    • can new ideas be accommodated
    • time to respond to questions
    • (we need to be very careful to not make this type of input subjective; on the other hand, "feeling" is very important for people to judge how much they'd like to continue to engage)
    • (external tools: hackerXXX, etc. are the reported issued being treated properly)

Legend:

Easy to collect
Harder to collect
Hardest to collect

FocusAspectDetailSupporting DataActions
CommunityGrowthnew interested individuals and conversion to contributor
  • the number of contributors to the code base (github PRs)
  • the number of contributors to requirements (github issues)
  • the number of contributors to design discussions (discord)
  • people working on docs (github PRs?)
  • page views of the contributor's guide
  • meet-up, outside evangelism (people leading meetups in their region)



Diversityno single organization keeps the project live
  • the number of organizations contributing to the code base and roadmap (github PRs, issues, discord discussions)
  • contributions to RPC/improvement requests
  • should be recorded as percentage of contributions from each org
  • language translations of the collaterals


Retentioninteresting/useful projects attract contributors, healthy projects retain them
  • active contributor longevity (github PRs, discord)
  • conversely, the contributors who have become inactive should also be recorded



Maturityreflects the lifecycle phase a project is in. This gives context to the others, what may be a red flag for a mature project may not be so for a young one
  • when was the first commit
  • frequency of releases (more mature projects have more regular cadence and have a higher success rate to achieve the cadence)
  • docs (high quality docs reflects higher maturity)
  • following best practices (security, test coverage, guidance on becoming a contributor etc.)
  • are the main committers involving the community in defining the roadmap?


Friendlinessto new contributors/ideas
  • number of good-first-issues, help-wanted
  • new contributors onboarded
  • can new ideas be accommodated, even if that may lead to forking of the code base (how far are core committers willing to go to help)
  • is a project defining a consistent way for new ideas and issues to be raised (github issues, etc.)
  • Are there PRs without comments.
  • time to resolve PRs and issues new vs. core contributors (github)
  • no special treatment to core committer's PRs, ideally all PRs should be given equal attention and addressed in order (well defined order taking into consideration of the severity, importance and chronological order) they were created
  • are there project meetings in place on a regular basis


Responsivenesshow long until proposed changes (code, design, bug reports, etc.) are given attention?
  • time to resolve PRs and issues across the board (github)
  • time to respond to questions (discord) - this may not be possible to measure (telling the difference b/w someone making a statement, vs. asking a question that never got responded to)

CodeUsefulnessis the project being adopted by customers and tire kickers?
  • usage information provided by customers and developers
  • number of questions from clients trying to use the code
  • docker pulls
  • release binary downloads
  • tagged online resources: case studies, presentations, mentorship programs
  • amount of research publications it generates
  • ask SIGs
  • hackathons
  • certified vendors?


Production Readinessis the current code base coherent enough to be usable in a real-world scenario?
  • release number (latest is 1.0.0 or later?)
  • test coverage
  • performance and reliability testing data
  • user documentation


Fundamental Metrics
  • commit rate: number of commits per month etc.
  • maybe indicators that would allow us to catch when a project starts to "cool down" or "people are leaving it for other options"


Docs
  • does it exist
  • quality
  • LMDWG is creating a badging system for documentation.


Amount of Innovationhow cutting edge?
  • ePrint, arxiv
  • measures academic interests

Automated data collections:

  • github APIs, discord APIs
  • Ry: if implementable as github actions, hyperledger staff can be responsible (preferably in a repo dedicated to this); if LFX, then questionable for now as the team is behind;
  • Hart: would be great input to the LFX team
  • David: for Meet-up, Hyperledger has a paid org for all the meet up groups, the platform also has APIs
  • Tracy: "OSSInsight" is a tool that can tell you the org for a github handle

Manual data collection:

  • should we consider asking each project to self-report? or designate a team to collect that for all teams
  • Tracy: our initial goal was to find all automate-able data sources
  • Peter: yes support the proposal to put aside those that are not automate-able

Would it make sense to allow folks to report data that are manually collected?

  • Peter: we should consider if they should be taken into account for evaluating a project's health. for now that should be optional
  • Bobbie: there's an existing page that does some of this already (link will be posted to the chat)


  • No labels