Measuring Repo Community Health with GitHub’s API
With that in mind, I wanted to look at GitHub’s Community Health measure of the repositories I’m responsible for. You can view each repo’s community page separately through the web interface (look under “Insights”) but that’s not especially scalable if you have a lot of projects to track.
So for a more repeatable approach, I wanted to use the GitHub API. Frustratingly the community health metric isn’t in the collection endpoints, you need to fetch data per-repository to get it. Here’s the script that I use; it fetches the first 100 public repos (if you have more than 100, you’ll need to loop and fetch additional pages of results) of the given org or user. It prints all repos and number of stars, then for repos with more than 10 stars, it also fetches and output the community health measure.
The whole thing outputs some sort of pipe-separated-sort-of format … one day I will make my hacky code more perfect before I share it, but today is not that day. LibreOffice had no problem ingesting the result when stored in a text file, so I’m calling it “good enough”!
Anyway, here’s the script – you should set a GitHub access token as GITHUB_TOKEN
and the org or user to use as GITHUB_ORG
:
import json import os import requests token = os.getenv("GITHUB_TOKEN") org = os.getenv("GITHUB_ORG") headers = { "Authorization": "token " + token, "Accept": "application/vnd.github.v3+json" } # all public repos base_url = "https://api.github.com/" repos_url = base_url + "orgs/" + org + "/repos?type=public&per_page=100" repos_req = requests.get(repos_url, headers=headers) repos_list = json.loads(repos_req.content) # print header row print("Project | Stars | Health") i = 0 for r in repos_list: label = r['full_name'] + " | " + str(r['stargazers_count']) if r['stargazers_count'] >= 10: url = base_url + "repos/" + r['full_name'] + "/community/profile" req = requests.get(url, headers=headers) data = json.loads(req.content) print(label + "| " + str(data['health_percentage']) + "% health") else: print(label) i = i + 1 # if this is 100, it's time to build pagination print(str(i) + " public repos in total")
I was surprised when I looked around that I couldn’t find an existing script for this, so I thought I had better share mine. That’s how the open source community works, after all! Tweaks, suggestions and additions are all welcome via the comments box, I’m happy to hear if this is useful and how you evolved it for your own needs.