Skip to content

Create the "fact tables" for our metrics in data/metric-definitions #5970

@ccerv1

Description

@ccerv1

Developer Metrics Definitions

Activity

data/metric-definitions/activity.py

Scope

Define canonical measures of developer activity and thresholds for determining “active” status.

Metrics

  • Commits: count of distinct days with at least one commit
  • Push events: count of distinct days with at least one push event
  • Issues / PRs: count of distinct days with issue or pull request activity

All activity metrics are expressed in terms of distinct active days, not raw event counts.

Activity Thresholds

  • Electric Capital definition

- Active if ≥10 distinct commit days within a 28-day rolling window

  • OSO definition

- Active if ≥10 distinct days with any of: Commit, Push Event, or Pull Request activity within a 30-day rolling window

Outputs

  • Boolean activity flags per developer per month:

- is_active_electric_capital

- is_active_oso

  • Supporting metrics:

- active_days_commits

- active_days_pushes

- active_days_issues_prs

- active_days_any

Goal

It should be possible to produce line charts comparing Electric Capital MAU and OSO MAU over time, with trends that are directionally consistent and ideally closely overlapping.


Lifecycle

data/metric-definitions/lifecycle.py

Scope

Define developer lifecycle stages and capture all valid state transitions over time.

Lifecycle Stages

  • New developer
  • Full-time active developer
  • Part-time developer
  • Dormant
  • Churned

Each developer must be assigned exactly one lifecycle stage per time period.

Transitions to Capture

  • New developer → Full-time active developer
  • Full-time active developer → Part-time developer
  • Part-time developer → Full-time active developer
  • Any active state → Dormant
  • Dormant → Active (full-time or part-time)
  • Any state → Churned

Outputs

  • Lifecycle stage per developer per month
  • Explicit transition records:

- developer_id

- from_state

- to_state

- transition_month

Goal

It should be possible to generate bar charts showing the distribution of developers across lifecycle stages using either ODD- or OSO-derived activity metrics, with comparable aggregate patterns.


Alignment

data/metric-definitions/alignment.py

Scope

Measure how a developer’s activity is distributed across ecosystems within a given time interval.

Definition

Given:

  • A predefined list of ecosystems
  • A developer
  • A time interval (e.g., calendar month)

Compute the percentage of the developer’s total activity attributable to each ecosystem during that interval.

Percentages must sum to 100% per developer per time period.

Example

  • March 2025:

- Ethereum: 70%

- AI: 30%

Outputs

  • developer_id
  • ecosystem
  • time_period
  • activity_share (0–1 or 0–100)

Retention

data/metric-definitions/retention.py (TBD)

Scope

Measure developer retention over time within a given ecosystem.

Definition

For each (developer, ecosystem) pair:

  • Define Month 0 as the month of first observed contribution
  • Track activity presence for Month 0 through Month N using OSO activity definitions

Outputs

  • Cohort tables keyed by first-activity month
  • Retention counts or rates at each month offset
  • Optional boolean flags for active vs inactive per offset month

Goal

It should be possible to generate cohort tables and line charts showing developer retention curves over time for each ecosystem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions