adding-new-metric
Guides systematic implementation of new sustainability metrics in OSS Sustain Guard using the plugin-based metric system. Use when adding metric functions to evaluate project health aspects like issue responsiveness, test coverage, or security response time.
Best use case
adding-new-metric is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Guides systematic implementation of new sustainability metrics in OSS Sustain Guard using the plugin-based metric system. Use when adding metric functions to evaluate project health aspects like issue responsiveness, test coverage, or security response time.
Teams using adding-new-metric should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/adding-new-metric/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How adding-new-metric Compares
| Feature / Agent | adding-new-metric | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Guides systematic implementation of new sustainability metrics in OSS Sustain Guard using the plugin-based metric system. Use when adding metric functions to evaluate project health aspects like issue responsiveness, test coverage, or security response time.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Add New Metric
This skill provides a systematic workflow for adding new sustainability metrics to the OSS Sustain Guard project using the **plugin-based metric system**.
## When to Use
- User wants to add a new metric to evaluate project health
- Implementing metrics from NEW_METRICS_IDEA.md
- Extending analysis capabilities with additional measurements
- Creating custom external metrics via plugins
## Critical Principles
1. **No Duplication**: Always check existing metrics to avoid measuring the same thing
2. **10-Point Scale**: ALL metrics use max_score=10 for consistency and transparency
3. **Integer Weights**: Metric importance is controlled via profile weights (integers ≥1)
4. **Project Philosophy**: Use "observation" language, not "risk" or "critical"
5. **CHAOSS Alignment**: Reference CHAOSS metrics when applicable
6. **Plugin Architecture**: Metrics are discovered via entry points and MetricSpec
## Implementation Workflow
### 1. Verify No Duplication
```bash
# Search for similar metrics in the metrics directory
ls oss_sustain_guard/metrics/
grep -rn "def check_" oss_sustain_guard/metrics/
# Check entry points in pyproject.toml
grep -A 30 '\[project.entry-points."oss_sustain_guard.metrics"\]' pyproject.toml
```
**Check**: Does any existing metric measure the same aspect?
### 2. Create Metric Module
Create a new file in `oss_sustain_guard/metrics/`:
```bash
touch oss_sustain_guard/metrics/my_metric.py
```
**Template**:
```python
"""My metric description."""
from typing import Any
from oss_sustain_guard.metrics.base import Metric, MetricContext, MetricSpec
def check_my_metric(repo_data: dict[str, Any]) -> Metric:
"""
Evaluates [metric purpose].
[Description of what this measures and why it matters.]
Scoring:
- [Condition]: X/10 ([Label])
- [Condition]: X/10 ([Label])
CHAOSS Aligned: [CHAOSS metric name] (if applicable)
"""
max_score = 10 # ALWAYS use 10 for all metrics
# Extract data from repo_data
data = repo_data.get("fieldName", {})
if not data:
return Metric(
"My Metric Name",
score_on_no_data,
max_score,
"Note: [Reason for default score].",
"None",
)
# Calculate metric
# ...
# Score logic with graduated thresholds (0-10 scale)
if condition_excellent:
score = 10 # Excellent
risk = "None"
message = f"Excellent: [Details]."
elif condition_good:
score = 8 # Good (80%)
risk = "Low"
message = f"Good: [Details]."
elif condition_moderate:
score = 5 # Moderate (50%)
risk = "Medium"
message = f"Moderate: [Details]."
elif condition_needs_attention:
score = 2 # Needs attention (20%)
risk = "High"
message = f"Observe: [Details]. Consider improving."
else:
score = 0 # Critical issue
risk = "Critical"
message = f"Note: [Details]. Immediate attention recommended."
return Metric("My Metric Name", score, max_score, message, risk)
def _check(repo_data: dict[str, Any], _context: MetricContext) -> Metric:
"""Wrapper for metric spec."""
return check_my_metric(repo_data)
def _on_error(error: Exception) -> Metric:
"""Error handler for metric spec."""
return Metric(
"My Metric Name",
0,
10,
f"Note: Analysis incomplete - {error}",
"Medium",
)
# Export MetricSpec for automatic discovery
METRIC = MetricSpec(
name="My Metric Name",
checker=_check,
on_error=_on_error,
)
```
**Key Decisions**:
- `max_score`: **ALWAYS 10** for all metrics (consistency)
- Score range: **0-10** (use integers or decimals)
- Importance: Controlled by **profile weights** (integers ≥1)
- Risk levels: "None", "Low", "Medium", "High", "Critical"
- Use supportive language: "Observe", "Consider", "Monitor" not "Failed", "Error"
### 3. Register Entry Point
Add to `pyproject.toml` under `[project.entry-points."oss_sustain_guard.metrics"]`:
```toml
[project.entry-points."oss_sustain_guard.metrics"]
# ... existing entries ...
my_metric = "oss_sustain_guard.metrics.my_metric:METRIC"
```
### 4. Add to Built-in Registry
Update `oss_sustain_guard/metrics/__init__.py`:
```python
_BUILTIN_MODULES = [
# ... existing modules ...
"oss_sustain_guard.metrics.my_metric",
]
```
**Why both entry points and built-in registry?**
- Entry points: Enable external plugins
- Built-in registry: Fallback for direct imports and faster loading
### 5. Update ANALYSIS_VERSION
**CRITICAL**: Before integrating your new metric, increment `ANALYSIS_VERSION` in `cli.py`.
```python
# In cli.py, update the version
ANALYSIS_VERSION = "1.2" # Increment from previous version
```
**Why this is required:**
- New metrics change the total score calculation
- Old cached data won't include your new metric
- Without version increment, users get inconsistent scores (cache vs. real-time)
- Version mismatch automatically invalidates old cache entries
**Always increment when:**
- Adding/removing metrics
- Changing metric weights in profiles
- Modifying scoring thresholds
- Changing max_score values
### 6. Add Metric to Scoring Profiles
Update `SCORING_PROFILES` in `core.py` to include your new metric:
```python
SCORING_PROFILES = {
"balanced": {
"name": "Balanced",
"description": "...",
"weights": {
# Existing metrics...
"Contributor Redundancy": 3,
"Security Signals": 2,
# Add your new metric
"My Metric Name": 2, # Assign appropriate weight (1+)
# ...
},
},
# Update all 4 profiles...
}
```
**Weight Guidelines**:
- **Critical metrics**: 3-5 (bus factor, security)
- **Important metrics**: 2-3 (activity, responsiveness)
- **Supporting metrics**: 1-2 (documentation, governance)
### 7. Test Implementation
```bash
# Create test file
touch tests/metrics/test_my_metric.py
# Write tests (see section below)
# Run tests
uv run pytest tests/metrics/test_my_metric.py -v
# Syntax check
python -m py_compile oss_sustain_guard/metrics/my_metric.py
# Run analysis on test project
uv run os4g check fastapi --insecure --no-cache -o detail
# Verify metric appears in output
# Check score is reasonable
# Run all tests
uv run pytest tests/ -x --tb=short
# Lint check
uv run ruff check oss_sustain_guard/metrics/my_metric.py
uv run ruff format oss_sustain_guard/metrics/my_metric.py
```
### 8. Write Comprehensive Tests
Create `tests/metrics/test_my_metric.py`:
```python
"""Tests for my_metric module."""
from oss_sustain_guard.metrics.my_metric import check_my_metric
def test_check_my_metric_excellent():
"""Test metric with excellent conditions."""
mock_data = {"fieldName": {"value": 100}}
result = check_my_metric(mock_data)
assert result.score == 10
assert result.max_score == 10
assert result.risk == "None"
assert "Excellent" in result.message
def test_check_my_metric_good():
"""Test metric with good conditions."""
mock_data = {"fieldName": {"value": 80}}
result = check_my_metric(mock_data)
assert result.score == 8
assert result.max_score == 10
assert result.risk == "Low"
def test_check_my_metric_no_data():
"""Test metric with missing data."""
mock_data = {}
result = check_my_metric(mock_data)
assert result.max_score == 10
assert "Note:" in result.message
```
### 9. Update Documentation (if needed)
Consider updating:
- `docs/local/NEW_METRICS_IDEA.md` - Mark as implemented
- Metric count in README.md
- `docs/SCORING_PROFILES_GUIDE.md` - If significant new metric
## Plugin Architecture Details
### MetricSpec Structure
```python
class MetricSpec(NamedTuple):
"""Specification for a metric check."""
name: str # Metric display name
checker: Callable[[dict[str, Any], MetricContext], Metric | None] # Main logic
on_error: Callable[[Exception], Metric] | None = None # Error handler
error_log: str | None = None # Error log format
```
### MetricContext
Context provided to metric checkers:
```python
class MetricContext(NamedTuple):
"""Context provided to metric checks."""
owner: str # GitHub owner
name: str # Repository name
repo_url: str # Full GitHub URL
platform: str | None # Platform (e.g., "pypi", "npm")
package_name: str | None # Original package name
```
### Metric Discovery Flow
1. **Built-in loading**: `_load_builtin_metric_specs()` imports from `_BUILTIN_MODULES`
2. **Entry point loading**: `_load_entrypoint_metric_specs()` discovers via `importlib.metadata`
3. **Deduplication**: Built-in metrics take precedence over external metrics with same name
4. **Integration**: `load_metric_specs()` returns combined list to `core.py`
### External Plugin Example
For external plugins (separate packages):
**`my_custom_metric/pyproject.toml`:**
```toml
[project]
name = "my-custom-metric"
version = "0.1.0"
dependencies = ["oss-sustain-guard>=0.13.0"]
[project.entry-points."oss_sustain_guard.metrics"]
my_custom = "my_custom_metric:METRIC"
```
**`my_custom_metric/__init__.py`:**
```python
from oss_sustain_guard.metrics.base import Metric, MetricContext, MetricSpec
def check_custom(repo_data, context):
return Metric("Custom Metric", 10, 10, "Custom logic", "None")
METRIC = MetricSpec(name="Custom Metric", checker=check_custom)
```
**Installation:**
```bash
pip install my-custom-metric
```
Metrics are automatically discovered and loaded!
```python
from datetime import datetime
created_at = datetime.fromisoformat(created_str.replace("Z", "+00:00"))
completed_at = datetime.fromisoformat(completed_str.replace("Z", "+00:00"))
duration_days = (completed_at - created_at).total_seconds() / 86400
```
### Ratio/Percentage Metrics
```python
ratio = (count_a / total) * 100
# Use graduated scoring
if ratio < 15:
score = max_score # Excellent
elif ratio < 30:
score = max_score * 0.6 # Acceptable
```
### Median Calculations
```python
values.sort()
median = (
values[len(values) // 2]
if len(values) % 2 == 1
else (values[len(values) // 2 - 1] + values[len(values) // 2]) / 2
)
```
### GraphQL Data Access
```python
# Common paths in repo_data
issues = repo_data.get("issues", {}).get("edges", [])
prs = repo_data.get("pullRequests", {}).get("edges", [])
commits = repo_data.get("defaultBranchRef", {}).get("target", {}).get("history", {})
funding = repo_data.get("fundingLinks", [])
```
## Score Budget Guidelines
| Importance | Max Score | Use Case |
|-----------|-----------|----------|
| Critical | 20 | Core sustainability (Bus Factor, Activity) |
| High | 10 | Important health signals (Funding, Retention) |
| Medium | 5 | Supporting metrics (CI, Community Health) |
| Low | 3-5 | Supplementary observations |
**Total Budget**: 100 points across ~20-25 metrics
## Validation Checklist
- [ ] **ANALYSIS_VERSION incremented in cli.py**
- [ ] No duplicate measurement with existing metrics
- [ ] Total max_score budget ≤ 100
- [ ] Uses supportive "observation" language
- [ ] Has graduated scoring (not binary)
- [ ] Handles missing data gracefully
- [ ] Error handling in integration
- [ ] Syntax check passes
- [ ] Real-world test shows metric in output
- [ ] Unit tests pass
- [ ] Lint checks pass
## Example: Stale Issue Ratio
For a complete, production-ready implementation example, see [examples/stale-issue-ratio.md](examples/stale-issue-ratio.md).
**Quick overview:**
- **Measures**: Percentage of issues not updated in 90+ days
- **Max Score**: 5 points
- **Scoring**: <15% stale (5pts), 15-30% (3pts), 30-50% (2pts), >50% (1pt)
- **Key patterns**: Time-based calculation, graduated scoring, graceful error handling
- **Real results**: fastapi (8.2% stale, 5/5), requests (23.4%, 3/5)
## Score Validation with Real Projects
After implementing a new metric, validate scoring behavior with diverse real-world projects.
### Validation Script
Create `scripts/validate_scoring.py`:
```python
#!/usr/bin/env python3
"""
Score validation script for testing new metrics against diverse projects.
Usage:
uv run python scripts/validate_scoring.py
"""
import subprocess
import json
from typing import Any
VALIDATION_PROJECTS = {
"Famous/Mature": {
"requests": "psf/requests",
"react": "facebook/react",
"kubernetes": "kubernetes/kubernetes",
"django": "django/django",
"fastapi": "fastapi/fastapi",
},
"Popular/Active": {
"angular": "angular/angular",
"numpy": "numpy/numpy",
"pandas": "pandas-dev/pandas",
},
"Emerging/Small": {
# Add smaller projects you want to test
},
}
def analyze_project(owner: str, repo: str) -> dict[str, Any]:
"""Run analysis on a project and return results."""
cmd = [
"uv", "run", "os4g", "check",
f"{owner}/{repo}",
"--insecure", "--no-cache", "-o", "json"
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
return {"error": result.stderr}
# Parse JSON output
try:
return json.loads(result.stdout)
except json.JSONDecodeError:
return {"error": "Failed to parse JSON output"}
def main():
print("=" * 80)
print("OSS Sustain Guard - Score Validation Report")
print("=" * 80)
print()
for category, projects in VALIDATION_PROJECTS.items():
print(f"\n## {category}\n")
print(f"{'Project':<25} {'Score':<10} {'Status':<15} {'Key Observations'}")
print("-" * 80)
for name, repo_path in projects.items():
result = analyze_project(*repo_path.split("/"))
if "error" in result:
print(f"{name:<25} {'ERROR':<10} {result['error'][:40]}")
continue
score = result.get("total_score", 0)
status = "✓ Healthy" if score >= 80 else "⚠ Monitor" if score >= 60 else "⚡ Needs attention"
observations = result.get("key_observations", "N/A")[:40]
print(f"{name:<25} {score:<10} {status:<15} {observations}")
print("\n" + "=" * 80)
print("\nValidation complete. Review scores for:")
print(" - Famous projects should score 70-95")
print(" - New metrics should show reasonable distribution")
print(" - No project should score >100")
if __name__ == "__main__":
main()
```
### Quick Validation Command
```bash
# Test specific famous projects
uv run os4g check requests react fastapi kubernetes --insecure --no-cache
# Compare before/after metric changes
uv run os4g check requests --insecure --no-cache -o detail > before.txt
# ... make changes ...
uv run os4g check requests --insecure --no-cache -o detail > after.txt
diff before.txt after.txt
```
### Expected Score Ranges
| Category | Expected Score | Examples |
|----------|----------------|----------|
| Famous/Mature | 75-95 | requests, kubernetes, react |
| Popular/Active | 65-85 | angular, numpy, pandas |
| Emerging/Small | 45-70 | New projects with activity |
| Problematic | 20-50 | Abandoned or struggling projects |
### Validation Checklist
After implementing a new metric:
- [ ] Test on 3-5 famous projects (requests, react, kubernetes, etc.)
- [ ] Verify scores remain within 0-100
- [ ] Check that famous projects score reasonably high (70+)
- [ ] Ensure new metric contributes meaningfully to total score
- [ ] Review that metric differentiates well between projects
- [ ] Confirm no single metric dominates the total score
## Troubleshooting
**Score calculation issues**: Verify all metrics have max_score=10 and check profile weights
**Metric not appearing**: Check integration in `_analyze_repository_data()`
**Tests fail**: Update expected metric names in test files
**Data not available**: Add proper null checks and default handling
**Scores too similar across projects**: Adjust scoring thresholds for better differentiation
**Famous project scores low**: Review metric logic and thresholdsRelated Skills
adding-tweets
Add tweets to the Second Brain. Use when the user provides a Twitter/X URL and pasted tweet content, asking to "add a tweet", "save this tweet", or "capture this tweet".
adding-templates
Use when adding new stacks, libraries, or project addons to create-faster CLI tool - covers META entries, template creation, and testing for all addon types
adding-milestones
Use this skill when adding a milestone to an existing project, starting a new milestone cycle, creating the first milestone after project init, or defining what's next after completing work. Triggers include "add milestone", "new milestone", "start milestone", "create milestone", "first milestone", "next milestone", and "milestone cycle".
application-metrics
Guide for instrumenting applications with metrics. Use when adding observability, monitoring, metrics, counters, gauges, or instrumentation to code. Covers API endpoints, databases, queues, caching, and locks.
adding-service-documentation
Documents new Coolify one-click services by creating markdown pages in docs/services/, downloading logos to docs/public/images/services/, and updating List.vue catalog. Use when adding service documentation, creating service pages, onboarding services from templates/compose/, or updating the services catalog with new entries.
adding-nango-provider-support
Use when adding support for a new Nango provider - configures provider in providers.yaml, creates documentation (main page, setup guide, connect guide), and updates docs.json following established patterns
parametric-scribe
Enables "Time Machine" coding. Records tasks as a Recipe and allows intelligent replay/modification of history.
agile-metrics
Master agile metrics with velocity, burn-down charts, cycle time, and team health indicators for data-driven improvement.
sentry-setup-metrics
Setup Sentry Metrics in any project. Use this when asked to add Sentry metrics, track custom metrics, setup counters/gauges/distributions, or instrument application performance metrics. Supports JavaScript, TypeScript, Python, React, Next.js, and Node.js.
aggregating-gauge-metrics
Aggregate pre-computed metrics (gauge, counter, delta types) using OPAL. Use when analyzing request counts, error rates, resource utilization, or any numeric metrics over time. Covers align + m() + aggregate pattern, summary vs time-series output, and common aggregation functions. For percentile metrics (tdigest), see analyzing-tdigest-metrics skill.
adding-stacks
Use when adding a new framework/stack to create-faster CLI tool - addresses copy-first mentality, incomplete implementations, and missing dependencies
adding-phases
Use this skill to add planned work discovered during execution to the end of the current milestone in the roadmap. This skill appends sequential phases to the current milestone's phase list, automatically calculating the next phase number. Triggers include "add phase", "append phase", "new phase", and "create phase". This skill updates ROADMAP.md and STATE.md accordingly.