Skip to content

dsiddharth2/codehawk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

200 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeHawk

CI Docker Docker Pulls Docker Image Size License: MIT Python 3.11+

AI-powered pull request review pipeline that runs in CI, produces structured findings, scores code quality, and posts inline comments to Azure DevOps or GitHub.


How It Works

flowchart LR
    A[CI trigger] --> B[Phase 1:\nOpenAI Agent]
    B --> C[findings.json]
    C --> D[Phase 2:\npost_findings.py]
    D --> E[Inline comments\n+ PR summary]
    D --> F[CI gate\npass / fail]
Loading

Phase 1 — An OpenAI agent reads the PR diff, fetches changed files, optionally performs AST-based graph analysis, and writes findings.json.

Phase 2post_findings.py validates findings, filters by confidence, scores the PR using a penalty matrix, posts inline comments, and outputs a structured result for CI gating.


Quick Start

Docker (Azure DevOps)

docker run --rm \
  -e PR_ID=42 \
  -e REPO=MyRepo \
  -e VCS=ado \
  -e OPENAI_API_KEY=sk-... \
  -e AZURE_DEVOPS_ORG=my-org \
  -e AZURE_DEVOPS_PROJECT=my-project \
  -e AZURE_DEVOPS_PAT=... \
  -v /workspace:/workspace \
  dsiddharth2/codehawk

Docker (GitHub)

docker run --rm \
  -e PR_ID=42 \
  -e REPO=owner/repo \
  -e VCS=github \
  -e OPENAI_API_KEY=sk-... \
  -e GH_TOKEN=ghp_... \
  -v /workspace:/workspace \
  dsiddharth2/codehawk

Python CLI

python src/run_agent.py \
  --pr-id 42 --repo MyRepo --workspace /workspace \
  --model o3 --max-turns 40 --prompt-file commands/review-pr-core.md

Key Features

  • Two-phase architecture — agent analysis is separated from deterministic posting
  • Azure DevOps + GitHub — inline comments, PR summaries, thread resolution
  • Model-agnostic — works with o3, gpt-4.1, gpt-4o, claude-sonnet-4, gemini-2.5-pro, and more
  • Fix verification — re-push detects fixed/dismissed/still-present findings and resolves threads
  • Developer dismissals — reply to a finding with reasoning; CodeHawk evaluates and accepts or rejects
  • Scoring — penalty-based star rating (0-5 stars) with mode-aware severity multipliers
  • Cost tracking — per-run token usage and cost estimation
  • Developer controls# cr: intentional, .codereview.md conventions, .codereview.yml gate thresholds
  • Deduplication — cr-id based; re-runs skip already-posted findings
  • Dry runDRY_RUN=true scores and prints without posting to VCS

Developer Controls

Control Effect
# cr: intentional Suppress a specific line finding
# cr: ignore-next-line Suppress the following line
# cr: ignore-block start/end Suppress a block
.codereview.md Per-repo conventions injected into the agent prompt
.codereview.yml Gate thresholds: min_star_rating, fail_on_critical
DRY_RUN=true Score and print without posting to VCS

Environment Variables

See CI Integration docs for the full reference. The essentials:

Variable Required Description
PR_ID Yes Pull request number
REPO Yes Repository name or owner/repo
VCS Yes ado or github
OPENAI_API_KEY Yes OpenAI API key
OPENAI_MODEL No Model to use (default: o3)
DRY_RUN No Skip VCS writes

Project Structure

codehawk/
├── src/
│   ├── run_agent.py          # CLI entry point
│   ├── review_job.py         # Phase 1 orchestrator
│   ├── post_findings.py      # Phase 2 engine
│   ├── fix_verifier.py       # Per-file fix verification
│   ├── score_comparison.py   # Before/after score comparison
│   ├── config.py             # Pydantic settings
│   ├── agents/               # OpenAI agent runner
│   ├── tools/                # VCS, graph, and workspace tools
│   ├── activities/           # VCS activity classes (ADO + GitHub)
│   └── models/               # Pydantic/dataclass models
├── commands/                 # Agent prompts and JSON schema
├── docs/                     # Full documentation
├── ci/                       # CI pipeline templates
├── Dockerfile
└── pyproject.toml

Documentation

Full docs live in docs/:

Document Description
Architecture Two-phase design, component interactions, data flow
Agent Runner Agent internals, turn budget, findings extraction
Post Findings Phase 2: filtering, scoring, posting, summary
Fix Verification Re-push detection, fix/dismissal classification
Scoring Penalty matrix, star ratings, configuration
Review Modes Mode detection and severity multipliers
CI Integration Pipeline setup, env vars, Docker usage
VCS CLI VCS command wrappers for ADO and GitHub

Development

git clone https://github.com/dsiddharth2/codehawk.git
cd codehawk
python -m venv .venv && .venv\Scripts\activate   # Linux: source .venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt
pytest tests/

License

MIT

About

Code Review Pipeline with score

Resources

License

Stars

Watchers

Forks

Contributors

Languages