How it works
This page explains the full scan lifecycle — from your CI push to findings in the dashboard.
Architecture overview
Your CI pipeline
│
│ POST /v1/scans (tarball + metadata)
▼
┌─────────────────────┐
│ Musha API │ ← validates API key, project ownership, rate limits
└────────┬────────────┘
│ enqueues job
▼
┌─────────────────────┐
│ Scan Worker │ ← picks up job from queue
│ │
│ 1. Extract tarball │
│ 2. Discover files │
│ 3. Run secplat │ ← SCA + IaC + Secrets in parallel
│ 4. Parse SARIF │
│ 5. Deduplicate │ ← stable fingerprints per finding
│ 6. Persist results │
│ 7. Post PR comment │ ← if scan has a pr_id
└─────────────────────┘
│
▼
┌─────────────────────┐
│ Security dashboard│ ← findings visible with full triage workflow
└─────────────────────┘
Step by step
1. Your CI pipeline packages and uploads the code
The workflow tarballs the repository (excluding .git, node_modules, etc.) and POSTs it to POST /v1/scans with metadata: project ID, branch, commit hash, PR ID (if a PR), and scan type.
Musha validates the API key, verifies the project belongs to your tenant, checks rate limits, and enqueues the job. The API responds immediately with a scan_id — the scan runs asynchronously.
2. The worker clones or extracts the code
The worker picks up the job and extracts the tarball into an ephemeral temp directory. The directory is deleted automatically when the scan finishes, regardless of success or failure.
3. File discovery
Musha walks the extracted directory looking for manifest files it knows how to scan:
- SCA:
go.mod,package-lock.json,yarn.lock,pnpm-lock.yaml,requirements.txt,Pipfile,pyproject.toml,Cargo.lock,Gemfile.lock,composer.lock,pom.xml,build.gradle,packages.lock.json, and more. - IaC:
*.tffiles (Terraform), CloudFormation YAML/JSON (detected by content heuristic), Kubernetes YAML/JSON (detected byapiVersion/kindfields). - Secrets: all text files, excluding
.git/,node_modules/,vendor/.
4. The secplat engine runs
secplat is Musha's open-core security engine (Rust). It runs three scanners:
| Scanner | Technology | Output |
|---|---|---|
| SCA | OSV advisory database | CVE IDs, affected versions, fix versions |
| IaC | Open Policy Agent (Rego rules) | Misconfiguration findings with remediation |
| Secrets | Regex catalog + Shannon entropy | Redacted snippets, provider identification |
Each scanner outputs SARIF — a standard JSON format for security findings.
5. Deduplication via fingerprints
Every finding gets a stable fingerprint: a SHA-256 hash of rule_id + location + tenant_id. If the same vulnerability appears in two consecutive scans, it maps to the same row in the database — state, history, and assignments are preserved.
This means resolving a finding once keeps it resolved across future scans, unless the vulnerability genuinely reappears.
6. PR comment
If the scan includes a pr_id, Musha posts a comment to the PR on your Git platform with:
- A pass ✅ or fail ❌ status based on your configured
block_severitythreshold. - Counts of blocking findings (new, above threshold), non-blocking findings, accepted risks, and technical debt (pre-existing vulns not introduced by this PR).
7. Security dashboard
All findings appear in the Security page with full lifecycle tracking: triage, assignment, SLA tracking, and state transitions (open → in progress → resolved).
Key design decisions
Zero-agent: Musha never installs agents in your infrastructure. Your code is uploaded as a tarball — Musha has no persistent access to your repos.
Webhook-based repo scanning: For repository scanning, you push code via a CI pipeline you control. Musha does not have a persistent GitHub App or OAuth token that can read arbitrary repos.
Fingerprint stability: Findings survive refactors. If you move a file, the fingerprint may change — but the vulnerability record in the DB links the history together.
Technical debt isolation: Musha distinguishes between vulnerabilities your PR introduced and pre-existing ones in the codebase. Pre-existing vulns never block a PR. See Technical debt.