How it works

This page explains the full scan lifecycle — from your CI push to findings in the dashboard.

Architecture overview

Your CI pipeline
      │
      │  POST /v1/scans  (tarball + metadata)
      ▼
┌─────────────────────┐
│   Musha API         │  ← validates API key, project ownership, rate limits
└────────┬────────────┘
         │  enqueues job
         ▼
┌─────────────────────┐
│   Scan Worker       │  ← picks up job from queue
│                     │
│  1. Extract tarball │
│  2. Discover files  │
│  3. Run secplat     │  ← SCA + IaC + Secrets in parallel
│  4. Parse SARIF     │
│  5. Deduplicate     │  ← stable fingerprints per finding
│  6. Persist results │
│  7. Post PR comment │  ← if scan has a pr_id
└─────────────────────┘
         │
         ▼
┌─────────────────────┐
│   Security dashboard│  ← findings visible with full triage workflow
└─────────────────────┘

Step by step

1. Your CI pipeline packages and uploads the code

The workflow tarballs the repository (excluding .git, node_modules, etc.) and POSTs it to POST /v1/scans with metadata: project ID, branch, commit hash, PR ID (if a PR), and scan type.

Musha validates the API key, verifies the project belongs to your tenant, checks rate limits, and enqueues the job. The API responds immediately with a scan_id — the scan runs asynchronously.

2. The worker clones or extracts the code

The worker picks up the job and extracts the tarball into an ephemeral temp directory. The directory is deleted automatically when the scan finishes, regardless of success or failure.

3. File discovery

Musha walks the extracted directory looking for manifest files it knows how to scan:

SCA: go.mod, package-lock.json, yarn.lock, pnpm-lock.yaml, requirements.txt, Pipfile, pyproject.toml, Cargo.lock, Gemfile.lock, composer.lock, pom.xml, build.gradle, packages.lock.json, and more.
IaC: *.tf files (Terraform), CloudFormation YAML/JSON (detected by content heuristic), Kubernetes YAML/JSON (detected by apiVersion/kind fields).
Secrets: all text files, excluding .git/, node_modules/, vendor/.

4. The secplat engine runs

secplat is Musha's open-core security engine (Rust). It runs three scanners:

Scanner	Technology	Output
SCA	OSV advisory database	CVE IDs, affected versions, fix versions
IaC	Open Policy Agent (Rego rules)	Misconfiguration findings with remediation
Secrets	Regex catalog + Shannon entropy	Redacted snippets, provider identification

Each scanner outputs SARIF — a standard JSON format for security findings.

5. Deduplication via fingerprints

Every finding gets a stable fingerprint: a SHA-256 hash of rule_id + location + tenant_id. If the same vulnerability appears in two consecutive scans, it maps to the same row in the database — state, history, and assignments are preserved.

This means resolving a finding once keeps it resolved across future scans, unless the vulnerability genuinely reappears.

6. PR comment

If the scan includes a pr_id, Musha posts a comment to the PR on your Git platform with:

A pass ✅ or fail ❌ status based on your configured block_severity threshold.
Counts of blocking findings (new, above threshold), non-blocking findings, accepted risks, and technical debt (pre-existing vulns not introduced by this PR).

7. Security dashboard

All findings appear in the Security page with full lifecycle tracking: triage, assignment, SLA tracking, and state transitions (open → in progress → resolved).

Key design decisions

Zero-agent: Musha never installs agents in your infrastructure. Your code is uploaded as a tarball — Musha has no persistent access to your repos.

Webhook-based repo scanning: For repository scanning, you push code via a CI pipeline you control. Musha does not have a persistent GitHub App or OAuth token that can read arbitrary repos.

Fingerprint stability: Findings survive refactors. If you move a file, the fingerprint may change — but the vulnerability record in the DB links the history together.

Technical debt isolation: Musha distinguishes between vulnerabilities your PR introduced and pre-existing ones in the codebase. Pre-existing vulns never block a PR. See Technical debt.

Architecture overview​

Step by step​

1. Your CI pipeline packages and uploads the code​

2. The worker clones or extracts the code​

3. File discovery​

4. The secplat engine runs​

5. Deduplication via fingerprints​

6. PR comment​

7. Security dashboard​

Key design decisions​