Series: Securing MedScribe-R-Us | Part 3 of 5
Security tooling in CI/CD has a failure mode that I've seen at every client that tries it without a prior threat model: it produces findings that nobody acts on.
The pipeline runs. Semgrep(or other tools they use) flags 400 things. Developers learn to ignore the pipeline. The security team starts treating the report as a checkbox. The tool is "integrated" in the sense that it runs — but not in the sense that it changes what gets shipped.
The root cause is almost always the same: the tooling wasn't configured against the actual threat model. Default rulesets catch generic vulnerabilities, not the specific risks of your specific system. Without that specificity, the signal-to-noise ratio is bad enough that real findings get buried alongside hundreds of irrelevant ones.
Phase 1 (P1) gave us the threat model. Phase 2(P2) is where it gets enforced.Follow along on the Github here: https://github.com/LeSpookyHacker/medscribe-r-us-appsec
Five Tools, One Pipeline
MedScribe-R-Us's CI/CD pipeline runs five security workflows in GitHub Actions. Each one is a separate file. Each has a different trigger strategy, a different failure policy, and a direct line back to specific findings in the P1 STRIDE register.
1.github/workflows/2├── sast.yml ← Semgrep (custom rules + community packs)3├── sca.yml ← pip-audit, npm audit, OWASP Dependency-Check4├── secrets.yml ← Gitleaks (full history scan)5├── container.yml ← Trivy (CVEs + misconfigs + delta report)6└── dast.yml ← OWASP ZAP (3 authenticated scan contexts)78semgrep-rules/9├── phi-in-logs.yml ← T-006: PHI in application logs10├── auth-missing.yml ← T-011: missing auth, tenant scope bypass11└── llm-output-handling.yml ← LLM02: insecure LLM output consumption1213.gitleaks.toml ← Custom patterns: GCP SA keys, MongoDB URIs, FHIR secrets
SAST: Rules That Address Actual Threats
The Semgrep workflow runs two passes. The first uses the custom ruleset in `semgrep-rules/` — three files, each mapped to specific STRIDE findings. The second runs community packs (`p/python`, `p/owasp-top-ten`, `p/jwt`).
The gate logic is different for each. Custom rules: any finding at any severity blocks the PR. Community rules: ERROR severity blocks merge to `main`, WARNING creates an issue. The asymmetry is intentional — custom rules are tuned to this codebase with zero false positive tolerance, community rules are generic and need some noise tolerance.
phi-in-logs.yml — addressing T-006
T-006 in the STRIDE register is "PHI in Application Logs" — rated Critical because application logs are not subject to the same access controls as the clinical data stores, are aggregated in Datadog, and are accessible to a broader set of engineers. It's one of the most common HIPAA violations in SaaS products.
The rule targets four patterns:
1yaml2- id: phi-variable-in-logger3 severity: ERROR4 patterns:5 - pattern: $LOGGER.$METHOD(..., $VAR, ...)6 - metavariable-regex:7 metavariable: $LOGGER8 regex: '^(logger|log|logging|LOGGER|LOG)$'9 - metavariable-regex:10 metavariable: $VAR11 regex: '.*(transcript|patient|phi|note|audio|deid|soap|clinical|encounter|mrn).*'
The other three patterns cover f-string interpolation in log calls, exception handlers that log the full request body, and print() calls with PHI-pattern variable names — the classic debug statement that makes it to production.
A false positive suppression looks like this:
1python2# nosemgrep: phi-variable-in-logger3# Justification: `session_id` is a UUID generated by MedScribe —4# no patient data. Confirmed in code review 2024-01-15.5logger.info("Session completed", session_id=session_id)
No justification comment means the suppression gets removed at the next quarterly review. The gate enforces this — # nosemgrep alone without a comment still triggers in CI.
auth-missing.yml — addressing T-011
T-011 is "Admin Portal Horizontal Privilege Escalation" — an admin endpoint that validates "is this user an admin?" rather than "is this admin scoped to this tenant?" means a Clinic Admin can traverse into another health system's data.
Two rules here. The first detects FastAPI route handlers that lack an authentication dependency:
1yaml2- id: fastapi-route-missing-auth-dependency3 patterns:4 - pattern: |5 @$ROUTER.$METHOD("...")6 async def $FUNC($PARAMS):7 ...8 - pattern-not: |9 @$ROUTER.$METHOD("...")10 async def $FUNC(..., $DEP: $TYPE = Depends(...), ...):11 ...
The second catches admin routes that pull `tenant_id` from path parameters rather than from the authenticated user's JWT claims — the exact pattern that enables horizontal privilege escalation:
1yaml2- id: admin-route-missing-tenant-scope3 severity: ERROR4 patterns:5 - pattern: |6 @$ROUTER.$METHOD("/admin/...")7 async def $FUNC(..., tenant_id: $TYPE, ...):8 ...9 - pattern-not: |10 @$ROUTER.$METHOD("/admin/...")11 async def $FUNC(..., $USER = Depends(...), ...):12 ...13 $USER.tenant_id
llm-output-handling.yml — addressing LLM02
This ruleset catches the pattern where raw LLM output gets consumed before it clears the Output Validation Service. Three variants: direct field access on the raw response object, writing raw LLM content directly to MongoDB, and json.loads() on LLM output outside a try/except block.
1yaml2- id: llm-output-used-before-validation3 severity: ERROR4 patterns:5 - pattern: $VAR.$FIELD6 - metavariable-regex:7 metavariable: $VAR8 regex: '^(raw_response|llm_output|vertex_response|gemini_response)$'9 - metavariable-regex:10 metavariable: $FIELD11 regex: '^(text|content|candidates|assessment|plan|subjective|objective)$'
The fourth rule catches the PHI scrubbing bypass pattern — a raw transcript variable passed directly to prompt construction functions without going through the scrubbing layer first. This is the automated enforcement of the most critical control in the L2 DFD.
SCA: The Supply Chain Surface
Two separate SCA workflows — pip-audit for Python, npm audit for Node.js — unified into the same SARIF upload path feeding GitHub Advanced Security.
The gate threshold is CVSS ≥ 9.0 with an available fix. This keeps the gate's signal-to-noise ratio high — it only blocks on Critical CVEs that are both severe and actionable. A CVSS 8.x CVE in a PHI-touching package (MongoDB driver, FHIR client, JWT library) gets escalated manually during triage because the contextual risk is higher than the base score implies.
The SCA workflow also runs daily at 06:00 UTC on a schedule — not just on PRs. This catches newly published CVEs against pinned dependencies in the window between code changes. A CVE published against cryptography 41.0.3 on a Tuesday afternoon fires on the Wednesday morning scan even if no code changed overnight.
One finding that surfaced during pipeline testing: a transitive dependency in the FHIR client library pinned to a version of requests with a known SSRF vulnerability. The FHIR client itself doesn't use the vulnerable code path, but MedScribe's Audio Ingestion Service uses requests for outbound webhook delivery — and if webhook URLs are configurable by Clinic Admins, the SSRF surface becomes real. The threat model didn't flag this specifically. The SCA tool did. That's the right division of labor between the two.
Secrets Detection: The Pre-commit Hook Is the Real Gate
The Gitleaks workflow scans the full commit history on every push — not just the diff. A secret committed three months ago and "deleted" in a later commit is still in git history and still requires rotation regardless of the current file state.
But the CI gate is the wrong place to catch secrets. By the time a secret reaches CI, it's in the repository, it's been pushed to GitHub's servers, and it needs to be rotated whether or not the build fails. The pre-commit hook catches it before the commit exists.
1bash2# Setup from repo root3pip install pre-commit4pre-commit install
The .gitleaks.toml adds MedScribe-specific patterns on top of the default ruleset: GCP service account key files (the JSON "type": "service_account" pattern), MongoDB Atlas connection strings (mongodb+srv://user:pass@cluster.mongodb.net), Vertex AI API keys (AIza...), and SMART on FHIR OAuth client secrets.
There is no warn-only mode for secrets. A detected secret means the credential is assumed compromised, rotation is required immediately, and the security team is notified before any merge discussion happens. The PR comment that fires on detection makes this explicit — it calls it a security incident, not a build failure.
Container Scanning: The Surface SAST Can't See
Trivy scans three surfaces per image: OS-level CVEs in the base image, library CVEs in installed packages, and Dockerfile misconfigurations.
The OS-level surface is what SAST misses entirely — vulnerabilities in packages installed via apt that the Python and Node.js dependency scanners don't know about. It also catches secrets baked into image layers: a COPY .env that was removed in a later RUN layer is still present in the image history and extractable by anyone with pull access to Artifact Registry.
Both MedScribe service images use distroless base images (gcr.io/distroless/python3, gcr.io/distroless/nodejs20). Distroless removes the shell, package manager, and most OS-level packages — eliminating large portions of the CVE surface and removing the shell that an attacker needs to do anything useful after container breakout.
The delta report is the most useful feature for day-to-day developer experience. Instead of showing all CVEs in the image, it shows only the CVEs introduced by the current change compared to the previously deployed image. PR review stays focused on new risk:
1## 🛡️ Container CVE Delta Report23⚠️ 1 new CVE(s) introduced:45| CVE | Severity | Package | Fixed In |6|---------------|----------|-----------------|----------|7| CVE-2024-1234 | HIGH | urllib3 1.26.18 | 2.0.7 |
DAST: Testing the Running Application
DAST is qualitatively different from the other four tools — it's the only one that tests what the application actually does at runtime, not what the code says it should do. OWASP ZAP runs post-deploy to staging, not on PR, because it needs a running application.
Three scan contexts:
Unauthenticated baseline covers security headers (CSP, HSTS, X-Frame-Options), information disclosure in error responses, open redirects, and common injection patterns — the surface any internet scanner would see.
Authenticated — clinician role covers the highest-risk authenticated surface: PHI access controls on the note retrieval endpoints, the approval gate enforcement, the audio upload endpoint, and the SOAP note editor (stored XSS surface). Uses a dedicated test clinician account in staging.
Authenticated — admin role specifically targets T-011 — horizontal privilege escalation. ZAP's active scanner manipulates path parameters and query strings on tenant-scoped admin endpoints, attempting to access data outside the test account's tenant. This is the automated equivalent of what a penetration tester would try on the Admin Portal.
Gate logic: High severity ZAP findings block the next environment promotion. Medium severity findings auto-create GitHub Issues with severity:medium, source:dast, status:open labels and a 30-day SLA.
The ZAP rules file (.zap/rules.tsv) documents every intentional IGNORE decision — findings suppressed because they're accepted risks, not because they're false positives. Every IGNORE entry has a comment. Undocumented ignores don't exist in this configuration.
The Developer Guide
All of this tooling only works if engineers understand it. The docs/sdlc/developer-guide.md covers how to read each tool's output, how to write a valid suppression, and when to escalate. The suppression policy is specific: a justification comment is required, and suppressions without one get removed at the quarterly review.
The guide closes with five golden rules short enough to memorize:
1. PHI never goes in logs. Ever.
2. Every FastAPI route that touches PHI requires an auth dependency.
3. LLM output is untrusted until it clears the Output Validation Service.
4. Secrets go in Secret Manager. Never hardcoded, never in `.env` files.
5. If you're unsure, ask.
The Full Gate Summary
| Tool | Trigger | Blocks On |
|---|---|---|
| Semgrep custom rules | Every PR | Any finding |
| Semgrep community | Every PR | ERROR severity on main |
| pip-audit / npm audit | Every PR | CVSS ≥ 9.0 with fix |
| Gitleaks | Every push | Any secret detected |
| Trivy | Push to main/staging | Critical or High with fix |
| ZAP | Post-deploy to staging | High severity alert |
That's the pipeline. Not a checkbox. A loop with teeth.
What P3 Builds On This
P2 generates findings. P3 defines what happens to them — the vulnerability management program that assigns SLAs, tracks remediation, and produces the security metrics that make the program legible to leadership. The gate policy already references SLA tiers that don't formally exist until P3 defines them. P3 closes that loop.
*All workflows, custom Semgrep rules, Gitleaks config, and policy documents are in the repo under .github/workflows/, semgrep-rules/, and docs/sdlc/. All companies, scenarios, and clinical details are fictional.*
The full repo is on GitHub at https://github.com/LeSpookyHacker/medscribe-r-us-appsec/
All companies, patients, and clinical scenarios are fictional.
— LeSpookyHacker
Small Glossary of Acronyms
Since you are reading this, I am going to assume you know most general AppSec acronyms, so I will only be defining some medical specific ones, or new acronyms that some people may not know yet in the security field.
- ATLAS: Adversarial Threat Landscape for AI Systems (MITRE framework)
- BA: Business Associate (under HIPAA)
- CSF: Cybersecurity Framework (NIST) / Common Security Framework (HITRUST)
- EMR: Electronic Medical Record
- FHIR: Fast Healthcare Interoperability Resources (R4 refers to Release 4)
- HIPAA: Health Insurance Portability and Accountability Act
- HITECH: Health Information Technology for Economic and Clinical Health Act
- HITRUST: Health Information Trust Alliance
- MRN: Medical Record Number
- PHI: Protected Health Information
- PII: Personally Identifiable Information
- SOAP: Subjective, Objective, Assessment, and Plan (Medical clinical note format)
- SOC: System and Organization Controls
- STRIDE: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege (Threat modeling framework)
Join the Grimoire
Get notified when I publish new posts. No spam, unsubscribe anytime.