随笔：代理式编程需要锚定物理现实的人类签署

代理式编程需要锚定物理现实的人类签署

2026-05-21 · 转载自 LinkedIn · AI 治理, 软件安全, AI 编程, 应用安全, 零信任, DevSecOps, coding-ethos

随着 AI 编程代理从被动自动补全工具转向可执行整条功能分支的自主贡献者，我们正冲向一个巨大的安全盲点：如何证明真实的人类确实在上线前审阅并验证了代理生成的代码？这并不是一个新问题，但现在无疑变得更加紧迫。

在我的项目 coding-ethos 中，我们重点构建面向 AI 代理的策略即代码护栏：使用 CEL 策略、Git hooks、沙箱以及 MCP 服务器，确保自主代理即使在人不在环路中时，也不能交付违反团队标准的代码。

但即使最稳健的自动化关卡也只完成了一半。纵深防御的最终层仍需要真实的人眼审阅关键代码。在完全代理化的工作流中，传统 SSH 或 GPG 提交签名已经不够，而且常常已被自动化。若代理进程或本地环境被攻破，或被复杂的提示注入转移，这些存储的凭据可能被误导。人也可能只是偷懒。

我们需要一种以密码学方式绑定物理现实的零信任开发者确认模型：

生物识别验证：快速、低摩擦的验证，例如 Face ID 或 Touch ID，证明一个活着且获授权的开发者正在屏幕前。
时间验证：确保人类批准精确发生在提交窗口内，从而消除重放攻击。
地理物理验证：确认开发者的物理位置符合预期遥测与可信边界。

当自主代理提出关键架构变更时，最终关卡不应只是 CI 流水线里的绿色勾号。它需要一个不可伪造的人类断言。

我目前正在为 coding-ethos 设计这一防御层，也想向网络中的各位打开讨论：你的工程团队如何划分自动化策略执行与强制人类签署之间的边界？随着代理处理越来越大的代码库片段，我们如何防止审阅疲劳把人类验证变成自动盖章？

欢迎讨论。我正在积极把这个验证框架从设计模式推进到真实的平台集成。如果你正在构建生物识别快速身份产品，或运营企业软件供应链安全平台，并希望探索与 coding-ethos 的试点集成，请联系我。

Threat Modeling Autonomous Dev Agents: How do we cryptographically prove a human actually reviewed a commit?

2026-05-21 · 转载自 Reddit (r/cybersecurity) · FOSS Tool

Hey everyone,

I’ve been spending a lot of time lately threat-modelling fully agentic coding workflows. As tools move from passive autocomplete to autonomous agents that execute entire feature branches, we are opening a massive supply-chain blind spot.

I maintain an open-source project called coding-ethos, which focuses on building policy-as-code guardrails for AI agents (using CEL policies, Git hooks, sandboxing, and MCP servers) to ensure agents can’t ship code that violates team standards. But even with robust automated gates, I keep hitting a wall with the ultimate layer of defence-in-depth: human verification.

* I have some very mathy thoughts about this, but I've kept them out of the post for now *

The Threat Vector

Traditional SSH or GPG commit signing is no longer sufficient. If a local environment or agent process is compromised—say, via a sophisticated prompt injection or a malicious package—those stored credentials can be hijacked by the agent to sign off on a malicious commit. If it passes the automated CI/CD tests, it merges.

How do we prove that "real eyes" actually reviewed critical code before it hits production?

The Proposed Defence Layer

I'm working on integrating a zero-trust developer confirmation model for critical commits that is cryptographically tied to physical reality. To actually trust an agent's output, the human sign-off needs to be:

Biometrically Verified: Fast, low-friction validation (e.g., WebAuthn/Passkeys via TouchID/FaceID) that proves a living, authorized developer is actively at the glass, signing the specific commit hash.
Temporally Verified: Ensuring the human approval happens precisely at the moment of the commit window to eliminate replay attacks or asynchronous approvals.
Geophysically Verified: Confirming the physical location/telemetry of the developer aligns with expected trusted boundaries at the time of signing.

The Problem

When an autonomous agent proposes a critical architectural change, a green checkmark from a CI pipeline isn't enough. It needs to be an un-spoofable human assertion, but it also can't be so high-friction that developers just blindly spam their fingerprint reader out of "reviewer fatigue."

I'm currently trying to take this from a design pattern into a live architecture within coding-ethos, but I want a sanity check from this sub:

How are your AppSec teams drawing the line between automated policy enforcement and hard human sign-off for AI-generated code?
Has anyone started integrating biometric auth directly into pre-commit/pre-push git hooks for critical branch merges?
What are the obvious bypasses to this triad (Biometric/Temporal/Geophysical) that I am missing in my threat model?

I would love to hear your thoughts or see if anyone else is building in this exact IAM/AppSec intersection.

永久链接: https://patrickaudley.com/posts/human-signoff-for-agentic-code.html · Markdown