Skip to content
AI Culture
9 min readMara Voss

Good Engineers Don't Need Rules Written Down. AI Changed That.

Teams adopted AI coding tools without rewriting the unwritten norms that governed human collaboration. Those norms are being written the hard way now: through public refusals, terse review comments, and Reddit threads with hundreds of replies.

Good Engineers Don't Need Rules Written Down. AI Changed That.

Quiet tension, no rule to point to. The friction at PR review is organizational, not technical.

An engineer on a popular developer forum announced to her team that she would no longer review AI-generated PRs. Not quietly declined. Announced it in a team meeting. The thread that followed got hundreds of replies. [4]

No rule was violated. That is the point.

The refusal landed so hard because no rule existed to govern what she owed the author, or what the author owed her. Everyone felt the friction and nobody could name the source. On another thread from the same community, a different developer complained that their coworker was using AI to reply to review comments. [5] Same friction, opposite direction. Same root cause: the AI coding team norms that governed this exchange were never rewritten for a world where AI is one of the authors.

This is not a code-quality argument. Code quality is the second-order concern here, and the threads are the evidence: the replies were not about bug rates or diff readability. They were about reciprocity. Effort. What a person owes another person when the work arrives in their queue. Those norms survived for decades without being written down because they did not need to be. Everyone who submitted a PR understood what they were submitting. AI disrupts that at the authorship stage and leaves the reviewer holding an expectation that was never updated.

Teams adopted AI coding tools without rewriting any of this. The organizational debt is now coming due at every PR review.

If you are watching your team navigate what professional norms survive when AI enters the discipline, this is the upstream version of that question.

The refusals started before anyone made a rule

The PR-refusal thread is not an isolated event. It is a norm being declared out loud because none existed to declare it quietly. [4]

That is what a norm vacuum looks like in public. When a rule is missing, someone eventually says it out loud at a meeting. The declaration feels dramatic because it is. The underlying situation is not dramatic; it is just overdue. The team needed a shared standard and was never given one, so the first person to reach a limit named it for everyone.

The second thread is different in tone but identical in structure. The frustration there is about AI-generated review responses: a developer feels disrespected when a coworker uses AI to respond to their careful feedback. [5] The complaint is not that the responses are wrong. The complaint is that the reviewer put in effort and the author outsourced the reciprocation.

That is an effort-asymmetry problem, not a code-correctness problem. The reviewer is applying a norm that says: if I think carefully about your work, you owe me considered engagement back. The author never agreed to that norm, not because they are careless, but because nobody ever held the conversation.

Both threads got hundreds of replies because everyone recognized the situation. The recognition is the signal. These are not edge cases. These are teams working through the same unwritten-rules gap, one tense PR review at a time.

You predicted 24% faster. The research says 19% slower.

The friction at PR review is not imaginary. There is a data shadow behind it.

A randomized controlled trial by METR, published in 2025, tracked 16 experienced open-source developers across 246 tasks. Before starting, the developers predicted that AI would make them 24% faster. The measured result: completion time increased by 19%. [1]

Stat card: Predicted 24% faster. Measured 19% slower. Source: METR, 246 tasks, 16 experienced open-source developers. The perception gap is large and consistent. Developers predicted a 24% speed gain; the measured result was a 19% slowdown across 246 tasks. Source: METR [1].

A 43-percentage-point gap between expectation and outcome is not a rounding error. It is a structural mismatch between how developers think AI changes their work and how it actually performs on measured tasks.

Reviewers can feel the discrepancy, even when they cannot cite a paper.

AI-generated code arrives with an implied confidence. The diff is clean. The commit message is articulate. Everything looks polished. But the reviewer's instinct, developed over years of reading human-authored code, picks up something off. The estimation feels wrong. The choices feel unexplained. The author seems not to know the code as well as the surface suggests.

That instinct is not paranoia. The METR data gives it a name. The reviewer is sensing the gap between the implied confidence and the actual cost of the work. They reach for the only lever available: refusal, or a terse comment, or an unusually long review queue. Not because they want to obstruct, but because the shared language for what they are sensing does not exist yet.

The AI coding team norms were never rewritten

The unwritten norm that organized PR review for the last two decades was simple: the author understands what they submitted.

That norm was load-bearing. It governed how long reviewers spent, what questions were reasonable to ask, how much context the author was expected to provide. It survived without being written down because it was almost always true. When you wrote the code, you understood it. You might have made mistakes. You might have chosen a bad approach. But you knew what you put in.

AI disrupts that norm at the source. Simon Willison, one of the most careful AI practitioners writing publicly, acknowledged in May 2026 that even his own standards are shifting: "as the coding agents get more reliable, I'm not reviewing every line of code that they write anymore, even for my production level stuff." [2]

Willison is not a careless engineer. He is describing the natural drift of a thoughtful person adapting to new tools. The norm is moving under everyone, not just the people who are being sloppy about it.

The problem is that the norm is moving asymmetrically. The author's expectation of their own authorship is updating. The reviewer's expectation of what the author owes them is not. Nobody held the conversation that would have updated both at once.

The result is a gap at the handoff point: the author submits work produced partly through AI augmentation, under the old norm (which said: you wrote it, you own it, you know it). The reviewer reads it under the same old norm. The norm was built for a world where authorship and understanding were bundled together. AI unbundles them. The norm was never rewritten to reflect that.

This is organizational debt. It accrued quietly, at the moment of tool adoption, when the team decided to add AI to the workflow without adding a ten-minute conversation about what that means for what people owe each other. The PR queue is where the interest comes due.

This happened before. It landed the same way.

Open source communities went through this in the 1990s and 2000s.

The initial governance pattern was informal. Projects had norms, but they were tacit: if you had been there long enough, you knew how things operated. New contributors arrived, often in large numbers, without access to the tacit knowledge. Conflicts happened. Maintainers burned out. Prominent contributors made public exits and posted essays about what went wrong.

The resolution was not "try harder to communicate." The resolution was: write it down. CONTRIBUTINGs, Codes of Conduct, RFC processes, explicit policies about what maintainers owed submitters and submitters owed maintainers. The documents did not fix the underlying friction immediately. But they created a shared reference point. They made it possible to say "we have a norm for this" instead of "I personally believe."

Remote work had the same arc a decade later. The early default was "just communicate like you would in the office, but use Slack." That worked until it did not. The question of whether a message required an immediate reply, the norm around meetings, the Slack-versus-email debate: all of it had to be made explicit because the old tacit norms assumed physical colocation.

The Stack Overflow Blog addressed the current version of this directly in March 2026. Guidelines for AI agents, the post argues, need to be "more explicit, demonstrative of patterns, and obvious" than human guidelines because agents lack tacit knowledge. [3] Explicit norms are now the mainstream recommendation. The community has arrived at the same conclusion the open source and remote-work communities arrived at: tacit is not stable.

Flow diagram: The same crisis, different decade. Open source in the 1990s-2000s: public conflict led to explicit CONTRIBUTINGs and Codes of Conduct. Remote work in the 2010s: Slack-vs-email friction led to async norms and no-meeting days. Both transitions followed the same arc: tacit norms met a new working pattern, friction went public, and the resolution was writing the rule down.

The argument against this analogy is worth taking seriously: "just review AI code like any other code." It was made during both earlier transitions too. "Just communicate like you're in the office." "Just follow the same OSS etiquette." In both cases, the resolution was not stricter application of the old norm. It was acknowledgment that the old norm no longer fit, followed by replacement with one that did.

AI coding tools are the same situation. The question of where control lives in the tooling itself is a separate, technical conversation (that one is covered in the companion post on AI coding tool control surfaces). The question here is what we owe each other when those tools are in the workflow. That is a cultural question. And the pattern from two prior transitions is clear: it requires an explicit conversation, not just intent to behave better.

Three questions to have instead of one more CONTRIBUTING.md

The resolution is not a document. It is a conversation that produces one.

A top-down CONTRIBUTING.md drafted by one tech lead and posted to the repo without discussion will not resolve this. The open source and remote-work precedents show why: the document without the conversation behind it does not earn the trust that makes people refer to it. The conversation is the point. The document is the artifact.

Here are three questions worth running with your team before the next conflict surfaces in a PR review. These are not universal answers. That is the point: the conversation produces answers that fit your team's actual working context, which no external document can know. [6] The Stack Overflow Blog's finding holds here too: what works is explicit, pattern-grounded, and obviously applicable. Not vague. [3]

Who owns an AI-generated change?

Not philosophically. Practically. If a reviewer finds a significant problem in an AI-generated diff, who is accountable for having submitted it? The author? The author's judgment about which AI output to accept? Someone else? The answer to this question determines what the review process is actually reviewing. It also determines whether "the AI wrote it" is a valid explanation for a decision the author can no longer defend.

What does a reviewer owe an AI-authored PR?

Does the review bar change? Does the time estimate change? Does the explanation requirement change? Reviewers are implicitly answering this question every day. The frustration in the refusal threads is partly the frustration of answering it without any team backing. Naming the answer together takes it out of each reviewer's private judgment and puts it on the table where it can be revised.

What is our shared rule for declaring AI usage in a PR description?

Not a policing mechanism. A reciprocity signal. If the author notes which parts of a diff were AI-generated and what their own review process was, the reviewer has a different kind of conversation to have. The goal is not transparency for its own sake; it is the minimum shared information that lets two people apply effort proportionately.

The teams having this conversation are not being precious about AI. They are paying down organizational debt before it compounds. The teams avoiding it, leaving their AI coding team norms unwritten, are paying interest. It comes due at every PR review.

If this is the week your team needs to have the conversation, send this.


References

Becker, A., Rush, H., Barnes, J., Rein, J. et al. (METR). "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity." https://arxiv.org/abs/2507.09089 . Published 2025. Verified on arxiv.org 2026-06-26. Randomized controlled trial; 16 experienced open-source developers; 246 tasks. Developers predicted 24% faster; measured result: 19% slower (19% increase in completion time).

Willison, Simon. "Vibe coding and agentic engineering are getting closer than I'd like." https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/ . Published 6 May 2026. Verified 2026-06-26. Quote: "as the coding agents get more reliable, I'm not reviewing every line of code that they write anymore, even for my production level stuff."

Stack Overflow Blog. "Building shared coding guidelines for AI (and people too)." https://stackoverflow.blog/2026/03/26/coding-guidelines-for-ai-agents-and-people-too/ . Published 26 March 2026. Verified 2026-06-26. Key claim: guidelines for AI agents must be "more explicit, demonstrative of patterns, and obvious" than human guidelines because agents lack tacit knowledge.

r/ExperiencedDevs. Community post: "Today I announced that I won't be reviewing AI generated PRs at company meeting" (observed in top-month results, June 2026). https://www.reddit.com/r/ExperiencedDevs/ . Observed 2026-06-26. Community signal: an engineer publicly announced to her team she would not review AI-generated PRs. No usernames, vote counts, or direct quotes reproduced.

r/ExperiencedDevs. Community post: "My coworker uses AI to reply to my PR review and I hate it" (observed in top-year results). https://www.reddit.com/r/ExperiencedDevs/ . Observed 2026-06-26. Community signal: developer frustrated by AI-generated responses to their review comments; effort-asymmetry complaint, not a code-correctness complaint. No usernames, vote counts, or direct quotes reproduced.

Author's inference. The framing of "organizational debt accruing faster than technical debt," "the PR queue as the interest payment on skipped rule-writing," and the three-question conversation structure are the author's analytical conclusions drawn from the evidence cited above. Not a sourced finding.

Put this project board inside ChatGPT

Open Agiflow in ChatGPT to plan campaigns, create tasks, and check what needs attention. Create a free Agiflow account when you are ready to keep the board for your team.