Skip to content

RFC 239: Policy on LLM assistance in contributions#239

Open
jugglinmike wants to merge 1 commit into
web-platform-tests:mainfrom
bocoup:llm-policy
Open

RFC 239: Policy on LLM assistance in contributions#239
jugglinmike wants to merge 1 commit into
web-platform-tests:mainfrom
bocoup:llm-policy

Conversation

@jugglinmike
Copy link
Copy Markdown
Contributor

@jugglinmike jugglinmike commented May 20, 2026

This initial draft takes a maximalist approach (and a permissive stance) to promote a robust and grounded discussion.

Rendered

@gsnedders
Copy link
Copy Markdown
Member

Fixes #202

Copy link
Copy Markdown
Member

@gsnedders gsnedders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying to tackle this!

I think we should also write something about use of LLMs for review, if nothing else.

Comment thread rfcs/llm_policy.md
Comment on lines +20 to +29
A few examples of policies on LLM use in FOSS contributions:

- permissive
- [ghostty/AI_POLICY.md at main · ghostty-org/ghostty](https://github.com/ghostty-org/ghostty/blob/main/AI_POLICY.md)
- [Policy about LLM generated code from PRs · Issue #28335 · opencv/opencv](https://github.com/opencv/opencv/issues/28335)
- [CONTRIBUTING.md: Guidelines relevant to AI-assisted contributions by gasche · Pull Request #14052 · ocaml/ocaml](https://github.com/ocaml/ocaml/pull/14052)
- [LLVM AI Tool Use Policy — LLVM 23.0.0git documentation](https://llvm.org/docs/AIToolPolicy.html)
- prohibitive
- [Code of Conduct ⚡ Zig Programming Language](https://ziglang.org/code-of-conduct/#strict-no-llm-no-ai-policy)
- [Getting Started - The Servo Book](https://book.servo.org/contributing/getting-started.html#ai-contributions)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there's three others which are notably relevant here: Chromium's, and Firefox's, given they are two of the five repos which have approval to land changes in WPT without further review. (WebKit and Test262 do not currently have policies — TC39's explicitly does not apply to code.)

Comment thread rfcs/llm_policy.md
Comment on lines +44 to +45
> Commits generated entirely by an LLM must be attributed to the LLM in the
> "Author" field.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels problematic. If we attribute a PR to Claude, Gemini, or OpenAI's GPT, if I try and contact the author… well, I don't think Anthropic, Google, or OpenAI are going to be very helpful?

Both Chromium and Firefox's policies are crystal clear that humans are still the authors and must self-review before submitting.

Therefore, when there's still a human very much in the loop who is required to self-review, it does not seem reasonable to consider the LLM the author — and the Chromium policy is explicit that, "Authors must attest that the code they submit is their original creation, regardless of whether AI tooling was used".

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong agree on this point for all the reasons you give. The Author field's purpose is to give a contact for problems/question, not assign blame. Listing an LLM is worthless there.

And also, yeah, assigning authorship to an LLM is abrogating your responsibility as an engineer to commit useful code that you understand.

Comment thread rfcs/llm_policy.md
Comment on lines +39 to +40
> Contributions that contain substantial amounts of tool-generated content must
> be labeled as such.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither Chromium nor Firefox require this today, and it's entirely plausible we've already had commits land into WPT via exports which don't meet this bar.

That said, Chromium's policy here is currently:

To aid reviewers, authors should flag areas that they are not confident about that had AI assistance.

This is maybe a weaker form, and hopefully something more in line with existing contributions.

Comment thread rfcs/llm_policy.md
Comment on lines +53 to +60
> ### For Trusted External Review
>
> Some external projects conduct review which the WPT maintainers recognize as
> authoritative. From rendering engines like Gecko to dedicated test suites
> like WASM, patches merged in these projects are incorporated into WPT without
> further review. The policy outlined by this document does not apply to these
> contributions; the external projects are trusted to determine their own
> mechanisms for quality assurance.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should probably be at least in part in another RFC that tries to define our existing policies?

As far as I'm aware, there's currently five repos which have approval to incorporate based on downstream review — Chromium, Firefox, Servo, Test262, and WebKit.

My understanding of the unwritten policy is we trust downstream reviewers; I can't even find the various places where we've elucidated parts of the policy over the years.

Comment thread rfcs/llm_policy.md

## Details

Proposed text:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed for where?

Comment thread rfcs/llm_policy.md
> contributions; the external projects are trusted to determine their own
> mechanisms for quality assurance.

## Risks
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worthwhile including at least a few more technical risks:

  1. Contributions of tests generated by an LLM closely looking at a specific implementation's code, matching that implementation, rather than the spec. (This is, of course, already an issue — but could inevitably become more of a problem if we get more, larger contributions.)
  2. Contributions not matching the spec at all. I've seen this mostly with trying to generate tests to assert ordering of things which end of using HTML's parallelism and HTML's event loops; that case is especially annoying because it can lead to flaky tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants