[CORE?;KOTLIN]Pre-escaping vs. Template-targeted Escaping in openapi-generator

## The Problem

The codegen Java layer currently bakes **language-specific syntax and escaping** directly into data model fields (`defaultValue`, `value` in enumVars, etc.) before they reach Mustache templates. I think this is the wrong layer for that responsibility. It makes context-specific escaping unnecessarily complex/unpredictable/impossible.

---

## Concrete Examples

### 1. `defaultValue` — pre-escaped for the wrong context

`AbstractKotlinCodegen.toDefaultValue()` and `AbstractJavaCodegen.toDefaultValue()` return strings like:
- `"\"hello\""` — raw value already wrapped in language string-literal quotes
- `"42l"` — Java long literal suffix baked in
- `"new BigDecimal(\"3.14\")"` — full constructor expression
- `"URI.create(\"...\")"` — static factory call

The template then receives a code-ready expression, not a value. This makes it impossible for the template to render the same value in a different context (e.g. annotation attribute vs. field initializer vs. comment) without getting double-escaped or incorrectly escaped output.

### 2. `value` in `enumVars` — same pattern, fragile workaround

`AbstractKotlinCodegen.toEnumValue()` returns `"\"available\""` for string types — i.e. the value already includes the surrounding Kotlin string-literal quotes. In `kotlin-client/enum_class.mustache`, `{{{value}}}` is used in two contexts with conflicting needs:

- **Enum constructor** (line 93): `{{name}}({{{value}}})` → `AVAILABLE("available")` — the pre-quoted form works here by accident
- **Annotations** (lines 65, 68, 71, 75, 81): `@SerializedName(value = {{#lambda.doublequote}}{{{value}}}{{/lambda.doublequote}})` → `@SerializedName(value = "available")` — this works, but only because `DoubleQuoteLambda` detects the value is already quoted and passes it through unchanged (i.e. it is a no-op here)

The annotation lines only produce correct output because the lambda happens to be idempotent for already-quoted input. If the template ever needs the *raw* value (e.g. in a Javadoc comment or a non-string context), there is no way to get it — there is no `unescapedValue` counterpart for `enumVars`.

### 3. `DoubleQuoteLambda` — a symptom, not a solution

Because some codegens pre-quote `defaultValue` (for string types) and others don't (for numeric types), the `{{#lambda.doublequote}}` lambda was introduced to normalize "add quotes unless already present." This is fundamentally a state-detection workaround, not a principled design: the template has to guess whether the Java layer already applied quoting.

### 4. `unescapedDefaultValue` — acknowledgment of the problem

The existence of a parallel `unescapedDefaultValue` field (set from `schema.getDefault()` directly) shows the codebase already recognizes that `defaultValue` is "too processed" for some uses — but this is a workaround, not a fix.

---

## Why Pre-escaping Is Wrong

Escaping is **context-dependent**:

| Context | String `hello's` needs |
|---------|------------------------|
| Kotlin/Java string literal | `"hello's"` |
| Single-quoted annotation | `'hello\'s'` |
| Kotlin multiline string | `"""hello's"""` |
| JSON value | `"hello's"` |
| XML attribute | `hello&apos;s` |
| Single-line comment | `hello's` (no escaping) |
| URL | `hello%27s` |

When a value is pre-escaped in Java for one assumed context, it:
- Cannot be reused for other contexts without double-escaping
- Requires detection hacks (like `DoubleQuoteLambda`) to "un-guess" whether quoting was applied
- Breaks cross-cutting uses (same field in a comment, a string literal, and an annotation in the same template)

---

## Security: Pre-escaping Creates Injection Vulnerabilities

Pre-escaping for one assumed context actively undermines safe handling in other contexts. Because the template author cannot tell what escaping has already been applied, they face a dilemma: apply a sanitizing lambda and risk double-escaping, or skip it and risk an injection. This creates a class of vulnerabilities in **generated code**:

### Kotlin string template injection (`$`)

Kotlin string literals treat `$` as the start of a string interpolation (`$variable`, `${expression}`). The `escapeText` method used during pre-escaping does **not** escape `$` — that is handled separately by `lambda.escapeDollar`. A pre-escaped value like `"hello $world"` stored as `"\"hello $world\""` will compile to an interpolated string referencing the variable `world`, rather than the literal text `$world`. A template author using the value in a different context (e.g. a multiline string or a comment) may assume escaping was already handled and skip `lambda.escapeDollar`, leaving the interpolation active.

### Premature termination of triple-quoted strings (`"""`)

Kotlin multiline strings are delimited by `"""`. A value containing `"""` (e.g. a description or default value from a spec) would prematurely close the string, injecting arbitrary content outside it. Pre-escaping with `escapeText` targets regular string literals (`\"`) — it does not produce the `${"\"\"\""}` construct required to safely embed triple-quotes inside a multiline string. A template author reusing a pre-escaped value in a `"""..."""` context has no safe path: the escaping that was applied is wrong for this context, and applying the right escaping on top would double-escape everything else.

### General principle

The root issue is that **a template author cannot reason safely about a value whose escaping state is unknown**. With raw values and explicit lambdas, the contract is clear: the value is always unescaped, and the template applies exactly the lambdas required for the target context — no guessing, no double-escaping, no missed injection vectors.

---

## The Correct Contract

> **The Java codegen layer stores raw semantic values. Mustache templates are solely responsible for context-appropriate escaping via lambdas.**

```
// Java layer — raw value only:
enumVar.put("value", "available");        // not "\"available\""
property.defaultValue = "hello world";   // not "\"hello world\""

// Template layer — escaping is explicit and context-targeted:
{{name}}({{#lambda.kotlinString}}{{{value}}}{{/lambda.kotlinString}})   // → AVAILABLE("available")
@SerializedName(value = "{{value}}")                                    // → @SerializedName(value = "available")
// default: {{defaultValue}}                                            // → // default: hello world
@DefaultValue("{{#lambda.escapeInNormalString}}{{{defaultValue}}}{{/lambda.escapeInNormalString}}")
```

---

## Benefits of the Change

1. **Correctness** — eliminates the `""available""` class of bugs where two layers both add quotes
2. **No more `DoubleQuoteLambda`** — it becomes unnecessary; the template always knows whether it's adding quotes
3. **No more `unescapedDefaultValue`** — `defaultValue` is already raw; the parallel field disappears
4. **Predictability** — any contributor reading a template knows exactly what they're getting: raw values, and explicit lambdas for escaping
5. **Security** — template authors can apply exactly the right escaping for each context without ambiguity
6. **Extensibility** — adding a new target language/context just means adding a new lambda, not forking `toDefaultValue()` overrides across dozens of codegen subclasses

---

## Migration Considerations

This is a **breaking change for custom templates**. The migration path would be:
- Deprecate the pre-escaped behavior with a flag
- Add a `rawValue` / `rawDefaultValue` field alongside the existing ones as a transition bridge
- Update all bundled templates to use explicit lambdas
- Remove the pre-escaped fields in a future major version

**I am willing to try to tackle this in kotlin codegens**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CORE?;KOTLIN]Pre-escaping vs. Template-targeted Escaping in openapi-generator #23962

The Problem

Concrete Examples

1. `defaultValue` — pre-escaped for the wrong context

2. `value` in `enumVars` — same pattern, fragile workaround

3. `DoubleQuoteLambda` — a symptom, not a solution

4. `unescapedDefaultValue` — acknowledgment of the problem

Why Pre-escaping Is Wrong

Security: Pre-escaping Creates Injection Vulnerabilities

Kotlin string template injection (`$`)

Premature termination of triple-quoted strings (`"""`)

General principle

The Correct Contract

Benefits of the Change

Migration Considerations

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Context	String `hello's` needs
Kotlin/Java string literal	`"hello's"`
Single-quoted annotation	`'hello\'s'`
Kotlin multiline string	`"""hello's"""`
JSON value	`"hello's"`
XML attribute	`hello's`
Single-line comment	`hello's` (no escaping)
URL	`hello%27s`

Uh oh!

[CORE?;KOTLIN]Pre-escaping vs. Template-targeted Escaping in openapi-generator #23962

Description

The Problem

Concrete Examples

1. defaultValue — pre-escaped for the wrong context

2. value in enumVars — same pattern, fragile workaround

3. DoubleQuoteLambda — a symptom, not a solution

4. unescapedDefaultValue — acknowledgment of the problem

Why Pre-escaping Is Wrong

Security: Pre-escaping Creates Injection Vulnerabilities

Kotlin string template injection ($)

Premature termination of triple-quoted strings (""")

General principle

The Correct Contract

Benefits of the Change

Migration Considerations

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `defaultValue` — pre-escaped for the wrong context

2. `value` in `enumVars` — same pattern, fragile workaround

3. `DoubleQuoteLambda` — a symptom, not a solution

4. `unescapedDefaultValue` — acknowledgment of the problem

Kotlin string template injection (`$`)

Premature termination of triple-quoted strings (`"""`)