The Problem
The codegen Java layer currently bakes language-specific syntax and escaping directly into data model fields (defaultValue, value in enumVars, etc.) before they reach Mustache templates. I think this is the wrong layer for that responsibility. It makes context-specific escaping unnecessarily complex/unpredictable/impossible.
Concrete Examples
1. defaultValue — pre-escaped for the wrong context
AbstractKotlinCodegen.toDefaultValue() and AbstractJavaCodegen.toDefaultValue() return strings like:
"\"hello\"" — raw value already wrapped in language string-literal quotes
"42l" — Java long literal suffix baked in
"new BigDecimal(\"3.14\")" — full constructor expression
"URI.create(\"...\")" — static factory call
The template then receives a code-ready expression, not a value. This makes it impossible for the template to render the same value in a different context (e.g. annotation attribute vs. field initializer vs. comment) without getting double-escaped or incorrectly escaped output.
2. value in enumVars — same pattern, fragile workaround
AbstractKotlinCodegen.toEnumValue() returns "\"available\"" for string types — i.e. the value already includes the surrounding Kotlin string-literal quotes. In kotlin-client/enum_class.mustache, {{{value}}} is used in two contexts with conflicting needs:
- Enum constructor (line 93):
{{name}}({{{value}}}) → AVAILABLE("available") — the pre-quoted form works here by accident
- Annotations (lines 65, 68, 71, 75, 81):
@SerializedName(value = {{#lambda.doublequote}}{{{value}}}{{/lambda.doublequote}}) → @SerializedName(value = "available") — this works, but only because DoubleQuoteLambda detects the value is already quoted and passes it through unchanged (i.e. it is a no-op here)
The annotation lines only produce correct output because the lambda happens to be idempotent for already-quoted input. If the template ever needs the raw value (e.g. in a Javadoc comment or a non-string context), there is no way to get it — there is no unescapedValue counterpart for enumVars.
3. DoubleQuoteLambda — a symptom, not a solution
Because some codegens pre-quote defaultValue (for string types) and others don't (for numeric types), the {{#lambda.doublequote}} lambda was introduced to normalize "add quotes unless already present." This is fundamentally a state-detection workaround, not a principled design: the template has to guess whether the Java layer already applied quoting.
4. unescapedDefaultValue — acknowledgment of the problem
The existence of a parallel unescapedDefaultValue field (set from schema.getDefault() directly) shows the codebase already recognizes that defaultValue is "too processed" for some uses — but this is a workaround, not a fix.
Why Pre-escaping Is Wrong
Escaping is context-dependent:
| Context |
String hello's needs |
| Kotlin/Java string literal |
"hello's" |
| Single-quoted annotation |
'hello\'s' |
| Kotlin multiline string |
"""hello's""" |
| JSON value |
"hello's" |
| XML attribute |
hello's |
| Single-line comment |
hello's (no escaping) |
| URL |
hello%27s |
When a value is pre-escaped in Java for one assumed context, it:
- Cannot be reused for other contexts without double-escaping
- Requires detection hacks (like
DoubleQuoteLambda) to "un-guess" whether quoting was applied
- Breaks cross-cutting uses (same field in a comment, a string literal, and an annotation in the same template)
Security: Pre-escaping Creates Injection Vulnerabilities
Pre-escaping for one assumed context actively undermines safe handling in other contexts. Because the template author cannot tell what escaping has already been applied, they face a dilemma: apply a sanitizing lambda and risk double-escaping, or skip it and risk an injection. This creates a class of vulnerabilities in generated code:
Kotlin string template injection ($)
Kotlin string literals treat $ as the start of a string interpolation ($variable, ${expression}). The escapeText method used during pre-escaping does not escape $ — that is handled separately by lambda.escapeDollar. A pre-escaped value like "hello $world" stored as "\"hello $world\"" will compile to an interpolated string referencing the variable world, rather than the literal text $world. A template author using the value in a different context (e.g. a multiline string or a comment) may assume escaping was already handled and skip lambda.escapeDollar, leaving the interpolation active.
Premature termination of triple-quoted strings (""")
Kotlin multiline strings are delimited by """. A value containing """ (e.g. a description or default value from a spec) would prematurely close the string, injecting arbitrary content outside it. Pre-escaping with escapeText targets regular string literals (\") — it does not produce the ${"\"\"\""} construct required to safely embed triple-quotes inside a multiline string. A template author reusing a pre-escaped value in a """...""" context has no safe path: the escaping that was applied is wrong for this context, and applying the right escaping on top would double-escape everything else.
General principle
The root issue is that a template author cannot reason safely about a value whose escaping state is unknown. With raw values and explicit lambdas, the contract is clear: the value is always unescaped, and the template applies exactly the lambdas required for the target context — no guessing, no double-escaping, no missed injection vectors.
The Correct Contract
The Java codegen layer stores raw semantic values. Mustache templates are solely responsible for context-appropriate escaping via lambdas.
// Java layer — raw value only:
enumVar.put("value", "available"); // not "\"available\""
property.defaultValue = "hello world"; // not "\"hello world\""
// Template layer — escaping is explicit and context-targeted:
{{name}}({{#lambda.kotlinString}}{{{value}}}{{/lambda.kotlinString}}) // → AVAILABLE("available")
@SerializedName(value = "{{value}}") // → @SerializedName(value = "available")
// default: {{defaultValue}} // → // default: hello world
@DefaultValue("{{#lambda.escapeInNormalString}}{{{defaultValue}}}{{/lambda.escapeInNormalString}}")
Benefits of the Change
- Correctness — eliminates the
""available"" class of bugs where two layers both add quotes
- No more
DoubleQuoteLambda — it becomes unnecessary; the template always knows whether it's adding quotes
- No more
unescapedDefaultValue — defaultValue is already raw; the parallel field disappears
- Predictability — any contributor reading a template knows exactly what they're getting: raw values, and explicit lambdas for escaping
- Security — template authors can apply exactly the right escaping for each context without ambiguity
- Extensibility — adding a new target language/context just means adding a new lambda, not forking
toDefaultValue() overrides across dozens of codegen subclasses
Migration Considerations
This is a breaking change for custom templates. The migration path would be:
- Deprecate the pre-escaped behavior with a flag
- Add a
rawValue / rawDefaultValue field alongside the existing ones as a transition bridge
- Update all bundled templates to use explicit lambdas
- Remove the pre-escaped fields in a future major version
I am willing to try to tackle this in kotlin codegens
The Problem
The codegen Java layer currently bakes language-specific syntax and escaping directly into data model fields (
defaultValue,valuein enumVars, etc.) before they reach Mustache templates. I think this is the wrong layer for that responsibility. It makes context-specific escaping unnecessarily complex/unpredictable/impossible.Concrete Examples
1.
defaultValue— pre-escaped for the wrong contextAbstractKotlinCodegen.toDefaultValue()andAbstractJavaCodegen.toDefaultValue()return strings like:"\"hello\""— raw value already wrapped in language string-literal quotes"42l"— Java long literal suffix baked in"new BigDecimal(\"3.14\")"— full constructor expression"URI.create(\"...\")"— static factory callThe template then receives a code-ready expression, not a value. This makes it impossible for the template to render the same value in a different context (e.g. annotation attribute vs. field initializer vs. comment) without getting double-escaped or incorrectly escaped output.
2.
valueinenumVars— same pattern, fragile workaroundAbstractKotlinCodegen.toEnumValue()returns"\"available\""for string types — i.e. the value already includes the surrounding Kotlin string-literal quotes. Inkotlin-client/enum_class.mustache,{{{value}}}is used in two contexts with conflicting needs:{{name}}({{{value}}})→AVAILABLE("available")— the pre-quoted form works here by accident@SerializedName(value = {{#lambda.doublequote}}{{{value}}}{{/lambda.doublequote}})→@SerializedName(value = "available")— this works, but only becauseDoubleQuoteLambdadetects the value is already quoted and passes it through unchanged (i.e. it is a no-op here)The annotation lines only produce correct output because the lambda happens to be idempotent for already-quoted input. If the template ever needs the raw value (e.g. in a Javadoc comment or a non-string context), there is no way to get it — there is no
unescapedValuecounterpart forenumVars.3.
DoubleQuoteLambda— a symptom, not a solutionBecause some codegens pre-quote
defaultValue(for string types) and others don't (for numeric types), the{{#lambda.doublequote}}lambda was introduced to normalize "add quotes unless already present." This is fundamentally a state-detection workaround, not a principled design: the template has to guess whether the Java layer already applied quoting.4.
unescapedDefaultValue— acknowledgment of the problemThe existence of a parallel
unescapedDefaultValuefield (set fromschema.getDefault()directly) shows the codebase already recognizes thatdefaultValueis "too processed" for some uses — but this is a workaround, not a fix.Why Pre-escaping Is Wrong
Escaping is context-dependent:
hello'sneeds"hello's"'hello\'s'"""hello's""""hello's"hello'shello's(no escaping)hello%27sWhen a value is pre-escaped in Java for one assumed context, it:
DoubleQuoteLambda) to "un-guess" whether quoting was appliedSecurity: Pre-escaping Creates Injection Vulnerabilities
Pre-escaping for one assumed context actively undermines safe handling in other contexts. Because the template author cannot tell what escaping has already been applied, they face a dilemma: apply a sanitizing lambda and risk double-escaping, or skip it and risk an injection. This creates a class of vulnerabilities in generated code:
Kotlin string template injection (
$)Kotlin string literals treat
$as the start of a string interpolation ($variable,${expression}). TheescapeTextmethod used during pre-escaping does not escape$— that is handled separately bylambda.escapeDollar. A pre-escaped value like"hello $world"stored as"\"hello $world\""will compile to an interpolated string referencing the variableworld, rather than the literal text$world. A template author using the value in a different context (e.g. a multiline string or a comment) may assume escaping was already handled and skiplambda.escapeDollar, leaving the interpolation active.Premature termination of triple-quoted strings (
""")Kotlin multiline strings are delimited by
""". A value containing"""(e.g. a description or default value from a spec) would prematurely close the string, injecting arbitrary content outside it. Pre-escaping withescapeTexttargets regular string literals (\") — it does not produce the${"\"\"\""}construct required to safely embed triple-quotes inside a multiline string. A template author reusing a pre-escaped value in a"""..."""context has no safe path: the escaping that was applied is wrong for this context, and applying the right escaping on top would double-escape everything else.General principle
The root issue is that a template author cannot reason safely about a value whose escaping state is unknown. With raw values and explicit lambdas, the contract is clear: the value is always unescaped, and the template applies exactly the lambdas required for the target context — no guessing, no double-escaping, no missed injection vectors.
The Correct Contract
Benefits of the Change
""available""class of bugs where two layers both add quotesDoubleQuoteLambda— it becomes unnecessary; the template always knows whether it's adding quotesunescapedDefaultValue—defaultValueis already raw; the parallel field disappearstoDefaultValue()overrides across dozens of codegen subclassesMigration Considerations
This is a breaking change for custom templates. The migration path would be:
rawValue/rawDefaultValuefield alongside the existing ones as a transition bridgeI am willing to try to tackle this in kotlin codegens