xds: Implementation of Unified Matcher and CEL Integration#12640
xds: Implementation of Unified Matcher and CEL Integration#12640shivaspeaks wants to merge 20 commits intogrpc:masterfrom
Conversation
082243d to
f472c63
Compare
l46kok
left a comment
There was a problem hiding this comment.
Sorry about the drive-by -- I happened to notice this while putting together an unrelated PR in CEL-Java. I'll hold off on submitting it on our end.
| # checkstyle 10.0+ requires Java 11+ | ||
| # See https://checkstyle.sourceforge.io/releasenotes_old_8-35_10-26.html#Release_10.0 | ||
| # checkForUpdates: checkstylejava8:9.+ | ||
| cel-runtime = "dev.cel:runtime:0.11.1" |
There was a problem hiding this comment.
If you plan on using both the compilation and runtime packages, you could consider just taking a dependency on dev.cel:cel instead, which includes both. It will likely make your dependency management much easier.
On a separate note -- let me know if you'd like us to cut a new release.
There was a problem hiding this comment.
We purposefully don't want a dependency on the cel parser. Also, it seems dev.cel:cel is busted, in that it re-defines dev.cel.runtime instead of depending on the other artifact. That will cause hard-to-debug duplicate classes in the class path.
There was a problem hiding this comment.
For artifact issue -- it's tracked, we can look into prioritizing it: (b/328302799).
The PR here is referencing CelCompiler, so I think you'll need the compiler dependency (unless if it shouldn't have been included here).
There was a problem hiding this comment.
The problem is pretty serious, as it prevents dependency management from working properly and it is not something regular people are able to debug. It's the sort of thing where everything looks fine, but then you make an unrelated change and somehow suddenly breaks CEL. So the concern is in support costs.
Really, I'm wary about merging this with the broken cel jars; it's best not to have them in our dependency tree. Even with the code disabled they can still harm users. Maybe it is best to make them compileOnly until things are fixed, which also means parts of testing and feature stabilization would be delayed until it is fixed.
The PR is only using CelCompiler in tests. It will need to get reorganized to avoid the dependency for production.
@shivaspeaks, you may want to glance at go/grpc-cel-integration. It was supposed to have been superseded by the gRFC, but I'm seeing the gRFC doesn't make a point of calling out some of the goals (like not depending on the CEL compiler). It's longer than you probably want to read in full, but you may just skim it to find the "interesting" parts.
There was a problem hiding this comment.
I'll look into this next week. I was actually under the impression this wasn't a hard blocker. Thanks for surfacing it up.
There was a problem hiding this comment.
There is an observation @ejona86! I was trying what the doc said:
"
Production code: no need to depend on the full dev.cel:cel package.
- Use dev.cel:runtime for CEL evaluation.
- Use dev.cel:protobuf for creating CEL programs from proto.
Testing code: it's acceptable to use unstable CEL methods and depend on dev.cel:cel / dev.cel:compiler in testing code.
"
When ChannelAndServerBuilderTest runs, it uses Guava's ClassPath to scan all io.grpc.* classes. When the JVM Verifier loads CelMatcher or PredicateEvaluator, it traverses the method signatures and fields, eventually hitting CodePointStream. Because CodePointStream implements CharStream, the JVM tries to load CharStream. Since dev.cel:runtime didn't bring ANTLR transitively, and I removed dev.cel:compiler (which does bring ANTLR) from implementation, the JVM throws NoClassDefFoundError. I am being forced to add compileOnly(libraries.cel.compiler) or else we can add implementation 'org.antlr:antlr4-runtime'
There was a problem hiding this comment.
I've merged in the fix for this: google/cel-java#954. Runtime pulling in the antlr dependency is unintended, which is also fixed with this PR. I'll cut a new release sometime next week.
In the interim, I've pushed a snapshot version if you'd like to locally verify the fix (and let me know if you encounter any issues): https://central.sonatype.com/repository/maven-snapshots/dev/cel/runtime/0.12.0-SNAPSHOT/runtime-0.12.0-SNAPSHOT.pom
There was a problem hiding this comment.
Yes its working with the snapshot version.
I assume there is no periodic release schedule for cel-java, like every 6 weeks for grpc (https://github.com/grpc/grpc-java/milestones or like go/grpc-oss-release-dates)? Is it done as and when required a new release for cel?
There was a problem hiding this comment.
For the most part, we cut a release whenever as needed. At a minimum, we'll cut one once per quarter.
| .enableStringConversion(false) | ||
| .enableStringConcatenation(false) | ||
| .enableListConcatenation(false) |
There was a problem hiding this comment.
Consider subsetting the environment via CelStandardDeclarations (compiler) and CelStandardFunctions (runtime) instead. Examples:
- https://github.com/google/cel-java/blob/b3ee6a6599b20814488a83bb660cfaace158c0d2/checker/src/test/java/dev/cel/checker/CelStandardDeclarationsTest.java#L135
- https://github.com/google/cel-java/blob/b3ee6a6599b20814488a83bb660cfaace158c0d2/runtime/src/test/java/dev/cel/runtime/CelStandardFunctionsTest.java#L116
This is the preferred mechanism, as it gives you more custom control over what functions/overloads are exposed. It's the direction cel-go went too, so this is likely desirable from a consistency standpoint as well.
| * Compiles the CEL expression string into a CelMatcher. | ||
| */ | ||
| @VisibleForTesting | ||
| public static CelMatcher compile(String expression) |
There was a problem hiding this comment.
Move this to the testing code so we don't need a dependency on the Cel compiler. It looks unused for production.
| # checkstyle 10.0+ requires Java 11+ | ||
| # See https://checkstyle.sourceforge.io/releasenotes_old_8-35_10-26.html#Release_10.0 | ||
| # checkForUpdates: checkstylejava8:9.+ | ||
| cel-runtime = "dev.cel:runtime:0.11.1" |
There was a problem hiding this comment.
The problem is pretty serious, as it prevents dependency management from working properly and it is not something regular people are able to debug. It's the sort of thing where everything looks fine, but then you make an unrelated change and somehow suddenly breaks CEL. So the concern is in support costs.
Really, I'm wary about merging this with the broken cel jars; it's best not to have them in our dependency tree. Even with the code disabled they can still harm users. Maybe it is best to make them compileOnly until things are fixed, which also means parts of testing and feature stabilization would be delayed until it is fixed.
The PR is only using CelCompiler in tests. It will need to get reorganized to avoid the dependency for production.
@shivaspeaks, you may want to glance at go/grpc-cel-integration. It was supposed to have been superseded by the gRFC, but I'm seeing the gRFC doesn't make a point of calling out some of the goals (like not depending on the CEL compiler). It's longer than you probably want to read in full, but you may just skim it to find the "interesting" parts.
| String id = over.toString(); | ||
| return !id.equals("ADD_STRING") && !id.equals("ADD_LIST"); |
There was a problem hiding this comment.
You should be able to compare directly against the overload enum instead of the stringified form (ex: over != AddOverload.ADD_STRING).
Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 872010923
| String id = over.toString(); | ||
| return !id.equals("ADD_STRING") && !id.equals("ADD_LIST"); | ||
| return !over.equals( | ||
| (Object) dev.cel.runtime.standard.AddOperator.AddOverload.ADD_STRING) |
There was a problem hiding this comment.
Why are we needing to cast to Object? If you're working around ErrorProne, then something seems broken and casting to Object is just hiding it.
There was a problem hiding this comment.
Yes it was an errorProne workaround.
over is statically known as a StandardOverload interface, but ADD_STRING is an AddOverload enum. ErrorProne threw an [EqualsIncompatibleType] build error so we need to cast the ADD_STRING enum specifically to (Object) before passing it into equals.
There was a problem hiding this comment.
ADD_STRING is a CelStandardOverload, which isn't a StandardOverload. It seems ErrorProne is correct; these will never be equal.
You should comment cases like this. If this were a bug in ErrorProne we'd probably want to file a bug with ErrorProne, but even if not, the comment helps us know when it is no longer necessary. (We don't have to guess why it was done, so we can confirm if that is no longer the case.)
It looks like that has been adjusted on cel-java master, as the function is now passed CelStandardOverload.
@sergiitk, the interface change looks like a breaking API change. That's surprising since I thought we were using stable APIs. Is this PR using an API that it shouldn't be, was this a known change that just hadn't been released yet, or should we be concerned about the actual stability of the API?
There was a problem hiding this comment.
This is a relatively new API that didn't exist when API stabilization doc was made (go/cel-api-stabilization). It's stable now (I'll update the doc when I get a chance). Reason why I suggested this API is that it follows the recommended pattern for how environment subsetting is supposed to be done across all CEL stacks.
As for EP -- I'm surprised it's actually flagging this (internally, it doesn't). AddOperator enum implements CelStandardOverload, so they should equal. My only guess is that CelStandardOverload is a functional interface, so theoretically it's possible to provide a lambda that's not equals comparable. Though we control the standard overload implementations, so in practice this won't happen. Checking for reference identity should likely get around the issue over == AddOverload.ADD_STRING
There was a problem hiding this comment.
Oh, I see, this is the new setup flow. I didn't look where this FUNCTIONS constant was used. That's fine; I'm less surprised that changed. (And yes, it's good to add it to the doc.)
I'm surprised it's actually flagging this (internally, it doesn't)
I don't think it'd get flagged on master either. It only looked to be a problem because we're still using 0.11.1; in 0.11.1 I agree with ErrorProne that the equals() will always return false.
| import javax.annotation.Nullable; | ||
|
|
||
| /** | ||
| * Interface for extracting values from a match context (e.g. HTTP headers). |
There was a problem hiding this comment.
This says explicitly "from a match context", and the only two implementations use MatchContext, so why is T present? Shouldn't this just have an argument that is MatchContext?
| return result.actions; | ||
| } | ||
|
|
||
| public interface MatchContext { |
There was a problem hiding this comment.
Nothing implements this. Is this still TODO? There's no mention this is missing in the PR description.
There was a problem hiding this comment.
Matcher Context: The input to the Matcher evaluation tree. Provided at runtime by the filter, contains the input data.
MatchContext is the representation of the "Matcher Context" described in gRFC. It is currently an empty shell (no implementations in this PR) because it is designed to be implemented by the specific xDS filters (e.g., RLQS or Composite Filter) that will be the primary consumers of this matching logic, as noted in the gRFC's Implementation section
There was a problem hiding this comment.
The implementation section saying "as part of" is saying "it will be part of that effort." We aren't saying CEL will be part of RLQS and the Composite Filter will call the RLQS code if RLQS was implemented first.
I think the filter will construct the Matcher Context and provide its contents, but I don't think each filter will have its own implementation, or at least an implementation from scratch. Headers will be Metadata, Method is always POST, Path will be pulled from MethodDescriptor, etc. I could understand if there was a server-side vs client-side implementation, but even that doesn't seem likely; worst-case looks to be different factory methods as convenience.
| /** | ||
| * Represents a compiled xDS Matcher. | ||
| */ | ||
| public abstract class UnifiedMatcher { |
There was a problem hiding this comment.
How do we have enough information to satisfy:
If set, the return type of the
inputmust match the input type
of thematcher.
There was a problem hiding this comment.
I see now. UnifiedMatcher has input+matcher combined together, so there's no need to check types when using it. But I think I got confused because there is not Matcher API. There should be a matcher API; in no way do we want to hard-code each matcher type.
| final class OnMatch { | ||
| @Nullable private final UnifiedMatcher nestedMatcher; | ||
| @Nullable private final TypedExtensionConfig action; | ||
| final boolean keepMatching; |
There was a problem hiding this comment.
Overall, it seems like keepMatching is incomplete (beyond this one instance). But that's not mentioned in the PR description. What is the current status of keepMatching?
There was a problem hiding this comment.
In MatcherList: keep_matching is used to continue the loop and accumulate actions.
In MatcherTree (Prefix): It is also implemented for traversing the trie/list of prefixes. Check line 133 in MatcherTree (if (!onMatch.keepMatching))
In MatcherTree (Exact): If a key matches, it returns immediately without checking if it should "keep matching" because thats the only logical way it seems. Should it fall back to on_no_match?
There was a problem hiding this comment.
keepMatching here is literally not used. Clearly that is wrong. Note that I'm most bothered that there's no TODO or mention in the PR description. There's nothing that calls out "this didn't make sense" or "this was not possible" for us to work through.
It seems you're missing the fact keep_matching means multiple completely different matchers can contribute to the result. You need to accumulate keep_matching results across matchers.
This is probably more clear after reading my comments at https://github.com/grpc/proposal/pull/520/changes#r2683391713 . Basically, the most important thing to understand about keep_matching is that it is a semantically different result. You shouldn't semantically mix the results between a terminal result with those of keep_matching; they aren't interchangeable (if C++ wants to and have an encoding for them to determine which results are what, they are free to, but I don't want the security vulnerability risk). We should either use a callback or a separate list in the result (MatchResult {TypedExecutionConfig action; List<TypedExecutionConfig> keepMatchingActions; }, where action == null means matched == false, or use Optional<TypedExecutionConfig> to track it)
| try { | ||
| com.github.xds.type.matcher.v3.CelMatcher celProto = customConfig.getTypedConfig() | ||
| .unpack(com.github.xds.type.matcher.v3.CelMatcher.class); | ||
| if (celProto.hasExprMatch()) { |
There was a problem hiding this comment.
Invert this condition so that we can more easily see what exception is matched with with if and also can get rid of the else. Ditto elsewhere in this class.
| @Test | ||
| public void celMatcher_match() { | ||
| // Construct CelMatcher with checked expression | ||
| CelMatcher celMatcher = createCelMatcher("request.path == '/good'"); |
There was a problem hiding this comment.
There's no reasons for the CEL-specific tests to be bunched with the general matcher logic. We should separate them out.
| import org.junit.runners.JUnit4; | ||
|
|
||
| @RunWith(JUnit4.class) | ||
| public class UnifiedMatcherTest { |
There was a problem hiding this comment.
We need to separate (or at least make a clear distinction between) the tests for:
- the proto validation logic (construction the matcher object from protos for the first time)
- the matching logic (when a concrete instance of the matcher object is applied to different inputs.
| @Nullable private final Matchers.StringMatcher stringMatcher; | ||
| @Nullable private final CelMatcher celMatcher; |
There was a problem hiding this comment.
This class design is unsustainable. We'll be adding support for different matcher extensions, not just these two. Do we expect hardcoding a separate nullable field for each extension we add?
|
|
||
| if (proto.hasValueMatch()) { | ||
| if (proto.getInput().getTypedConfig().getTypeUrl() | ||
| .equals(TYPE_URL_HTTP_ATTRIBUTES_CEL_INPUT)) { |
There was a problem hiding this comment.
There's no reason for any CEL logic to be tightly coupled with SinglePredicateEvaluator, or any other specific input/matcher extensions.
Instead, we should introduce two registries: one for inputs, one for matchers. Then we need to generalize logic, like so:
- Check if the input contains a type in the input registry. If so, get corresponding class
- Check if the matcher (either from
value_matchorcustom_match) contains a type in the matcher registry. If so, get corresponding class - Compare that the output type of the input is the same as the input type of the matcher.
- If yes, store the input object and the matcher object as Single Predicate object fields. These object fields types must not be concrete (like
CelMatcher celMatcherright now), but interfaces/abstract classes.
| * Returns the type of value extracted by this input. | ||
| */ | ||
| default Class<?> outputType() { | ||
| return Object.class; |
There was a problem hiding this comment.
This disables the check by default and it seems extremely doubtful any implementation would legitimately use Object. Delete the default implementation?
| } | ||
|
|
||
| if (!input.outputType().isAssignableFrom(matcher.inputType()) | ||
| && !matcher.inputType().isAssignableFrom(input.outputType())) { |
There was a problem hiding this comment.
The only way for the input to be assignable from the output and the output to be assignable from the input is for it to be the same class. This is just a verbose equals() check.
And just using equals() sounds perfect. I don't think we need any complex object hierarchy stuff going on. We just need to make sure they match. Equals() is what I'd hope we use, as it is unlikely to be broken.
| this.matcher = new Matcher() { | ||
| @Override | ||
| public boolean match(Object value) { | ||
| if (value instanceof String) { |
There was a problem hiding this comment.
If the input isn't compatible, this should throw. That is a programmer error, as the class said which type it can handle and that wasn't provided. That lets us find bugs. Note that silently ignoring a failure like that can be a security vulnerability. We need to be just as precise with match and non-match; we can't favor one way or the other, as we have no knowledge of what this matching implies.
| } | ||
|
|
||
| @Override | ||
| public Object apply(MatchContext context) { |
There was a problem hiding this comment.
You can have this method return String even when the interface returns Object. That would help guarantee we don't accidentally return any other type.
Implements gRFC A106: xDS Unified Matcher and CEL Integration
grpc/proposal#520