fix: allow numeric resource IDs in _VALID_RESOURCE_NAME_REGEX#6440
Open
urmzd wants to merge 5 commits intogoogleapis:mainfrom
Open
fix: allow numeric resource IDs in _VALID_RESOURCE_NAME_REGEX#6440urmzd wants to merge 5 commits intogoogleapis:mainfrom
urmzd wants to merge 5 commits intogoogleapis:mainfrom
Conversation
The regex required the first character to be a lowercase letter [a-z], which rejected bare numeric IDs (e.g. "1234567890") that the API assigns to resources like RAG corpora and files. Updated to accept any alphanumeric first character [a-zA-Z0-9]. Fixes all three definitions of _VALID_RESOURCE_NAME_REGEX: - vertexai/preview/rag/utils/_gapic_utils.py - vertexai/rag/utils/_gapic_utils.py - google/cloud/aiplatform/vertex_ray/util/_validation_utils.py
Adds tests for get_corpus and get_file with numeric IDs to verify the regex fix accepts API-assigned numeric resource identifiers.
The vertex_ray _VALID_RESOURCE_NAME_REGEX intentionally requires a lowercase letter first for persistent resource names, which is a different context from RAG resource IDs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
_VALID_RESOURCE_NAME_REGEXin the RAG SDK requires the first character to be a lowercase letter ([a-z]), which rejects bare numeric IDs (e.g.1234567890) that the Vertex AI API assigns to resources like RAG corpora and files.Calling
get_corpus()orget_file()with a valid numeric resource ID raisesValueError:The ID is valid —
parse_rag_corpus_pathreturns{}(not a full path), and the regex rejects it because"1"doesn't match[a-z]. The fix branch never reachesrag_corpus_path()to expand the short ID.Fixes #6442
Changes
Updated the regex from
[a-z][a-zA-Z0-9._-]{0,127}to[a-zA-Z0-9][a-zA-Z0-9._-]{0,127}in both definitions:vertexai/rag/utils/_gapic_utils.pyvertexai/preview/rag/utils/_gapic_utils.pyTesting
test_get_corpus_numeric_id_successandtest_get_file_numeric_id_successto bothtest_rag_data.pyandtest_rag_data_preview.pyside_effecton path helpers so bare numeric IDs exercise the regex code path (the shared fixture'sMock()always returns truthy forparse_rag_corpus_path, bypassing the regex entirely)