GH-3969: Native stream() implementations for DatasetGraph#3970
Open
arne-bdt wants to merge 1 commit into
Open
Conversation
Contributor
Author
I took an extra look at TBD2, not much to do there, since it mostly builds on But you hinted me towards |
Replace the find()-wrapping default of DatasetGraph.stream(g,s,p,o) with first-class stream support across the hierarchy. DatasetGraphBaseFind gets a stream() path mirroring the find() default-/named-/union-graph split, backed by streamInDftGraph / streamInSpecificNamedGraph / streamInAnyNamedGraphs primitives implemented per dataset. GraphView streams graph-over-dataset through DatasetGraph#stream; adds G stream helpers. The interface method stays default, so implementors are not broken. Includes stream()==find() parity tests and a forEachRemaining check in IteratorTxnTracker so stream bulk operations stay inside their transaction.
24a3587 to
4bc8494
Compare
Member
|
In principle, the PR is a "good thing" ™️
I will review this PR ... there is quite a lot of reviewing at the moment, so there is a bit of a queue. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
GitHub issue resolved #3969
Pull request Description:
PR description
Replaces the
find()-wrapping default with native streaming across theDatasetGraphhierarchy.DatasetGraphBaseFind:stream()path (streamNG/streamAny/streamQuadsInUnionGraph) mirroringfind(), backed by new primitivesstreamInDftGraph/streamInSpecificNamedGraph/streamInAnyNamedGraphs, implemented by each dataset (in-memory, map, one, null, collection, buffering, dyadic, ordered, storage, TDB1).GraphView.stream(...)routes graph-over-dataset access throughDatasetGraph#stream.Gstream helpers (quads2triples,triples2quads,triples2quadsDftGraph).IteratorTxnTracker.forEachRemainingnow checks the transaction, keeping stream bulk operations (forEach/count/collect) inside their originating transaction.Compatibility:
DatasetGraph.stream(g,s,p,o)remains adefaultmethod — no break for external implementors.Tests:
stream()==find()parity across access patterns for the in-memory / map / one / filtered-view / storage / TDB1 datasets andGraphView, plus buffered-overlay parity tests forBufferingDatasetGraph.Note: I did not expose the native stream support of the
StorageRDFimplementations toDatasetGraphStorage. If one plans to to that, the streams would need to be transaction isolated like in the IteratorTxnTracker.AI Disclaimer
The productive code is written by hand, only with AI assisted code completion.
Most of the new tests are written by an AI Coding assistant.
Issue description, PR description and commit comment is AI generated.
By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.
See the Apache Jena "Contributing" guide.