[Confluence] Fix pagination for get_all_* methods and unify _get_paged across Cloud/Server#1616
[Confluence] Fix pagination for get_all_* methods and unify _get_paged across Cloud/Server#1616Zircoz wants to merge 7 commits intoatlassian-api:masterfrom
Conversation
8c2f7f9 to
ea2fa33
Compare
…d across Cloud/Server Fixes atlassian-api#1598 Fixes atlassian-api#1480 - Switch 10 get_all_* methods to use _get_paged for full pagination - Unify _get_paged into ConfluenceBase (remove Cloud/Server duplicates) - Handle _links.next as both string and dict formats - Fix relative pagination URLs by prepending base URL correctly - Fix Cloud api_root from wiki/api/v2 to wiki/rest/api (endpoints use v1 paths) - Recognize api.atlassian.com in Cloud detection; support explicit cloud= kwarg - Add routing tests and pagination edge-case tests for both Cloud and Server Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
So let's check how AI will work, fortunately it works to me |
|
@Zircoz looks like a lot of extra code |
|
Ill tell claude to review and rethink. :P More seriously tho, I'll take a closer look myself with assistance from claude and come back to ya on this. |
Use urlparse to extract and check the hostname directly instead of naive substring matching, preventing spoofing via paths like evil.com/atlassian.net/... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…itization [Confluence] Fix pagination for get_all_* methods and CodeQL URL sanitization
- Revert Cloud api_version from "latest" back to "2" (original) - Revert Cloud api_root from "wiki/rest/api" back to "wiki/api/v2" (original) - Revert Cloud URL construction: remove api_root suffix appended to self.url - Simplify _get_paged relative URL resolution: drop api_root-stripping branch (was only needed due to the Cloud URL change) and use urlparse(self.url).netloc directly - Update test_init_defaults assertions to match reverted Cloud defaults The Cloud api_version/api_root/URL changes were unrelated to atlassian-api#1598 and constituted a breaking change for existing Cloud users. The complex api_root stripping logic in _get_paged was a direct consequence of that change and is no longer needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion-11278499126321725972 Follow-up: CodeQL fix + PR scope simplification
Migration note for maintainersHi — a note for when this gets merged, in case it's useful for release notes or a changelog entry: What changedSeveral Affected methods:
Upgrading# Before — broken (only first page returned, dict access required)
result = confluence.get_all_pages_from_space("MYSPACE")
pages = result["results"]
# After — iterate the generator directly
for page in confluence.get_all_pages_from_space("MYSPACE"):
process(page)
# or collect everything at once
pages = list(confluence.get_all_pages_from_space("MYSPACE"))Cloud users on v1 REST APIIf you use confluence = Confluence(
url="https://api.atlassian.com/ex/confluence/<tenant-id>",
username=email,
password=api_token,
cloud=True,
api_root="wiki/rest/api",
api_version="latest",
)A couple of requests:
Thanks! 🤖 Generated with Claude Code |
Summary
Fixes #1598
get_all_pages_from_spaceand relatedget_all_*methods only returned the first page of results because they calledself.get()directly instead ofself._get_paged(). Additionally, pagination broke when the Confluence API returned_links.nextas a plain URL string rather than a{"href": "..."}dict.Changes
Pagination fix (core — #1598)
get_all_*methods to use_get_pagedon Server so they return fully-paginated generators instead of single-page dictsget_all_pages_from_spaceandget_all_blog_posts_from_spaceto Cloud — these were missing from the Cloud implementation entirely_get_pagedintoConfluenceBase— removed duplicate implementations fromConfluenceCloudBaseandConfluenceServerBase, replacing them with a single method that handles bothstranddictformats for_links.next/rest/api/content?start=25), the scheme and host are extracted fromself.urlviaurlparseand correctly prependedRouting improvements (
Confluencewrapper)"atlassian.net" in urlchecks withurlparse(url).hostnameto prevent false matches on paths likeevil.com/fake/atlassian.net/api.atlassian.com— theConfluencewrapper now correctly routes OAuth2 API gateway URLs toConfluenceCloudcloud=kwarg — allows callers to force Cloud or Server routing regardless of URL heuristicsOut of scope (intentionally not in this PR)
The Cloud
api_rootandapi_versiondefaults (wiki/api/v2,"2") are unchanged. An earlier iteration of this branch changed these towiki/rest/api/"latest"(v1 REST API), but that was reverted — it was unrelated to #1598 and would break existing Cloud users who have already set up v2 OAuth scopes. Users who need v1 REST API access can passapi_root="wiki/rest/api", api_version="latest"explicitly to the constructor.Breaking change
The following methods now return generators instead of dicts:
get_all_pages_from_space,get_all_blog_posts_from_space,get_all_pages_by_label,get_all_blog_posts_by_label,get_all_draft_pages_from_space,get_all_draft_blog_posts_from_space,get_trash_content,get_all_pages_from_space_trash,get_all_blog_posts_from_space_trashBefore (broken — only first page):
After (correct — all pages):
🤖 Generated with Claude Code