Fix missing taskId filter in pprof task log query and its DB implementations#13807
Fix missing taskId filter in pprof task log query and its DB implementations#13807wu-sheng merged 2 commits intoapache:masterfrom
taskId filter in pprof task log query and its DB implementations#13807Conversation
…anDB/Elasticsearch implementations
There was a problem hiding this comment.
Pull request overview
This PR fixes pprof task log retrieval to apply the taskId filter at the storage/query layer (rather than querying all logs and filtering in-memory), updating the core DAO contract and the JDBC/Elasticsearch/BanyanDB implementations accordingly.
Changes:
- Update
IPprofTaskLogQueryDAO#getTaskLogListto accepttaskIdand document the new contract. - Push
taskIdfiltering down into JDBC / Elasticsearch / BanyanDB task log queries. - Simplify
PprofQueryService#queryPprofTaskLogsby removing in-memory filtering and updatingCHANGES.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| oap-server/server-storage-plugin/storage-jdbc-hikaricp-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/jdbc/common/dao/JDBCPprofTaskLogQueryDAO.java | Add taskId predicate to SQL used to fetch pprof task logs. |
| oap-server/server-storage-plugin/storage-elasticsearch-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/elasticsearch/query/PprofTaskLogQueryEsDAO.java | Add taskId term filter to the ES bool query for pprof task logs. |
| oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/stream/BanyanDBPprofTaskLogQueryDAO.java | Add taskId equality condition to BanyanDB stream query. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/profiling/pprof/IPprofTaskLogQueryDAO.java | Change DAO API to require taskId for task log queries. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/profiling/pprof/PprofQueryService.java | Remove in-memory filtering; delegate filtering to DAO and map results. |
| docs/en/changes/changes.md | Document the fix in the 10.5.0 change log. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * Search task logs by the given task id. | ||
| * | ||
| * @param taskId the task id to filter by, must not be null or blank | ||
| * @return the task logs associated with the given task id | ||
| * @throws IOException if the query fails | ||
| */ | ||
| List<PprofTaskLog> getTaskLogList() throws IOException; | ||
| List<PprofTaskLog> getTaskLogList(String taskId) throws IOException; |
There was a problem hiding this comment.
The Javadoc states taskId must not be null or blank, but there is no validation at the service/DAO boundary. This leads to inconsistent behavior across implementations (e.g., Elasticsearch Query.term throws NPE on null, while JDBC/BanyanDB may silently return empty results). Either enforce this contract (preferred: validate and throw an IllegalArgumentException with a clear message) or relax the Javadoc to match actual behavior.
| public List<PprofTaskLog> queryPprofTaskLogs(String taskId) throws IOException { | ||
| List<PprofTaskLog> taskLogList = getTaskLogQueryDAO().getTaskLogList(); | ||
| return findMatchedLogs(taskId, taskLogList); | ||
| } | ||
|
|
||
| private List<PprofTaskLog> findMatchedLogs(final String taskID, final List<PprofTaskLog> allLogs) { | ||
| return allLogs.stream() | ||
| .filter(l -> Objects.equals(l.getId(), taskID)) | ||
| .map(this::extendTaskLog) | ||
| .collect(Collectors.toList()); | ||
| return getTaskLogQueryDAO().getTaskLogList(taskId).stream() | ||
| .map(this::extendTaskLog) | ||
| .collect(Collectors.toList()); |
There was a problem hiding this comment.
taskId is passed directly to the storage layer without validation. Since the Elasticsearch implementation will throw on null (and the DAO Javadoc claims non-null/non-blank), add an explicit check here (or earlier at the GraphQL boundary) to fail fast with a clear error message and consistent behavior across storage backends.
There was a problem hiding this comment.
Added the task id checker in each implementation.
| public List<PprofTaskLog> getTaskLogList(String taskId) throws IOException { | ||
| final String index = IndexController.LogicIndicesRegister.getPhysicalTableName(PprofTaskLogRecord.INDEX_NAME); | ||
| final BoolQueryBuilder query = Query.bool(); | ||
| if (IndexController.LogicIndicesRegister.isMergedTable(PprofTaskLogRecord.INDEX_NAME)) { | ||
| query.must(Query.term(IndexController.LogicIndicesRegister.RECORD_TABLE_NAME, PprofTaskLogRecord.INDEX_NAME)); | ||
| } | ||
|
|
||
| query.must(Query.term(PprofTaskLogRecord.TASK_ID, taskId)); | ||
|
|
There was a problem hiding this comment.
Query.term(..., taskId) requires a non-null value and will throw a NullPointerException if taskId is null. Add an explicit argument validation (or ensure callers validate) so failures are consistent and produce a clearer error than an NPE.
CHANGESlog.