feat: Add sync API endpoints for device synchronization#164
Conversation
Greptile SummaryThis PR adds a new
Confidence Score: 1/5Not safe to merge — the feature is entirely non-functional and introduces a security regression. The blueprint is never registered so no endpoint is reachable. The import handler crashes on every call due to a mismatched heartbeat() signature, and two other bugs (exception instead of None from get_bucket_metadata, transposed create_bucket args) would prevent correct bucket creation even if the crash were fixed. On top of that, the new routes bypass the DNS-rebinding guard that protects every other endpoint in the server, exposing the full-data export to unauthenticated cross-origin reads. aw_server/sync_api.py requires significant rework, and aw_server/server.py needs a blueprint registration call added.
|
| Filename | Overview |
|---|---|
| aw_server/sync_api.py | New sync API module with three endpoints (export, import, status). Contains two crash-level bugs (blueprint not registered; heartbeat called with wrong signature), two data-corruption bugs (get_bucket_metadata raises instead of returning None; create_bucket args swapped), a DNS-rebinding security gap, and an O(N×M) deduplication scan capped at 1,000 events. |
Sequence Diagram
sequenceDiagram
participant Client
participant SyncAPI as sync_api.py
participant ServerAPI as api.py (ServerAPI)
participant DB as Datastore
Note over Client,DB: GET /api/0/sync/export
Client->>SyncAPI: GET /api/0/sync/export
SyncAPI->>ServerAPI: get_buckets()
ServerAPI->>DB: buckets()
DB-->>ServerAPI: bucket dict
loop each bucket_id
SyncAPI->>ServerAPI: "get_events(bucket_id, limit=None)"
ServerAPI->>DB: "get(limit=-1)"
DB-->>SyncAPI: events list
end
SyncAPI-->>Client: JSON export payload
Note over Client,DB: POST /api/0/sync/import (BROKEN)
Client->>SyncAPI: POST /api/0/sync/import (JSON)
loop each bucket
SyncAPI->>ServerAPI: get_bucket_metadata(bucket_id)
Note right of ServerAPI: raises NotFound if bucket missing
alt bucket exists
loop each event
SyncAPI->>ServerAPI: "get_events(bucket_id, limit=1000)"
Note right of ServerAPI: Only checks first 1000 events
SyncAPI->>ServerAPI: "heartbeat(bucket_id, dict, duration, timestamp=ts)"
Note right of ServerAPI: CRASH: wrong type and args
end
end
end
Note over Client,DB: GET /api/0/sync/status
Client->>SyncAPI: GET /api/0/sync/status
SyncAPI->>ServerAPI: get_buckets()
SyncAPI->>ServerAPI: get_info()
SyncAPI-->>Client: status JSON
Reviews (1): Last reviewed commit: "feat: Add sync API endpoints for device ..." | Re-trigger Greptile
| """ | ||
| ActivityWatch Sync API | ||
| Allows exporting and importing bucket data between devices. | ||
| """ | ||
| import json | ||
| from datetime import datetime, timezone | ||
| from flask import Blueprint, jsonify, request, current_app | ||
|
|
||
| from .api import ServerAPI | ||
|
|
||
| sync_blueprint = Blueprint("sync", __name__, url_prefix="/api") |
There was a problem hiding this comment.
sync_blueprint is defined but never imported or registered in server.py. The AWFlask.__init__ method only calls self.register_blueprint(rest.blueprint) — sync_blueprint is absent. All three sync endpoints are completely unreachable at runtime; the entire feature is a no-op until this is wired up.
| if exists: | ||
| skipped_count += 1 |
There was a problem hiding this comment.
heartbeat() called with wrong signature — runtime crash on every import
ServerAPI.heartbeat has the signature heartbeat(bucket_id, heartbeat: Event, pulsetime: float). This call passes a plain dict (event_payload) where an Event object is required, passes duration as pulsetime (semantically unrelated — pulsetime is a merge window, not an event duration), and passes timestamp=ts which is not a parameter of heartbeat at all. Every call will raise a TypeError. The correct approach is to construct an Event object and call api.create_events(bucket_id, [event]) for bulk import, or use the existing api.import_bucket helper.
| imported_count = 0 | ||
| skipped_count = 0 | ||
|
|
||
| for bucket_id, bucket_data in data["buckets"].items(): |
There was a problem hiding this comment.
get_bucket_metadata raises NotFound, never returns None
ServerAPI.get_bucket_metadata is decorated with @check_bucket_exists, which raises a NotFound exception (HTTP 404) when the bucket is absent. It never returns a falsy value. The if not existing: branch is dead code — imports targeting a new bucket will always surface as a 404 error before the create_bucket call is reached.
| api.create_bucket( | ||
| bucket_id, | ||
| meta.get("client", "sync"), | ||
| meta.get("type", "unknown"), | ||
| meta.get("hostname", source_device), | ||
| ) |
There was a problem hiding this comment.
create_bucket's second positional parameter is event_type, not client. As written, client value is passed as event_type and type value as client, storing them in the wrong DB columns for every imported bucket.
| api.create_bucket( | |
| bucket_id, | |
| meta.get("client", "sync"), | |
| meta.get("type", "unknown"), | |
| meta.get("hostname", source_device), | |
| ) | |
| api.create_bucket( | |
| bucket_id, | |
| meta.get("type", "unknown"), | |
| meta.get("client", "sync"), | |
| meta.get("hostname", source_device), | |
| ) |
| """ | ||
| ActivityWatch Sync API | ||
| Allows exporting and importing bucket data between devices. | ||
| """ | ||
| import json | ||
| from datetime import datetime, timezone | ||
| from flask import Blueprint, jsonify, request, current_app | ||
|
|
||
| from .api import ServerAPI | ||
|
|
||
| sync_blueprint = Blueprint("sync", __name__, url_prefix="/api") |
There was a problem hiding this comment.
Missing
host_header_check — DNS rebinding vulnerability
Every route in rest.py is protected against DNS rebinding via the host_header_check decorator (applied at the Api level). The new sync_blueprint bypasses this entirely, exposing all three sync endpoints — including the full-data export — to DNS rebinding attacks from any malicious web page. The fix is to apply host_header_check as a before_request hook on sync_blueprint or decorate each view function.
| duration = event_data.get("duration", 0) | ||
| event_payload = event_data.get("data", {}) | ||
|
|
||
| # Last-write-wins: check if event with same id exists | ||
| event_id = event_data.get("id") | ||
| if event_id: |
There was a problem hiding this comment.
Deduplication fetches only 1,000 events per event, silently re-importing the rest
get_events(bucket_id, limit=1000) is called inside the per-event loop, so for a bucket with more than 1,000 events the deduplication check only covers the first 1,000. Events beyond that window will be silently re-imported on every sync, causing duplicates. This also makes the import O(N×M) in the number of events — for large buckets this becomes very slow.
| duration = event_data.get("duration", 0) | ||
| event_payload = event_data.get("data", {}) | ||
|
|
||
| # Last-write-wins: check if event with same id exists |
There was a problem hiding this comment.
The comment says "Last-write-wins" but the code does the opposite — it skips an event when a matching ID already exists, making it first-write-wins. This mismatch will mislead anyone trying to understand or extend the conflict-resolution logic.
| # Last-write-wins: check if event with same id exists | |
| # First-write-wins: skip event if it already exists locally |
| import json | ||
| from datetime import datetime, timezone |
Summary
Adds a sync API to ActivityWatch for device synchronization.
Endpoints
GET /api/0/sync/export- Export all buckets/eventsPOST /api/0/sync/import- Import data from another deviceGET /api/0/sync/status- Sync statusConflict Resolution
Last-write-wins based on event timestamps.
Closes #35