Skip to content

feat: Add sync API endpoints for device synchronization#164

Open
mkcash wants to merge 1 commit into
ActivityWatch:masterfrom
mkcash:feature/sync-api
Open

feat: Add sync API endpoints for device synchronization#164
mkcash wants to merge 1 commit into
ActivityWatch:masterfrom
mkcash:feature/sync-api

Conversation

@mkcash
Copy link
Copy Markdown

@mkcash mkcash commented May 26, 2026

Summary

Adds a sync API to ActivityWatch for device synchronization.

Endpoints

  • GET /api/0/sync/export - Export all buckets/events
  • POST /api/0/sync/import - Import data from another device
  • GET /api/0/sync/status - Sync status

Conflict Resolution

Last-write-wins based on event timestamps.

Closes #35

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 26, 2026

Greptile Summary

This PR adds a new sync_api.py module with three endpoints for exporting, importing, and checking the status of ActivityWatch data across devices. The implementation has several correctness and security defects that would prevent the feature from working at all in its current state.

  • Blueprint not wired up: sync_blueprint is never imported or registered in server.py, so no endpoint is reachable.
  • Broken import path: The import handler calls api.heartbeat() with an incorrect signature (wrong types, wrong argument names) causing a TypeError on every import; get_bucket_metadata raises NotFound rather than returning None, so the bucket-creation branch is unreachable; and create_bucket has its event_type/client arguments transposed.
  • Missing DNS-rebinding protection: All existing routes apply host_header_check; the new blueprint omits it, exposing the full-data export and import endpoints to DNS rebinding attacks.

Confidence Score: 1/5

Not safe to merge — the feature is entirely non-functional and introduces a security regression.

The blueprint is never registered so no endpoint is reachable. The import handler crashes on every call due to a mismatched heartbeat() signature, and two other bugs (exception instead of None from get_bucket_metadata, transposed create_bucket args) would prevent correct bucket creation even if the crash were fixed. On top of that, the new routes bypass the DNS-rebinding guard that protects every other endpoint in the server, exposing the full-data export to unauthenticated cross-origin reads.

aw_server/sync_api.py requires significant rework, and aw_server/server.py needs a blueprint registration call added.

Security Review

  • DNS Rebinding (aw_server/sync_api.py): The sync_blueprint routes bypass the host_header_check decorator that protects all existing API routes in rest.py. A malicious web page on the LAN could rebind its DNS to 127.0.0.1 and call /api/0/sync/export to exfiltrate all stored events, or /api/0/sync/import to inject arbitrary data — without any host validation.
  • Unauthenticated full data export: /api/0/sync/export returns all buckets and all events with no authentication or rate-limiting. The absence of the host header check makes it a larger attack surface than the existing /api/0/export endpoint.

Important Files Changed

Filename Overview
aw_server/sync_api.py New sync API module with three endpoints (export, import, status). Contains two crash-level bugs (blueprint not registered; heartbeat called with wrong signature), two data-corruption bugs (get_bucket_metadata raises instead of returning None; create_bucket args swapped), a DNS-rebinding security gap, and an O(N×M) deduplication scan capped at 1,000 events.

Sequence Diagram

sequenceDiagram
    participant Client
    participant SyncAPI as sync_api.py
    participant ServerAPI as api.py (ServerAPI)
    participant DB as Datastore

    Note over Client,DB: GET /api/0/sync/export
    Client->>SyncAPI: GET /api/0/sync/export
    SyncAPI->>ServerAPI: get_buckets()
    ServerAPI->>DB: buckets()
    DB-->>ServerAPI: bucket dict
    loop each bucket_id
        SyncAPI->>ServerAPI: "get_events(bucket_id, limit=None)"
        ServerAPI->>DB: "get(limit=-1)"
        DB-->>SyncAPI: events list
    end
    SyncAPI-->>Client: JSON export payload

    Note over Client,DB: POST /api/0/sync/import (BROKEN)
    Client->>SyncAPI: POST /api/0/sync/import (JSON)
    loop each bucket
        SyncAPI->>ServerAPI: get_bucket_metadata(bucket_id)
        Note right of ServerAPI: raises NotFound if bucket missing
        alt bucket exists
            loop each event
                SyncAPI->>ServerAPI: "get_events(bucket_id, limit=1000)"
                Note right of ServerAPI: Only checks first 1000 events
                SyncAPI->>ServerAPI: "heartbeat(bucket_id, dict, duration, timestamp=ts)"
                Note right of ServerAPI: CRASH: wrong type and args
            end
        end
    end

    Note over Client,DB: GET /api/0/sync/status
    Client->>SyncAPI: GET /api/0/sync/status
    SyncAPI->>ServerAPI: get_buckets()
    SyncAPI->>ServerAPI: get_info()
    SyncAPI-->>Client: status JSON
Loading

Reviews (1): Last reviewed commit: "feat: Add sync API endpoints for device ..." | Re-trigger Greptile

Comment thread aw_server/sync_api.py
Comment on lines +1 to +11
"""
ActivityWatch Sync API
Allows exporting and importing bucket data between devices.
"""
import json
from datetime import datetime, timezone
from flask import Blueprint, jsonify, request, current_app

from .api import ServerAPI

sync_blueprint = Blueprint("sync", __name__, url_prefix="/api")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Blueprint never registered

sync_blueprint is defined but never imported or registered in server.py. The AWFlask.__init__ method only calls self.register_blueprint(rest.blueprint)sync_blueprint is absent. All three sync endpoints are completely unreachable at runtime; the entire feature is a no-op until this is wired up.

Comment thread aw_server/sync_api.py
Comment on lines +79 to +80
if exists:
skipped_count += 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 heartbeat() called with wrong signature — runtime crash on every import

ServerAPI.heartbeat has the signature heartbeat(bucket_id, heartbeat: Event, pulsetime: float). This call passes a plain dict (event_payload) where an Event object is required, passes duration as pulsetime (semantically unrelated — pulsetime is a merge window, not an event duration), and passes timestamp=ts which is not a parameter of heartbeat at all. Every call will raise a TypeError. The correct approach is to construct an Event object and call api.create_events(bucket_id, [event]) for bulk import, or use the existing api.import_bucket helper.

Comment thread aw_server/sync_api.py
Comment on lines +53 to +56
imported_count = 0
skipped_count = 0

for bucket_id, bucket_data in data["buckets"].items():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 get_bucket_metadata raises NotFound, never returns None

ServerAPI.get_bucket_metadata is decorated with @check_bucket_exists, which raises a NotFound exception (HTTP 404) when the bucket is absent. It never returns a falsy value. The if not existing: branch is dead code — imports targeting a new bucket will always surface as a 404 error before the create_bucket call is reached.

Comment thread aw_server/sync_api.py
Comment on lines +61 to +66
api.create_bucket(
bucket_id,
meta.get("client", "sync"),
meta.get("type", "unknown"),
meta.get("hostname", source_device),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 create_bucket's second positional parameter is event_type, not client. As written, client value is passed as event_type and type value as client, storing them in the wrong DB columns for every imported bucket.

Suggested change
api.create_bucket(
bucket_id,
meta.get("client", "sync"),
meta.get("type", "unknown"),
meta.get("hostname", source_device),
)
api.create_bucket(
bucket_id,
meta.get("type", "unknown"),
meta.get("client", "sync"),
meta.get("hostname", source_device),
)

Comment thread aw_server/sync_api.py
Comment on lines +1 to +11
"""
ActivityWatch Sync API
Allows exporting and importing bucket data between devices.
"""
import json
from datetime import datetime, timezone
from flask import Blueprint, jsonify, request, current_app

from .api import ServerAPI

sync_blueprint = Blueprint("sync", __name__, url_prefix="/api")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Missing host_header_check — DNS rebinding vulnerability

Every route in rest.py is protected against DNS rebinding via the host_header_check decorator (applied at the Api level). The new sync_blueprint bypasses this entirely, exposing all three sync endpoints — including the full-data export — to DNS rebinding attacks from any malicious web page. The fix is to apply host_header_check as a before_request hook on sync_blueprint or decorate each view function.

Comment thread aw_server/sync_api.py
Comment on lines +71 to +76
duration = event_data.get("duration", 0)
event_payload = event_data.get("data", {})

# Last-write-wins: check if event with same id exists
event_id = event_data.get("id")
if event_id:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Deduplication fetches only 1,000 events per event, silently re-importing the rest

get_events(bucket_id, limit=1000) is called inside the per-event loop, so for a bucket with more than 1,000 events the deduplication check only covers the first 1,000. Events beyond that window will be silently re-imported on every sync, causing duplicates. This also makes the import O(N×M) in the number of events — for large buckets this becomes very slow.

Comment thread aw_server/sync_api.py
duration = event_data.get("duration", 0)
event_payload = event_data.get("data", {})

# Last-write-wins: check if event with same id exists
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The comment says "Last-write-wins" but the code does the opposite — it skips an event when a matching ID already exists, making it first-write-wins. This mismatch will mislead anyone trying to understand or extend the conflict-resolution logic.

Suggested change
# Last-write-wins: check if event with same id exists
# First-write-wins: skip event if it already exists locally

Comment thread aw_server/sync_api.py
Comment on lines +5 to +6
import json
from datetime import datetime, timezone
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 import json is unused — jsonify handles all JSON serialization in this module.

Suggested change
import json
from datetime import datetime, timezone
from datetime import datetime, timezone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant