The coordinator (coordinator.py) is the supervisor process that manages the
lifecycle of every springtail service running on a host. One coordinator runs
per host and is responsible for a single service type: ingestion, fdw, or
proxy.
Note: the coordinator is intended as a reference example of how to deploy Springtail end-to-end (binary install from S3, systemd-managed postgres, Redis-driven lifecycle, SNS notifications, etc.). It encodes the assumptions of one particular deployment — paths, AWS services, secret layout, systemd unit naming, and so on — and will likely need to be adjusted (or replaced) to match a different target environment.
The coordinator:
- Downloads and installs binaries from S3 (production only).
- Registers the components for its service type with the
Scheduler. - Starts the components in dependency order.
- Monitors components via PID checks, Redis liveness timeouts, and a Redis pub/sub channel, restarting on failure.
- Reacts to state transitions written to Redis (
coordinator_stateand, for FDW,fdw_state) so that operators can drive lifecycle changes from outside the host.
State and orchestration are coordinated through Redis. The coordinator reads/writes two state values that drive the four operations described in this document:
| Key | Field | Type | Values |
|---|---|---|---|
<db_instance_id>:coordinator_state |
<service>:<instance_key> |
CoordinatorState |
startup, reloading, running, reload, shutdown, dead |
<db_instance_id>:fdw |
<fdw_id> (JSON state field) |
string | initialize, running, draining, stopped |
A springtail database instance consists of three service tiers, each launched on its own host(s) by an independent coordinator process:
| Service | Components started (in order) |
|---|---|
ingestion |
pg_log_mgr_daemon |
fdw |
postgres → pg_xid_subscriber_daemon → pg_ddl_daemon |
proxy |
proxy |
The coordinator is launched as:
python coordinator.py -c config.yaml -s <ingestion|fdw|proxy> [--debug] [--manual] [-f /path/to/postmaster.pid]In production it is normally started by systemd as the
springtail-coordinator unit; the same command runs interactively from the
shell for development.
Flags:
-c, --config-file— path to the localconfig.yaml(defaults toconfig.yamlin the working directory). The YAML suppliessystem_json_path,install_dir, theproductionflag, and log rotation settings.-s, --service— one ofingestion,fdw, orproxy. May be supplied via theSERVICE_NAMEenvironment variable instead.--debug— verbose logging.--manual— in production mode, skip the loader self-restart so the coordinator can be driven from a shell.-f, --postgres-pid-file— explicit path to the postgres PID file (FDW service only).
Coordinator.startup() (in coordinator.py) executes:
- Read coordinator state from Redis. The state field is keyed by
<service>:<instance_key>so each coordinator on the host has its own record. - Install binaries (production only). If state is
STARTUP, theProductionhelper downloads the latestspringtail-*.tgzfroms3://<S3_BUCKET>/packages/, validates the package'sConfig HashinINFO.txtagainst the value Redis has for the deployment, and rsyncs it toinstall_dir. The state is then advanced toRELOADINGand (unless--manualwas passed)loader.startupis invoked to relaunch the coordinator from the freshly installed code, after which the current process exits. The loader will bring the coordinator back up under the new binary set. - Sanity-check paths. The mount path, log path, and
<install>/bin/systemdirectory are verified to exist. - Stop stragglers.
stop_daemonskills any orphaned daemons listed inALL_DAEMONSso the coordinator starts from a clean slate. - Build the scheduler and register components. A
ComponentFactoryproduces theComponentobjects for the selected service. Forfdw, the coordinator first callsProduction.install_pgfdwto install thespringtail_fdwextension into the local PostgreSQL (see Host PostgreSQL prerequisites for what this assumes about the cluster), then waits for the ingestion service to be reachable (pingsXidMgrClientandSysTblMgrClientuntil both respond) before starting postgres and the FDW daemons. - Start components in order.
Scheduler.start_alllaunches each component by ascendingstartup_orderand waits for each to be running before moving on. Once everything is up, the coordinator state is set toRUNNING. - Enter the monitor loop.
Scheduler.monitor_timeoutsloops every ~5 seconds:- Reacts to coordinator-state changes (see "Reload" and "Stopping" below).
- Tracks per-database state changes and emits SNS notifications.
- Reads the liveness hash in Redis and flags components whose
heartbeat is older than
allowed_timeout(40s by default). - Drains the
pubsub:liveness_notifychannel for failure messages from the daemons. - Calls
Component.is_alive()on every registered component. - On any failure, calls
Scheduler.restart_all(subject to theMAX_FAILURES/FAILURE_WINDOW_THRESHOLDrate limiter — 5 failures within 5 seconds aborts the loop and waits 5 minutes before retrying). - For the FDW service, also runs
_process_fdw_state_changeto honor the external "remove replica" flow described below.
Once monitor_timeouts exits, shutdown_all is invoked, the coordinator
state is set to DEAD, and the process exits.
To pick up a new build without manually restarting, set the coordinator state
to RELOAD:
props.set_coordinator_state(CoordinatorState.RELOAD)The monitor loop's _check_coordinator_state will:
- Call
shutdown_allon every registered component. - Re-run
install_binaries(andinstall_pgfdwfor the FDW service). - Call
restart_allto bring the components back under the new binaries. - Set the state back to
RUNNING.
Springtail "replica nodes" are FDW hosts: instances running the fdw service
that expose a PostgreSQL endpoint backed by the springtail foreign data
wrapper. Each FDW host is identified by a unique FDW_ID and has its own
config record under the Redis hash <db_instance_id>:fdw.
The coordinator does not provision new hosts itself — that is the responsibility of the controller / infrastructure layer. From the coordinator's perspective, adding a replica is simply "stand up a new host and start an FDW coordinator on it." See Host PostgreSQL prerequisites first for what each FDW host must have in place before the coordinator can run; the flow below assumes those prerequisites are met.
The FDW coordinator does not install or initialize PostgreSQL — it expects
a specific cluster to already be running on the host. The local-cluster
ansible at local-cluster/opt/springtail/helpers/customize-pg.yml is the
canonical reference for how that cluster is laid down; the requirements
below summarize what the coordinator itself actually depends on.
1. The custom (patched) PostgreSQL 16 build. Springtail does not run on
stock PostgreSQL. The FDW host must run the patched PG 16 fork (currently
postgresql-16.9, distributed as springtail-pg.tar.gz), which adds RLS
support for foreign tables that the springtail_fdw extension relies on.
The reference build is in docker/ansible/roles/custom-pg/tasks/main.yml:
it downloads the tarball from custom_pg_package_url, configures with
--prefix=/usr/lib/postgresql/16 (with --bindir / --datadir /
--libdir under that prefix), and runs make install-world.
install_pgfdw and PostgresComponent discover the install layout at
runtime via pg_config --sharedir, --pkglibdir, --bindir, and
--version, so the pg_config first on the coordinator's PATH must
resolve to the custom build. If a stock pg_config shadows it, the FDW
extension files will be copied into the wrong cluster.
2. Cluster naming driven by the FDW superuser. The data directory,
systemd unit, OS owner, and environment file are all named after the
fdw_superuser username pulled from AWS Secrets Manager (secret
sk/{org_id}/{account_id}/aws/dbi/{db_instance_id}/primary_db_password,
read via Properties.get_role(DB_USER_ROLE_FDW)). Throughout this section
{fdw_user} is that username and {pg_version} is the PG major version
returned by pg_config --version (currently 16):
| What | Path / name |
|---|---|
| Data directory | /var/lib/postgresql/{pg_version}/{fdw_user} |
pg_hba.conf |
/var/lib/postgresql/{pg_version}/{fdw_user}/pg_hba.conf |
postmaster.pid |
/var/lib/postgresql/{pg_version}/{fdw_user}/postmaster.pid (overridable via the -f flag) |
| Environment file | /etc/postgresql/{pg_version}/{fdw_user}/environment |
| systemd unit | postgresql-{fdw_user}.service |
The OS user {fdw_user} must exist and have passwordless sudo — the
coordinator runs sudo cp, sudo systemctl start/stop ..., and
sudo -u {fdw_user} psql ... for liveness and connection-count probes.
The maintenance database that is_alive / get_connection_count connect
to comes from PGDATABASE (defaults to __springtail in the env-driven
path, postgres when properties are loaded from a config.yaml) and must
already exist in the cluster.
3. What install_pgfdw mutates on every coordinator start/reload. It
does not create the cluster, but it does modify it on each invocation (see
python/coordinator/production.py:201):
- Copies the FDW extension into the cluster's
sharedir/extension(springtail_fdw--1.0.sql,springtail_fdw.control) and the shared library intopkglibdirasspringtail_fdw.so. - Removes any stale
libspringtail_pgext.sofrom the springtail install'sshared-libdirectory. - Rewrites the local-auth line in
pg_hba.conftoscram-sha-256. - Writes the springtail runtime env vars (the
ENV_VARSlist inproduction.py— Redis, AWS, mount, FDW,LD_LIBRARY_PATH) into the postgres environment file so the systemd unit picks them up. - Stops postgres, so the scheduler can start it cleanly under the refreshed environment in the next step of the startup sequence.
-
Provision an FDW config record. Add the new FDW to
system.jsonunderfdws(or insert it directly into Redis at<db_instance_id>:fdwand<db_instance_id>:fdw_ids). Newly added FDW configs are written withstate = "initialize"(seeProperties._load_redis). -
Provision the host. Bring up an EC2 instance (or equivalent) with the custom PostgreSQL cluster configured per Host PostgreSQL prerequisites and the springtail environment variables set. The coordinator requires:
SERVICE_NAME=fdwFDW_ID=<the new fdw id>INSTANCE_KEY(used as part of the coordinator-state record key)- The standard Redis / AWS / mount-path env vars
(
REDIS_HOSTNAME,REDIS_PORT,REDIS_USER,REDIS_PASSWORD,REDIS_USER_DATABASE_ID,REDIS_CONFIG_DATABASE_ID,ORGANIZATION_ID,ACCOUNT_ID,DATABASE_INSTANCE_ID,LUSTRE_DNS_NAME,LUSTRE_MOUNT_NAME,MOUNT_POINT,SNS_TOPIC_ARN,AWS_REGION).
-
Start the coordinator on the new host. Either via
systemctl start springtail-coordinator(production) or directly:python coordinator.py -c config.yaml -s fdw
The coordinator will run the standard FDW startup sequence:
- Download the binaries from S3 and install them locally
(
Production.install_binaries). - Install the
springtail_fdwextension into the host's PostgreSQL and rewritepg_hba.conf/ the postgres environment file (Production.install_pgfdw). - Wait for the ingestion service to be reachable.
- Start
postgres,pg_xid_subscriber_daemon, andpg_ddl_daemon. - Set the coordinator state to
RUNNING.
The FDW daemons themselves transition the FDW config's
statefield away frominitialize(towardrunning) once they have synchronized their state. - Download the binaries from S3 and install them locally
(
-
Route traffic. Once the new FDW is healthy, the proxy can begin routing connections to it. The proxy reads its target list from Redis, so no coordinator-side action is needed beyond the new FDW reaching
running.
There is no "add replica" RPC inside the coordinator: each FDW is added by
spinning up a new host and pointing a new coordinator at the right
FDW_ID.
Removal is the case the coordinator itself implements explicitly, in
Scheduler._process_fdw_state_change (called every monitor iteration when
the service type is fdw). The flow is operator-driven via the FDW state
field:
- Operator marks the FDW as draining. Externally (controller, admin
tool, or
Properties.set_fdw_state('draining')), set the FDW config'sstatetodrainingin Redis at<db_instance_id>:fdw[<fdw_id>]. - Coordinator detects the draining state. On its next monitor tick the
FDW coordinator reads the FDW config (
get_fdw_config(nocache=True)), seesstate == 'draining', and enters the drain path. - Wait for connections to drain. The coordinator polls
PostgresComponent.get_connection_count()(which runsselect count(client_port) from pg_stat_activity where client_port != -1) every 5 seconds and only proceeds once it returns 0. While waiting the proxy is responsible for steering new connections away from this FDW. - Shut down all components.
Scheduler.shutdown_allstopspg_ddl_daemon,pg_xid_subscriber_daemon, andpostgresin reverse startup order. - Wait for
stopped. The coordinator callsProperties.wait_for_fdw_state('stopped', 30). The expectation is that the operator (or controller) advances the FDW state tostoppedonce they have confirmed the host is quiesced. If the wait times out, the coordinator forces the state tostoppeditself. - Clean up Redis state. The coordinator removes everything in Redis
that this FDW owned:
DELETE <db_instance_id>:queue:ddl:fdw:<fdw_id>(DDL work queue).HDELmatching members of<db_instance_id>:hash:ddl:fdwwhose key ends in:<fdw_id>.HDELmatching members of<db_instance_id>:fdw_min_xidswhose key starts with<fdw_id>:.SREMmatching members of<db_instance_id>:fdw_pidswhose value starts with<fdw_id>:.
- Stay idle. The coordinator sets
services_stopped = True. The monitor loop continues running but skips its component checks (IDLE_SLEEP_INTERVAL = 5s), so the host can be safely terminated by the controller. To fully stop the coordinator, follow the "Stopping Springtail" flow below.
The FDW config record itself is left in Redis — the controller is
responsible for removing it from system.json/<db_instance_id>:fdw once
the host is gone if the FDW should not be re-used.
There are two ways to stop a coordinator and its components, depending on whether the stop should be initiated locally (process signal) or remotely (Redis state).
The coordinator installs handlers for SIGINT and SIGTERM
(make_signal_handler in coordinator.py). Either signal:
- Sets
coordinator.shutdown_event, which causes the startup wait loops (e.g._wait_for_ingestion) to exit early. - Calls
Scheduler.shutdown(), which sets the scheduler'sshutdown_eventand breaksmonitor_timeoutsout of its loop. - Once the loop exits,
Scheduler.shutdown_allshuts down components in reverse startup order (shutdownfirst, thenkillif the graceful shutdown times out). - The Redis pub/sub subscription and connection are closed, and the
coordinator state is set to
DEAD. - After
Coordinator.startup()returns,coordinator.py's__main__callsstop_daemonsforALL_DAEMONSto be sure no orphans are left, then sends an SNSshutdownnotification (production only).
In production, this is normally done with:
sudo systemctl stop springtail-coordinatorTo stop a running coordinator from outside the host, write SHUTDOWN to
Redis:
props.set_coordinator_state(CoordinatorState.SHUTDOWN)On the next monitor tick _check_coordinator_state matches SHUTDOWN and
calls Scheduler.shutdown(), which then follows the same teardown path as
the signal-driven case (reverse-order component shutdown, pub/sub close,
state set to DEAD).
To stop springtail across all hosts:
- Drain proxies first by stopping their coordinators (signal or Redis-state). With proxies down, no new client traffic can reach FDW hosts.
- Drain each FDW host using the "Removing Replica Nodes" flow
(
set_fdw_state('draining')), which gracefully waits for in-flight connections, shuts downpg_ddl_daemon,pg_xid_subscriber_daemon, andpostgres, and cleans up the per-FDW Redis state. Then stop each FDW coordinator (signal orcoordinator_state = SHUTDOWN). - Stop the ingestion coordinator (signal or
coordinator_state = SHUTDOWN); this brings downpg_log_mgr_daemon.
When every coordinator has reached the DEAD state in
<db_instance_id>:coordinator_state, the instance is fully stopped.