ADR-018: Plugin Lifecycle, Watchers, and Developer Experience

Status: Accepted Date: 2026-05-03 Deciders: QNTX Core Team Related: ADR-003 (Plugin Communication), ADR-004 (Plugin-Pulse Integration)

Context

Plugins are separate processes that communicate with QNTX over gRPC. The lifecycle — boot, initialization, health monitoring, restart, shutdown — has implicit contracts that caused real bugs: double initialization, stale connections, silent watcher failures. This ADR documents the full plugin lifecycle and the watcher subsystem as canonical references.

Problems observed

  1. Double initialization. ForceInitialize bypassed initOnce without consuming it. The HTTP routing lazy-init then called Initialize again, causing plugins to start their work twice and tear down what they just built.

  2. WebSocket ping-pong is undocumented. Plugins implementing HandleWebSocket must read the incoming gRPC stream and respond to PING messages with PONG. If they ignore the stream (natural first instinct), QNTX logs "WebSocket pong timeout" and the connection dies.

  3. Watcher lifecycle is implicit. The full path — InitializeResponse.watchers → DB → engine → ExecuteJob — is spread across four files. A plugin developer sees only the proto field and ExecuteJob.

  4. Predicate matching rules are undocumented. Matching semantics (exact, OR, rate limiting) are only discoverable by reading engine.go.

Decision

Document the plugin lifecycle and watcher system as first-class concepts.

Plugin Lifecycle

Binary launch          gRPC connect          Initialize RPC         Health poll (10s)
     |                      |                      |                      |
  process starts        Metadata()           Initialize()            Health()
  binds port            validates name       plugin starts work      monitors liveness
  prints QNTX_PLUGIN_PORT                    returns watchers,       restarts on 2
                                             routes, handlers        consecutive failures

Boot sequence

  1. QNTX launches the plugin binary and waits for it to bind a port
  2. gRPC connection established, Metadata() called to validate plugin identity
  3. Initialize(InitializeRequest) sent with config, ATS endpoint, auth token
  4. Plugin returns InitializeResponse with watchers, routes, handlers, schedules
  5. QNTX registers watchers in DB, sets up HTTP proxy routes, registers async handlers
  6. Health polling begins (every 10s)

Initialize contract

Restart

POST /api/plugins/{name}/restart or make install from a plugin directory.

  1. QNTX unregisters the plugin from the registry (state → Restarting). Browser requests get 503 Plugin still loading during restart.
  2. Mux cache invalidated so the next request after restart builds a fresh HTTP proxy.
  3. Active ATS streams cancelled to release the database mutex.
  4. Old process killed (SIGKILL to process group), old gRPC connection closed.
  5. New binary discovered, launched, gRPC connected — new proxy with fresh initOnce.
  6. registerRestarted: re-registers in registry, calls Initialize, re-registers handlers/schedules/providers, marks ready.
  7. Banner emitted with health status after async health check.

A restart always produces a new process, new gRPC connection, new proxy. The single PluginManager is reused — LoadPluginsFromConfig receives the existing manager, no second manager is created.

Health polling

Shutdown

Watcher Lifecycle

Plugin                          QNTX Core                        Watcher Engine
  |                                |                                  |
  |-- InitializeResponse -------->|                                  |
  |   (watchers: [...])           |                                  |
  |                               |-- SetupPluginWatchers() -------->|
  |                               |   (write to DB, idempotent)      |
  |                               |                                  |
  |                               |-- ReloadWatchers() ------------>|
  |                               |   (load from DB into memory)     |
  |                               |                                  |
  |                               |   attestation arrives            |
  |                               |                                  |
  |                               |   <-- predicate match --------  |
  |                               |                                  |
  |<-- ExecuteJob(attestation) ---|                                  |
  |   (handler_name routes it)    |                                  |
  |                                                                  |

WatcherRegistration fields

FieldRequiredDescription
idyesUnique suffix. Core prefixes with plugin name: {plugin}-{id}
handler_nameyesWhich ExecuteJob handler receives the match
predicatesyesAttestation predicates to match (exact match)
contextsnoAdditional context filters
max_fires_per_secondnoRate limit. 0 = zero fires allowed (QNTX LAW: zero means zero). Default: 0

Validation

A watcher must declare at least one filter dimension. The storage layer (validateWatcher) rejects watchers with no structural filters (subjects, predicates, contexts, actors), no temporal filters (time_start, time_end), no attribute filters, no ax_query, and no semantic_query. Empty-filter watchers — which would match every attestation — cannot be created.

Predicate matching

UI signal: ⏿

The watcher symbol ⏿ appears inline next to predicates that have active watchers. One eye per watcher. Hover shows watcher names. Eye color follows dilation — bright spice-blue when strained, deep sea blue when relaxed, faded white when the watcher has never fired.

Hot-swap behavior

Watchers survive plugin restart. On every Initialize:

WebSocket keepalive contract

QNTX sends PING messages on the gRPC stream and expects PONG responses. This tells the plugin whether a browser client is still connected.

Plugins must read the incoming gRPC stream and reply to PING with PONG (echo the timestamp). Spawn a reader task or thread that checks the message type and responds accordingly.

Failure to respond causes QNTX to log WebSocket pong timeout. The keepalive interval and timeout are configurable in am.toml under [plugin.websocket.keepalive]:

[plugin.websocket.keepalive]
ping_interval_secs = 30
pong_timeout_secs = 60

Error flow

When things go wrong, QNTX emits:

Log messageMeaningPlugin action
Failed to parse AX query for watcherWatcher predicate is malformedFix the predicate string in WatcherRegistration
gRPC ExecuteJob failedPlugin returned an error from ExecuteJobCheck plugin-side handler logic
Max retries exceeded, giving upExecuteJob failed repeatedlyCheck plugin health, logs
WebSocket pong timeoutPlugin ignores incoming WebSocket streamRead the incoming stream and reply to PING with PONG
Failed to setup plugin watchersDB write failed during InitializeCheck DB connectivity

Consequences

Positive

Negative

Related