Handshake state machine
A short story of how the SDK keeps your widget in sync with Grist, and how to lean on it when the default status (booting / ready / …) is not granular enough.
If you're happy with <GristBoundary> and useGrist().status, skip this page — those already work great. Read this if you've ever wanted to:
- Show a distinct UI while column mappings are being configured.
- React to a degrading network (
stale→lost) before the widget fully fails. - Gate write actions on "do I actually have write access right now", not just "did the handshake complete".
- Implement a custom retry / backoff that knows what generation of the state it's looking at.
The five orthogonal axes
The SDK models the widget ↔ Grist relationship as five independent state machines running in parallel rather than one big enum. The public status is a derived projection over all five — but each axis carries strictly more information.
┌──────────────────────────────────────────────────────────────┐
│ GristWidgetSnapshot │
│ │
│ lifecycle : idle → detecting → negotiating → online │
│ ↓ │
│ terminated (not_embedded | script_timeout | │
│ negotiate_failed | manual_stop) │
│ │
│ link : unknown ↔ connected ↔ stale ↔ lost │
│ │
│ authz : { requested, granted } — widens monotonically │
│ │
│ config : { requiredAccess, declared, mappings: pending │
│ / resolved / unreported / invalidated } │
│ │
│ sync : record / records / options / newRecord freshness│
│ + currentTable state │
└──────────────────────────────────────────────────────────────┘Each axis transitions independently — for example, the link can go connected → stale → lost without the lifecycle leaving online. The top-level status reflects the worst-case across axes, but the snapshot gives you the full truth.
Capabilities, the actionable booleans
Reading the snapshot directly is fine for telemetry. For UI gating, prefer the derived capabilities — they fold the five axes into boolean answers to the questions widgets actually ask:
import { useGristCapabilities } from "grist-widget-sdk/advanced"
function App() {
const { canRead, canRender, canWriteRecords, canWriteSchema } = useGristCapabilities()
if (!canRender) return <Skeleton />
return canWriteRecords ? <Editable /> : <ReadOnly />
}The capability chain is conjunctive:
canWriteSchema ⇒ canWriteRecords ⇒ canRender ⇒ canReadIf canWriteRecords is true, all the strictly weaker capabilities are guaranteed to be true too. That property is exercised by the property-based tests (fast-check) on every CI run.
Lifecycle phases in practice
const { snapshot } = useGristHandshake()
switch (snapshot.lifecycle.phase) {
case "idle": return null // before any detection
case "detecting": return <Connecting /> // polling for grist
case "negotiating": return <Connecting label="Negotiating" />// grist.ready in-flight
case "online": return <App /> // happy path
case "terminated":
switch (snapshot.lifecycle.reason.kind) {
case "not_embedded": return <OpenInGristNotice />
case "script_timeout": return <ScriptTimeoutNotice />
case "negotiate_failed": return <NegotiateFailNotice err={snapshot.lifecycle.reason.cause} />
case "manual_stop": return null
}
}terminated is absorbing — <GristBoundary>'s default reload() button calls the manager's reload(), which bumps the generation and re-enters detecting.
The link and the heartbeat
When the lifecycle is online, the manager runs a heartbeat in the background — by default a grist.docApi.getDocName() every 30 seconds — and feeds the result into the link axis:
link.state | Meaning |
|---|---|
unknown | No probe issued yet (just turned online, or pause-on-hidden). |
connected | Last probe (or natural RPC) succeeded. |
stale | Missed probes ≥ staleAfterMissed (default 2). |
lost | Missed probes ≥ lostAfterMissed (default 4). |
lost escalates to the top-level status === "error". stale does not — the widget keeps running, but hasFreshSelection may drop to false to reflect that the record stream is going cold.
Auto-coalescence
Every successful Grist RPC the SDK issues through useGristWrites(), useGrist().fetchTable(…), useGrist().setCursorPosition(…), etc. is reported to the manager. The heartbeat then skips its next scheduled probe because natural traffic already proved the link healthy.
The practical effect: chatty widgets cost zero extra requests; quiet widgets still get a baseline 30 s health check.
Talking to Grist outside the SDK
If you call grist.* directly, feed the manager yourself:
const { recordRpcSuccess, recordRpcFailure } = useGristHandshakeContext()
try {
const out = await grist.docApi.applyUserActions(actions)
recordRpcSuccess()
return out
} catch (err) {
recordRpcFailure()
throw err
}This is rarely necessary — the SDK's slice hooks already cover the common paths.
Mappings, beyond columnMappingStatus
useGrist().columnMappingStatus collapses the mapping config into the familiar { ok, pending, missing, emptyMultiples } shape. The snapshot gives you the underlying state machine if you need it:
const { snapshot } = useGristHandshake({ columns: [{ name: "Title" }, { name: "Done" }] })
switch (snapshot.config.mappings.state) {
case "not_declared": // we didn't declare columns at all
case "pending": // declared, waiting for the host
case "resolved": // host reported mappings — read snapshot.config.mappings.state2
case "unreported": // 5 s elapsed with no report — host probably old/buggy
case "invalidated": // we received a mapping update that wiped previous ones
}The unreported state is the main reason to look at this — it lets you show a friendly "the Grist host didn't report column mappings" hint without falling back to a hard error.
Generation: how stale callbacks are dropped
Every snapshot carries an integer generation. Every effect dispatch carries the generation it was scheduled in. The reducer drops actions whose action.generation is older than the current snapshot's. The manager bumps generation on reload() / restart().
This means a grist.ready() resolving 4 seconds after the user clicked "Retry" can never flip the lifecycle back to online of the prior generation — the action is silently dropped.
If you write your own effects (custom RPC layer, optimistic retry…), follow the same discipline: capture snapshot.generation before sending, ignore the result if it changed by the time you return.
Choosing your provider
| Provider | Slice hooks driven by FSM? | Capabilities exposed? | Heartbeat coalesces SDK RPCs? |
|---|---|---|---|
<GristWidgetProvider> (default) | yes | implicitly via status | yes |
<GristHandshakeProvider> (advanced) | no (no slice hooks) | yes (via useGristHandshakeContext) | yes (for the manager you own) |
| Both, nested | yes (legacy provider's manager drives slices) | yes (each provider has its own manager) | yes for both |
Pick one in practice:
- Most widgets:
<GristWidgetProvider>+<GristBoundary>is enough. AdduseGristCapabilities()inside if you need finer gating. - Pure handshake: drop
<GristWidgetProvider>and use<GristHandshakeProvider>+ your own subscriptions. Useful for the smallest possible widgets. - Custom retry / observability:
<GristWidgetProvider>for the slices, anduseGristHandshake()standalone for the observable snapshot — both share the page-levelgrist.readysingleton so there's no double handshake.
See the API reference for the full type contract.