Webhook Receiver
Canvas FHIR bridge. Bidirectional. Receives Canvas webhooks on one side, pushes back to Canvas on the other. The only service with Canvas webhooks pointed at it.
31c7367.
841 tests passing (535 unit + 302 integration + 9 real-Postgres-skipped). First egress handler is patient.archived → Patient.active = false push to Canvas. Additional handlers land per unit as we build them out.
Overview
Webhook-receiver is the Canvas-platform boundary. It does two jobs:
-
Ingress. Canvas POSTs FHIR events (e.g.
MedicationRequest.activated,Appointment.booked) to our public endpoint. We HMAC-verify, persist verbatim, normalize into our internal event envelope, and emit to EventBridge. -
Egress. Our internal services emit events like
patient.archived. A SQS-backedCanvasSyncWorkerconsumes these and writes back to Canvas via canvas-client, with strong idempotency and typed-error-based disposition.
Status
| Field | Value |
|---|---|
| Wave | A4 — Canvas bridge |
| Shipped | 2026-04-21 |
| Commit | 31c7367 |
| Tests | 841 (535 unit + 302 integration + 9 real-Postgres-skipped) |
| Ingress endpoint | POST /webhooks/canvas (public, HMAC-authed) |
| Egress dispatcher | CanvasSyncWorker, SQS consumer |
| Phase 1 egress handlers | patient.archived → Patient.active=false |
Ingress: Canvas → us
The POST /webhooks/canvas endpoint is the sole Canvas-facing entrypoint.
Flow
- Canvas POSTs to
/webhooks/canvaswith headerX-Canvas-Signature. - We read the raw body (before any JSON parsing).
- We INSERT a
canvas_webhook_eventsrow immediately — before deciding whether the signature is valid. This is a deliberate forensic choice: even invalid payloads are captured. - HMAC verify via
@yourera/canvas-client'sverifyCanvasWebhook.signature_validis persisted as the result. - If
signature_valid = false: log a structured warn (CloudWatch Metric Filter hook) and return 401. Do NOT dispatch. The row is retained for forensics. - If valid but malformed JSON: persist, log, return 400. No dispatch.
- Otherwise: parse, normalize into the canonical envelope, INSERT an
outboxrow with the normalized event, updatecanvas_webhook_events.dispatched_outbox_idto link them. - Return 200. Outbox worker ships the event to EventBridge.
canvas_webhook_events. This gives ops a complete record when Canvas changes a payload
schema or an attacker probes the endpoint.
Event naming
Canvas webhooks carry a FHIR resource type and an action. We emit internal events as
canvas.{resource_snake}.{action}:
| Canvas event | Emitted as |
|---|---|
| MedicationRequest activated | canvas.medication_request.activated |
| MedicationDispense completed | canvas.medication_dispense.completed |
| Appointment booked | canvas.appointment.booked |
| Patient updated | canvas.patient.updated |
| Task completed | canvas.task.completed |
Consumers filter via EventBridge rules on source=canvas + detail-type. No fan-out
is done at the emitter side — webhook-receiver just publishes; rules route.
Egress: us → Canvas
Internal events (e.g. patient.archived) fan out via EventBridge. An SQS queue owned by
webhook-receiver has an EventBridge rule subscription for the events that need to write back to
Canvas. The CanvasSyncWorker consumes the queue.
Flow
- Worker polls SQS. Receives a message with the envelope.
- Computes an idempotency key (see Idempotency Key).
- UPSERT into
canvas_sync_attemptskeyed by the idempotency key. If existing row is terminal (succeeded,failed,abandoned), ack and move on. - Atomically bump
attempts++via SQL expression (race-safe; two workers cannot double-count). - Dispatch to the appropriate handler. For Phase 1, the only handler is
patient.archived→canvas-client.patient.update(id, { active: false }). - Disposition based on the result (see Error Disposition).
patient.archived is wired today. Additional handlers (e.g. Canvas task assignment on
physician membership grant, MedicationRequest cancellation on saga abort) land per unit in subsequent
waves.
Database Schema
canvas_webhook_events (ingress log)
Immutable audit log of every Canvas webhook we received, valid or not.
| Column | Type | Notes |
|---|---|---|
id | uuid | PK. |
received_at | timestamptz | Request arrival. |
signature_valid | boolean | Result of HMAC verify. |
raw_body | bytea | Verbatim bytes. Contains PHI. Access gated to superadmin. |
signature_header | text | The signature string we received. |
resource_type | text | Parsed from body if valid JSON; else null. |
resource_id | text | Parsed. Null if unparseable. |
canvas_event_name | text | e.g. MedicationRequest.activated. |
canvas_event_timestamp | timestamptz | Canvas-side event time. |
dispatched_outbox_id | uuid | Null if not dispatched (e.g. invalid signature). |
canvas_sync_attempts (egress ledger)
Per-egress-work-unit ledger with idempotency key.
| Column | Type | Notes |
|---|---|---|
id | uuid | PK. |
idempotency_key | text | Unique index. See below. |
internal_event_name | text | e.g. patient.archived. |
target_resource_type | text | Canvas resource we're updating. |
target_resource_id | text | Canvas resource id. |
status | enum | pending / succeeded / failed / abandoned. |
attempts | int | Incremented atomically via SQL expression. |
canvas_response_status | int | Last Canvas HTTP status. Operational metadata, no PHI. |
last_error | text | Class + short message. No payload. |
created_at | timestamptz | |
updated_at | timestamptz |
Also present: outbox and audit_events, shared-pattern tables.
HTTP Endpoints
Public. Canvas ingress. HMAC auth only (no JWT). Always logs verbatim, even on invalid signature or malformed JSON. See Ingress for the full flow.
Superadmin only. Contains PHI in raw_body. List + filter.
Superadmin only. Full verbatim payload for incident review.
Support + superadmin (M-5). Operational metadata only — no PHI. Support can read to triage stuck syncs.
Support + superadmin. Single attempt detail.
Superadmin only. Flips failed or abandoned back to
pending and kicks worker.tick() best-effort. Used when a Canvas-side fix
(e.g. unstuck resource) makes a previously terminal error now recoverable.
Superadmin only. Synchronous outbox tick for incident response.
Superadmin only. Synchronous egress tick for incident response.
Event Contracts
Defined in @yourera/contracts/events/canvas.ts. The canonical envelope for every Canvas-origin
event we emit:
export const canvasEventEnvelopeSchema = z.object({
webhookEventId: z.string().uuid(),
canvasEventName: z.string(), // e.g. "MedicationRequest.activated"
resourceType: z.string(), // e.g. "MedicationRequest"
resourceId: z.string(),
canvasEventTimestamp: z.string().datetime(),
receivedAt: z.string().datetime(),
resource: z.unknown(), // the FHIR resource body as received
});
Consumers use EventBridge rules with detail-type = canvas.{resource_snake}.{action}.
For example, orchestrator subscribes to canvas.medication_request.activated; pharmacy-router
subscribes to canvas.medication_dispense.completed.
Idempotency Key
The egress idempotency key is deterministic from (internal event name, target resource id, intent).
For patient.archived the shape is:
canvas-sync:patient.archived:canvasPatientId:<canvasPatientId>:active=false
The canvas_sync_attempts.idempotency_key column has a unique index. Two concurrent workers
receiving the same SQS message (or a re-delivery after an ack timeout) collide on INSERT and short-circuit.
attempts is incremented via SQL expression (UPDATE … SET attempts = attempts + 1),
never read-modify-write at the application level. This matters when SQS visibility-timeout expires and two
workers pick up the same message.
Error Disposition
The CanvasSyncWorker branches on the error class returned from canvas-client:
| Error | Disposition | Reason |
|---|---|---|
CanvasNotFoundError | Terminal — status=failed + ACK SQS | Target doesn't exist. Retrying will never succeed. |
CanvasValidationError | Terminal — status=failed + ACK SQS | Payload is wrong. Bug to fix, not retry. |
CanvasConflictError | Terminal — status=failed + ACK SQS | Concurrent write won. Human must reconcile. |
CanvasAuthError | Terminal — status=failed + ACK SQS | Credentials broken. Page ops. Retrying doesn't help. |
CanvasTransientError | Leave status=pending, do NOT ACK SQS | SQS redelivers after visibility timeout. |
CanvasRateLimitError | Leave status=pending, change SQS visibility to retryAfterSeconds | Respect Canvas backoff guidance. |
| Canvas 5xx after client retries | Leave status=pending, do NOT ACK SQS | Transient; SQS redelivers. |
failed attempt does not retry automatically. Ops reviews via
/canvas-sync-attempts, fixes the root cause, then calls /canvas-sync-attempts/:id/retry
to flip back to pending.
Testing
841 tests:
- 535 unit — ingress parsing, signature verification hooks, envelope normalization, error-disposition branching, idempotency key derivation.
- 302 integration — end-to-end ingress and egress with canvas-client stubbed; outbox and attempts ledger verified.
- 9 real-Postgres-skipped — run locally for the atomic attempts-counter update, idempotency-key unique-index collision behavior, and concurrent worker races.
TODOs & Future Waves
- More egress handlers. Phase 1 is only
patient.archived. Forthcoming: physician membership grants routed to Canvas Task assignments; saga-abort MedicationRequest cancellation; consent revocation pushback. - Ingress fan-out to orchestrator.
canvas.medication_request.activatedneeds to kick the saga. The subscriber lives in orchestrator, not here — webhook-receiver only emits. - Dead-letter visibility. SQS DLQ is configured; a first-class admin UI for DLQ inspection is pending.
- Canvas-side subscription management. Webhooks are configured in the Canvas console today. Programmatic registration via FHIR
Subscriptionresource is a Wave C item; requires canvas-client to add the wrapper. - PHI retention policy for
canvas_webhook_events. Currently retained indefinitely. Need legal-approved retention + pruning job. - CloudWatch alarms. The structured warn on signature-failure has a metric filter hook wired; the alarm threshold + routing is a follow-up.