Skip to content

Solo Operator Runbook

This page is the operating procedure for a controlled GiveCare pilot while one person is reviewing the system. It is internal-facing: it defines what the operator checks, when to intervene, and when to pause enrollment.

Operating Posture

GiveCare is SMS support for caregivers. Mira can remember context, help sort one next step, check in, and point to resources or benefits worth checking. Mira is not an emergency service, clinician, lawyer, benefits adjudicator, or care coordinator.

During a solo pilot, keep enrollment small enough that one operator can review every exception the same day. If the queue cannot be reviewed reliably, pause new enrollment.

Daily Checks

Check these at least once per active pilot day:

Area Admin action What to look for
Production flow productionFlow failed outbound messages, stuck jobs, fallback traces, human-review traces
Safety queue listSafetyEvents open critical or high-severity events
Review queue listReviewQueue blocked model replies, benefits certainty, resource issues, clinical/legal boundary flags
Population health listOrgs, getOrgPopulationHealth enrollment, consent rate, score coverage, pressure zones, open loops, opt-outs
Caregiver trace getCaregiverByPhone, getCaregiverTraces, getCaregiverMemory whether Mira is using memory correctly and staying inside boundaries

Safety Event Procedure

  1. Open the safety event and recent caregiver traces.
  2. Confirm whether the deterministic safety response was sent.
  3. If there is imminent danger, the SMS response should route to emergency/crisis resources; do not attempt to manage the emergency yourself.
  4. If a human follow-up is appropriate, send a short message that reinforces safety and encourages local/emergency support.
  5. Resolve the safety event only after the record has an outcome note.
  6. If multiple safety events appear or review falls behind, pause the org or public signup path.

Operator messages should be short, factual, and boundary-safe. Do not diagnose, promise availability, or imply human clinical monitoring.

Review Queue Procedure

Use the review queue for messages blocked by evaluator policy or routed to human review.

  1. Read the triggering trace, model draft, flags, and recent context.
  2. Decide one of: approve no send, send a safer operator reply, or mark resolved with no reply.
  3. If replying, keep it to one useful next step or one clarifying question.
  4. Resolve the queue item with the reason.

Common reasons to avoid sending: legal certainty, benefits certainty, unsupported resource claims, medical advice, excessive length, or too many steps.

Failed Delivery Procedure

If production flow shows failed outbound messages:

  1. Check whether failures are isolated or systemic.
  2. Confirm Twilio status if available.
  3. For isolated failures, leave the caregiver record intact and retry only if the failure reason is transient.
  4. For systemic failures, pause proactive outreach until delivery recovers.
  5. Do not manually re-send multiple texts to the same caregiver without checking recent outbound history.

Enrollment And Pilot Health

For each partner cohort, review weekly:

  • enrolled caregivers
  • consent granted / pending / revoked
  • active caregivers in the last 7 days
  • inbound and outbound message counts
  • failed or pending outbound messages
  • assessment completion and score coverage
  • top pressure zones
  • open caregiver loops
  • suggested benefit/resource opportunities
  • safety and review events

This is a manual report until a partner-facing dashboard is intentionally released.

Data Requests

For deletion requests, use the admin wipe path only after confirming the caregiver identity and recording the request outside the caregiver record. For export requests, collect conversation, assessment, score, memory, and screening data manually until an explicit export workflow exists.

Pause Conditions

Pause enrollment or the affected org when any of these are true:

  • safety or review queues cannot be checked that day
  • outbound delivery is failing broadly
  • repeated evaluator fallbacks indicate a broken model or prompt release
  • a partner cohort exceeds the solo review capacity
  • public copy or partner expectations no longer match actual system behavior

InvisibleBench Use

InvisibleBench is the regression and safety-evaluation layer. Use it before changing model, prompt, tool, safety, or benefits behavior. It does not replace live operations: benchmark pass rates show expected behavior in known scenarios, while this runbook covers what the operator does when real caregivers trigger exceptions.