Implementation Plan — Inspired Applied Learning Platform (ALP)

Grounded in knowledge-tree/ and source-docs/. The phasing follows knowledge-tree/reference/implementation-roadmap.md. Open questions referenced inline live in knowledge-tree/questions/.

Problem statement

Inspired Instruction runs in-person professional development for teachers. Today there is no digital system that follows a teacher from “learned a strategy in a workshop” to “applied it in the classroom.” Existing thinking treats the platform like an LMS — it would measure completion (registered, attended, survey done) and stop there.

The actual product goal, verbatim from the v3 strategy, is to reduce the time between learning and improved practice. The platform must therefore measure and drive behavioral change — did the teacher apply the strategy, how fast, how many times before mastery — and intervene when expected behavior is absent.

Premises (to confirm before building)

  1. Behavior over completion. The platform’s reason to exist is application speed/frequency, not course completion. Every metric ties to application. (If wrong, this is just another LMS.)
  2. The 6-stage cycle is the spine. Prepare → Engage → Practice → Apply → Reflect → Adjust is the shared vocabulary on every screen. The v2 7-stage naming is deprecated.
  3. Five roles. Participant, Learning Administrator, Instructional Strategist, Platform Manager, Content Creator. Content Creator’s day-to-day flow is undefined and out of scope for v1 (see content-creator-role).
  4. HubSpot stays commercial; the platform owns delivery. v1 enters engagement data manually (PM); the HubSpot pull is a later adapter, not a v1 dependency (hubspot-sync).
  5. Patterns, not individuals. Every Instructional-Strategist-facing analytic is aggregate by default; surfacing an individual requires an explicit, logged action.

Goal / non-goals

Premise gate resolved (Phase 1 CEO review): v1 narrowed to Phase 1 only, shipped to one pilot cohort, with IS-observed application evidence added alongside teacher self-report so the headline time-to-apply metric is corroborated rather than trusted blindly. Phase 2 (context layer) is funded only after the pilot proves applications actually get logged. Both the Claude and Codex strategy voices recommended this; the user confirmed.

Goals (v1 = Phase 1 + measurement integrity): ship the behavioral core (Practice + Apply phases, 48h gate, nudge cascade, mark-complete → survey → certificate, locked resources) to a single pilot cohort. The ApplicationEvent backbone records evidence_type ∈ {self_report, artifact, observation}; the IS logs observed application on a sample so self-report drift is measured during the pilot. Seed a minimal application-oriented microlearning set so Apply-phase nudges have something to point at.

Non-goals (deferred): Phase 2 context layer (School Context Profile, Cohort Snapshot, Context Brief, pulse check, intro video) — deferred until the pilot validates the loop; full Platform Manager automation + pattern detection (Phase 3); HubSpot pull/push + case-study pipeline (Phase 4); Content Creator authoring workflow (post-Phase-4, blocked on a working session); native mobile apps; SCORM/LMS import.

Architecture (proposed, see data-model)

Phased delivery

Phase 1 — Foundation (behavioral core) — v1

Phase 2 — Context layer — v1

Phase 3 — Platform Manager + automation — v2

Phase 4 — Business + Content loop — v2

Metrics that define success

Test strategy (high level)

Key risks

Rollout

Ship Phase 1 to one pilot school/cohort, instrument time-to-apply, validate the cycle approach produces application evidence, then add Phase 2 context. Phases 3–4 only after the pilot proves the behavioral loop works.

# /autoplan Review
Pipeline: CEO → Design → Eng (DX skipped — no developer-facing scope; the only DX keyword hits were the internal HubSpot integration, not a shipped API/SDK/CLI). Each phase ran a Claude independent subagent + a Codex voice. Auto-decisions use the 6 principles; taste decisions and user challenges are surfaced at gates.
## Phase 1 — CEO Review (Strategy & Scope)
Mode: SELECTIVE EXPANSION (autoplan override) — hold the plan’s scope as baseline, surface scope changes as explicit decisions.
### 0A. Premise challenge
| # | Premise | Verdict | Note | |—|—|—|—| | 1 | Behavior over completion, measured via teacher self-report in v1 | CHALLENGED | Both models: self-report “applied” ≠ improved practice. The metric that defines the product is also its weakest signal, and it sits in v1. → User Challenge. | | 2 | 6-stage cycle is the spine | Accept | Sound; low-risk shared vocabulary. | | 3 | Five roles; Content Creator deferred | Partially challenged | Codex: intervention layer needs microlearning to point at; deferring Content Creator too far makes nudges empty. Keep role deferred but seed a minimal application-oriented library. | | 4 | HubSpot manual entry in v1 | Accept (watch) | Pragmatic; PM duplication could bottleneck — keep a CSV/link path in reach. | | 5 | Patterns, not individuals | Accept (tension noted) | Codex: conflicts with the coaching loop if stuck individuals are hidden. Resolve with consent + explicit norms + logged reveal, not aggregate-only. |
### 0B. What already exists (leverage map)
Greenfield project — no prior code. Leverage is external/library, not internal: - Auth + RLS + Postgres + scheduled functions: Supabase (don’t hand-roll). - Job scheduling for the fire-on-absence cascade: a durable due-job table + worker (don’t invent a bespoke scheduler). - Email/in-app nudge delivery: a transactional provider (Resend/Postmark) + an in-app inbox table. - TTS: a managed TTS API, not a custom model. Nothing in the plan duplicates existing functionality (DRY check passes — there is no codebase yet).
### 0C. Dream-state trajectory
CURRENT THIS PLAN (v1 = Ph1+2) 12-MONTH IDEAL ───────── ───────────────────── ────────────── In-person PD, Behavioral core + full context Every engagement produces no digital layer: 6-stage cycle, 48h gate, rigorous, IS-corroborated follow-through. nudge cascade, School Context evidence of classroom transfer; Workshop = done Profile, pulse, intro video, renewal/expansion driven by at attendance. certificate. proven behavioral lift; content library evolves from gap signals. Delta this plan leaves: it builds the measurement engine but bets the headline metric on self-report and bundles the largest new build (context layer) before the loop is proven.
### 0C-bis. Implementation alternatives (the strategic fork)
| Approach | Effort (human / CC) | Risk | Pros | Cons | |—|—|—|—|—| | A. Plan as written — v1 = Phase 1 + Phase 2 | ~8-10 wk / ~days | High | Complete, matches v3 doc | Bets headline metric on self-report; builds context layer before loop is proven; large unvalidated surface | | B. Phase-1-only pilot (recommended by both models) | ~3-4 wk / ~1-2 days | Medium | Proves application data actually flows before funding the context layer; smaller blast radius | Less impressive demo; context layer slips | | C. IS-evidence wedge — instrument the IS to log observed application; thin teacher self-report | ~3-4 wk / ~1-2 days | Low-Med | Rigorous data by default; leverages the paid expert in the room; sidesteps teacher compliance | Diverges from v3 doc’s participant-centric framing; fewer teacher-facing features |
### 0D. Scope decisions (SELECTIVE EXPANSION) - Reduction surfaced (User Challenge): split v1 → Phase 1 pilot first, then Phase 2. Both models independently recommend this; the plan’s own Rollout section already says “Phase 2 only after pilot proves the loop,” then contradicts it by defining v1 = Phases 1+2. - Measurement surfaced (User Challenge): add IS-observed evidence (Approach C elements) alongside/instead of pure self-report for the validity backbone. - Expansion (small, in blast radius, auto-approved): seed a minimal application-oriented microlearning set so Phase-1 nudges have something to point at (P2 boil-lakes; <1 day CC). Logged. - Deferred (to TODOS): full Content Creator workflow, HubSpot pull/push, case-study pipeline, pattern detection. Unchanged.
### 0E. Temporal interrogation - Hour 1: participant invited, builds profile, watches intro — fine. - Hour 24–120: the 48h/5d windows fire. Risk: nothing compels a teacher to log; dashboards go empty. Risk: tz/weekend math pushes nudges to Saturday. - Week 2–6: drift nudges. Risk: nudge fatigue / surveillance feeling. - Month 6: if self-report didn’t flow, the whole metric story is empty — the 6-month regret both models name.
### 0F. Mode confirmation SELECTIVE EXPANSION holds. The one decision that must go to the user before Design/Eng proceed: the v1 scope + measurement fork (premise gate below).
### Phase 1 dual voices — CEO consensus
CEO DUAL VOICES — CONSENSUS TABLE ═══════════════════════════════════════════════════════════════════ Dimension Claude Codex Consensus ────────────────────────────────────── ──────── ─────── ────────── 1. Premises valid? No(P1) No(P1) CONFIRMED (self-report premise weak) 2. Right problem to solve? Reframe Reframe CONFIRMED (leverage IS / make transfer observable) 3. Scope calibration correct? No No CONFIRMED (v1=Ph1+2 too big → Phase-1 pilot) 4. Alternatives explored? No No CONFIRMED (plan has no alternatives section) 5. Competitive/market risk covered? No(thin) No(LMS gravity) CONFIRMED (moat = IS relationship, under-leveraged) 6. 6-month trajectory sound? At risk At risk CONFIRMED (empty-dashboards regret) ═══════════════════════════════════════════════════════════════════ 6/6 dimensions flagged by BOTH voices. Unusually strong agreement → these are User Challenges.
### CEO review sections (auto-decided issues)
- S1 Problem/demand: No commercial thesis for why a school renews because time-to-apply dropped. Decision: add the thesis to the plan + validate with the pilot (P1 completeness). Logged. - S2 Error & Rescue Registry: see table below. - S3 Scope discipline: “NOT in scope” section added below. Examined; the plan already defers Phases 3–4 cleanly — only the v1/Phase-2 boundary is contested. - S4 Metrics/proxy skepticism: “applied within 48h” is a proxy that can drift from “improved practice.” Decision: add a quality signal (IS spot-observation or a light rubric) so the proxy is checked, not trusted (P1). → folded into User Challenge. - S5 People sequencing: the IS is the highest-leverage actor and is under-used as an evidence source. → User Challenge (measurement). - S6 Content dependency: intervention layer needs microlearning to point at. Decision: seed a minimal library in Phase 1 (auto-approved expansion). Logged. - S7 LMS-gravity check: certificates + locked resources risk pulling back toward completion incentives. Decision: keep them (they drive return-to-platform per knowledge tree) but do not let them become the success metric; success = application, not certificate issuance. Logged, no scope change. - S8 Privacy vs coaching tension: examined; resolved by Eng phase (consent + logged reveal). No CEO-level change. - S9 Observability: nudge-send success, empty-dashboard detection, unmarked-complete detection are first-class — pushed to Eng failure-modes registry. - S10 Reversibility: nearly all decisions here are two-way doors (greenfield). The one one-way-ish door is the metric definition exported to clients — don’t publish a time-to-apply number externally until self-report drift is measured. Logged.
### Error & Rescue Registry (CEO-surfaced, refined in Eng)
| Failure | Trigger | Visible to | Rescue | |—|—|—|—| | Empty dashboards | Teachers don’t log applications | IS, PM | Pilot measures log-rate before external claims; IS-observation fallback | | Stuck cohort, silent | IS never marks complete | PM | Stale “delivered-but-unmarked” escalation (an absence-trigger) | | Nudge fatigue | Cascade + cap interaction | Participant, IS | Bounded 1–3 + weekly cap + participant self-snooze | | False precision | LA context inaccurate/late | IS | Treat context as optional enrichment, never gate prep on it | | Metric drift | Self-report ≠ practice quality | Company | Validate self-report against IS observation in pilot |
### NOT in scope (v1) Full PM automation + pattern detection (Phase 3); HubSpot pull/push + case-study pipeline (Phase 4); Content Creator authoring workflow (post-Phase-4, blocked on working session); native mobile; SCORM/LMS import. Plus (pending premise gate) possibly the entire Phase 2 context layer.
### Phase 1 transition summary Codex: 11 strategic concerns. Claude subagent: 6 findings. Consensus: 6/6 dimensions flagged by both → two User Challenges (v1 scope split; self-report→IS-evidence). Premise gate next.

Phase 2 — Design Review

In-scope v1 surfaces (post premise gate): Participant dashboard and Instructional Strategist dashboard (behavioral-core subset). Context Brief, intro video, Strategist Insights, pulse, and the LA/PM dashboards are Phase 2 — mark them “future” in any v1 build so the implementer doesn’t build Phase 2 by accident (Codex scope-leakage catch).

Phase 2 dual voices — design consensus (litmus scorecard)

DESIGN DUAL VOICES — CONSENSUS TABLE
═══════════════════════════════════════════════════════════════════
  Dimension                              Claude   Codex   Consensus
  ────────────────────────────────────── ──────── ─────── ──────────
  1. Information hierarchy serves user?  Partial  Partial CONFIRMED (Participant ok; IS tone wrong)
  2. Interaction states specified?       No(crit) No      CONFIRMED (loading/empty/error/partial absent)
  3. Empty/success state designed?       No(crit) No(crit)CONFIRMED (IS "zero stuck" must read as success)
  4. Specific UI vs generic patterns?    Generic  Generic CONFIRMED (no copy, no hierarchy spec)
  5. Trust posture safe in UI?           No(high) No(high)CONFIRMED (timer/nudge = trust-risk center)
  6. Accessibility sufficient?           No(high) —       Claude-only (mobile-first, captions, contrast)
  7. Scope correct for v1?               —        No(med) Codex-only (Phase-2 widgets leaking in)
═══════════════════════════════════════════════════════════════════
5/7 CONFIRMED by both. Dims 6,7 single-voice → flagged regardless.

7-pass design scorecard (auto-decided, P5 explicit + P1 completeness)

Pass Dimension Score Auto-decision (principle)
1 Information hierarchy 6/10 Participant (phase + one next action) is right. Fix: IS dashboard top-level = “which cohorts need support” + the support action; individual reveal behind a logged coaching action — never a “stuck participants” roster (P5).
2 Interaction states 3/10 Fix (critical): add a states matrix per in-scope dashboard — loading / empty / partial / error / success — before build (P1).
3 Empty/success states 2/10 Fix (critical): the IS “zero stuck” state is the GOOD state — design it as “All active cohorts on track” with secondary actions, not an empty queue (P1).
4 Specificity 4/10 Fix: spec microcopy, screen ordering, reveal affordances, nudge copy, evidence-state visuals. In this product the wording IS the design (P5).
5 Trust posture (nudge/timer) 4/10 Fix (high): support language not deadline language — “first-try window” not “48h countdown”; “not logged yet” not “delayed”; per-branch nudge tone (warm for attempted-failed); add participant self-snooze (“not this week”); no red/overdue badges, no rankings, no watchlists (P1+P5).
6 Accessibility 4/10 Fix: commit to mobile-first responsive web (teachers on phones between classes); TTS on nudges + reflection prompts, not just intro text; captions on the IS intro video (Phase 2, ADA); WCAG contrast on the locked/greyed state; keyboard/SR paths (P1).
7 Consent in the flow 5/10 Fix: mark-complete (IS action) silently starts the participant’s clock — open the application window TO the participant as an invitation, not a back-office trigger (P5, design-for-trust).

Design — what was examined, nothing flagged

Design taste decisions → gate

Phase 2 transition summary

Codex: 5 design concerns. Claude subagent: 6 findings. Consensus: 5/7 confirmed by both, 2 single-voice flagged. All structural fixes auto-decided (states matrix, success states, trust-safe microcopy, consent reframe). 1 taste decision (nudge tone) → gate. Passing to Phase 3 (Eng).


Phase 3 — Eng Review + Dual Voices

What already exists / leverage (greenfield)

No prior code. Reuse libraries, don’t hand-roll: Supabase (auth + Postgres + RLS + pg_cron), a durable job table + FOR UPDATE SKIP LOCKED worker (don’t invent a scheduler), a transactional email/in-app provider. DRY check passes (no codebase to duplicate).

Phase 3 dual voices — eng consensus

ENG DUAL VOICES — CONSENSUS TABLE
═══════════════════════════════════════════════════════════════════
  Dimension                              Claude   Codex   Consensus
  ────────────────────────────────────── ──────── ─────── ──────────
  1. Architecture / data model sound?    No(crit) No(crit)CONFIRMED (metric backbone missing joins)
  2. Test coverage sufficient?           No        No     CONFIRMED (scheduler/privacy/tz untested)
  3. Fire-on-absence implementable?      No(crit) No(crit)CONFIRMED (need durable due-job table)
  4. Privacy enforceable at DB?          No(crit) No(crit)CONFIRMED (RLS + k-anon + logged reveal)
  5. Timezone/window correct?            No(high) No(crit)CONFIRMED (school tz + calendar)
  6. Critical-path decisions resolved?   No(crit) No(crit)CONFIRMED (timer anchor deferred but blocking)
═══════════════════════════════════════════════════════════════════
6/6 flagged by BOTH. Exceptional agreement → all auto-decided (structural, one right answer).

Corrected architecture (supersedes the proposed entity list above)

                         ┌──────────────────────────────────────────────┐
   School ──< Engagement >── InstructionalStrategist                     │
     │           │  │                                                    │
     │           │  └──< EngagementStrategy >── Strategy   (NEW: mastery │
     │           │                                          curves need  │
     │           │                                          a 1st-class   │
     │           │                                          Strategy)     │
     │           ├──< Cohort/CohortSnapshot >──< Participant + LearnerProfile
     │           │                                    │                   │
     │           │   mark-complete ─────► ApplicationTrackingWindow (NEW) │
     │           │   (records metric ANCHOR = engagement-end, audited;    │
     │           │    NOT the late button-click time)                     │
     │           │        │                                               │
     │           │        ├─► enqueue intervention_job × {24h,48h,5d}     │
     │           │        │   (NEW: due_at, status, idempotency_key,      │
     │           │        │    locked_at, attempt_count, suppressed_reason)│
     │           │        └─► survey + certificate eligibility            │
     │           │                                                        │
   ActionPlan ──┴──< ApplicationEvent  (FK: action_plan_id NEW,          │
     (what/when/whom)    strategy_id→Strategy, evidence_type ∈            │
                         {self_report, artifact, observation},            │
                         attempt_number via unique constraint OR          │
                         ROW_NUMBER() — no stored race)                   │
                                                                          │
   Worker:  SELECT … FROM intervention_job WHERE due_at<=now()           │
            AND status='pending' FOR UPDATE SKIP LOCKED                   │
            → re-check absence in same txn → insert nudge_delivery        │
            (idempotency_key) → decrement weekly-cap ledger (txn)         │
   PRIVACY: IS/LA roles DENIED direct SELECT on pulse/application rows;   │
            only SECURITY DEFINER aggregate views w/ HAVING count(distinct│
            participant)>=5 (k-anon); individual reveal = logged function │
            writing individual_data_reveal in the same txn.               │
                         └──────────────────────────────────────────────┘

Auto-decided architecture fixes (P5 explicit + P3 pragmatic)

# Fix Severity Principle
E1 Add intervention_job (due_at, status, idempotency_key, locked_at, attempt_count, suppressed_reason); create txn-ally; process FOR UPDATE SKIP LOCKED; re-check absence before send. critical P5
E2 Enforce weekly nudge cap in Postgres via a delivery ledger keyed by participant + local week; overflow jobs rescheduled/suppressed by priority. critical P5
E3 Store school timezone + school calendar on engagement/window; compute due times in school-local, persist UTC due_at, roll sends out of weekends/after-hours; keep metric elapsed-time separate from send-time. critical P1
E4 Add Strategy + EngagementStrategy; add action_plan_id FK on ApplicationEvent; attempt_number via unique (participant, engagement, strategy, attempt_number) or read-time ROW_NUMBER(). critical P5
E5 Privacy at DB: deny IS/LA direct SELECT on individual pulse/application rows; aggregate-only SECURITY DEFINER views with k-anonymity (k≥5) + anti-differencing; reveal = logged Postgres function writing individual_data_reveal in-txn. critical P1
E6 Add ApplicationTrackingWindow; timer anchor = engagement-end time (audited), not the late mark-complete click. Nudge due-dates may respect ActionPlan.when; time-to-apply must not. Resolves the deferred timer-anchor on the Phase-1 critical path. critical P5
E7 Stale-engagement detector: “delivered but never marked complete” is itself an absence-trigger so a cohort can’t go silently dead. high P1

Failure modes registry

Failure Detection Visible to Mitigation Tested?
Missed nudge (worker down) due_at<=now() catch-up query ops alarm catch-up on restart, not “this-minute” firing test #2
Double-fire on retry idempotency_key unique dedup insert test #2
Cap blown by burst ledger count per local week participant txn cap decrement + priority suppression test #3
Saturday nudge school calendar check participant roll to next school day test #4
Silent dead cohort unmarked-complete detector PM escalation prompt test #5
Individual data leak RLS denies base SELECT k-anon view + logged reveal test #7
attempt_number race unique constraint constraint fails loudly / ROW_NUMBER test #8
Half mark-complete fan-out idempotency IS transactional or retryable effects test #1

NOT in scope (Eng, v1)

Phase 2 context-layer tables (SchoolContextProfile versioning + pinning, PulseResponse, IntroVideo) — schema-stubbed only; their RLS policies designed now (E5) but UIs deferred. Phase 3/4 entirely out.

Test plan

Written to disk: ~/.gstack/projects/inspired-applied-learninh/brandon-main-test-plan-20260527-100703.md (test diagram + 8 critical cases + unit/integration/E2E breakdown).

Phase 3 transition summary

Codex: 6 findings (5 critical, 1 high). Claude subagent: ~12 findings across 6 areas. Consensus: 6/6 dimensions confirmed by both. All auto-decided (E1–E7 are structural). No eng taste decisions — the answers are determined. Passing to Final Gate.


Cross-Phase Themes (flagged independently in 2+ phases — high-confidence)

  1. “Applied” ≠ “improved practice.” CEO (metric is a proxy that can drift) + Eng (timer anchor must be engagement-end, not button-click; add observation evidence) + Design (a 48h countdown reads as a test). All three phases. → addressed by premise gate (IS evidence) + E6.
  2. The “not evaluative” trust posture is fragile and crosses the stack. Design (nudge/timer microcopy, no individual watchlists) + Eng (RLS k-anonymity + logged reveal, not app-code promises) + CEO (patterns-vs-coaching tension). All three. → addressed by E5 + design fixes
    • a “consent + explicit norms” note.
  3. Absence is the core mechanic and the core risk. CEO (empty-dashboards 6-month regret) + Design (empty/success states unspecified — the IS “zero stuck” state IS the product) + Eng (fire-on-absence needs a durable due-job table). All three. → addressed by states matrix + E1 + E7 + the pilot’s log-rate validation.

Decision Audit Trail

# Phase Decision Class Principle Rationale
1 CEO Mode = SELECTIVE EXPANSION Mechanical override autoplan default
2 CEO Run Codex + Claude subagent Mechanical P6 dual voices
3 CEO v1 → Phase-1 pilot + IS evidence User Challenge both models; user confirmed at premise gate
4 CEO Accept premises 2,4,5 as-is Mechanical P6 low-risk, sound
5 CEO Seed minimal microlearning in v1 Auto (expansion) P2 in blast radius, <1d CC; nudges need targets
6 CEO Keep certificate/locked-resources but not as success metric Auto P5 drives return; success = application
7 CEO Don’t publish time-to-apply externally until drift measured Auto P7 one-way door (proxy skepticism)
8 Design States matrix per dashboard required Auto P1 empty/success states ARE the product
9 Design IS top-level = cohort patterns, individual behind logged reveal Auto P5 trust posture; no watchlists
10 Design Support microcopy not deadline copy; add participant self-snooze Auto P1+P5 trust; nudge fatigue
11 Design Mark-complete opens window AS invitation to participant Auto P5 consent / design-for-trust
12 Design Commit mobile-first; TTS on nudges; video captions; contrast Auto P1 accessibility (teachers on phones)
13 Design T1 nudge-tone calibration Taste → final gate
14 Eng E1 durable intervention_job table + SKIP LOCKED + re-check Auto P5 fire-on-absence needs obligations, not just a log
15 Eng E2 weekly cap enforced in Postgres ledger Auto P5 survives catch-up burst
16 Eng E3 school tz + calendar; UTC due_at; roll out of weekends Auto P1 no Saturday nudges; metric≠send time
17 Eng E4 add Strategy + EngagementStrategy + action_plan_id; fix attempt_number Auto P5 mastery curves need a Strategy entity
18 Eng E5 RLS deny + k-anon views + logged reveal Auto P1 privacy enforced at DB not app
19 Eng E6 timer anchor = engagement-end (audited), not button-click Auto P5 both models; resolves blocking open question
20 Eng E7 stale unmarked-complete detector Auto P1 no silently-dead cohorts