Implementation Plan — Inspired Applied Learning Platform (ALP)
Grounded in
knowledge-tree/andsource-docs/. The phasing followsknowledge-tree/reference/implementation-roadmap.md. Open questions referenced inline live inknowledge-tree/questions/.
Problem statement
Inspired Instruction runs in-person professional development for teachers. Today there is no digital system that follows a teacher from “learned a strategy in a workshop” to “applied it in the classroom.” Existing thinking treats the platform like an LMS — it would measure completion (registered, attended, survey done) and stop there.
The actual product goal, verbatim from the v3 strategy, is to reduce the time between learning and improved practice. The platform must therefore measure and drive behavioral change — did the teacher apply the strategy, how fast, how many times before mastery — and intervene when expected behavior is absent.
Premises (to confirm before building)
- Behavior over completion. The platform’s reason to exist is application speed/frequency, not course completion. Every metric ties to application. (If wrong, this is just another LMS.)
- The 6-stage cycle is the spine. Prepare → Engage → Practice → Apply → Reflect → Adjust is the shared vocabulary on every screen. The v2 7-stage naming is deprecated.
- Five roles. Participant, Learning Administrator, Instructional Strategist, Platform Manager, Content Creator. Content Creator’s day-to-day flow is undefined and out of scope for v1 (see content-creator-role).
- HubSpot stays commercial; the platform owns delivery. v1 enters engagement data manually (PM); the HubSpot pull is a later adapter, not a v1 dependency (hubspot-sync).
- Patterns, not individuals. Every Instructional-Strategist-facing analytic is aggregate by default; surfacing an individual requires an explicit, logged action.
Goal / non-goals
Premise gate resolved (Phase 1 CEO review): v1 narrowed to Phase 1 only, shipped to one pilot cohort, with IS-observed application evidence added alongside teacher self-report so the headline time-to-apply metric is corroborated rather than trusted blindly. Phase 2 (context layer) is funded only after the pilot proves applications actually get logged. Both the Claude and Codex strategy voices recommended this; the user confirmed.
Goals (v1 = Phase 1 + measurement integrity): ship the behavioral core (Practice + Apply phases, 48h gate, nudge cascade, mark-complete → survey → certificate, locked resources) to a single pilot cohort. The ApplicationEvent backbone records evidence_type ∈ {self_report, artifact, observation}; the IS logs observed application on a sample so self-report drift is measured during the pilot. Seed a minimal application-oriented microlearning set so Apply-phase nudges have something to point at.
Non-goals (deferred): Phase 2 context layer (School Context Profile, Cohort Snapshot, Context Brief, pulse check, intro video) — deferred until the pilot validates the loop; full Platform Manager automation + pattern detection (Phase 3); HubSpot pull/push + case-study pipeline (Phase 4); Content Creator authoring workflow (post-Phase-4, blocked on a working session); native mobile apps; SCORM/LMS import.
Architecture (proposed, see data-model)
- Stack: TypeScript end to end. Next.js (App Router) web app; Postgres (Supabase) with row-level security; Supabase Auth for the five roles; a job runner for the timed nudge cascade (Supabase scheduled functions or a small worker).
- Core entities: School, SchoolContextProfile (versioned), Engagement, Cohort/CohortSnapshot, Participant + LearnerProfile, ActionPlan, ApplicationEvent (the metric backbone), PulseResponse, Reflection/Survey, Certificate, ContentAsset + VisibilityTransition, IntroVideo, InterventionTrigger, NextEngagementSignal.
- Enums frozen up front: the 6 cycle phases, readiness 1–5, change-pressure tags, priority focus areas, attendee groups, evidence types, content visibility states.
- Privacy enforced at the query layer, not just UI: pulse aggregates never expose individual scores; individual-data access is logged.
Phased delivery
Phase 1 — Foundation (behavioral core) — v1
- Adopt 6-stage cycle vocabulary across all dashboards (shared phase-ribbon component + enum).
- Practice phase: ActionPlan object (what / when / with whom), submitted in-session.
- Apply phase: 48-hour application gate with three branches (≤24h / 24–48h / delayed / attempted-failed). ApplicationEvent capture with
evidence_type ∈ {self_report, artifact, observation}; IS logs observed application on a pilot sample to measure self-report drift. - Nudge cascade at 24h / 48h / 5d (Apply-phase subset of the intervention layer).
- Mark-complete (IS action) → opens application-tracking window, triggers participant survey, triggers certificate eligibility.
- Survey gate → certificate (visible-but-locked until survey done).
- Locked-resources mechanism (D1 visible-greyed pre; D2 IS release post).
- Decisions needed first: cycle naming, application evidence, timer anchor.
Phase 2 — Context layer — v1
- LA School Context Profile (A1–A5): readiness, change tags, top-3 priorities, narrative note; <5 min, reusable, versioned, school-level with workshop-level override.
- Cohort Snapshot (B1–B2) per engagement.
- IS Context Brief (C1) + Strategist Insights (E2–E4): % logged in, % watched intro, pulse aggregates, outliers; visible ≥3 days before engagement.
- Pulse check (C1–C3): 1 Likert + 3 MCQs, “diagnostic, not a test”, patterns-only guardrails.
- IS intro video (A1, A2): ≤30s upload → PM approval → auto-associate to assignments.
Phase 3 — Platform Manager + automation — v2
- Full PM flow; publishing-rule engine (4 visibility states + transition log); full intervention layer (all phases, not just Apply); cross-engagement pattern detection + at-risk/high-perf flags; Briefing Report in-platform intake.
Phase 4 — Business + Content loop — v2
- Next-engagement signal routed LA → PM → HubSpot (LA-first); case-study capture; Content Creator backlog feed; time-to-apply + mastery-curve metrics surfaced to sales.
Metrics that define success
- % applied within 24h / 48h / 5d; repeat-application rate; avg & median time-to-first-application; applications-to-mastery. Filterable by readiness / change load / priority once Phase 2 lands.
- Guardrail metric: nudge volume per participant per week stays under a cap (no fatigue).
Test strategy (high level)
- Unit: enum integrity, visibility state-machine transitions, timer-window math.
- Integration: mark-complete fan-out (survey + certificate + tracking window); nudge cascade firing on absence; pulse-aggregate query never leaks individual scores.
- E2E: full participant happy path (invite → apply → certificate) + each Apply branch.
Key risks
- Self-report-only application data is soft — mitigated by optional artifact upload + IS observation later.
- Nudge fatigue — bounded 1–3 nudges + global weekly cap + IS mute.
- Context-layer adoption — if LAs don’t fill the profile, IS prep stays generic; keep it <5 min.
- Privacy — teacher-facing “not evaluative” trust posture is load-bearing; enforce patterns-only at the data layer.
Rollout
Ship Phase 1 to one pilot school/cohort, instrument time-to-apply, validate the cycle approach produces application evidence, then add Phase 2 context. Phases 3–4 only after the pilot proves the behavioral loop works.
| # /autoplan Review |
| Pipeline: CEO → Design → Eng (DX skipped — no developer-facing scope; the only DX keyword hits were the internal HubSpot integration, not a shipped API/SDK/CLI). Each phase ran a Claude independent subagent + a Codex voice. Auto-decisions use the 6 principles; taste decisions and user challenges are surfaced at gates. |
| ## Phase 1 — CEO Review (Strategy & Scope) |
| Mode: SELECTIVE EXPANSION (autoplan override) — hold the plan’s scope as baseline, surface scope changes as explicit decisions. |
| ### 0A. Premise challenge |
| | # | Premise | Verdict | Note | |—|—|—|—| | 1 | Behavior over completion, measured via teacher self-report in v1 | CHALLENGED | Both models: self-report “applied” ≠ improved practice. The metric that defines the product is also its weakest signal, and it sits in v1. → User Challenge. | | 2 | 6-stage cycle is the spine | Accept | Sound; low-risk shared vocabulary. | | 3 | Five roles; Content Creator deferred | Partially challenged | Codex: intervention layer needs microlearning to point at; deferring Content Creator too far makes nudges empty. Keep role deferred but seed a minimal application-oriented library. | | 4 | HubSpot manual entry in v1 | Accept (watch) | Pragmatic; PM duplication could bottleneck — keep a CSV/link path in reach. | | 5 | Patterns, not individuals | Accept (tension noted) | Codex: conflicts with the coaching loop if stuck individuals are hidden. Resolve with consent + explicit norms + logged reveal, not aggregate-only. | |
| ### 0B. What already exists (leverage map) |
| Greenfield project — no prior code. Leverage is external/library, not internal: - Auth + RLS + Postgres + scheduled functions: Supabase (don’t hand-roll). - Job scheduling for the fire-on-absence cascade: a durable due-job table + worker (don’t invent a bespoke scheduler). - Email/in-app nudge delivery: a transactional provider (Resend/Postmark) + an in-app inbox table. - TTS: a managed TTS API, not a custom model. Nothing in the plan duplicates existing functionality (DRY check passes — there is no codebase yet). |
| ### 0C. Dream-state trajectory |
CURRENT THIS PLAN (v1 = Ph1+2) 12-MONTH IDEAL ───────── ───────────────────── ────────────── In-person PD, Behavioral core + full context Every engagement produces no digital layer: 6-stage cycle, 48h gate, rigorous, IS-corroborated follow-through. nudge cascade, School Context evidence of classroom transfer; Workshop = done Profile, pulse, intro video, renewal/expansion driven by at attendance. certificate. proven behavioral lift; content library evolves from gap signals. Delta this plan leaves: it builds the measurement engine but bets the headline metric on self-report and bundles the largest new build (context layer) before the loop is proven. |
| ### 0C-bis. Implementation alternatives (the strategic fork) |
| | Approach | Effort (human / CC) | Risk | Pros | Cons | |—|—|—|—|—| | A. Plan as written — v1 = Phase 1 + Phase 2 | ~8-10 wk / ~days | High | Complete, matches v3 doc | Bets headline metric on self-report; builds context layer before loop is proven; large unvalidated surface | | B. Phase-1-only pilot (recommended by both models) | ~3-4 wk / ~1-2 days | Medium | Proves application data actually flows before funding the context layer; smaller blast radius | Less impressive demo; context layer slips | | C. IS-evidence wedge — instrument the IS to log observed application; thin teacher self-report | ~3-4 wk / ~1-2 days | Low-Med | Rigorous data by default; leverages the paid expert in the room; sidesteps teacher compliance | Diverges from v3 doc’s participant-centric framing; fewer teacher-facing features | |
| ### 0D. Scope decisions (SELECTIVE EXPANSION) - Reduction surfaced (User Challenge): split v1 → Phase 1 pilot first, then Phase 2. Both models independently recommend this; the plan’s own Rollout section already says “Phase 2 only after pilot proves the loop,” then contradicts it by defining v1 = Phases 1+2. - Measurement surfaced (User Challenge): add IS-observed evidence (Approach C elements) alongside/instead of pure self-report for the validity backbone. - Expansion (small, in blast radius, auto-approved): seed a minimal application-oriented microlearning set so Phase-1 nudges have something to point at (P2 boil-lakes; <1 day CC). Logged. - Deferred (to TODOS): full Content Creator workflow, HubSpot pull/push, case-study pipeline, pattern detection. Unchanged. |
| ### 0E. Temporal interrogation - Hour 1: participant invited, builds profile, watches intro — fine. - Hour 24–120: the 48h/5d windows fire. Risk: nothing compels a teacher to log; dashboards go empty. Risk: tz/weekend math pushes nudges to Saturday. - Week 2–6: drift nudges. Risk: nudge fatigue / surveillance feeling. - Month 6: if self-report didn’t flow, the whole metric story is empty — the 6-month regret both models name. |
| ### 0F. Mode confirmation SELECTIVE EXPANSION holds. The one decision that must go to the user before Design/Eng proceed: the v1 scope + measurement fork (premise gate below). |
| ### Phase 1 dual voices — CEO consensus |
CEO DUAL VOICES — CONSENSUS TABLE ═══════════════════════════════════════════════════════════════════ Dimension Claude Codex Consensus ────────────────────────────────────── ──────── ─────── ────────── 1. Premises valid? No(P1) No(P1) CONFIRMED (self-report premise weak) 2. Right problem to solve? Reframe Reframe CONFIRMED (leverage IS / make transfer observable) 3. Scope calibration correct? No No CONFIRMED (v1=Ph1+2 too big → Phase-1 pilot) 4. Alternatives explored? No No CONFIRMED (plan has no alternatives section) 5. Competitive/market risk covered? No(thin) No(LMS gravity) CONFIRMED (moat = IS relationship, under-leveraged) 6. 6-month trajectory sound? At risk At risk CONFIRMED (empty-dashboards regret) ═══════════════════════════════════════════════════════════════════ 6/6 dimensions flagged by BOTH voices. Unusually strong agreement → these are User Challenges. |
| ### CEO review sections (auto-decided issues) |
| - S1 Problem/demand: No commercial thesis for why a school renews because time-to-apply dropped. Decision: add the thesis to the plan + validate with the pilot (P1 completeness). Logged. - S2 Error & Rescue Registry: see table below. - S3 Scope discipline: “NOT in scope” section added below. Examined; the plan already defers Phases 3–4 cleanly — only the v1/Phase-2 boundary is contested. - S4 Metrics/proxy skepticism: “applied within 48h” is a proxy that can drift from “improved practice.” Decision: add a quality signal (IS spot-observation or a light rubric) so the proxy is checked, not trusted (P1). → folded into User Challenge. - S5 People sequencing: the IS is the highest-leverage actor and is under-used as an evidence source. → User Challenge (measurement). - S6 Content dependency: intervention layer needs microlearning to point at. Decision: seed a minimal library in Phase 1 (auto-approved expansion). Logged. - S7 LMS-gravity check: certificates + locked resources risk pulling back toward completion incentives. Decision: keep them (they drive return-to-platform per knowledge tree) but do not let them become the success metric; success = application, not certificate issuance. Logged, no scope change. - S8 Privacy vs coaching tension: examined; resolved by Eng phase (consent + logged reveal). No CEO-level change. - S9 Observability: nudge-send success, empty-dashboard detection, unmarked-complete detection are first-class — pushed to Eng failure-modes registry. - S10 Reversibility: nearly all decisions here are two-way doors (greenfield). The one one-way-ish door is the metric definition exported to clients — don’t publish a time-to-apply number externally until self-report drift is measured. Logged. |
| ### Error & Rescue Registry (CEO-surfaced, refined in Eng) |
| | Failure | Trigger | Visible to | Rescue | |—|—|—|—| | Empty dashboards | Teachers don’t log applications | IS, PM | Pilot measures log-rate before external claims; IS-observation fallback | | Stuck cohort, silent | IS never marks complete | PM | Stale “delivered-but-unmarked” escalation (an absence-trigger) | | Nudge fatigue | Cascade + cap interaction | Participant, IS | Bounded 1–3 + weekly cap + participant self-snooze | | False precision | LA context inaccurate/late | IS | Treat context as optional enrichment, never gate prep on it | | Metric drift | Self-report ≠ practice quality | Company | Validate self-report against IS observation in pilot | |
| ### NOT in scope (v1) Full PM automation + pattern detection (Phase 3); HubSpot pull/push + case-study pipeline (Phase 4); Content Creator authoring workflow (post-Phase-4, blocked on working session); native mobile; SCORM/LMS import. Plus (pending premise gate) possibly the entire Phase 2 context layer. |
| ### Phase 1 transition summary Codex: 11 strategic concerns. Claude subagent: 6 findings. Consensus: 6/6 dimensions flagged by both → two User Challenges (v1 scope split; self-report→IS-evidence). Premise gate next. |
Phase 2 — Design Review
In-scope v1 surfaces (post premise gate): Participant dashboard and Instructional Strategist dashboard (behavioral-core subset). Context Brief, intro video, Strategist Insights, pulse, and the LA/PM dashboards are Phase 2 — mark them “future” in any v1 build so the implementer doesn’t build Phase 2 by accident (Codex scope-leakage catch).
Phase 2 dual voices — design consensus (litmus scorecard)
DESIGN DUAL VOICES — CONSENSUS TABLE
═══════════════════════════════════════════════════════════════════
Dimension Claude Codex Consensus
────────────────────────────────────── ──────── ─────── ──────────
1. Information hierarchy serves user? Partial Partial CONFIRMED (Participant ok; IS tone wrong)
2. Interaction states specified? No(crit) No CONFIRMED (loading/empty/error/partial absent)
3. Empty/success state designed? No(crit) No(crit)CONFIRMED (IS "zero stuck" must read as success)
4. Specific UI vs generic patterns? Generic Generic CONFIRMED (no copy, no hierarchy spec)
5. Trust posture safe in UI? No(high) No(high)CONFIRMED (timer/nudge = trust-risk center)
6. Accessibility sufficient? No(high) — Claude-only (mobile-first, captions, contrast)
7. Scope correct for v1? — No(med) Codex-only (Phase-2 widgets leaking in)
═══════════════════════════════════════════════════════════════════
5/7 CONFIRMED by both. Dims 6,7 single-voice → flagged regardless.
7-pass design scorecard (auto-decided, P5 explicit + P1 completeness)
| Pass | Dimension | Score | Auto-decision (principle) |
|---|---|---|---|
| 1 | Information hierarchy | 6/10 | Participant (phase + one next action) is right. Fix: IS dashboard top-level = “which cohorts need support” + the support action; individual reveal behind a logged coaching action — never a “stuck participants” roster (P5). |
| 2 | Interaction states | 3/10 | Fix (critical): add a states matrix per in-scope dashboard — loading / empty / partial / error / success — before build (P1). |
| 3 | Empty/success states | 2/10 | Fix (critical): the IS “zero stuck” state is the GOOD state — design it as “All active cohorts on track” with secondary actions, not an empty queue (P1). |
| 4 | Specificity | 4/10 | Fix: spec microcopy, screen ordering, reveal affordances, nudge copy, evidence-state visuals. In this product the wording IS the design (P5). |
| 5 | Trust posture (nudge/timer) | 4/10 | Fix (high): support language not deadline language — “first-try window” not “48h countdown”; “not logged yet” not “delayed”; per-branch nudge tone (warm for attempted-failed); add participant self-snooze (“not this week”); no red/overdue badges, no rankings, no watchlists (P1+P5). |
| 6 | Accessibility | 4/10 | Fix: commit to mobile-first responsive web (teachers on phones between classes); TTS on nudges + reflection prompts, not just intro text; captions on the IS intro video (Phase 2, ADA); WCAG contrast on the locked/greyed state; keyboard/SR paths (P1). |
| 7 | Consent in the flow | 5/10 | Fix: mark-complete (IS action) silently starts the participant’s clock — open the application window TO the participant as an invitation, not a back-office trigger (P5, design-for-trust). |
Design — what was examined, nothing flagged
- Color/shape language: the flowchart legend (blue phase / green step / purple new / orange nudge) is a coherent inherited system — no change needed for v1.
- Certificate gating + locked-resources “visible-but-locked”: examined; the pattern is sound and drives return-to-platform. The only risk (LMS-gravity) is a CEO-phase metric concern, not a UI defect.
Design taste decisions → gate
- T1 (nudge tone calibration): how soft is too soft? Recommend warm-but-clear with a self-snooze; reasonable people could prefer more or less hand-holding. → final gate.
Phase 2 transition summary
Codex: 5 design concerns. Claude subagent: 6 findings. Consensus: 5/7 confirmed by both, 2 single-voice flagged. All structural fixes auto-decided (states matrix, success states, trust-safe microcopy, consent reframe). 1 taste decision (nudge tone) → gate. Passing to Phase 3 (Eng).
Phase 3 — Eng Review + Dual Voices
What already exists / leverage (greenfield)
No prior code. Reuse libraries, don’t hand-roll: Supabase (auth + Postgres + RLS + pg_cron), a durable job table + FOR UPDATE SKIP LOCKED worker (don’t invent a scheduler), a transactional email/in-app provider. DRY check passes (no codebase to duplicate).
Phase 3 dual voices — eng consensus
ENG DUAL VOICES — CONSENSUS TABLE
═══════════════════════════════════════════════════════════════════
Dimension Claude Codex Consensus
────────────────────────────────────── ──────── ─────── ──────────
1. Architecture / data model sound? No(crit) No(crit)CONFIRMED (metric backbone missing joins)
2. Test coverage sufficient? No No CONFIRMED (scheduler/privacy/tz untested)
3. Fire-on-absence implementable? No(crit) No(crit)CONFIRMED (need durable due-job table)
4. Privacy enforceable at DB? No(crit) No(crit)CONFIRMED (RLS + k-anon + logged reveal)
5. Timezone/window correct? No(high) No(crit)CONFIRMED (school tz + calendar)
6. Critical-path decisions resolved? No(crit) No(crit)CONFIRMED (timer anchor deferred but blocking)
═══════════════════════════════════════════════════════════════════
6/6 flagged by BOTH. Exceptional agreement → all auto-decided (structural, one right answer).
Corrected architecture (supersedes the proposed entity list above)
┌──────────────────────────────────────────────┐
School ──< Engagement >── InstructionalStrategist │
│ │ │ │
│ │ └──< EngagementStrategy >── Strategy (NEW: mastery │
│ │ curves need │
│ │ a 1st-class │
│ │ Strategy) │
│ ├──< Cohort/CohortSnapshot >──< Participant + LearnerProfile
│ │ │ │
│ │ mark-complete ─────► ApplicationTrackingWindow (NEW) │
│ │ (records metric ANCHOR = engagement-end, audited; │
│ │ NOT the late button-click time) │
│ │ │ │
│ │ ├─► enqueue intervention_job × {24h,48h,5d} │
│ │ │ (NEW: due_at, status, idempotency_key, │
│ │ │ locked_at, attempt_count, suppressed_reason)│
│ │ └─► survey + certificate eligibility │
│ │ │
ActionPlan ──┴──< ApplicationEvent (FK: action_plan_id NEW, │
(what/when/whom) strategy_id→Strategy, evidence_type ∈ │
{self_report, artifact, observation}, │
attempt_number via unique constraint OR │
ROW_NUMBER() — no stored race) │
│
Worker: SELECT … FROM intervention_job WHERE due_at<=now() │
AND status='pending' FOR UPDATE SKIP LOCKED │
→ re-check absence in same txn → insert nudge_delivery │
(idempotency_key) → decrement weekly-cap ledger (txn) │
PRIVACY: IS/LA roles DENIED direct SELECT on pulse/application rows; │
only SECURITY DEFINER aggregate views w/ HAVING count(distinct│
participant)>=5 (k-anon); individual reveal = logged function │
writing individual_data_reveal in the same txn. │
└──────────────────────────────────────────────┘
Auto-decided architecture fixes (P5 explicit + P3 pragmatic)
| # | Fix | Severity | Principle |
|---|---|---|---|
| E1 | Add intervention_job (due_at, status, idempotency_key, locked_at, attempt_count, suppressed_reason); create txn-ally; process FOR UPDATE SKIP LOCKED; re-check absence before send. |
critical | P5 |
| E2 | Enforce weekly nudge cap in Postgres via a delivery ledger keyed by participant + local week; overflow jobs rescheduled/suppressed by priority. | critical | P5 |
| E3 | Store school timezone + school calendar on engagement/window; compute due times in school-local, persist UTC due_at, roll sends out of weekends/after-hours; keep metric elapsed-time separate from send-time. |
critical | P1 |
| E4 | Add Strategy + EngagementStrategy; add action_plan_id FK on ApplicationEvent; attempt_number via unique (participant, engagement, strategy, attempt_number) or read-time ROW_NUMBER(). |
critical | P5 |
| E5 | Privacy at DB: deny IS/LA direct SELECT on individual pulse/application rows; aggregate-only SECURITY DEFINER views with k-anonymity (k≥5) + anti-differencing; reveal = logged Postgres function writing individual_data_reveal in-txn. |
critical | P1 |
| E6 | Add ApplicationTrackingWindow; timer anchor = engagement-end time (audited), not the late mark-complete click. Nudge due-dates may respect ActionPlan.when; time-to-apply must not. Resolves the deferred timer-anchor on the Phase-1 critical path. |
critical | P5 |
| E7 | Stale-engagement detector: “delivered but never marked complete” is itself an absence-trigger so a cohort can’t go silently dead. | high | P1 |
Failure modes registry
| Failure | Detection | Visible to | Mitigation | Tested? |
|---|---|---|---|---|
| Missed nudge (worker down) | due_at<=now() catch-up query |
ops alarm | catch-up on restart, not “this-minute” firing | test #2 |
| Double-fire on retry | idempotency_key unique | — | dedup insert | test #2 |
| Cap blown by burst | ledger count per local week | participant | txn cap decrement + priority suppression | test #3 |
| Saturday nudge | school calendar check | participant | roll to next school day | test #4 |
| Silent dead cohort | unmarked-complete detector | PM | escalation prompt | test #5 |
| Individual data leak | RLS denies base SELECT | — | k-anon view + logged reveal | test #7 |
| attempt_number race | unique constraint | — | constraint fails loudly / ROW_NUMBER | test #8 |
| Half mark-complete | fan-out idempotency | IS | transactional or retryable effects | test #1 |
NOT in scope (Eng, v1)
Phase 2 context-layer tables (SchoolContextProfile versioning + pinning, PulseResponse, IntroVideo) — schema-stubbed only; their RLS policies designed now (E5) but UIs deferred. Phase 3/4 entirely out.
Test plan
Written to disk: ~/.gstack/projects/inspired-applied-learninh/brandon-main-test-plan-20260527-100703.md (test diagram + 8 critical cases + unit/integration/E2E breakdown).
Phase 3 transition summary
Codex: 6 findings (5 critical, 1 high). Claude subagent: ~12 findings across 6 areas. Consensus: 6/6 dimensions confirmed by both. All auto-decided (E1–E7 are structural). No eng taste decisions — the answers are determined. Passing to Final Gate.
Cross-Phase Themes (flagged independently in 2+ phases — high-confidence)
- “Applied” ≠ “improved practice.” CEO (metric is a proxy that can drift) + Eng (timer anchor must be engagement-end, not button-click; add
observationevidence) + Design (a 48h countdown reads as a test). All three phases. → addressed by premise gate (IS evidence) + E6. - The “not evaluative” trust posture is fragile and crosses the stack. Design (nudge/timer microcopy, no individual watchlists) + Eng (RLS k-anonymity + logged reveal, not app-code promises) + CEO (patterns-vs-coaching tension). All three. → addressed by E5 + design fixes
- a “consent + explicit norms” note.
- Absence is the core mechanic and the core risk. CEO (empty-dashboards 6-month regret) + Design (empty/success states unspecified — the IS “zero stuck” state IS the product) + Eng (fire-on-absence needs a durable due-job table). All three. → addressed by states matrix + E1 + E7 + the pilot’s log-rate validation.
Decision Audit Trail
| # | Phase | Decision | Class | Principle | Rationale |
|---|---|---|---|---|---|
| 1 | CEO | Mode = SELECTIVE EXPANSION | Mechanical | override | autoplan default |
| 2 | CEO | Run Codex + Claude subagent | Mechanical | P6 | dual voices |
| 3 | CEO | v1 → Phase-1 pilot + IS evidence | User Challenge | — | both models; user confirmed at premise gate |
| 4 | CEO | Accept premises 2,4,5 as-is | Mechanical | P6 | low-risk, sound |
| 5 | CEO | Seed minimal microlearning in v1 | Auto (expansion) | P2 | in blast radius, <1d CC; nudges need targets |
| 6 | CEO | Keep certificate/locked-resources but not as success metric | Auto | P5 | drives return; success = application |
| 7 | CEO | Don’t publish time-to-apply externally until drift measured | Auto | P7 | one-way door (proxy skepticism) |
| 8 | Design | States matrix per dashboard required | Auto | P1 | empty/success states ARE the product |
| 9 | Design | IS top-level = cohort patterns, individual behind logged reveal | Auto | P5 | trust posture; no watchlists |
| 10 | Design | Support microcopy not deadline copy; add participant self-snooze | Auto | P1+P5 | trust; nudge fatigue |
| 11 | Design | Mark-complete opens window AS invitation to participant | Auto | P5 | consent / design-for-trust |
| 12 | Design | Commit mobile-first; TTS on nudges; video captions; contrast | Auto | P1 | accessibility (teachers on phones) |
| 13 | Design | T1 nudge-tone calibration | Taste | — | → final gate |
| 14 | Eng | E1 durable intervention_job table + SKIP LOCKED + re-check | Auto | P5 | fire-on-absence needs obligations, not just a log |
| 15 | Eng | E2 weekly cap enforced in Postgres ledger | Auto | P5 | survives catch-up burst |
| 16 | Eng | E3 school tz + calendar; UTC due_at; roll out of weekends | Auto | P1 | no Saturday nudges; metric≠send time |
| 17 | Eng | E4 add Strategy + EngagementStrategy + action_plan_id; fix attempt_number | Auto | P5 | mastery curves need a Strategy entity |
| 18 | Eng | E5 RLS deny + k-anon views + logged reveal | Auto | P1 | privacy enforced at DB not app |
| 19 | Eng | E6 timer anchor = engagement-end (audited), not button-click | Auto | P5 | both models; resolves blocking open question |
| 20 | Eng | E7 stale unmarked-complete detector | Auto | P1 | no silently-dead cohorts |