Infrastructure Adoption - Week 1 Plan
Make infrastructure deployment HABITUAL for all 22 orchestrators + 32 personas.
We deployed mandatory infrastructure on 2026-02-03. This plan ensures agents actually USE it in their daily work.
What We Deployed
Already deployed:
- ✅ Incremental sync (90-95% fewer API calls)
- ✅ Rate limiting (80% safety margin, zero errors)
- ✅ WPEngine metrics (228 sites, every 2 hours)
- ✅ Sync profiles (10 profiles for all use cases)
Needs manual deployment:
- ⚠️ Memory system Supabase tables (ready, needs Supabase SQL - 10-15 min)
- ⚠️ WPEngine alerts (ready, needs SSL investigation - 40 min)
Total deployment time remaining: 50-55 minutes
Day 1: Complete Manual Steps + First Usage
Morning (1-2 hours)
Task 1: Complete Deployment (50-55 min)
Follow these guides in order:
-
Memory System Deployment (10-15 min)
- Guide:
docs/planning/MANUAL-DEPLOY-memory-system-2026-02-03.md - Apply 2 SQL migrations in Supabase
- Test with quick-test script
- Guide:
-
1Password Authentication (15-20 min)
- Guide:
docs/planning/1password-sa-setup.md - Create service account:
fp-tech-automation - Grant Engineering vault access
- Update
.envfile
- Guide:
-
WPEngine Alerts (40 min)
- Guide:
docs/planning/NEXT-STEPS-wpengine-ssl-2026-02-03.md - Test SSL filtering (should show <10 alerts)
- Deploy to cron (every 15 min)
- Monitor first run in Slack
#tech-alerts
- Guide:
Checkpoint: All systems operational ✅
Afternoon (2-3 hours)
Task 2: First Real Usage - Demonstrate the Workflow
Pick a REAL task and demonstrate the full mandatory workflow.
Example Task: "Add backup monitoring dashboard to fp-interface"
# Step 1: SEARCH FOR RELATED DECISIONS (MANDATORY)
cd ~/GitHub/flypilot
grep -ril "dashboard" docs/planning/decisions/
grep -ril "backup" docs/planning/decisions/
grep -ril "monitoring" docs/planning/decisions/
grep -ril "wpengine" docs/planning/decisions/
# Step 2: CHECK HANDOFFS
ls handoffs/
# Step 3: RECORD ARCHITECTURAL DECISION
cat > docs/planning/decisions/$(date +%Y-%m-%d)-use-wpengine-backup-status-view.md << 'EOF'
# Decision: Use v_wpengine_backup_status view for dashboard
**Date:** 2026-02-25
**Status:** Active
**Impact:** medium
**Category:** architecture
**Department:** fp
## Decision
Use the existing v_wpengine_backup_status Supabase view for the backup
monitoring dashboard in fp-interface.
## Rationale
View already exists from WPEngine Phase 1. Provides unified backup status
across 228 sites without additional query complexity.
## Alternatives Considered
- **Query wpengine_metrics directly:** More complex query, duplicates view logic
- **Build custom aggregation:** Unnecessary given the view already exists
EOF
# Step 4: Do the Work
# ... implement the feature ...
# Step 5: DOCUMENT IMPORTANT FINDINGS
echo "- $(date +%Y-%m-%d): v_wpengine_backup_status view has 228 sites, updated every 2 hours via Mac Studio cron — tags: wpengine,dashboard,monitoring,backup" >> docs/planning/research-notes.md
echo "- $(date +%Y-%m-%d): WPEngine health views use 0-100 score calculation: backup=40%, ssl=30%, disk=20%, bandwidth=10% — tags: wpengine,monitoring,health-score" >> docs/planning/research-notes.md
# Step 6: Verify Decisions Were Recorded
ls -lt docs/planning/decisions/ | head -5
Document This Example
Create: docs/planning/infrastructure-workflow-example.md
Include:
- Full command sequence from above
- When to record decisions (any choice between 2+ alternatives)
- When to document findings (key facts, non-obvious insights)
- When to search first (ALWAYS before deciding)
This becomes the reference example for all other agents to follow.
Day 2-3: Agent Training
Goal: Ensure all orchestrators know about the new workflows
Training Protocol (For Each Orchestrator)
Before Training:
- Start session in tmux:
tmux new -s training-[role] - Load their role file:
cat docs/planning/agent-roles/[role].md - Review their "Before Every Session" checklist
During Training:
- Pick a real task from their backlog (not a fake task)
- Walk through the mandatory workflow:
- Search for related decisions with grep
- Check handoffs for ongoing context
- Record decisions made during work as markdown files
- Document key findings in research notes
- Observe what worked / what was confusing
- Note any workflow improvements needed
After Training:
- Document in training log
- If issues found, update workflows or documentation
- Celebrate successful adoption
Recommended Training Schedule
Day 2 Orchestrators:
-
FP-Infrastructure (infrastructure role)
- Task: "Investigate Mac Studio backup automation"
- Focus: Decision recording for infrastructure choices
-
OMG-Accounts (client role)
- Task: "Audit OMG client Toggl time entries"
- Focus: Using sync profiles instead of custom API calls
-
FP-Research (research role)
- Task: "Compare LLM context window offerings"
- Focus: Documenting research findings in notes files
Day 3 Orchestrators:
-
FP-Core (development role)
- Task: "Refactor sync rate limiting"
- Focus: Full workflow (search → decide → record → document)
-
LRND-PM (project management role)
- Task: "Generate weekly project status report"
- Focus: Using sync profiles for data access
-
All-Accounts (accounts director)
- Task: "Review client time allocation patterns"
- Focus: Cross-client decision coordination
Day 4-5: Monitor Adoption
Goal: Ensure infrastructure is being used, not bypassed
Daily Checks
Decision Usage:
cd ~/GitHub/flypilot
# Should see 2-5 new decision files per day
ls -lt docs/planning/decisions/ | head -10
# Count decisions this week
ls docs/planning/decisions/ | wc -l
# Target by end of week: 10+ decisions
Sync Operations:
# Should see cron jobs running
tail -100 ~/logs/flypilot-sync.log | grep "Rate Limit Statistics"
# Target: Zero rate limit errors
grep "Rate limit hit" ~/logs/flypilot-sync.log
Research Notes:
# Should see findings being documented
tail -20 docs/planning/research-notes.md
# Target: 10+ entries stored
wc -l docs/planning/research-notes.md
WPEngine Alerts:
# Should see alerts running every 15 min
tail -100 ~/logs/flypilot-wpengine-alerts.log
# Target: less than 10 SSL alerts per run, no backup/disk critical alerts
grep -c "Critical" ~/logs/flypilot-wpengine-alerts.log
Red Flags to Watch For
❌ No decisions being recorded
- Issue: Agents bypassing workflow
- Fix: Re-emphasize mandatory nature, add to role checklists
❌ Custom API clients appearing
- Issue: Agents not using sync profiles
- Fix: Remove custom clients, redirect to sync profiles
❌ Rate limit errors
- Issue: Rate limiting not being used
- Fix: Review sync profile usage, adjust limits if needed
❌ Empty research notes
- Issue: Research not being preserved
- Fix: Emphasize documentation in research role
❌ Infrastructure bypasses in code
- Issue: Developers building custom solutions
- Fix: Code review enforcement, update contribution guidelines
Green Lights to Celebrate
✅ Decisions growing steadily
- Indicator: Cross-orchestrator learning happening
- Celebrate: Share interesting decisions in Slack
✅ Zero rate errors
- Indicator: Rate limiting working perfectly
- Celebrate: Highlight performance improvement
✅ Research notes being reused
- Indicator: Reduced duplicate research
- Celebrate: Show time savings from finding an answer already in notes
✅ Slack alerts working
- Indicator: Proactive monitoring operational
- Celebrate: Share first caught issue that was auto-detected
✅ Sync performance improvements
- Indicator: Incremental sync reducing API calls 90%+
- Celebrate: Calculate API cost savings
Week 1 Success Metrics
By end of Week 1 (2026-02-10), we should see:
| Metric | Target | Measurement | Current |
|---|---|---|---|
| Decisions recorded | 10+ | ls docs/planning/decisions/ | 0 |
| Orchestrators trained | 10+ | Training log | 0 |
| Sync operations | Zero errors | grep "error" ~/logs/flypilot-sync.log | - |
| Research notes | 10+ entries | wc -l docs/planning/research-notes.md | 0 |
| WPEngine alerts | <10 per run | Alert logs | - |
| Infrastructure bypasses | 0 | Code reviews | - |
| Rate limit hits | 0 | Sync logs | ✅ 0 |
Common Pitfalls to Avoid
Pitfall 1: "Too Busy to Follow Checklist"
Symptom: Agents skip decision search to "save time" Reality: Spend 2x as long reinventing the wheel Solution: The checklist SAVES time by preventing duplicate work
Pitfall 2: "Decision Recording Feels Slow"
Symptom: Agents skip recording decisions Reality: Explaining the same decision 5 times later takes longer Solution: Record once, reference forever
Pitfall 3: "Not Sure What to Record"
Symptom: Agents only record "big" decisions Reality: Small decisions matter too (e.g., "use sync profile X for task Y") Solution: Record ANY choice between 2+ alternatives
Pitfall 4: "Forgetting to Search First"
Symptom: Conflicting decisions across orchestrators Reality: Searching reveals past decisions and prevents conflicts Solution: Make it muscle memory — search before deciding
Pitfall 5: "Building Custom Solutions"
Symptom: New API clients, custom sync scripts appear Reality: Infrastructure already exists and is production-ready Solution: Tool escalation protocol — use existing before building new
Week 2 Preview
Goals:
- Measure impact (API calls reduced, decisions reused, time saved)
- Refine workflows based on Week 1 feedback
- Expand to remaining orchestrators (all 22 total)
- Build decision library (target: 30+ decisions)
- Add infrastructure usage dashboards to fp-interface
Success Indicators:
- Agents proactively searching decisions without being reminded
- Infrastructure usage becomes automatic, not forced
- New orchestrators ask "what decisions have we made about X?"
- Cross-orchestrator learning evident in decision patterns
Remember
The infrastructure is MANDATORY, not optional.
The goal of Week 1 is to make it HABITUAL.
By Week 2, agents should:
- Search for decisions automatically before starting work
- Record decisions as markdown files without being reminded
- Document research findings by default
- Use sync profiles instead of custom API calls
- Refer to past decisions when making new choices
This is cultural change, not just technical deployment.
Week 1 is about building the habit. Week 2 is about measuring the impact.