Infrastructure Adoption - Week 1 Plan

Goal

Make infrastructure deployment HABITUAL for all 22 orchestrators + 32 personas.

We deployed mandatory infrastructure on 2026-02-03. This plan ensures agents actually USE it in their daily work.

What We Deployed

Already deployed:

✅ Incremental sync (90-95% fewer API calls)
✅ Rate limiting (80% safety margin, zero errors)
✅ WPEngine metrics (228 sites, every 2 hours)
✅ Sync profiles (10 profiles for all use cases)

Needs manual deployment:

⚠️ Memory system Supabase tables (ready, needs Supabase SQL - 10-15 min)
⚠️ WPEngine alerts (ready, needs SSL investigation - 40 min)

Total deployment time remaining: 50-55 minutes

Day 1: Complete Manual Steps + First Usage

Morning (1-2 hours)

Task 1: Complete Deployment (50-55 min)

Follow these guides in order:

Memory System Deployment (10-15 min)
- Guide: docs/planning/MANUAL-DEPLOY-memory-system-2026-02-03.md
- Apply 2 SQL migrations in Supabase
- Test with quick-test script
1Password Authentication (15-20 min)
- Guide: docs/planning/1password-sa-setup.md
- Create service account: fp-tech-automation
- Grant Engineering vault access
- Update .env file
WPEngine Alerts (40 min)
- Guide: docs/planning/NEXT-STEPS-wpengine-ssl-2026-02-03.md
- Test SSL filtering (should show <10 alerts)
- Deploy to cron (every 15 min)
- Monitor first run in Slack #tech-alerts

Checkpoint: All systems operational ✅

Afternoon (2-3 hours)

Task 2: First Real Usage - Demonstrate the Workflow

Pick a REAL task and demonstrate the full mandatory workflow.

Example Task: "Add backup monitoring dashboard to fp-interface"

# Step 1: SEARCH FOR RELATED DECISIONS (MANDATORY)
cd ~/GitHub/flypilot
grep -ril "dashboard" docs/planning/decisions/
grep -ril "backup" docs/planning/decisions/
grep -ril "monitoring" docs/planning/decisions/
grep -ril "wpengine" docs/planning/decisions/

# Step 2: CHECK HANDOFFS
ls handoffs/

# Step 3: RECORD ARCHITECTURAL DECISION
cat > docs/planning/decisions/$(date +%Y-%m-%d)-use-wpengine-backup-status-view.md << 'EOF'
# Decision: Use v_wpengine_backup_status view for dashboard

**Date:** 2026-02-25
**Status:** Active
**Impact:** medium
**Category:** architecture
**Department:** fp

## Decision

Use the existing v_wpengine_backup_status Supabase view for the backup
monitoring dashboard in fp-interface.

## Rationale

View already exists from WPEngine Phase 1. Provides unified backup status
across 228 sites without additional query complexity.

## Alternatives Considered

- **Query wpengine_metrics directly:** More complex query, duplicates view logic
- **Build custom aggregation:** Unnecessary given the view already exists
EOF

# Step 4: Do the Work
# ... implement the feature ...

# Step 5: DOCUMENT IMPORTANT FINDINGS
echo "- $(date +%Y-%m-%d): v_wpengine_backup_status view has 228 sites, updated every 2 hours via Mac Studio cron — tags: wpengine,dashboard,monitoring,backup" >> docs/planning/research-notes.md
echo "- $(date +%Y-%m-%d): WPEngine health views use 0-100 score calculation: backup=40%, ssl=30%, disk=20%, bandwidth=10% — tags: wpengine,monitoring,health-score" >> docs/planning/research-notes.md

# Step 6: Verify Decisions Were Recorded
ls -lt docs/planning/decisions/ | head -5

Document This Example

Create: docs/planning/infrastructure-workflow-example.md

Include:

Full command sequence from above
When to record decisions (any choice between 2+ alternatives)
When to document findings (key facts, non-obvious insights)
When to search first (ALWAYS before deciding)

This becomes the reference example for all other agents to follow.

Day 2-3: Agent Training

Goal: Ensure all orchestrators know about the new workflows

Training Protocol (For Each Orchestrator)

Before Training:

Start session in tmux: tmux new -s training-[role]
Load their role file: cat docs/planning/agent-roles/[role].md
Review their "Before Every Session" checklist

During Training:

Pick a real task from their backlog (not a fake task)
Walk through the mandatory workflow:
- Search for related decisions with grep
- Check handoffs for ongoing context
- Record decisions made during work as markdown files
- Document key findings in research notes
Observe what worked / what was confusing
Note any workflow improvements needed

After Training:

Document in training log
If issues found, update workflows or documentation
Celebrate successful adoption

Recommended Training Schedule

Day 2 Orchestrators:

FP-Infrastructure (infrastructure role)
- Task: "Investigate Mac Studio backup automation"
- Focus: Decision recording for infrastructure choices
OMG-Accounts (client role)
- Task: "Audit OMG client Toggl time entries"
- Focus: Using sync profiles instead of custom API calls
FP-Research (research role)
- Task: "Compare LLM context window offerings"
- Focus: Documenting research findings in notes files

Day 3 Orchestrators:

FP-Core (development role)
- Task: "Refactor sync rate limiting"
- Focus: Full workflow (search → decide → record → document)
LRND-PM (project management role)
- Task: "Generate weekly project status report"
- Focus: Using sync profiles for data access
All-Accounts (accounts director)
- Task: "Review client time allocation patterns"
- Focus: Cross-client decision coordination

Day 4-5: Monitor Adoption

Goal: Ensure infrastructure is being used, not bypassed

Daily Checks

Decision Usage:

cd ~/GitHub/flypilot

# Should see 2-5 new decision files per day
ls -lt docs/planning/decisions/ | head -10

# Count decisions this week
ls docs/planning/decisions/ | wc -l

# Target by end of week: 10+ decisions

Sync Operations:

# Should see cron jobs running
tail -100 ~/logs/flypilot-sync.log | grep "Rate Limit Statistics"

# Target: Zero rate limit errors
grep "Rate limit hit" ~/logs/flypilot-sync.log

Research Notes:

# Should see findings being documented
tail -20 docs/planning/research-notes.md

# Target: 10+ entries stored
wc -l docs/planning/research-notes.md

WPEngine Alerts:

# Should see alerts running every 15 min
tail -100 ~/logs/flypilot-wpengine-alerts.log

# Target: less than 10 SSL alerts per run, no backup/disk critical alerts
grep -c "Critical" ~/logs/flypilot-wpengine-alerts.log

Red Flags to Watch For

❌ No decisions being recorded

Issue: Agents bypassing workflow
Fix: Re-emphasize mandatory nature, add to role checklists

❌ Custom API clients appearing

Issue: Agents not using sync profiles
Fix: Remove custom clients, redirect to sync profiles

❌ Rate limit errors

Issue: Rate limiting not being used
Fix: Review sync profile usage, adjust limits if needed

❌ Empty research notes

Issue: Research not being preserved
Fix: Emphasize documentation in research role

❌ Infrastructure bypasses in code

Issue: Developers building custom solutions
Fix: Code review enforcement, update contribution guidelines

Green Lights to Celebrate

✅ Decisions growing steadily

Indicator: Cross-orchestrator learning happening
Celebrate: Share interesting decisions in Slack

✅ Zero rate errors

Indicator: Rate limiting working perfectly
Celebrate: Highlight performance improvement

✅ Research notes being reused

Indicator: Reduced duplicate research
Celebrate: Show time savings from finding an answer already in notes

✅ Slack alerts working

Indicator: Proactive monitoring operational
Celebrate: Share first caught issue that was auto-detected

✅ Sync performance improvements

Indicator: Incremental sync reducing API calls 90%+
Celebrate: Calculate API cost savings

Week 1 Success Metrics

By end of Week 1 (2026-02-10), we should see:

Metric	Target	Measurement	Current
Decisions recorded	10+	`ls docs/planning/decisions/`	0
Orchestrators trained	10+	Training log	0
Sync operations	Zero errors	`grep "error" ~/logs/flypilot-sync.log`	-
Research notes	10+ entries	`wc -l docs/planning/research-notes.md`	0
WPEngine alerts	<10 per run	Alert logs	-
Infrastructure bypasses	0	Code reviews	-
Rate limit hits	0	Sync logs	✅ 0

Common Pitfalls to Avoid

Pitfall 1: "Too Busy to Follow Checklist"

Symptom: Agents skip decision search to "save time" Reality: Spend 2x as long reinventing the wheel Solution: The checklist SAVES time by preventing duplicate work

Pitfall 2: "Decision Recording Feels Slow"

Symptom: Agents skip recording decisions Reality: Explaining the same decision 5 times later takes longer Solution: Record once, reference forever

Pitfall 3: "Not Sure What to Record"

Symptom: Agents only record "big" decisions Reality: Small decisions matter too (e.g., "use sync profile X for task Y") Solution: Record ANY choice between 2+ alternatives

Pitfall 4: "Forgetting to Search First"

Symptom: Conflicting decisions across orchestrators Reality: Searching reveals past decisions and prevents conflicts Solution: Make it muscle memory — search before deciding

Pitfall 5: "Building Custom Solutions"

Symptom: New API clients, custom sync scripts appear Reality: Infrastructure already exists and is production-ready Solution: Tool escalation protocol — use existing before building new

Week 2 Preview

Goals:

Measure impact (API calls reduced, decisions reused, time saved)
Refine workflows based on Week 1 feedback
Expand to remaining orchestrators (all 22 total)
Build decision library (target: 30+ decisions)
Add infrastructure usage dashboards to fp-interface

Success Indicators:

Agents proactively searching decisions without being reminded
Infrastructure usage becomes automatic, not forced
New orchestrators ask "what decisions have we made about X?"
Cross-orchestrator learning evident in decision patterns

Remember

The Infrastructure is MANDATORY

The infrastructure is MANDATORY, not optional.

The goal of Week 1 is to make it HABITUAL.

By Week 2, agents should:

Search for decisions automatically before starting work
Record decisions as markdown files without being reminded
Document research findings by default
Use sync profiles instead of custom API calls
Refer to past decisions when making new choices

This is cultural change, not just technical deployment.

Week 1 is about building the habit. Week 2 is about measuring the impact.

What We Deployed​

Day 1: Complete Manual Steps + First Usage​

Morning (1-2 hours)​

Task 1: Complete Deployment (50-55 min)​

Afternoon (2-3 hours)​

Task 2: First Real Usage - Demonstrate the Workflow​

Day 2-3: Agent Training​

Training Protocol (For Each Orchestrator)​

Recommended Training Schedule​

Day 4-5: Monitor Adoption​

Daily Checks​

Red Flags to Watch For​

Green Lights to Celebrate​

Week 1 Success Metrics​

Common Pitfalls to Avoid​

Pitfall 1: "Too Busy to Follow Checklist"​

Pitfall 2: "Decision Recording Feels Slow"​

Pitfall 3: "Not Sure What to Record"​

Pitfall 4: "Forgetting to Search First"​

Pitfall 5: "Building Custom Solutions"​

Week 2 Preview​

Remember​