Cherry-pick BMad-METHOD: 3 pattern em steal cho personal AI agent — skip ceremony, giữ memory

Mục lục · 10 mục

TL;DR
Câu chuyện — không phải research chủ động
Bối cảnh: 1 framework dev cho team gặp 1 personal agent solo
Comparison head-to-head
Pattern 1 adopted — Auto Next Action Hint
Pattern 2 adopted — Adversarial QA gate
Pattern 3 adopted — 3-layer TOML config
3 pattern em DROP
Reservations
Conclusion

BMAD-METHOD (46.6k star, framework Agile AI dev) tổ chức đẹp cho team software regulated. Nhưng không có persistent memory + 12 persona ceremony nặng cho solo founder. Sau deep research, em adopt 3 pattern (next-action hint, adversarial QA gate, 3-layer TOML config) và drop 3 pattern (persona handoff, no-memory design, enterprise token cost). Cross-pollinate framework — không migrate.

Luận điểm chính

BMAD-METHOD và JARVIS giải 2 bài toán khác nhau: BMad cho team dev 3-10 người regulated environment với artifact audit trail; JARVIS cho 1 founder polyglot ops cross-domain (BOQ pricing + Drive automation + Zalo + finance + blog)
BMad không có persistent semantic memory — state sống trong markdown + sprint-status.yaml. Đây là tradeoff đắt cho personal agent vì cross-session learning bị mất. JARVIS có ChromaDB + knowledge graph + age vault — không trade off cái này
3 pattern BMad có giá trị transferable: bmad-help orchestrator gợi ý next action cuối workflow, adversarial QA gate trước handoff phase, 3-layer TOML config customization (default < team < user)
3 pattern BMad em DROP: 12+ persona handoff (Analyst/PM/Architect/Dev/QA) heavy ceremony cho solo founder, no-memory artifact-only design (đi ngược triết lý personal agent), token cost enterprise reported mid 3-figure USD/tháng

TL;DR

Em deep research GitHub bmad-code-org/BMAD-METHOD v6.6.0 (46.6k star) — framework “Breakthrough Method Agile AI-Driven Development” cho team software regulated.

Sau head-to-head comparison với JARVIS, em adopt 3 pattern:

Auto Next Action Hint — sau mỗi workflow hoàn thành, AI append section “Next” 2-3 option gợi ý bước kế (chống pattern “agent ngáo” sau task xong im lặng)
Adversarial QA gate — checklist N gate explicit trước handoff phase, AI tự chấm PASS/BORDERLINE/HARD-STOP với gap list cụ thể (thay vì hỏi user mơ hồ)
3-layer TOML config — customize.toml default < {skill}.toml team < {skill}.user.toml per-machine. Deep-merge tables, key-merge arrays bằng code/id

Và drop 3 pattern:

12+ persona handoff (Analyst → PM → Architect → Dev → QA → SM) heavy ceremony cho solo founder
No-memory artifact-only design (markdown + sprint-status.yaml) — đi ngược triết lý personal agent có ChromaDB + knowledge graph
Token cost enterprise reported (~mid 3-figure USD/tháng + 230M token/tuần ở scale)

Cross-pollinate, không migrate.

Câu chuyện — không phải research chủ động

Đây không phải plan research. Hôm đó tôi cafe làm việc chung với 1 anh em cũng đang xây AI agent — tự nhiên anh ấy kể “có 1 framework BMAD-METHOD trên GitHub đang trend, anh thử check không”. Peer recommendation, không hẹn trước.

Tôi không định adopt framework dev. JARVIS đang focus solo founder polyglot ops — BMad target team software regulated, khác phương trình. Nhưng vì peer kể có vài pattern hay, tôi mở repo lên check thử trong lúc uống cafe.

Compare head-to-head — và ai ngờ cũng apply được vài thứ. Pattern transferable không phụ thuộc domain: state machine, orchestration, gate validation, config layering — vấn đề chung mọi AI agent có ≥3 sub-skill.

Bài này là output của 1 buổi cafe + 1 ngày deep research + apply 3 pattern.

Bối cảnh: 1 framework dev cho team gặp 1 personal agent solo

Hệ JARVIS em build cho 1 founder polyglot ops cross-domain — BOQ pricing, Drive automation, Zalo native, finance MISA, blog GEO, Google Ads tactical/strategic. ~25 skill catalog, ~33 slash command, ChromaDB collection embedding semantic, knowledge graph entity-edge.

BMad-METHOD positioning: cure cho “vibe coding” (user prompt rời rạc vào single LLM IDE rồi mất context). 2 đột phá claim:

Agentic Planning — Analyst → PM → Architect collaborate sinh PRD + Architecture coherent
Context-Engineered Development — implementation agent (Dev/QA) nhận story file đầy đủ “what + why” → “zero context loss” giữa planning ↔ coding

State machine BMad: markdown artifact + sprint-status.yaml. Filesystem RAG, không vector embedding, không knowledge graph.

JARVIS state machine khác: ChromaDB jarvis_brain HNSW L2 + knowledge graph 35 entity 60 edge + age vault 21 credential + filesystem markdown. Multi-layer.

→ 2 framework giải 2 bài toán, không cùng phương trình. Câu hỏi: pattern nào của BMad transferable?

Comparison head-to-head

Trục	BMAD-METHOD	JARVIS
Target user	Team dev 3-10 người, regulated (SOC 2 / govtech)	1 founder polyglot cross-domain
Agent count	12+ persona (Analyst/PM/Architect/Dev/QA/SM/UX)	~25 skill catalog + 4 subagent
Orchestrator	Centralized BMad Master + auto bmad-help end of workflow	Routing table A-G + capability map
Workflow phases	Planning → Story-cycle (Greenfield/Brownfield)	Sprint co-arising multi-stream
MCP	Native (Context7, Linear, Jira, Notion)	FastMCP server với 4 tool
Plugin format	Anthropic Skills + 3-layer TOML customize	Anthropic Skills (folder + symlink)
Memory persistent	❌ markdown + sprint-status.yaml	✅ ChromaDB + knowledge graph + age vault
RAG / vector	❌ document sharding only	✅ ChromaDB embedding
HITL	Mandatory adversarial review + Party Mode	HARD-STOP gate + multi-step approve
Domain coverage	Software dev + game + govtech + creative	Cross-domain ops + personal life + VN-localized
Cost profile	Reported mid 3-figure USD/tháng enterprise scale	Claude Max flat

Sweet spot rõ:

BMad: team dev đang ship feature regulated industry, willing pay ceremony tax for traceability
JARVIS: solo founder polyglot, không có persona ceremony, có semantic memory cross-session

Pattern 1 adopted — Auto Next Action Hint

BMad có concept bmad-help — skill auto chạy cuối mỗi workflow để guide user “next step”. Centralized routing, không decentralized swarm.

Pain JARVIS pre-adopt: sau mỗi /close_session, /generate_boq, /publish_post — AI im lặng. User phải tự nghĩ bước kế. Pattern recurring “agent ngáo”.

Adopted form: file brain/personas/next_action_hints.md mapping declarative 10 slash command → 2-3 option:

| Trigger | Next action option |
|---|---|
| /generate_boq | 1. Anh review forensic_report.md HARD-STOP 1 → reply "OK"
                  2. Sau Phase 4 ship → /boq-readiness-check <project_code>
                  3. Khách đồng ý → /generate_contract <project_code> |
| /close_session | 1. /morning-brief lần kế (cron 7h tự chạy)
                   2. /dream Light Sleep manual nếu chưa consolidate
                   3. git push nếu PHASE 5 fail |

Rule MASTER_AGENT §63: sau workflow mutation hoàn thành, AI BẮT BUỘC append section ## Next với 2-3 option rút từ map. Skip nếu user đã chỉ rõ next, output là pure query, hoặc workflow fail HARD-STOP.

Effort adopt: ~2-3 giờ. ROI cao — không cần chat với user về “tiếp theo làm gì”.

Pattern 2 adopted — Adversarial QA gate

BMad QA / Test Architect agent chấm artifact qua N gate trước handoff phase. Adversarial mode (KHÔNG soft “tạm OK”) buộc gap critical lộ trước khi proceed.

JARVIS đã có HARD-STOP HITL pattern (vd Phase 1 forensic → user duyệt “OK tiếp đi”). Khác adversarial QA gate ở chỗ:

HARD-STOP HITL: dialog mơ hồ, user trả lời tự do
Adversarial QA: N gate explicit, AI tự chấm PASS/⚠/❌, output verdict structured

Adopted form: slash command /boq-readiness-check <project_code> với 10 gate:

Gate	Check
1	Drawing dim thực (DWG/DXF/PDF kích thước)
2	Pricing 3-source (history + competitor + bottom-up)
3	Margin policy fit class (giam vốn 20-28% / Hạng A 30-35% / Standard 12-18%)
4	PCCC scope rõ (Path A “phát sinh sau” hay Path B include cứng)
5	HVAC scope cụ thể
6	5 invisible items high-end building
7	HĐ Thiết kế ký trước 3D revision
8	BOQ schema match by `code` field, không name normalize
9	Sheet folder destination đúng “02. Bao gia & BOQ”
10	Naming convention `aic_YYYY_NNN_{slug}`

Verdict 3-tier: PASS (10/10 ✅) ready push customer; BORDERLINE (1-2 ⚠, 0 ❌) user decide override + log decision; HARD-STOP (≥1 ❌) cấm push, fix gap, re-run.

Run đầu trên 1 dự án ACTIVE thật → BORDERLINE 8/10 (gate 2 pricing 3-source partial + gate 7 HĐ thiết kế revision không rõ). Pattern hoạt động đúng.

Effort adopt: ~4 giờ build + test trên project thật. Trace về 1 retro project có HARD-STOP 4 gap critical lộ ngay trước push customer (sự kiện gốc).

Pattern 3 adopted — 3-layer TOML config

BMad có 3-layer customization: customize.toml default committed > {skill}.toml team committed > {skill}.user.toml personal gitignored. Deep-merge tables, key-merge arrays by code/id.

Pain JARVIS pre-adopt: anh chạy Mac + PC Windows + VPS — config threshold (vd anomaly detect ratio) nên khác per-environment. Mac dev: alert sớm threshold thấp; VPS prod: filter noise threshold cao. Hard-code Python dict không cho per-machine override mà không leak.

Adopted form: helper .agents/lib/jarvis_config.py:

def load_skill_config(skill_dir, skill_name=None):
    """Load 3-layer config, later wins, deep-merge dict, key-merge dict-list by code/id."""
    layers = [
        skill_dir / "customize.toml",
        skill_dir / f"{skill_name}.toml",
        skill_dir / f"{skill_name}.user.toml",
    ]
    config = {}
    for layer in layers:
        if layer.exists():
            config = _deep_merge(config, tomllib.load(layer.open("rb")))
    return config

.gitignore chặn *.user.toml. Demo customize.toml cho 1 skill ops (Google Ads anomaly thresholds + bid safety):

[thresholds]
cost_spike_ratio = 2.0       # cost yesterday / avg 7d > 2.0 → flag
click_spike_ratio = 3.0
kw_cost_threshold_vnd = <NOISE_FLOOR>   # filter keyword đốt dưới ngưỡng noise

[bid_safety]
min_bid_vnd = <FLOOR>         # nếu bid micros < floor → abort, sai unit
max_bid_vnd = <CEILING>       # nếu bid > ceiling → confirm trước mutate

12 unit test PASS deep-merge + key-merge. Migrate hard-code production code defer Sprint sau (tránh break production runner — Trace rule mỗi line đổi PHẢI trace về request).

Effort adopt: ~4-6 giờ build + test. ROI medium — chưa thực sự cần per-machine override hôm nay nhưng future-proof khi VPS chạy tự động.

3 pattern em DROP

1. 12+ persona handoff heavy ceremony

BMad có Analyst → PM → Architect → PO → SM → Dev → QA → UX. Mỗi persona role/output cụ thể, artifact chain producer-consumer.

Solo founder không cần. JARVIS có 4 subagent (Explore, general-purpose, account-agent, boq-agent) + 25 skill atomic — đủ. Adopt 12 persona = ceremony tax cao + workflow latency cao.

BMad issue #2003 chính họ thừa nhận: “non-technical users can’t handle requirements, technical users can’t trace AI code” — paradox với heavy ceremony.

2. No-memory artifact-only design

BMad state sống trong markdown + sprint-status.yaml. Reload mỗi session = đọc lại file.

Đi ngược triết lý personal agent. Em cần cross-session episodic memory (vd “user 2 tuần trước ship X” → context cho query Y hôm nay). ChromaDB + knowledge graph cho phép embedding query semantic, không phải grep filesystem.

Migrate sang artifact-only = mất moat. KHÔNG drop ChromaDB.

3. Token cost enterprise scale

User report BMad: ~mid 3-figure USD/tháng + 230M token/tuần ở enterprise team scale. Workflow 1 run ~31.7k token (multiple persona handoff).

JARVIS solo Claude Max flat. Adopt persona handoff multiplier 12× = vỡ economic model. Đây là constraint structural, không phải optimization micro.

Reservations

Cherry-pick có rủi ro impedance mismatch — pattern adopted cần adapt format JARVIS, không copy literal. Adversarial QA gate em adopt khác cấu trúc BMad QA agent (skill markdown thay vì subagent isolation).

Pattern thứ 4 chưa adopt — dual-discovery document sharding (doc.md fallback doc/index.md). Defer chờ 1 file knowledge vượt 2500 dòng. Trước đó là thừa abstraction.

Còn concept Generator-Discriminator GAN-style — BMad gợi ý cho future BOQ self-improving loop. Em đã save design doc, build Sprint kế (~24 giờ ETA). Risk reward hacking khi LLM tự chấm LLM, mitigation 3 layer guardrail.

Em không track BMad v7+ release tự động. Cron passive check changelog 30 ngày 1 lần — pattern mới có giá trị thì re-evaluate. Tránh fork/refactor mỗi version.

Conclusion

Cross-pollinate framework là pattern tốt khi 2 framework giải 2 bài toán khác nhau nhưng có pattern transferable subset. Migrate sang framework khác = mất moat (semantic memory, domain context, integration cost). Cherry-pick = giữ moat + add value mới.

3 pattern em adopted (next-action hint + adversarial QA gate + 3-layer TOML config) hoạt động đúng sau adoption — pattern đầu giải pain “agent ngáo”, pattern 2 chặn HARD-STOP retroactive trước push customer, pattern 3 future-proof multi-machine.

3 pattern em drop (persona ceremony + no-memory design + token cost enterprise) là constraint structural cho solo founder polyglot. Không phải pattern xấu — chỉ không hợp problem em đang giải.

Threshold cherry-pick em đặt: 3. Stop trước số 4 để review impedance mismatch. Nếu pattern thứ 4 thực sự cần, sẽ adopt sau khi có signal real (vd file knowledge vượt 2500 dòng → sharding).

FAQ

Sao deep research framework dev khi mình build personal agent ops?

Pattern transferable không phụ thuộc domain. State machine, orchestration, gate validation, config layering — những vấn đề chung cho mọi AI agent có ≥3 sub-skill. Cách tốt nhất là không reinvent: research framework mature đã đi qua các thất bại của họ, cherry-pick pattern hợp problem mình.

Tại sao chọn cherry-pick thay vì migrate sang BMad?

BMad mature ở process discipline software team — phù hợp dev team có PM/Architect/QA tách bạch role. JARVIS solo founder không cần persona handoff đó. Plus: BMad không có semantic memory, đi ngược pattern AI agent personal. Migrate mất moat ChromaDB + graph + VN domain context. Cherry-pick + giữ moat = tốt nhất.

Pattern adversarial QA gate vs HARD-STOP HITL có khác gì?

HARD-STOP em đã có (vd Phase 1 forensic report → anh duyệt 'OK tiếp đi'). Adversarial QA gate khác ở chỗ: AI tự chấm output qua N gate explicit (thay vì hỏi user mơ hồ), có verdict PASS/BORDERLINE/HARD-STOP với gap list cụ thể. AI vẫn không bypass user — nhưng user có report đầy đủ thay vì dialog từng câu.

3-layer TOML có overkill cho solo single-user không?

Có cảm giác overkill. Nhưng anh chạy Mac + PC Windows + VPS — pattern này cho phép Mac dev override (1.5x threshold debug sớm) vs VPS prod (2.5x threshold tránh noise) mà không leak vault. *.user.toml gitignored = per-machine sandbox. Nếu thuần solo 1 máy thì 1 layer customize.toml đủ.

Em có lo bị bias adopt quá nhiều BMad pattern không?

Có. Cherry-pick 3 pattern là threshold em dừng lại review trước khi steal cái thứ 4. Pattern thứ 4 sẽ là dual-discovery sharding (single doc.md fallback doc/index.md) — em defer chờ tới khi 1 file knowledge vượt 2500 dòng. Trước đó: thừa abstraction. Karpathy guideline: 3 example similar lines tốt hơn premature abstraction.