Human Review creates a valuable artifact: the difference between what the AI produced and what a human actually approved. Prompt Learning is Fluxo’s way of turning that real-world feedback into durable improvements, without sacrificing human control.
The goal is practical. If reviewers keep making the same edit over and over, the system should learn the pattern and propose a fix. But Fluxo never silently changes prompts. It generates suggestions, explains why, and lets workflow owners decide.
The loop I implemented
Fluxo continuously mines reviewed tasks and looks for consistent correction patterns. The pipeline:
- loads recent reviewed tasks in a rolling time window
- groups feedback by workflow and human review node
- resolves nearest upstream AI node through graph traversal
- generates suggestion candidates
- stores deduplicated suggestions with hashes and status tracking
I intentionally limit pending suggestions per workflow to avoid recommendation spam.
Graph-aware targeting
I do not guess where to apply a suggestion. I resolve the nearest upstream AI node (OPENAI, GEMINI, ANTHROPIC) using BFS over workflow connections.
That single decision reduced accidental patching to the wrong nodes in early iterations. A suggestion that lands in the wrong place is worse than no suggestion at all, because it erodes trust.
Suggestion generation strategies
I built two modes:
- LLM mode using organization credentials (OpenAI, Anthropic, Gemini)
- deterministic heuristic mode as resilient fallback
Both produce additive patch objects (systemPromptAppend, userPromptAppend) with normalized keywords and evidence snippets.
Safety and quality controls
Prompt learning can become unsafe if it leaks sensitive text or creates broken prompt syntax. Fluxo includes protections:
- redact common PII patterns from feedback snippets
- cap snippet length and sample volume
- parse and validate JSON responses robustly
- normalize patch schema strictly
- enforce append-only patch semantics
When applying a patch, I revalidate template syntax of resulting prompts so broken template expressions cannot be introduced by a suggestion.
Human control over application
Suggestions are never silently applied. Workflow editors can:
- list pending and recent suggestions by workflow
- apply suggestion (writes workflow version snapshot)
- dismiss suggestion
Status transitions (PENDING, APPLIED, DISMISSED) are persisted so teams can analyze adoption and quality over time.
Why this approach works
I chose prompt patching over model fine-tuning for three reasons:
- it is explainable to operators
- it is fast to deploy and reverse
- it works with mixed model providers in one platform
This is practical machine learning for operations teams: learn from real edits, improve prompts, preserve control.