Placeholder Text¶
Rule ID: content-placeholder-text
Detect TODO markers, bracket placeholders, and unfilled template text
| Severity | warning (auto) |
| Autofix | - |
| Since | v0.9.0 |
Research Basis¶
Detects TODO markers, bracket placeholders, and unfilled template text in
instruction files (e.g., TODO, FIXME, [Insert API key here], *TBD*).
Placeholder text in committed instruction files is unfinished work that the
agent treats as real context. An LLM cannot distinguish between a deliberate
instruction and an unfilled template — it processes [Insert your API endpoint
here] as a literal instruction, potentially generating code that references a
nonexistent endpoint or asking the user to fill in information that should
already be present.
This is standard software engineering hygiene applied to a new file type.
TODO and FIXME markers have been tracked by linters (ESLint's no-warning-comments,
SonarQube's "Track uses of 'TODO' tags") for decades because they
indicate incomplete implementation. The same principle applies to instruction
files: if the content isn't ready, it shouldn't be in the agent's context.
References:
- ESLint: no-warning-comments — Tracks TODO/FIXME as code quality signals; the same pattern applies to instruction files
- SonarSource: Track uses of "TODO" tags — "TODO tags are commonly used to mark places where some more code is required, but which the developer wants to implement later"
- Anthropic: Effective Context Engineering — "Keep your context informative, yet tight" — placeholder text is uninformative noise that consumes context budget
Instruction Budget vs. Context Budget¶
skillsaw has two separate budget rules that measure different things:
content-instruction-budget — How many directives?¶
Counts discrete imperative instructions per file using regex matching on imperative verb patterns (lines starting with "use", "always", "never", "ensure", etc.). Code blocks are stripped first.
| Threshold | Severity |
|---|---|
| 80–119 instructions | INFO |
| 120–150 instructions | WARNING |
| 150+ instructions | ERROR |
Why it matters: The "Curse of Instructions" (ICLR 2025) showed that the probability of following all N instructions equals p^N — exponential decay. At p = 0.99 and N = 150, the probability of following all instructions is only ~22%. The IFScale benchmark confirmed that primacy bias (selectively ignoring later instructions) becomes dominant at 150–200 instructions.
This is about cognitive load on the model — too many simultaneous directives exceed the model's instruction-following capacity regardless of how many tokens they occupy.
context-budget — How many tokens?¶
Measures estimated token count (chars ÷ 4) of each individual file, checked per-file against category-specific thresholds.
| Category | Warn | Error |
|---|---|---|
| CLAUDE.md, AGENTS.md, GEMINI.md | 6,000 | 12,000 |
| Instruction files (Cursor, Copilot, Kiro) | 4,000 | 8,000 |
| Skills | 3,000 | 6,000 |
| Commands, agents, rules | 2,000 | 4,000 |
Why it matters: Raw token count determines how much of the context window the file consumes and how severely attention degrades. Levy et al. showed reasoning performance degrades at ~3,000 tokens. Chroma's "Context Rot" study found that attention dilution is quadratic in token count — doubling the tokens more than doubles the accuracy loss.
This is about context window consumption — a single file that's too large will crowd out other context and degrade attention across the board.
The distinction¶
A file with 50 instructions in 5,000 tokens (verbose prose around each one) has a low instruction budget but high context budget. A file with 200 terse one-line instructions in 2,000 tokens has a high instruction budget but low context budget. Both degrade model performance, but through different mechanisms.
| Instruction Budget | Context Budget | |
|---|---|---|
| Measures | Discrete imperative count | Estimated token count |
| Scope | Per-file | Per-file |
| Degradation mechanism | Instruction-following capacity | Attention dilution |
| Research basis | Curse of Instructions (ICLR 2025) | Same Task, More Tokens (ACL 2024) |
Key Papers (Cross-Cutting)¶
These papers justify multiple rules simultaneously:
| Paper | Venue | Rules |
|---|---|---|
| Liu et al., Lost in the Middle | TACL 2024 | critical-position, section-length, cognitive-chunks |
| Curse of Instructions | ICLR 2025 | instruction-budget, contradiction, inconsistent-terminology |
| Jaroslawicz et al., How Many Instructions Can LLMs Follow at Once? | arXiv 2025 | instruction-budget |
| Levy, Jacoby & Goldberg, Same Task, More Tokens | ACL 2024 | tautological, redundant-with-tooling, instruction-budget, section-length |
| Bsharat et al., Principled Instructions Are All You Need | arXiv 2023 | weak-language, negative-only, actionability-score |
| Suppressing Pink Elephants | arXiv 2024 | negative-only |
| Chroma, Context Rot | 2025 | critical-position, instruction-budget, section-length |
| When Prompts Go Wrong | arXiv 2025 | contradiction |
| Anthropic: Effective Context Engineering | 2025 | tautological, redundant-with-tooling, instruction-budget, broken-internal-reference, placeholder-text |
| Wang et al., LLMs Meet Library Evolution | ICSE 2025 | banned-references |