Placeholder Text¶

Rule ID: content-placeholder-text

Detect TODO markers, bracket placeholders, and unfilled template text


Severity	warning (auto)
Autofix	-
Since	v0.9.0

Research Basis¶

Detects TODO markers, bracket placeholders, and unfilled template text in instruction files (e.g., TODO, FIXME, [Insert API key here], *TBD*).

Placeholder text in committed instruction files is unfinished work that the agent treats as real context. An LLM cannot distinguish between a deliberate instruction and an unfilled template — it processes [Insert your API endpoint here] as a literal instruction, potentially generating code that references a nonexistent endpoint or asking the user to fill in information that should already be present.

This is standard software engineering hygiene applied to a new file type. TODO and FIXME markers have been tracked by linters (ESLint's no-warning-comments, SonarQube's "Track uses of 'TODO' tags") for decades because they indicate incomplete implementation. The same principle applies to instruction files: if the content isn't ready, it shouldn't be in the agent's context.

References:

ESLint: no-warning-comments — Tracks TODO/FIXME as code quality signals; the same pattern applies to instruction files
SonarSource: Track uses of "TODO" tags — "TODO tags are commonly used to mark places where some more code is required, but which the developer wants to implement later"
Anthropic: Effective Context Engineering — "Keep your context informative, yet tight" — placeholder text is uninformative noise that consumes context budget

Instruction Budget vs. Context Budget¶

skillsaw has two separate budget rules that measure different things:

`content-instruction-budget` — How many directives?¶

Counts discrete imperative instructions per file using regex matching on imperative verb patterns (lines starting with "use", "always", "never", "ensure", etc.). Code blocks are stripped first.

Threshold	Severity
80–119 instructions	INFO
120–150 instructions	WARNING
150+ instructions	ERROR

Why it matters: The "Curse of Instructions" (ICLR 2025) showed that the probability of following all N instructions equals p^N — exponential decay. At p = 0.99 and N = 150, the probability of following all instructions is only ~22%. The IFScale benchmark confirmed that primacy bias (selectively ignoring later instructions) becomes dominant at 150–200 instructions.

This is about cognitive load on the model — too many simultaneous directives exceed the model's instruction-following capacity regardless of how many tokens they occupy.

`context-budget` — How many tokens?¶

Measures estimated token count (chars ÷ 4) of each individual file, checked per-file against category-specific thresholds.

Category	Warn	Error
CLAUDE.md, AGENTS.md, GEMINI.md	6,000	12,000
Instruction files (Cursor, Copilot, Kiro)	4,000	8,000
Skills	3,000	6,000
Commands, agents, rules	2,000	4,000

Why it matters: Raw token count determines how much of the context window the file consumes and how severely attention degrades. Levy et al. showed reasoning performance degrades at ~3,000 tokens. Chroma's "Context Rot" study found that attention dilution is quadratic in token count — doubling the tokens more than doubles the accuracy loss.

This is about context window consumption — a single file that's too large will crowd out other context and degrade attention across the board.

The distinction¶

A file with 50 instructions in 5,000 tokens (verbose prose around each one) has a low instruction budget but high context budget. A file with 200 terse one-line instructions in 2,000 tokens has a high instruction budget but low context budget. Both degrade model performance, but through different mechanisms.

	Instruction Budget	Context Budget
Measures	Discrete imperative count	Estimated token count
Scope	Per-file	Per-file
Degradation mechanism	Instruction-following capacity	Attention dilution
Research basis	Curse of Instructions (ICLR 2025)	Same Task, More Tokens (ACL 2024)

Key Papers (Cross-Cutting)¶

These papers justify multiple rules simultaneously:

Paper	Venue	Rules
Liu et al., Lost in the Middle	TACL 2024	critical-position, section-length, cognitive-chunks
Curse of Instructions	ICLR 2025	instruction-budget, contradiction, inconsistent-terminology
Jaroslawicz et al., How Many Instructions Can LLMs Follow at Once?	arXiv 2025	instruction-budget
Levy, Jacoby & Goldberg, Same Task, More Tokens	ACL 2024	tautological, redundant-with-tooling, instruction-budget, section-length
Bsharat et al., Principled Instructions Are All You Need	arXiv 2023	weak-language, negative-only, actionability-score
Suppressing Pink Elephants	arXiv 2024	negative-only
Chroma, Context Rot	2025	critical-position, instruction-budget, section-length
When Prompts Go Wrong	arXiv 2025	contradiction
Anthropic: Effective Context Engineering	2025	tautological, redundant-with-tooling, instruction-budget, broken-internal-reference, placeholder-text
Wang et al., LLMs Meet Library Evolution	ICSE 2025	banned-references