content-section-length¶

Warn about markdown sections longer than ~500 tokens


Severity	info (auto)
Autofix	-
Since	v0.7.0
Category	Content Intelligence

Why¶

Long, unbroken sections exceed the model's working-memory span for a single topic. When a section runs past ~500 tokens, instructions near its end compete with instructions near its start for the model's attention — and the ones in the middle lose.

Examples¶

Bad:

A single ## Setup section spanning 200 lines covering environment, dependencies, database, Docker, and CI configuration.

Good:

## Environment setup
...

## Database setup
...

## Docker
...

How to fix¶

Split long sections into focused subsections, each under its own heading one level deeper than the parent. Aim for roughly 10–30 lines per subsection. A coding agent can add headings automatically.

Tuning¶

Adjust the token threshold per section:

rules:
  content-section-length:
    max-tokens: 800

Configuration¶

rules:
  content-section-length:
    enabled: auto  # true | false | auto
    severity: info

Parameter	Description	Default
`max-tokens`	Maximum estimated tokens per section before triggering a warning	`500`

Research Basis¶

Warns about markdown sections exceeding ~500 estimated tokens.

Long monolithic text blocks degrade both human readability and LLM attention. The lost-in-the-middle effect operates within sections: the longer a contiguous block of text, the worse recall becomes for information in its interior. Breaking content into smaller sections with headings creates natural retrieval anchors.

The ~500 token threshold aligns with RAG chunking research. Pinecone's chunking guide recommends ~512 tokens as the standard baseline for optimal retrieval and comprehension. The threshold is configurable via the max-tokens parameter.

References:

Liu et al., Lost in the Middle — Attention degrades within long contiguous blocks
Chroma, Context Rot — Attention dilution is quadratic in token count
Pinecone: Chunking Strategies for LLM Applications — 512 tokens as standard chunking baseline
Miller, G. A. (1956), The Magical Number Seven, Plus or Minus Two — Working memory limits and the value of chunking

Run skillsaw explain content-section-length to see this documentation and the rule's effective configuration in your terminal.