Atkinson Hyperlegible Next
A dense paragraph can ask a lot of a stretched mind. Every comma, every clause, every bracketed aside adds a little cost. Good prose keeps that cost low.
Press ← or → to navigate between chapters
Press S or / to search in the book
Press ? to show this help
Press Esc to hide this help
Built for readers whose attention is stretched — ADHD, dyslexia, fatigue, a second language, or an accessibility-sensitive context.
lucid-lint reads your Markdown or plain text and flags the moments that
make prose hard to process. It does not rewrite your voice. It hands you a
short list and gets out of the way.
Before
The caching subsystem, which was introduced in an earlier milestone, turned out to interact poorly with the new request pipeline under sustained load, and the investigation that followed required multiple rounds of profiling.
After
The caching subsystem was introduced earlier. It interacts poorly with the new request pipeline under sustained load. The investigation required several rounds of profiling.
lucid-lint flagged
sentence-too-long (43 words) and
consecutive-long-sentences. It did not propose the
rewrite — that's yours.
Most prose tools measure style (write-good), grammar (Antidote), or a
surface readability score (Flesch). lucid-lint measures cognitive
load — the mental effort a reader spends to understand a sentence. It
flags the patterns that the research behind Sweller, Gibson, Graesser,
and Coh-Metrix single out.
dev-doc, public, or falc (Easy-to-Read),
then override per rule if you want.lucid-lint is at v0.2 (released 2026-04-22). All 25 rules listed in
RULES.md
are shipped (17 from v0.1, 8 added during the v0.2 cycle), alongside
the hybrid scoring model —
a global X / max score plus five per-category sub-scores, computed on
top of the diagnostics. Pre-1.0: breaking changes remain possible
between minor versions. See the roadmap for what
comes next.
A clean file earns the full 100/100 and a wordmark banner — the peak-end moment of a passing lint run:

~~~~~ ⟨ • ⟩ ───── lucid-lint v0.2.0
cognitive accessibility linter · prose · EN / FR
────────────────────────────────────────────────
No issues found.
────────────────────────────────────────────────────────────
score: 100/100
structure █████ 20/20
rhythm █████ 20/20
lexicon █████ 20/20
syntax █████ 20/20
readability █████ 20/20
cargo install lucid-lint
# Lint a file
lucid-lint check README.md
# Strictest profile (Easy-to-Read / FALC)
lucid-lint check --profile=falc docs/
# Stdin
echo "This is a test sentence." | lucid-lint check -
# JSON for CI
lucid-lint check --format=json docs/
# Fail the build if the aggregate score drops below 85/100
lucid-lint check --min-score=85 docs/
The whole site is built as a reading companion. Pick the font that reads best for you — it will stick across pages.
Atkinson Hyperlegible Next
A dense paragraph can ask a lot of a stretched mind. Every comma, every clause, every bracketed aside adds a little cost. Good prose keeps that cost low.
Line spacing and text size are on the way as sliders. Until then, pick a font and your browser's zoom is honoured.
Dual-licensed under MIT or Apache-2.0, at your option.
lucid-lint ships through four routes. Pick the one that matches your environment.
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh | sh
The script is generated by cargo-dist for every tagged release. It detects your platform, downloads the matching prebuilt binary from the GitHub release, and places it on $PATH (default: $CARGO_HOME/bin if set, else ~/.cargo/bin).
curl … | sh is fast but opaque. To read the script before executing it:
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh -o install.sh
less install.sh
sh install.sh
The script is short — under 200 lines of POSIX shell — so a quick read is realistic. It pins the release version it was generated for, verifies the downloaded archive’s expected size, and exits non-zero on any mismatch.
latest resolves to the most recent release. To pin a known-good version:
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/download/v0.2.2/lucid-lint-installer.sh | sh
powershell -ExecutionPolicy Bypass -c "irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 | iex"
Same cargo-dist machinery, PowerShell flavour. Drops the binary into %CARGO_HOME%\bin when CARGO_HOME is set, else %USERPROFILE%\.cargo\bin.
To audit before running, save the script and inspect it:
irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 -OutFile install.ps1
notepad install.ps1
.\install.ps1
cargo install lucid-lint
This compiles from source via crates.io and places the binary in your Cargo bin directory (default ~/.cargo/bin/). Slower than the prebuilt installer but useful when the prebuilt targets don’t match your platform.
git clone https://github.com/bastien-gallay/lucid-lint
cd lucid-lint
cargo install --path .
Each release ships pre-built binaries for:
x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl)aarch64-apple-darwin, x86_64-apple-darwin)x86_64-pc-windows-msvc)The shell and PowerShell installers above pick the right archive automatically. To install manually, download from the GitHub releases page and put the extracted binary on $PATH.
lucid-lint --version
cargo install).This page walks through linting your first document.
lucid-lint check README.md
Output:
warning <path>/README.md:14:1 Sentence is 27 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]
summary: 1 warnings.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long
────────────────────────────────────────────────────────────
score: 88/100
structure ██▏░░ 8/20
rhythm █████ 20/20
lexicon █████ 20/20
syntax █████ 20/20
readability █████ 20/20
The trailing block is the scoring summary — a global
X / 100 score followed by the full per-category breakdown.
lucid-lint check docs/*.md CHANGELOG.md
lucid-lint check docs/
All files with .md, .markdown, or .txt extensions will be processed.
echo "This is a test sentence." | lucid-lint check -
For formats that lucid-lint does not parse natively yet:
pandoc report.docx -t markdown | lucid-lint check -
# Strictest: Easy-to-Read
lucid-lint check --profile=falc docs/
# Looser: developer documentation
lucid-lint check --profile=dev-doc docs/
See Profiles for details.
# JSON for CI
lucid-lint check --format=json docs/
See CI integration for CI recipes.
| Code | Meaning |
|---|---|
| 0 | No issues (or only info) and score above --min-score (if set) |
| 1 | Warnings found or score below --min-score |
| 2 | Runtime error (invalid args, unreadable file) |
The two gates stack. See CI integration for combination recipes.
A profile is a preset bundle of rule thresholds tuned for a specific audience.
dev-docFor technical documentation, API references, ADRs, and developer-facing content.
Thresholds are loose: technical readers have higher tolerance for long sentences, nominalizations, and domain-specific jargon.
public (default)For general-audience content: marketing pages, product descriptions, blog posts.
Thresholds are moderate. Plain-language guidelines apply.
falcFor content that follows the Facile À Lire et à Comprendre / Easy-to-Read European standard.
Thresholds are strict: short sentences, simple vocabulary, no passive voice, no undefined acronyms.
Start with the profile that matches the intent of the content. Override specific rules if needed via lucid-lint.toml.
See the rule reference for exact thresholds per rule and per profile.
The overall pattern is:
dev-doc: 30 words per sentence, 4 commas, 7 sentences per paragraphpublic: 22 words per sentence, 3 commas, 5 sentences per paragraphfalc: 15 words per sentence, 2 commas, 3 sentences per paragraphThe same file linted three times under dev-doc, public, and
falc in turn — the score drops as the profile tightens:

Any per-rule threshold set in lucid-lint.toml takes precedence over the profile preset.
[default]
profile = "public"
[rules.sentence-too-long]
max_words = 18 # stricter than public's 22
A condition tag describes the cognitive condition a rule primarily targets. Conditions are orthogonal to profiles: a profile (dev-doc, public, falc) sets the strictness of the always-on rules; conditions enable additional rules tuned for a specific audience.
| Tag | Targets |
|---|---|
general | Always-on rules. The v0.2 baseline. |
a11y-markup | Prose-adjacent markup signals (e.g. all-caps shouting). |
dyslexia | Dyslexia-targeted signals. Source: BDA Dyslexia Style Guide. |
dyscalculia | Numeric format and anchoring. Source: CDC Clear Communication Index. |
aphasia | Aphasia-targeted signals. Source: FALC, plain-language guides. |
adhd | Attention-fragility signals. |
non-native | Non-native reader signals (vocabulary rarity, idioms). |
The set is fixed. New tags are a deliberate, versioned change.
For every rule the engine evaluates:
general is always enabled.general runs only when at least one of its tags appears in the user’s active condition list.All 17 v0.2 rules carry general, so the default behavior is unchanged. Future tagged rules (e.g. lexicon.all-caps-shouting for a11y-markup, syntax.nested-negation for aphasia + adhd) opt in via this list.
In lucid-lint.toml:
[default]
profile = "falc"
conditions = ["dyslexia", "aphasia"]
On the command line (comma-separated, repeatable):
lucid-lint check --profile falc --conditions dyslexia,aphasia docs/
FALC retains its regulatory meaning. Adding dyslexia does not relax or rename it — it layers dyslexia-specific signals on top.
Three strictness levels × N conditions explodes combinatorially. Keeping the two axes orthogonal preserves the regulatory meaning of falc while letting users compose audience-specific overlays. See ROADMAP entries F71 and F72.
lucid-lint is configured via a lucid-lint.toml file at the project root (optional) and CLI flags (overrides the file).
# lucid-lint.toml
[default]
profile = "public"
[rules.sentence-too-long]
max_words = 22
[rules.passive-voice]
max_per_paragraph = 2
[default]Top-level defaults applied to the whole run.
| Field | Type | Default | Description |
|---|---|---|---|
profile | string | "public" | One of dev-doc, public, falc |
conditions | array of strings | [] | Active condition tags. See Conditions. |
exclude | array of glob strings | [] | Paths to skip during directory recursion. See Excluding paths. |
[rules.<rule-id>]Per-rule configuration. The fields available depend on the rule. See the rule pages in Rules reference.
[scoring]Tunables for the hybrid scoring model. All fields are
optional; missing fields fall back to the shipped defaults
(category_max = 20, category_cap = 15).
[scoring]
category_max = 20
category_cap = 15
[scoring.weights]
sentence-too-long = 3
weasel-words = 2
The [scoring.weights] sub-table is keyed by rule id. Unknown ids are
ignored, so removing a rule in a future version does not break older
configs.
From lowest to highest:
public)lucid-lint.toml overridesAn unset CLI flag defers to the TOML value; an unset TOML field defers to the profile preset.
lucid-lint walks up from the current working directory to the first lucid-lint.toml it finds, stopping at the nearest .git repo boundary. Passing --config <path> skips auto-discovery and loads the given file directly; a missing explicit path is an error, but a missing auto-discovered file is not.
Large documentation repositories routinely contain generated output,
vendored text, and snapshots that would drown the linter in noise. Use
the exclude field in [default] — or the --exclude <GLOB> CLI flag
— to skip them at discovery time, before parsing.
[default]
exclude = [
"vendor/**",
"**/fixtures/**",
"CHANGELOG.md",
]
Equivalently on the command line:
lucid-lint check --exclude 'vendor/**,**/fixtures/**,CHANGELOG.md' docs
Notes:
lucid-lint check docs with
exclude = ["drafts/**"] skips docs/drafts/....docs/CHANGELOG.md directly
on the command line, it is linted even when CHANGELOG.md is in the
exclude list. If you named the path, you meant it.--exclude and TOML exclude are unioned, not
overridden. Comma-separate multiple patterns in a single flag, or
repeat --exclude.Markdown documents support
inline-disable directives for local silencing, but
plain text and stdin have no such escape hatch. [[ignore]] fills
that gap — and works uniformly across all input formats.
[[ignore]]
rule_id = "unexplained-abbreviation"
[[ignore]]
rule_id = "weasel-words"
Each [[ignore]] entry removes every diagnostic whose rule_id
matches, across Markdown files, plain text, and stdin. The filter is
applied after all rules have run but before scoring, so the score
reflects the post-filter view too.
Notes:
[[ignore]] only when a rule is genuinely noisy
project-wide.reason = "..." field on each entry is
tracked as F-suppression-reason-field — when it lands it will be surfaced in reports and
optionally required via config.TOML-driven config is wired rule-by-rule as each Config gains a dedicated accessor. Two rules honour it today:
[rules.readability-score][rules.readability-score]
formula = "kandel-moles" # or "flesch-kincaid", "auto"
Pins the readability formula regardless of detected language. auto (default) preserves the F-readability-formulas-extra per-language selection.
[rules.unexplained-abbreviation][rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]
Entries are additive over the profile baseline (F31). Use this to restore project-specific acronyms — accessibility standards, domain initialisms, engineering-practice terms — that the v0.2 baseline no longer ships. Each entry is silenced globally across the document, same as if it had been defined inline via Expansion (ACRONYM).
[rules."structure.excessive-commas"][rules."structure.excessive-commas"]
max_commas = 2
Overrides the per-sentence comma ceiling (default: 4 / 3 / 2 for dev-doc / public / falc). Must be a positive integer — 0 or negative values are rejected at load time. The override replaces the profile preset; it is not additive.
Tables for other rules parse without error but have no runtime effect. Extending this list is a mechanical per-rule change and will continue through the v0.2.x cycle.
v0.2 adds a hybrid scoring model on top of the existing diagnostics. Every run now answers two questions at once:
The two surfaces are complementary. Scores are summaries; diagnostics remain the actionable signal.
The score takes the form X / max — an arbitrary maximum rather than a
0–100 normalized number. v0.2 ships with max = 100 (five categories ×
twenty points), but the number is treated as a test-and-learn calibration:
the scale may shift in a future minor release as rule weights are tuned
against real corpora.
The rules of thumb for today’s calibration:
| Range | Reading |
|---|---|
| 80 – 100 | Score reads green in the terminal. Nothing blocking. |
| 60 – 79 | Score reads yellow. A handful of hits worth reviewing. |
| 0 – 59 | Score reads red. Dense issues or a runaway rule. |
The colour bands are a reader aid, not a pass / fail contract. For CI
gating, use --min-score with a concrete
number you picked.
Every rule belongs to exactly one category. v0.2 fixes the taxonomy at five buckets:
| Category | Covers |
|---|---|
structure | Length, nesting, punctuation, document skeleton |
rhythm | Cadence and repetition across adjacent sentences |
lexicon | Vocabulary, terminology, acronyms, lexical diversity |
syntax | Sentence-level style and syntactic clarity |
readability | Document-level readability metrics |
See the rules reference for the rule-to-category mapping.
For a single document:
per_rule_cost = Σ (weight × severity_multiplier) over hits
per_category_cost = min(Σ per_rule_cost / (words / 1000), ← density
category_cap) ← cap
category_score = category_max − per_category_cost (clamped ≥ 0)
global_score = Σ category_score
Three mechanics stack:
weight × severity_multiplier. The
default weight table lives in scoring::default_weight_for and
emphasises rules whose hits carry the most cognitive load
(readability-score = 5, length / subordination / passive /
unclear-antecedent = 2, everything else = 1).words / 1000 so a
10 000-word handbook is not punished for having more hits than a
400-word README. Documents shorter than 200 words are treated as
200-word documents, so tiny fixtures are not artificially penalized.category_cap out of category_max. A single noisy rule eats at most
75 % of its own category (15 / 20 by default) and cannot leak into the
others.The severity multiplier is info = 1, warning = 3, error = 5.
The terminal formatter prints each diagnostic, a short summary line, then a score block: the global number followed by every category score with an eight-step sparkline bar.

The same run rendered as plain text, for screen readers and copy-paste:
warning examples/sample.md:7:1 Sentence is 35 words long (maximum 30). Consider splitting it into shorter sentences. [section: A paragraph with a long sentence] [structure.sentence-too-long]
warning examples/sample.md:7:11 Weasel phrase "rather" weakens the statement. Replace with concrete language or remove it. [section: A paragraph with a long sentence] [lexicon.weasel-words]
info examples/sample.md:1:1 Flesch-Kincaid grade 6.8 (target ≤ 14.0). [readability.score]
info examples/sample.md:7:1 Sentence starts with a bare demonstrative "this". Name the referent to avoid forcing the reader to guess. [section: A paragraph with a long sentence] [syntax.unclear-antecedent]
warning examples/sample.md:7:1 Line is 210 characters wide (maximum 120). [section: A paragraph with a long sentence] [structure.line-length-wide]
summary: 3 warnings, 2 info.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long, lexicon.weasel-words, readability.score + 2 more
────────────────────────────────────────────────────────────
score: 45/100
structure █▎░░░ 5/20
rhythm █████ 20/20
lexicon █▎░░░ 5/20
syntax ██▌░░ 10/20
readability █▎░░░ 5/20
All five categories are always displayed so the breakdown stays
structurally stable run-to-run. A perfect document reads score: 100/100 with every bar full (█████). When the same rule fires two
or more times on one file, the hits cluster under a compact header
and any shared message or section is hoisted up so it only appears
once.
The JSON schema is at version = 2 in v0.2. New fields:
{
"version": 2,
"diagnostics": [
{
"rule_id": "structure.sentence-too-long",
"severity": "warning",
"location": { "file": { "kind": "path", "path": "draft.md" }, "line": 12, "column": 1, "length": 42 },
"section": "Introduction",
"message": "Sentence is 27 words long (maximum 22).",
"weight": 2
}
],
"summary": { "info": 0, "warning": 1, "error": 0, "total": 1 },
"score": { "value": 88, "max": 100 },
"category_scores": [
{ "category": "structure", "value": 8, "max": 20 },
{ "category": "rhythm", "value": 20, "max": 20 },
{ "category": "lexicon", "value": 20, "max": 20 },
{ "category": "syntax", "value": 20, "max": 20 },
{ "category": "readability", "value": 20, "max": 20 }
]
}
Category values are lowercase strings in the fixed order listed above. Consumers that parsed the v0.1 schema should:
version from 1 to 2;length → structure,
lexical → lexicon, style → syntax, global → readability);--min-scoreThe check subcommand takes an optional --min-score=N flag. The run
exits 1 if the aggregate global score is below N, independently of
the severity-based gate.
# Fail the build if overall quality drops below 85/100
lucid-lint check --min-score=85 docs/
Both gates stack: the run fails if either the severity gate trips or the score gate trips. Pick one or both depending on your workflow:
--fail-on-warning=false --min-score=85):
tolerates individual warnings but fails when density drifts past your
threshold.--min-score=85): both spikes and drifts fail the
build.lucid-lint.tomlProjects can override the calibration in their lucid-lint.toml:
[scoring]
category_max = 20
category_cap = 15
[scoring.weights]
sentence-too-long = 3
weasel-words = 2
Missing fields fall back to the shipped defaults. The [scoring.weights]
sub-table is keyed by rule id; unknown ids are ignored so removing a rule
later doesn’t break older configs.
The brainstorm that shaped F14 (see
brainstorm/20260420-score-semantics.md)
kept the model minimal. Decorations promoted only when user feedback
requires them:
lucid-lint supports two inline directives for silencing diagnostics in Markdown input. They are intended for the rare cases where a rule fires on intentional prose (a quoted weasel word, a didactic heavy-nominalization example, a legitimate passive). Prefer rewriting the prose first; reach for a directive when the detection is a known false positive or when the author has considered the warning and chosen to keep the text.
<!-- lucid-lint disable-next-line structure.sentence-too-long -->
A long sentence that is intentional and should not be flagged.
<!-- lucid-lint disable-next-line lexicon.weasel-words reason="quoting the style guide" --> — surfaced in JSON output; will be required via config in a future release (tracked as F-suppression-reason-field in the roadmap).<!-- lucid-lint-disable structure.sentence-too-long -->
A long sentence.
Another long sentence in the same scope.
<!-- lucid-lint-enable -->
<!-- lucid-lint-disable <rule-id> --> opens a scope for one rule.<!-- lucid-lint-enable --> closes every currently-open scope. Passing a rule id (<!-- lucid-lint-enable <rule-id> -->) closes only that rule’s scope, which lets overlapping disables for different rules nest cleanly.disable-file directive (F-suppression-disable-file) once it lands.[[ignore]] in lucid-lint.toml) covering .txt and stdin are tracked as F19.The following extensions are tracked on the roadmap:
| ID | Item |
|---|---|
| F19 | Config-based ignores ([[ignore]] in lucid-lint.toml) for .txt and stdin inputs |
| F-suppression-reason-field | Optional-then-required reason="..." field, surfaced in reports |
| F-suppression-disable-file | File-level directive (disable-file) and multi-rule comma lists |
## Suppression section on any rule page under Rules reference.lucid-lint is designed for CI. It returns:
0 when no issues (or only info) are found1 when warnings are found2 on runtime error (invalid args, unreadable file)name: Docs lint
on:
pull_request:
paths:
- '**/*.md'
push:
branches: [main]
paths:
- '**/*.md'
jobs:
lucid-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install lucid-lint
run: cargo install lucid-lint
- name: Lint docs
run: lucid-lint check --profile=public docs/ README.md
Add to your .pre-commit-config.yaml:
repos:
- repo: local
hooks:
- id: lucid-lint
name: lucid-lint
entry: lucid-lint check --profile=public
language: system
types: [markdown]
To surface diagnostics as pull request review comments:
lucid-lint check --format=json docs/ | reviewdog -f=rdjson -reporter=github-pr-review
Note: RDJSON adapter is not shipped. For native code-review surfacing, prefer the GitHub Code Scanning workflow below.
--format=sarif emits a SARIF v2.1.0 log that GitHub’s Code Scanning ingests directly: each diagnostic becomes a code-scanning alert annotated on the PR diff.
name: Lucid lint (code scanning)
on:
pull_request:
paths: ['**/*.md']
push:
branches: [main]
paths: ['**/*.md']
permissions:
security-events: write
contents: read
jobs:
lucid-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: cargo install lucid-lint
- name: Run lucid-lint and emit SARIF
run: |
lucid-lint check \
--profile=public \
--format=sarif \
--fail-on-warning=false \
docs/ README.md > lucid-lint.sarif
- name: Upload SARIF to Code Scanning
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: lucid-lint.sarif
category: lucid-lint
Notes:
--fail-on-warning=false lets the upload step always run; rely on Code Scanning’s own gating in the PR UI rather than the lint exit code.runs[0].tool.driver.rules with its category, default severity, default scoring weight, and a helpUri pointing at the per-rule mdBook page.properties.weight and properties.section carry the scoring weight and the heading the diagnostic was found under.To avoid failing CI on warnings (e.g., during a gradual adoption phase), you can invert the default:
lucid-lint check --fail-on-warning=false docs/
This always returns 0 except on runtime error.
You can also gate the build on the aggregate
scoring model. The run exits 1 if the global score is
below the threshold, independently of the severity gate.
jobs:
lucid-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: cargo install lucid-lint
- name: Lint and gate on score
run: lucid-lint check --min-score=85 docs/ README.md
Both gates stack — the run fails if either trips. Pick the combination that fits your adoption curve:
| Goal | Flags |
|---|---|
| Catch newly introduced warnings (default behaviour) | default |
| Tolerate individual warnings but fail on drift | --fail-on-warning=false --min-score=85 |
| Fail on both spikes and drift | default + --min-score=85 |
A gated run that fails — lucid-lint prints its usual summary, then the shell surfaces the non-zero exit code:

$ lucid-lint check --min-score=85 examples/sample.md
…
score: 45/100
structure █▎░░░ 5/20
rhythm █████ 20/20
lexicon █▎░░░ 5/20
syntax ██▌░░ 10/20
readability █▎░░░ 5/20
$ echo "exit: $?"
exit: 1
lucid-lint ships 25 rules as of v0.2 (17 from v0.1 plus 8 added during the v0.2 cycle). Each rule has a dedicated page below with category, severity, default weight, thresholds per profile, examples, and suppression guidance.
The compact reference at RULES.md remains the single-file overview kept in the repository root. The academic and normative sources behind every rule are consolidated on the References page.
Every rule belongs to exactly one of five fixed buckets. The taxonomy is authoritative — the scoring model composes per-category sub-scores into the global X / max.
Authoritative source. The category of each rule is determined by
Category::for_ruleinsrc/types.rs. The mapping above mirrors that function. A coverage test (tests/rule_docs_coverage.rs) keeps the per-rule pages, the category helper, and the scoring weights in lock-step.
| Level | Meaning | Effect |
|---|---|---|
info | Signal worth knowing, not a defect | Reported; does not fail CI |
warning | Quality issue worth fixing | Reported; may fail CI depending on --min-score |
error | Reserved for v0.3+ | Not emitted in v0.2 |
Each rule bundles its reference page into the binary. Run lucid-lint explain <rule-id> to print the same content this website serves —
useful when CI only gives you a diagnostic id and no browser:

lucid-lint explain structure.sentence-too-long
See Contributing for the rule-addition checklist — every new rule must land with a page in this section.
structure.sentence-too-longSentences whose length exceeds a per-profile ceiling. The intrinsic cognitive load of a sentence grows non-linearly with its word count (Graesser et al. 2004, Coh-Metrix); FALC caps at 15 words, Plain English at 20. Long sentences increase the probability of a reader under attentional load losing the thread mid-read.
| Category | structure |
| Default severity | warning |
| Default weight | 2 |
| Languages | EN · FR (identical detection) |
| Source | src/rules/sentence_too_long.rs |
Split text into sentences via strong punctuation (., !, ?, …, paragraph breaks). Count Unicode word tokens, excluding punctuation. Contractions (don't) and elisions (l'accessibilité) count as one word when the apostrophe sits between two letters. Code blocks are skipped.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_words | int | 30 | 22 | 15 |
exclude_code_blocks | bool | true | true | true |
Three ideas, colour-matched across the rewrite — position already pairs them, the tint just confirms the rewrite loses none. lucid-lint reports; the rewrite is always yours.
Before (flagged):
The caching subsystem, which was introduced in an earlier milestone, turned out to interact poorly with the new request pipeline under sustained load, and the investigation that followed required multiple rounds of profiling.
What lucid-lint check --profile public reports:
warning input.md:1:1 Sentence is 33 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]
After (your rewrite):
The caching subsystem was introduced earlier. It interacts poorly with the new request pipeline under sustained load. The investigation required several rounds of profiling.
Before (flagged):
Le sous-système de cache introduit lors d’un jalon précédent interagit mal avec le nouveau pipeline de requêtes sous charge soutenue, et l’enquête a nécessité plusieurs rondes de profilage.
What lucid-lint check --profile public reports:
warning input.md:1:1 Sentence is 29 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]
After (your rewrite):
Le cache a été introduit lors d’un jalon précédent. Il interagit mal avec le nouveau pipeline sous charge soutenue. L’enquête a nécessité plusieurs rondes de profilage.
See Suppressing diagnostics for the inline and block forms.
rhythm.consecutive-long-sentences — catches rhythm; its threshold must stay lower than max_words here.structure.sentence-too-long carries weight 2 because the cognitive cost compounds with length.See References for the full bibliography.
structure.paragraph-too-longParagraphs that overrun either a sentence-count or a word-count threshold. A paragraph is a visual reprise unit: long paragraphs dilute the reprise point for readers who interrupt often. Both metrics are checked so that a short-but-dense paragraph (one 80-word sentence) is still caught — structure.sentence-too-long covers the complementary case.
| Category | structure |
| Default severity | warning |
| Default weight | 2 |
| Languages | EN · FR (identical detection) |
| Source | src/rules/paragraph_too_long.rs |
Split on blank lines (Markdown paragraph convention). Count sentences and words per paragraph. Flag paragraphs exceeding either threshold.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_sentences | int | 7 | 5 | 3 |
max_words | int | 150 | 100 | 60 |
A paragraph of eight medium sentences under the public profile will fire on max_sentences. A paragraph containing a single 120-word sentence will fire on max_words (and also on structure.sentence-too-long).
See References for the full bibliography.
structure.heading-jumpHeading-level jumps that break the document’s mental map (e.g. H2 → H4). Each level must follow the previous by at most +1. Readers with attentional difficulties lean heavily on heading hierarchy to reposition after an interruption; a broken hierarchy destroys that cue. Also flags the first heading being deeper than H2 when allow_first_heading_any_level is false, and missing H1 when require_h1 is true.
References. WCAG 2.1 SC 1.3.1 (Info and Relationships) and 2.4.6 (Headings and Labels); RGAA 9.1.
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Languages | language-agnostic |
| Source | src/rules/heading_jump.rs |
Parse Markdown headings (#, ##, …). Walk them in source order; report each heading whose level exceeds the previous by more than one. Deterministic, no false positives.
| Key | Type | Default |
|---|---|---|
allow_first_heading_any_level | bool | true |
require_h1 | bool | false |
A binary rule — no per-profile thresholds.
Flagged:
# Overview
#### Details ← jumps from H1 to H4
Clean:
# Overview
## Section
### Subsection
structure.deeply-nested-lists — the list-level equivalent signal.See References for the full bibliography.
structure.deeply-nested-listsBulleted list items nested beyond a reasonable depth. A deeply nested list forces the reader to reconstruct a complex mental hierarchy — horizontal indentation stops being a positional cue and becomes noise. Four levels of indent are too many for readers with attentional difficulties to track.
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Languages | language-agnostic |
| Source | src/rules/deeply_nested_lists.rs |
Parse Markdown via pulldown-cmark; extract list items with their indentation level; flag items deeper than max_depth. Deterministic, no false positives.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_depth | int | 4 | 3 | 2 |
Under public (max depth 3):
- Level 1
- Level 2
- Level 3
- Level 4 ← flagged
Includes repair guidance: flatten the structure, split into multiple lists, or promote sub-items to subsections with headings.
See References for the full bibliography.
structure.line-length-wideAuthor-chosen lines wider than the per-profile ceiling. WCAG 1.4.8 (AAA) caps rendered text at roughly 80 characters per line because longer lines force the eye to track further between saccades and increase re-reading on return sweep — a known difficulty for dyslexic readers (BDA Dyslexia Style Guide).
“Author-chosen” matters: in Markdown, soft-wrapped lines collapse to spaces at parse time because the renderer reflows them to fit the viewport. Their source length tells us nothing about what the reader sees. Only line breaks the author kept on purpose are checked here — Markdown hard breaks (<br> or two trailing spaces) and explicit newlines in plain-text input. A soft-wrapped Markdown paragraph is exempt no matter how long its joined text is. Use structure.paragraph-too-long to bound paragraph density.
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Condition tags | dyslexia, general |
| Languages | EN · FR (script-agnostic) |
| Source | src/rules/line_length_wide.rs |
For every paragraph that carries an authorial line break, scan each line’s width in grapheme clusters and report lines above max_line_length.
A Markdown paragraph with no hard break inside it (the common case for prose) is exempt — the parser collapses its soft breaks to spaces, so what remains is one logical line whose source length tracks the viewport, not the rendered width WCAG 1.4.8 targets. Plain-text input is treated symmetrically: a paragraph with no inner \n is exempt; one with internal newlines is checked line by line.
Fenced and indented code blocks are excluded upstream by the Markdown parser. Headings, list items, and table cells are out of scope by construction — paragraph-too-long, sentence-too-long, and the heading rules cover the cognitive-load concerns that apply to those blocks.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_line_length | int | 120 | 100 | 80 |
FALC matches the WCAG 1.4.8 AAA recommendation of 80 characters.
Long single-line prose paragraphs in Markdown source are intentionally exempt. The rule used to fire on them and produced large amounts of noise on real prose; v0.2.x narrows the rule to author-chosen breaks only. Pair this rule with structure.paragraph-too-long if you also want a ceiling on the joined paragraph length.
Headings and list items are not measured by this rule. Their wrap behavior depends on the renderer (display type, list indent), and the underlying cognitive concerns are covered by other rules.
See References for the full bibliography.
structure.mixed-numeric-formatSentences that mix digit numerals (42, 3.14, 1,000, 1 000) with spelled-out numerals (two, trois, twenty, cent) within the same sentence. Presenting numbers inconsistently forces the reader to switch surface forms mid-clause and re-anchor the referent — a known load for readers with dyscalculia and a plain-language anti-pattern.
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Condition tags | dyscalculia, general |
| Languages | EN · FR |
| Source | src/rules/mixed_numeric_format.rs |
For each sentence emitted by the tokenizer, scan for digit-numeric tokens and for entries in the per-language spelled-numeral list. If at least one of each kind co-occurs, emit a single diagnostic for the sentence citing one representative token of each kind.
Digit tokens accept ASCII digits plus an optional decimal (.) or thousands separator (,, narrow space U+0020) when flanked by digits on both sides. Spelled-out matches are case-insensitive ASCII compares against en::SPELLED_NUMERALS and fr::SPELLED_NUMERALS.
The ambiguous forms one (EN) and un / une (FR) are excluded from the spelled-numeral list because they double as indefinite pronouns and articles. This keeps the false-positive rate manageable at the cost of missing genuine mixed-format cases whose only spelled-out numeral is one. Metropolitan French and Swiss / Belgian regional forms (septante, huitante, octante, nonante) are all included.
Sentences are produced by the shared tokenizer (see src/parser/tokenizer.rs), so abbreviations, decimals, and ellipses do not spuriously split sentences. Fenced and indented code blocks are excluded upstream by the Markdown parser.
None. The rule has no configurable threshold — a single co-occurrence of the two surface forms is sufficient.
one / un / une are not flagged, by design (see Detection).first, premier, 2nd, 3e) are out of scope. 2nd currently reads as a digit token (2) followed by a word (nd), which does not match the spelled-numeral list — no false positive.IV, XIV) are neither digits nor spelled-out numerals for this rule.See References for the full bibliography.
structure.excessive-commasSentences whose comma count exceeds a per-profile ceiling. The comma is the most frequent marker of syntactic complexity; rather than disentangle the cause (subordination, apposition, enumeration, parenthetical), the rule uses density as a leading indicator of overload.
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (identical detection) |
| Source | src/rules/excessive_commas.rs |
Count commas per sentence, report those above max_commas.
Interaction. When structure.long-enumeration fires on the same sentence, this rule is suppressed for that sentence to avoid double-reporting. The shared enumeration detector discounts Oxford-style enumeration commas (3+ short items, plus a relaxed rhythmic pass for 1–4-word items, plus runs closed by plus as well as and / or — see “Known false positives” below) and commas inside (A, B, C, …) parenthesised token lists (3+ short comma-separated segments inside balanced parens) — all language-agnostic.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_commas | int | 4 | 3 | 2 |
Remaining false positives mostly come from bare lists with no terminal connector (e.g. Rules touched: A, B, C) and Oxford runs interrupted by an interleaved parenthetical; these are tracked as F22 in the roadmap for further v0.3 sub-slices.
See References for the full bibliography.
structure.long-enumerationInline prose enumerations that would be clearer as a bulleted list — 5+ comma-separated items closed by a coordinator (and, or, et, ou).
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (identical detection) |
| Source | src/rules/long_enumeration.rs, shared helper src/rules/enumeration.rs |
Sequence of min_items or more short comma-separated segments ending with , and / , or / , plus / , et / , ou (Oxford comma optional). Shared detector also informs structure.excessive-commas.
| Key | Type | Default |
|---|---|---|
min_items | int | 5 |
Suggests converting the enumeration to a bulleted list.
lucid-lint reports; the rewrite is always yours.
Six items, colour-matched across the rewrite — each inline term lines up with its bullet.
Before (flagged):
The dish contains tomato, onion, garlic, basil, parsley, and thyme.
What lucid-lint check --profile public reports:
warning input.md:1:1 Inline enumeration of 5 items. Consider converting it into a bulleted list so readers can scan the items. [structure.long-enumeration]
After (your rewrite):
The dish contains:
- tomato
- onion
- garlic
- basil
- parsley
- thyme
See References for the full bibliography.
structure.deep-subordinationCascading subordinate clauses: multiple relative pronouns or subordinating conjunctions chained without a strong-punctuation break. Each open referent has to sit in working memory until it closes — Gibson’s Dependency Locality Theory (1998) ties processing cost directly to that distance.
| Category | structure |
| Default severity | warning |
| Default weight | 2 |
| Languages | EN · FR (separate lists) |
| Source | src/rules/deep_subordination.rs |
Walk the sentence between strong-punctuation breaks; count consecutive subordinators. Flag when the count exceeds max_consecutive_subordinators. Pronoun enumerations (qui, que, dont, où) are skipped — the detector recognises the list form and does not treat it as cascading.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_consecutive_subordinators | int | 3 | 2 | 2 |
Each highlighted token is one subordinator counted by the rule. Four in a row triggers the dev-doc threshold (3); two in a row triggers public and falc.
Flagged (FR):
Le document qui a été rédigé par l’équipe que nous avons constituée et qui couvre les points que nous avions discutés…
Flagged (EN):
The report that was drafted by the team which we formed last month and which covers the topics that we had discussed…
Not flagged (enumeration form, recognised by the detector):
Les pronoms relatifs en français sont : qui, que, dont, où.
And the matching English form:
The English relative pronouns are: which, that, who, whom, whose.
See References for the full bibliography.
structure.italic-span-longExperimental in v0.2.x. Off by default; opt in via
--experimental structure.italic-span-longor[experimental] enabled = ["structure.italic-span-long"]inlucid-lint.toml. Flips toStableat the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for thedyslexiacondition tag that gates this rule under user-active conditions.
Italic spans (*…* / _…_) longer than a configurable word threshold. Slanted glyphs degrade letter-shape recognition for readers with dyslexia — a robust finding behind the British Dyslexia Association’s recommendation to keep italic emphasis to a short phrase rather than running a full sentence in italics. Long italic runs also harm scanability for readers whose attention is already taxed (fatigue, second-language reading, low-vision conditions).
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Status | experimental (v0.2.x) → stable at v0.3 cut |
| Condition tag | dyslexia (gated; runs only under matching --conditions) |
| Languages | EN · FR (identical detection — substrate is language-agnostic) |
| Source | src/rules/structure/italic_span_long.rs |
Walks the typed inline tree captured on each Paragraph (F143 substrate) and flags every Inline::Emphasis span whose visible word count exceeds the per-profile threshold. Code blocks and inline code are excluded by the parser, so an italic span inside a code fence never fires. Strong (**bold**) does not trigger this rule — only emphasis (*italic* / _italic_).
The diagnostic location points at the opening delimiter, so the squiggle in your editor lands on the visible * or _ rather than an arbitrary column inside the paragraph.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_words | int | 12 | 8 | 5 |
Tune via lucid-lint.toml:
[rules."structure.italic-span-long"]
max_words = 6
Before (flagged):
The team eventually concluded that the proposed migration plan would require careful coordination across three regional offices and an extended freeze window before any deployment could begin.
What lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia reports:
warning input.md:1:36 Italic span is 17 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]
After (your rewrite):
The team eventually concluded that the proposed migration plan would require careful coordination. Three regional offices and an extended freeze window are prerequisites before any deployment.
The italics now mark a single load-bearing word — the kind of emphasis the BDA style guide endorses.
Before (flagged):
L’équipe a fini par conclure que le plan de migration proposé nécessiterait une coordination soignée entre trois bureaux régionaux et une fenêtre de gel prolongée avant tout déploiement.
What lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia reports:
warning input.md:1:35 Italic span is 18 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]
After (your rewrite):
L’équipe a fini par conclure que le plan de migration nécessiterait une coordination soignée. Trois bureaux régionaux et une fenêtre de gel prolongée sont indispensables avant tout déploiement.
See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:
<!-- lucid-lint disable-next-line structure.italic-span-long -->
A *deliberately long italic span that the rule would normally flag* lives here.
dyslexia tag that gates this rule.See References for the full bibliography.
structure.number-runExperimental in v0.2.x. Off by default; opt in via
--experimental structure.number-runor[experimental] enabled = ["structure.number-run"]inlucid-lint.toml. Flips toStableat the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for thedyscalculiacondition tag that gates this rule under user-active conditions.
Sentences that pack more than a configurable number of numeric tokens together. plainlanguage.gov is explicit on the framing — “Don’t put a lot of numbers together in one sentence” and “Avoid placing too many statistics close together” — and readers with dyscalculia carry the cost first: each numeric token forces a quantity-to-symbol re-anchoring that does not benefit from running prose context the way ordinary words do. Citation salads ((Smith 2020, Jones 2021, Wei 2022, Park 2023)), benchmark tables flattened into prose, and statistic-heavy paragraphs are the typical hits.
| Category | structure |
| Default severity | warning |
| Default weight | 1 |
| Status | experimental (v0.2.x) → stable at v0.3 cut |
| Condition tag | dyscalculia (gated; runs only under matching --conditions) |
| Languages | EN · FR (identical detection — digits are language-agnostic) |
| Source | src/rules/structure/number_run.rs |
Walks each paragraph’s sentence stream (post-flattening, so fenced code blocks are already excluded by the parser) and counts numeric tokens per sentence. A numeric token is a contiguous run of ASCII digits, optionally containing one decimal separator (. or ,) followed by more digits. Hyphen, colon, slash, and whitespace split tokens.
| Input | Tokens counted | Note |
|---|---|---|
42 | 1 | Bare integer |
3.14 | 1 | Decimal separator kept |
1,000 | 1 | Comma separator kept |
2026-05-04 | 3 | Hyphens split — a date is three numbers from a load standpoint |
$3.50 | 1 | Currency prefix is non-digit and ignored |
1st | 1 | Trailing letters split; the digits still count |
The diagnostic location points at the first numeric token in the offending sentence, so the squiggle in your editor lands on the visible cluster rather than the start of the sentence.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_numbers | int | 6 | 4 | 3 |
Tune via lucid-lint.toml:
[rules."structure.number-run"]
max_numbers = 5
Before (flagged):
The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points over the prior year.
What lucid-lint check --profile public --experimental structure.number-run --conditions dyscalculia reports:
warning input.md:1:5 Sentence packs 8 numeric tokens (maximum 4). plain-language guidance recommends not placing many numbers or statistics together in one sentence; split the sentence or move some figures to a list or table. [structure.number-run]
After (your rewrite):
The 2024 cohort sat 1,200 students across 4 campuses. They posted a 92.5% pass rate on the reviewed papers and improved 18 points over the prior year.
The figures still travel together, but each sentence carries a load a dyscalculic reader can re-anchor without losing the running referent.
Before (flagged):
La promotion 2024 a réuni 1 200 étudiants sur 4 campus, affiché un taux de réussite de 92,5 % sur les 3 copies revues, et progressé de 18 points par rapport à l’année précédente.
After (your rewrite):
La promotion 2024 a réuni 1 200 étudiants sur 4 campus. Le taux de réussite atteint 92,5 % sur les copies revues et progresse de 18 points par rapport à l’année précédente.
See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:
<!-- lucid-lint disable-next-line structure.number-run -->
The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points.
dyscalculia tag that gates this rule.structure.mixed-numeric-format — sibling rule on numeric form consistency. Atomic split: mixed-numeric-format cares whether digits and spelled-out numerals share one sentence; number-run cares about how many numeric tokens cluster regardless of form.mixed-numeric-format).See References for the full bibliography.
rhythm.consecutive-long-sentencesStreaks of long sentences within the same paragraph. An isolated long sentence is manageable; several in a row fatigue attention even when each individual sentence is under the structure.sentence-too-long ceiling. This rule catches the rhythm.
| Category | rhythm |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (identical detection) |
| Source | src/rules/consecutive_long_sentences.rs |
Walk sentences sequentially inside each paragraph. Maintain a running count of consecutive sentences above word_threshold. Fire once per streak that reaches max_consecutive.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
word_threshold | int | 20 | 15 | 10 |
max_consecutive | int | 3 | 2 | 2 |
structure.sentence-too-longBoth rules look at sentence length but catch different problems:
| Rule | Threshold (dev-doc / public / falc) | Fires on |
|---|---|---|
structure.sentence-too-long | max_words 30 / 22 / 15 | a single sentence past the ceiling |
rhythm.consecutive-long-sentences | word_threshold 20 / 15 / 10 | a streak of max_consecutive sentences each above the lower threshold |
Because word_threshold sits below max_words, this rule catches the rhythm even when no individual sentence trips sentence-too-long. The invariant word_threshold < max_words (per profile) keeps the two from co-firing on the same sentence.
Five ideas, colour-matched across the rewrite — only the rhythm changes. lucid-lint reports; the rewrite is always yours.
Before (flagged):
The migration introduced a caching layer that sits in front of every read from the primary database. The team observed unexpected latency spikes whenever the cache invalidated under sustained write load. A subsequent investigation traced the regression to a thundering-herd pattern that fired on every cold key. The metrics dashboard misreported the issue as a generic timeout because the trace propagation was incomplete. The fix coalesced concurrent fills, added jittered TTLs, and instrumented the cache layer with a dedicated span emitter.
Five sentences, each over 20 words — the streak fatigues attention.
What lucid-lint check --profile dev-doc reports:
warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]
After (your rewrite):
The migration introduced a caching layer in front of the primary database. Latency spiked whenever the cache invalidated under heavy writes. The cause was a thundering-herd pattern on cold keys. Metrics misreported it as a generic timeout — trace propagation was broken. The fix coalesced concurrent fills, added jittered TTLs, and emitted a dedicated span.
Before (flagged):
La migration a introduit une couche de cache qui se place devant chaque lecture de la base primaire. L’équipe a observé des pics de latence inattendus chaque fois que le cache s’invalidait sous une charge d’écriture soutenue. Une enquête ultérieure a relié la régression à un motif de troupeau tonnant qui se déclenchait sur chaque clé froide. Le tableau de bord des métriques signalait à tort un délai d’attente générique parce que la propagation de la trace était incomplète. Le correctif a fusionné les remplissages concurrents, ajouté un TTL avec gigue, et instrumenté la couche de cache avec un émetteur de span dédié.
What lucid-lint check --profile dev-doc reports:
warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]
After (your rewrite):
La migration a introduit une couche de cache devant la base primaire. La latence montait dès que le cache s’invalidait sous écritures soutenues. Le coupable : un troupeau tonnant sur les clés froides. Les métriques signalaient un délai générique — la trace était cassée. Le correctif fusionne les remplissages, ajoute un TTL avec gigue et émet un span dédié.
See Suppressing diagnostics for the inline and block forms.
structure.sentence-too-long — catches individual long sentences; this rule catches the streak even when each sentence is under that ceiling.rhythm.consecutive-long-sentences carries the default weight 1; the cognitive cost is the cumulative streak, not any single sentence.See References for the full bibliography.
rhythm.repetitive-connectorsOveruse of a single logical connector inside a short window of sentences. Connectors (opposition, cause, consequence, sequence, illustration, addition) are attentional anchors; repeated, they flatten the sense of progression. Sanders & Noordman (2000), Connectives as processing signals; Graesser et al. (2004), local cohesion.
| Category | rhythm |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (separate lists) |
| Source | src/rules/repetitive_connectors.rs |
Sliding window of window_size sentences. Per connector, count occurrences in the window. Fire once per cluster that crosses max_per_window.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_per_window | int | 4 | 3 | 2 |
window_size | int | 5 | 5 | 5 |
custom_connectors | list | [] | [] | [] |
lucid-lint reports; the rewrite is always yours.
Five actions, colour-matched across the rewrite — only the connectors change.
Before (flagged):
We analysed the data. Then we built the model. Then we validated the results. Then we published the report. Then we archived the raw data.
Four then in five sentences — no progression felt.
What lucid-lint check --profile public reports:
warning input.md:1:1 Connector "then" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]
After (your rewrite):
We analysed the data. From it we built the model. Validation followed, and once the results held up we published the report. The raw data was archived last.
Five actions, colour-matched across the rewrite — only the connectors change.
Before (flagged):
Nous avons analysé les données. Ensuite nous avons construit le modèle. Ensuite nous avons validé les résultats. Ensuite nous avons publié le rapport. Ensuite nous avons archivé les données brutes.
Quatre ensuite en cinq phrases — aucune progression ressentie.
What lucid-lint check --profile public reports:
warning input.md:1:1 Connector "ensuite" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]
After (your rewrite):
Nous avons analysé les données. À partir de là nous avons construit le modèle. La validation a suivi, et dès que les résultats ont tenu nous avons publié le rapport. Les données brutes ont été archivées en dernier.
See Suppressing diagnostics for the inline and block forms.
structure.sentence-too-long — long sentences and connector overuse often co-occur; flagging both surfaces a richer rhythm signal.rhythm.repetitive-connectors carries the default weight 1; the cost is local rather than compounding.See References for the full bibliography.
lexicon.low-lexical-diversityPassages with excessive repetition of content words. A monotonous text loses reader attention and often signals unstructured thinking. The rule is not an anti-jargon detector: technical terms (API, request, cache) are expected to repeat — the signal targets non-technical content words.
| Category | lexicon |
| Default severity | info |
| Default weight | 1 |
| Languages | EN · FR (separate stoplists) |
| Source | src/rules/low_lexical_diversity.rs |
Sliding window of window_size words. Within the window, compute unique_words / total_words over non-stopword, non-code-block tokens. Fire when the ratio falls below min_ratio.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
window_size | int | 100 | 100 | 80 |
min_ratio | float | 0.40 | 0.50 | 0.55 |
use_stoplist | bool | true | true | true |
See References for the full bibliography.
lexicon.excessive-nominalizationSentences densely packed with nominalizations — verbs turned into abstract nouns. Two problems compound: nominalized text is more abstract (costlier to process) and hides the agent (“who does what” is obscured). FALC and the US Plain Writing Act both recommend strong verbs over nominalizations.
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (overlapping suffix lists) |
| Source | src/rules/excessive_nominalization.rs |
Walk the sentence. Flag words whose suffix matches the language’s nominalization list. Fire when the count per sentence crosses max_per_sentence.
-tion, -sion, -ment, -ance, -ence, -age, -ité, -isme, -ure-tion, -sion, -ment, -ance, -ence, -ity, -ism, -ness, -al| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_per_sentence | int | 4 | 3 | 2 |
suffixes | list | language defaults | language defaults | language defaults |
Technical vocabulary (function, implementation, configuration) contains many legitimate nominalizations, which is why dev-doc relaxes the threshold. The -al suffix in English is too broad (flags crucial, horizontal, positional despite these not being abstract nouns) and is tracked for review in F-excessive-nominalization-suffix-refine on the roadmap.
Nominalizations colour-matched to their active-verb counterparts in the rewrite.
Before (heavy):
La réalisation de l’analyse de la conformité permettra l’identification des axes d’amélioration.
After (lighter):
Nous analyserons la conformité. Cela permettra d’identifier les axes à améliorer.
See References for the full bibliography.
lexicon.unexplained-abbreviationAcronyms used without a nearby definition. Each forced interruption to guess or look up an acronym breaks the flow and raises the risk of losing attention.
References. WCAG 2.1 SC 3.1.4 (Abbreviations); RGAA 9.4.
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (whitelists differ) |
| Source | src/rules/unexplained_abbreviation.rs |
Pre-scan the whole document for acronyms defined in either canonical form:
Full Expansion (ACRONYM) — example: World Wide Web (WWW)ACRONYM (Full Expansion) — example: WWW (World Wide Web)The “expansion” side must contain at least two alphabetic words, so short parenthetical notes like (TBD) or (check later) do not count as definitions.
Match sequences of 2+ consecutive uppercase letters (optionally with digits) in the main text.
Filter each candidate against three layers, in order:
[rules.unexplained-abbreviation].whitelist.Flag each remaining occurrence.
A single definition anywhere in the document silences every occurrence of the same acronym — matching how readers actually use documentation (scroll back once to find the expansion, remember it thereafter).
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_length | int | 3 | 2 | 2 |
whitelist | list | extended | minimal | empty |
Default whitelist (v0.2, narrowed by F31): the infrastructure stack — URL, HTML, CSS, JSON, XML, HTTP, HTTPS, UTF, IO, API, CLI, GUI, OS, CPU, RAM, SSD, USB, IDE, SDK, CI, CD — plus common FR/EN acronyms and RFC 2119 emphasis keywords (PDF, SMS, GPS, ID, OK, FAQ, MUST, SHALL, SHOULD, …).
[rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]
User-whitelist entries are additive over the baseline — they extend it, never replace it.
lexicon.jargon-undefined — the content-word equivalent.See References for the full bibliography.
lexicon.weasel-wordsVague qualifiers that weaken a statement. A weasel word adds an invisible cognitive load: the reader has to decide whether the claim matters, is true, or measurable. References: Wikipedia style guide (Avoid weasel words), Strunk & White, FALC.
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (separate lists) |
| Source | src/rules/weasel_words.rs |
Word-boundary match against a per-language list. Case-insensitive. One diagnostic per occurrence.
`…` is skipped. Wrap a weasel term in backticks when you are discussing the word itself.rather than (EN) and plutôt que (FR) are conjunctions meaning “instead of” — not hedges — and are skipped.| Key | Type | Default |
|---|---|---|
custom_weasels_fr | list | [] |
custom_weasels_en | list | [] |
disable_weasels | list | [] |
Two patterns still fire in v0.2: straight-quoted terms ("many X" without backticks) and "many X" where X is a concrete noun. Both are queued under F23 on the roadmap. Wrap the quoted term in backticks, or use an inline disable comment, to opt out.
Use <!-- lucid-lint disable-next-line lexicon.weasel-words --> when the weasel is intentional (quotation, legitimate subset reference, meta-discussion). See Suppressing diagnostics.
See References for the full bibliography.
lexicon.jargon-undefinedDomain-specific terms used without definition. Jargon is contextual: acceptable among specialists, exclusionary otherwise. Like acronyms, jargon creates reading interruptions for the non-specialist; unlike acronyms, these are content words, not uppercase sequences.
References. US Plain Language, FALC, WCAG 2.1 SC 3.1.3 (Unusual Words).
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Languages | EN · FR (separate lists per language and domain) |
| Source | src/rules/jargon_undefined.rs |
tech, legal, medical, admin).| Profile | Lists active |
|---|---|
dev-doc | none (developers understand their own jargon) |
public | tech, legal, medical, admin |
falc | tech, legal, medical, admin, strict mode |
In v0.2, the active lists are set by the profile and are not yet user-overridable from lucid-lint.toml. Per-rule TOML overrides — adding custom domain terms, silencing specific entries, or activating a non-default list combination — are tracked as F126 on the roadmap.
See References for the full bibliography.
lexicon.all-caps-shoutingRuns of consecutive ALL-CAPS words.
ALL-CAPS prose strips the shape cues that dyslexic readers rely on to disambiguate words:
b, d, h, k, l.g, p, q, y.a, e, o and tall ones like h, l.In all-caps, every letter sits on the same baseline at the same height. The reader loses the silhouette of the word and has to decode letter by letter. ALL-CAPS also triggers many screen readers to spell out the run letter by letter unless the surrounding markup says otherwise.
WCAG 3.1.5 and the BDA Dyslexia Style Guide both recommend lowercase or sentence case for emphasis.
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Condition tags | a11y-markup, dyslexia, general |
| Languages | EN · FR (script-only detection — language-agnostic) |
| Source | src/rules/all_caps_shouting.rs |
Per paragraph, scan for runs of consecutive ALL-CAPS words. Minor connectors (,, ;, :, -, whitespace) keep a run alive; a lowercase word, a period, or paragraph break ends it.
A word is ALL-CAPS when it is at least 2 letters long and contains no lowercase letter. Single ALL-CAPS tokens are treated as abbreviations and are the responsibility of lexicon.unexplained-abbreviation.
Code blocks are excluded by the Markdown parser before the rule runs.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_run_length | int | 3 | 2 | 2 |
dev-doc tolerates a 2-word emphasis run (DO NOT) common in technical docs.
lucid-lint reports; the rewrite is always yours.
One emphasis phrase, colour-matched across the rewrite — the shouting becomes typographic emphasis without losing the stress.
Before (flagged):
Please DO NOT touch this.
DO NOT reads as shouting.
What lucid-lint check --profile public reports:
warning input.md:1:8 2 consecutive ALL-CAPS words read as shouting and degrade legibility for dyslexic readers. Use sentence case and rely on emphasis (italics, bold) or a callout instead. [lexicon.all-caps-shouting]
After (your rewrite):
Please do not touch this.
A chain of three or more acronyms in prose (API HTTP TLS) is structurally indistinguishable from shouting and will fire. Suppress on the line if the chain is intentional, or restructure the prose.
See References for the full bibliography.
lexicon.redundant-intensifierIntensifiers — adverbs that try to upgrade the confidence of a statement without adding information. very important reduces to important, or better, to a quantified claim. plainlanguage.gov (Chapter 4) and the CDC Clear Communication Index flag intensifiers as a plain-language anti-pattern.
The rule is a deliberate sibling of lexicon.weasel-words: weasel words downgrade confidence (hedges, qualifiers); redundant intensifiers upgrade it. The two lists are disjoint by construction.
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Condition tags | general |
| Languages | EN · FR |
| Source | src/rules/redundant_intensifier.rs |
Per paragraph, lowercase the text and look for each intensifier phrase in the per-language list (en::INTENSIFIERS, fr::INTENSIFIERS) using the shared word-bounded search. Hits inside fenced or inline code spans are ignored. Documents whose language is Unknown are skipped rather than guessed, matching lexicon.weasel-words.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
custom_intensifiers_en | list<string> | [] | [] | [] |
custom_intensifiers_fr | list<string> | [] | [] | [] |
disable | list<string> | [] | [] | [] |
custom_intensifiers_en / _fr add phrases to the defaults. disable removes phrases from them (exact lowercase match).
very in the fixed phrase very well (as acknowledgment) still triggers — plain-language guides flag it anyway, so the rule does not carve out an exception. Suppress via inline directive if the context genuinely calls for it.See References for the full bibliography.
lexicon.consonant-clusterWords whose longest run of consecutive consonants meets or exceeds a per-profile threshold. Dense consonant clusters are a known decoding barrier for dyslexic readers (BDA Dyslexia Style Guide): the reader must hold more phonemes in working memory before the next vowel “releases” the syllable.
Typical English offenders at the public threshold of 5 include strengths (n-g-t-h-s), twelfths (l-f-t-h-s), sixths (x-t-h-s in a 4-run plus context). Typical French offenders at the falc threshold of 4 include constructions (n-s-t-r).
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Condition tags | dyslexia, general |
| Languages | EN · FR |
| Source | src/rules/consonant_cluster.rs |
Per source line, walk the grapheme stream once. A word is a maximal run of alphabetic characters; hyphens, apostrophes, and whitespace close the word (so dys-lexic is two words, not one ten-letter cluster). Within a word, track the longest run of consecutive consonants. Emit one diagnostic per word whose longest run meets min_run_length.
Vowels are language-aware — French accented forms (é, è, ê, à, â, î, ï, ô, ö, ù, û, ü, ÿ, œ, æ) count as vowels. The English fallback still accepts common latin-1 accented vowels so borrowed words (café, naïve) decode correctly. y is treated as a vowel in every language (lenient), which avoids awkward false positives on words like fly, rhythm.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_run_length | int | 6 | 5 | 4 |
dev-doc is tolerant — technical prose regularly names things like strengths and benchmarks. falc (plain-language audience) catches any 4-consonant run.
hatching (5 letters: t-c-h-n-g — a run of 5) reads fluently to most readers because tch is a single English digraph. Suppress with an inline directive when a hit is unavoidable.en or fr — in practice such content is out of scope for a bilingual EN/FR linter.See References for the full bibliography.
lexicon.homophone-densityExperimental in v0.2.x. Off by default; opt in via
--experimental lexicon.homophone-densityor[experimental] enabled = ["lexicon.homophone-density"]inlucid-lint.toml. Flips toStableat the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for thedyslexiaandaphasiacondition tags that gate this rule under user-active conditions.
Paragraphs whose share of homophones — words that sound alike but spell differently (their / there / they're, to / too / two, cours / court, amande / amende) — exceeds a configurable percentage. Homophones force a phonological-then-orthographic disambiguation pass: the ear resolves the word, the eye must then pick the right spelling from context. That extra hop is cheap on its own and expensive in a cluster. The British Dyslexia Association style guide flags homophones as a known friction point for dyslexic readers, and the FALC orthographic-clarity guidelines recommend rephrasing dense homophone runs for aphasic and plain-language audiences.
| Category | lexicon |
| Default severity | warning |
| Default weight | 1 |
| Status | experimental (v0.2.x) → stable at v0.3 cut |
| Condition tags | dyslexia, aphasia (gated; runs only under matching --conditions) |
| Languages | EN · FR (curated per-language homophone lists) |
| Source | src/rules/lexicon/homophone_density.rs |
For each paragraph, walk the word stream once, count alphabetic words as the denominator, and count words that appear in the per-language homophone table as hits. If hits / total strictly exceeds the per-profile threshold, emit one diagnostic anchored at the paragraph’s start line. Paragraphs with fewer than 20 content words are skipped — below that floor, a single homophone produces a misleading double-digit percentage. The diagnostic message names up to two example homophones the rule actually saw, so the location is the paragraph but the fix candidates are concrete.
The homophone tables (HOMOPHONE_GROUPS_EN, HOMOPHONE_GROUPS_FR in src/language/) lean toward content-word pairs whose orthographic confusion genuinely distorts meaning. Ultra-frequent French function-word homophones (et / est, a / à, ou / où) are intentionally excluded: they appear in nearly every sentence and would push baseline density past every threshold, drowning out the signal the rule is meant to catch.
When the document’s detected language is Unknown the rule has no table to apply and skips silently rather than guessing.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_density_percent | float | 8.0 | 5.0 | 3.0 |
Tune via lucid-lint.toml:
[rules."lexicon.homophone-density"]
max_density_percent = 4.0
Before (flagged):
Their report shows there were too many decisions to make and two teams could not affect the launch nor lose the schedule despite careful planning across each region and product line every quarter.
What lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia reports:
warning input.md:1:1 Paragraph density of homophones is 21.2% (7 of 33 content words (e.g. their, there)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]
After (your rewrite):
The report shows that the team made many decisions and that the two squads kept the launch on schedule despite careful planning across each region and product line every quarter.
The rephrase swaps their / there / to / too / two for context-anchored alternatives (the report, that, the team, kept, the two squads), bringing density well below the threshold.
Before (flagged):
Pendant le cours du matin la cuisinière prépare le foie de veau avant la pause de midi puis revient à sa tâche après avoir rangé les ustensiles sur la grande table en bois clair.
What lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia reports:
warning input.md:1:1 Paragraph density of homophones is 11.8% (4 of 34 content words (e.g. cours, foie)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]
After (your rewrite):
Pendant la séance du matin la cuisinière prépare le foie de veau avant la coupure de midi puis reprend son travail après avoir rangé les ustensiles sur la grande table en bois clair.
cours becomes séance, pause becomes coupure, tâche becomes travail — three of the four homophone hits disappear without losing meaning.
See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:
<!-- lucid-lint disable-next-line lexicon.homophone-density -->
Their report shows there were too many decisions to make and two teams could not lose the launch.
dyslexia and aphasia tags that gate this rule.See References for the full bibliography.
syntax.passive-voicePassive-voice constructions. Passive hides the agent and lengthens the sentence without adding information. Legitimate exceptions exist (unknown agent, scientific style, intentional focus on the action) — the rule flags, the author decides.
References. US Plain Language; Strunk & White; FALC.
| Category | syntax |
| Default severity | warning |
| Default weight | 2 |
| Languages | EN · FR (separate heuristics) |
| Source | src/rules/passive_voice.rs |
be (conjugated) + past participle [+ by …]. Handles regular -ed and the irregular-participle table.être (conjugated) + past participle [+ par …], plus se faire + infinitif. Harder than EN because of participle agreement (gender/number) and confusion with (a) subject attribute (il est content vs il est vu) and (b) compound-tense être auxiliary (elle est partie — passé composé, active).Expect ~70–80% precision. A POS-parser-based replacement is planned for a future lucid-lint-nlp plugin.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_per_paragraph | int | 3 | 1 | 0 |
ignore_scientific_style | bool | false | false | false |
Use inline disables on intentional passives. See Suppressing diagnostics.
See References for the full bibliography.
syntax.unclear-antecedentPronouns whose antecedent is not obvious in the immediate context. Ambiguous pronominal reference is one of the costliest comprehension breaks for readers with attentional difficulties: each ambiguity forces a conscious return-and-search.
References. Strunk & White; FALC (“prefer name repetition over pronouns”); Graesser et al. Coh-Metrix (referential cohesion).
| Category | syntax |
| Default severity | info |
| Default weight | 2 |
| Languages | EN · FR (separate pronoun lists) |
| Source | src/rules/unclear_antecedent.rs |
Exact detection requires anaphora resolution (advanced NLP). v0.1 catches the two most frequent patterns:
This/That/These/Those, Ceci/Cela/Ce) not followed by a noun.Severity is info because the heuristic is approximate — the noise level warrants a soft signal.
| Key | Type | Default |
|---|---|---|
check_demonstratives | bool | true |
check_paragraph_start_pronouns | bool | true |
Les performances étaient médiocres avec le cache LRU. Cela a motivé le changement.
Cela refers to the performance? The cache? Ambiguous.
See References for the full bibliography.
syntax.nested-negationSentences that stack multiple negations. Two or more negations in the same sentence force the reader to mentally toggle truth values — a known burden for readers with aphasia and attention-fragile readers (ADHD), and a load multiplier for everyone reading under cognitive pressure. Plain-language guidelines (FALC, CDC Clear Communication Index, plainlanguage.gov) recommend rewriting double negatives as positives.
| Category | syntax |
| Default severity | warning |
| Default weight | 2 |
| Condition tags | aphasia, adhd, general |
| Languages | EN · FR (language-specific counting) |
| Source | src/rules/nested_negation.rs |
Count the negations per sentence; report sentences whose count exceeds max_negations.
not, no, never, none, nothing, nobody, nowhere, neither, nor, cannot, without) plus occurrences of the contracted n't suffix (don't, won't, isn't, doesn't, …).ne / n' clitic contributes one negation and pairs with its nearest second-position particle (pas, rien, jamais, plus, personne, aucun, aucune, guère, nulle part) within a short window; the pairing just consumes the particle to avoid double-counting. Unpaired particles in a ne-sentence contribute one more — this catches forms like rien used as a nominal negative subject. Guards: pas / plus never count when unpaired (too ambiguous outside ne …); rien preceded by de is treated as the idiom de rien and skipped; particles in a sentence with no ne clitic are skipped too (plus de courage, personne d'autre). Standalones sans / non always count.| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_negations | int | 3 | 2 | 1 |
lucid-lint reports; the rewrite is always yours.
Three negations → three affirmatives, colour-matched across the rewrite. The not simply drops — the simplification shows.
Before (flagged):
We do not say nothing is never possible.
Three negations (not, nothing, never).
What lucid-lint check --profile public reports:
warning input.md:1:1 Sentence stacks 3 negations (maximum 2). Rewrite as a positive statement or split the negations across separate sentences. [syntax.nested-negation]
After (your rewrite):
We say something is possible.
Passes under public:
Nous ne sommes pas prêts.
Bipartite ne ... pas counts as one negation.
Before (flagged):
Nous ne disons pas que rien n’est jamais possible.
Three negations: ne…pas (one bipartite), rien (unpaired), n'…jamais (one bipartite).
What lucid-lint check --profile public reports:
warning input.md:1:1 Sentence stacks 3 negations (maximum 2). Rewrite as a positive statement or split the negations across separate sentences. [syntax.nested-negation]
After (your rewrite):
Nous disons que quelque chose est possible.
See References for the full bibliography.
syntax.conditional-stackingSentences that chain multiple conditional clauses. Each if / when / unless / quand / si opens a branch the reader must keep on a mental stack until the outer clause resolves; two or three of them stacked in one sentence is a known load multiplier for readers with aphasia, ADHD, and anyone reading under cognitive pressure. Plain-language guidelines (FALC, plainlanguage.gov) recommend splitting conditional chains into separate sentences or a bullet list.
| Category | syntax |
| Default severity | warning |
| Default weight | 2 |
| Condition tags | aphasia, adhd, general |
| Languages | EN · FR (language-specific lists) |
| Source | src/rules/conditional_stacking.rs |
Per sentence, count the conditional connectors and report counts above max_conditionals.
if, unless, when, whenever, while, until, provided, assuming, in case, as long as, as soon as, even if, only if).si, sauf si, à moins que, à moins de, quand, lorsque, lorsqu', dès que, tant que, pourvu que, à condition que, à condition de, au cas où, même si, en cas de) plus the elliptic clitics s'il / s'ils.| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_conditionals | int | 3 | 2 | 1 |
Three conditionals, colour-matched across the rewrite — position already pairs them, the tint just confirms each branch carries through. lucid-lint reports; the rewrite is always yours.
Before (flagged):
If we ship, when the build passes, unless the gate fails, we deploy.
What lucid-lint check --profile public reports:
warning input.md:1:1 Sentence stacks 3 conditional clauses (maximum 2). Split the conditions across separate sentences or convert them to a bullet list. [syntax.conditional-stacking]
After (your rewrite):
We deploy when all three checks hold:
- the ship command ran,
- the build passes,
- the gate does not fail.
Before (flagged):
Si nous expédions, quand le test passe, à moins que la barrière échoue, nous déployons.
Three conditional connectors (si, quand, à moins que). French rewrite to come with the FR translation pass.
The English list mixes pure conditionals with temporal conjunctions (when, while) that often introduce conditional-like sub-clauses. Pure-temporal usages may produce a false positive on long sentences. Use disable-next-line when the temporal reading is unambiguous.
See References for the full bibliography.
syntax.dense-punctuation-burstLocal bursts of punctuation: a sliding window of grapheme clusters that contains too many qualifying marks (,, ;, :, —, –). Tight clusters of marks signal layered subordination, parenthetical interjections, or list-within-list constructions that are hard to parse for readers with cognitive or attentional difficulties (IFLA easy-to-read guidelines).
Distinct from structure.excessive-commas, which counts commas across an entire sentence. A sentence with 8 commas spread evenly across 200 characters does not trigger here, while a sentence with 3 commas inside a 30-character span does.
| Category | syntax |
| Default severity | warning |
| Default weight | 1 |
| Condition tags | general |
| Languages | EN · FR (script-agnostic) |
| Source | src/rules/dense_punctuation_burst.rs |
Per source line, walk the grapheme stream once and collect the column of every qualifying mark. When a window of window_graphemes graphemes holds min_marks or more marks, emit a burst spanning the first to the last mark in the window, then advance past that last mark so overlapping windows do not double-fire on the same cluster.
Code blocks (fenced and indented) are excluded upstream by the Markdown parser. Sentence terminators (., !, ?) and brackets do not count toward the burst.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_marks | int | 4 | 3 | 3 |
window_graphemes | int | 30 | 30 | 40 |
dev-doc tolerates a 3-mark cluster (often unavoidable in technical lists adjacent to prose). FALC keeps the same density floor as public but widens the window to catch slightly looser bursts.
—, U+2014) and en dash (–, U+2013) qualify; the ASCII double-hyphen surrogate (--) does not, on the assumption that authors who care about readability use the proper Unicode forms.See References for the full bibliography.
syntax.parenthetical-depthExperimental in v0.2.x. Off by default; opt in via
--experimental syntax.parenthetical-depthor[experimental] enabled = ["syntax.parenthetical-depth"]inlucid-lint.toml. Flips toStableat the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for theadhdandgeneralcondition tags.
A sentence whose maximum balanced-bracket nesting depth across () and [] reaches the profile threshold. Stacked parentheticals force the reader to track multiple suspended frames at once — a recognised “hard sentence” signal in the plainlanguage.gov and Hemingway editing traditions, and a particular cost for ADHD readers, who carry the working-memory load first.
The rule complements structure.excessive-commas, which already discounts flat (A, B, C) enumerations at depth 1. This rule fires only at depth 2 or more, so the two rules are mechanically orthogonal: one flat parenthesised list never trips this rule.
| Category | syntax |
| Default severity | warning |
| Default weight | 1 |
| Status | experimental (v0.2.x) → stable at v0.3 cut |
| Condition tags | adhd, general (gated; runs only under matching --conditions) |
| Languages | EN · FR (language-agnostic — bracket families are the same in both) |
| Source | src/rules/syntax/parenthetical_depth.rs |
For each sentence, the rule walks the post-flattening paragraph text (so fenced code blocks are already excluded by the parser) and tracks a single running depth counter.
( or [; decrement on ) or ].parenthesised_list_comma_count helper used by structure.excessive-commas.max_depth ≥ the profile threshold, anchored at the deepest opener.Em-dash pairs (— … —), curly braces ({}), and comma-flanked appositives are intentionally out of scope at v0.2.x. Em-dash pair detection is fragile (en/em-dash confusion, hyphen ambiguity) and would smuggle scope back in.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_depth | int | 4 | 3 | 2 |
max_depth is the inclusive nesting depth at which the rule fires. A sentence whose deepest bracket frame is one level shallower stays silent.
Tune via lucid-lint.toml:
[rules."syntax.parenthetical-depth"]
max_depth = 3
Before (flagged):
The migration tool (which now supports rollbacks (see
--reverse, added in 0.4.2 [tracked in #312])) is opt-in.
What lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd reports:
warning input.md:1:21 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]
After (your rewrite):
The migration tool is opt-in. It now supports rollbacks via
--reverse, added in 0.4.2 (tracked in #312).
The two top-level parentheticals are gone; the remaining one sits flat at depth 1. A reader no longer has to push three suspended thoughts on the stack to reach the close.
Before (flagged):
Le module (qui dépend du noyau (chargé au démarrage [voir le manuel])) est facultatif.
What lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd reports:
warning input.md:1:23 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]
After (your rewrite):
Le module est facultatif. Il dépend du noyau, chargé au démarrage. Voir le manuel pour les détails.
Three sentences, no nested brackets. The dependency chain is now linear and the reader recovers each fact in the order it appears.
See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:
<!-- lucid-lint disable-next-line syntax.parenthetical-depth -->
The migration tool (which now supports rollbacks (see `--reverse`, added in 0.4.2 [tracked in #312])) is opt-in.
adhd and general tags that gate this rule.structure.excessive-commas — sibling rule on flat parenthesised enumerations. Atomic split: excessive-commas discounts depth-1 (A, B, C) lists; this rule fires only at depth ≥ 2.syntax.dense-punctuation-burst — sibling rule on local punctuation density. Both signal hard-to-parse sentences from different angles.See References for the full bibliography.
readability.scoreA document-level readability index. Readability formulas are the historical synthetic signal for text complexity — simple, reproducible, recognized by US/UK government guidelines and WCAG. Treat it like cyclomatic complexity: a metric first, a warning second.
| Category | readability |
| Default severity | info (always reported) · warning when above max_grade_level |
| Default weight | 5 |
| Languages | EN — Flesch-Kincaid · FR — Kandel-Moles (auto-selected per detected language; v0.2+) |
| Source | src/rules/readability_score.rs |
The formula is selected by the document’s detected language:
English — Flesch-Kincaid Grade Level:
0.39 × (words / sentences) + 11.8 × (syllables / words) − 15.59
The result is a US-school grade. Compared directly to max_grade_level.
French — Kandel & Moles (1958):
207 − 1.015 × (words / sentences) − 73.6 × (syllables / words)
The result is an ease score on roughly 0..100 (higher = easier), Flesch-style. To stay comparable across languages, the rule converts it to a grade-equivalent with the standard linear approximation (100 − score) / 10, and compares that against max_grade_level. The diagnostic message surfaces both the native ease score and the grade-equivalent.
Unknown language falls back to Flesch-Kincaid.
| Grade | US school equivalent |
|---|---|
| < 6 | Elementary |
| 6–9 | Middle school |
| 9–12 | High school |
| 12–16 | College |
| > 16 | Expert |
Additional formulas (Gunning Fog, SMOG, Dale-Chall, Scolarius) and multi-formula --readability-verbose reports remain on the roadmap.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_grade_level | float | 14 | 9 | 6 |
always_report | bool | true | true | true |
formula | auto | flesch-kincaid | kandel-moles | auto | auto | auto |
Override formula via --readability-formula on the CLI; auto uses the detected language, other values pin the formula.
info (for observability, even when under the threshold).warning when the grade level exceeds max_grade_level.Suppressing a document-level metric is rarely the right answer; adjust max_grade_level in lucid-lint.toml instead. See Configuration.
See References for the full bibliography.
readability.large-number-unanchoredExperimental in v0.2.x. Off by default; opt in via
--experimental readability.large-number-unanchoredor[experimental] enabled = ["readability.large-number-unanchored"]inlucid-lint.toml. Flips toStableat the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for thedyscalculiaandgeneralcondition tags.
A large numeral or magnitude word that appears in a sentence with no nearby anchor — no unit, no percentage, no currency symbol, no ratio, no comparator phrase. The CDC Clear Communication Index asks whether numbers are clear and meaningful for the primary audience; plainlanguage.gov is more direct on the mechanism — “Use Numbers Effectively” recommends giving every large figure a comparison or denominator the reader can ground. Readers with dyscalculia carry the cost first: a context-free “4.8 milliards” forces an unaided magnitude estimate that ordinary prose context does not provide.
The rule complements structure.number-run, which fires on numeric clusters (≥ N tokens in one sentence). This rule fires on a single large or magnitude-word numeral that lacks anchoring context.
| Category | readability |
| Default severity | warning |
| Default weight | 1 |
| Status | experimental (v0.2.x) → stable at v0.3 cut |
| Condition tags | dyscalculia, general (gated; runs only under matching --conditions) |
| Languages | EN · FR (per-language comparator and figure-ref lexicons) |
| Source | src/rules/readability/large_number_unanchored.rs |
For each sentence, the rule walks the post-flattening paragraph text (so fenced code blocks are already excluded by the parser) and searches for unanchored candidates.
A sentence-level candidate is either:
,, ., ASCII space, NBSP, thin space, narrow NBSP) between digit groups so 1 000 (FR) and 1,000 (EN) both count as one 4-digit token with value 1000.million(s), billion(s), trillion(s) in EN; million(s), milliard(s), billion(s), trillion(s) in FR. Whole-word, case-insensitive.1000..=2999. 2024 and 1789 are years, not magnitudes.1st, 12th).Figure, Fig., Page, Section, §, p., pp., #, or the FR equivalents (figure, page, section, tableau, chapitre, annexe).Any of the following anywhere in the sentence anchors all candidates in that sentence:
%).$, €, £, ¥).km, kg, m², °C, L, Hz, MB, Mo, …).X out of Y, X sur Y, or X / Y between digits.roughly, approximately, more than, the size of, …; FR: soit environ, équivalent à, environ, plus de, par rapport à, …).The diagnostic location points at the first surviving candidate in the offending sentence, so the squiggle in your editor lands on the visible numeral rather than the start of the sentence.
| Key | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_value | int | 100000 | 10000 | 1000 |
min_value is the inclusive lower bound on the integer value of a numeric candidate. Tokens that meet the digit-count gate but fall below min_value are skipped — page-number-like quantities already get the figure-ref skip; this is a second safety net.
Tune via lucid-lint.toml:
[rules."readability.large-number-unanchored"]
min_value = 50000
Before (flagged):
The proposal mentions several billion in vague spending across regions.
What lucid-lint check --profile public --experimental readability.large-number-unanchored --conditions dyscalculia reports:
warning input.md:1:32 Magnitude word appears with no anchor in this sentence (no unit, percentage, ratio, or comparison phrase). plain-language guidance recommends pairing magnitude words with a unit or a comparison the reader can ground. [readability.large-number-unanchored]
After (your rewrite):
The proposal mentions several billion dollars in vague spending across regions, roughly the annual budget of a mid-sized state agency.
The figure now sits next to a unit (dollars) and a comparator phrase (roughly the annual budget); both anchor the magnitude for a reader who cannot ground it from raw scale.
Before (flagged):
Le budget atteint 4 800 000 000 selon le rapport final.
What lucid-lint check --profile public --experimental readability.large-number-unanchored --conditions dyscalculia reports:
warning input.md:1:19 Large numeral (10-digit, value ≈ 4800000000) appears with no anchor in this sentence (no unit, percentage, ratio, or comparison phrase). plain-language guidance recommends giving large numbers a comparison or denominator the reader can ground. [readability.large-number-unanchored]
After (your rewrite):
Le budget atteint 4,8 milliards d’euros, soit environ 6 % du PIB selon le rapport final.
The figure is now accompanied by a unit (euros), a percentage (6 %), and a comparator phrase (soit environ). A reader who cannot estimate “4,8 milliards” raw now has three independent anchors.
See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:
<!-- lucid-lint disable-next-line readability.large-number-unanchored -->
The proposal mentions several billion in vague spending across regions.
dyscalculia and general tags that gate this rule.structure.number-run — sibling rule on numeric clustering. Atomic split: number-run fires on clusters of numeric tokens; this rule fires on a single unanchored large numeral.structure.mixed-numeric-format — another sibling on numeric form consistency (digits vs spelled-out).See References for the full bibliography.
lucid-lint is a small Rust crate with a deliberately simple pipeline.
input text
│
▼
┌──────────────────────────┐
│ Language detection │ stop-word ratio heuristic
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Parser │ pulldown-cmark or plain text
│ (Markdown | plain) │
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Document model │ Section > Paragraph > Sentence
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Rules │ Each rule gets the document + language
│ (sentence-too-long, ...) │
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Diagnostics │ rule_id, severity, location, section,
│ │ message, weight
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐ v0.2+
│ Scoring │ density-normalized, category-capped
│ (Scorecard) │ 5 fixed categories
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Output formatter │ TTY (default) or JSON
│ │ — carries diagnostics + scorecard
└──────────────────────────┘
Diagnostic — the output unit. Carries weight (seeded from
scoring::default_weight_for) as of v0.2.Rule (trait) — fn check(document, language) -> Vec<Diagnostic>.Document — the parser’s output. Section-aware.Scorecard — global: Score plus [CategoryScore; 5] in fixed
Structure · Rhythm · Lexicon · Syntax · Readability order.Report — diagnostics + scorecard + word_count, returned by
Engine::lint_* since v0.2.Engine — bundles a profile, rule set, and optional
ScoringConfig; exposes lint_str, lint_file, lint_stdin.These principles are enforced in code review. See Design decisions for background.
NonZeroU32.src/
├── lib.rs — library root
├── main.rs — binary entry point
├── cli.rs — clap CLI
├── config.rs — profile presets, config file parsing
├── engine.rs — orchestration
├── language/ — detection + per-language data
├── parser/ — Markdown + plain + tokenizer + document model
├── rules/ — one file per rule
├── scoring.rs — hybrid scoring model (v0.2+)
├── output/ — TTY + JSON formatters
└── types.rs — domain types (Diagnostic, Severity, Location, ...)
This page records design decisions made during v0.1 that are worth revisiting before changing.
Decision: v0.1 shipped as a classic linter with info / warning
severities. v0.2 added a hybrid scoring model (global score +
per-category sub-scores + diagnostics) on top, without removing the
linter form.
Rationale: shipping the linter form first let us validate detection quality on real corpora before adding the aggregation layer. The scoring layer is additive — consumers that only care about diagnostics ignore the scorecard.
Decision: global + 5 per-category sub-scores, all in X / max form.
Composition stacks a weighted sum, density normalization (per 1 000
words, floored at 200), and a per-category cap. 5 fixed categories:
Structure · Rhythm · Lexicon · Syntax · Readability.
New Diagnostic.weight field, new --min-score=N CLI flag.
Rationale (full brainstorm at brainstorm/20260420-score-semantics.md):
X / max over 0–100: arbitrary max lets us re-tune without claiming
the 80 we ship today is the same 80 next release. The /impeccable
skill already uses this convention.category_of(rule_id) helper already decided in v0.1. Derive-from-
prefix (plan B) was rejected because it would require renaming 17
rules for F14 alone.Decision: a Diagnostic carries rule_id, severity, location,
section, message, and (as of v0.2) weight.
What’s NOT stored and why:
category — derivable from rule_id via Category::for_rule. Storing it would duplicate information and risk drift.suggestion — still deferred; current messages are actionable on their own.What IS stored and why:
section — recomputing it after the fact would require re-parsing the document to walk headings and match locations. The storage cost is an Option<String> per diagnostic; the recompute cost is a second full parse.weight (v0.2) — seeded at emission from scoring::default_weight_for
so that user overrides (via config) and rule-level overrides (via
with_weight) both flow through aggregation without a second lookup.Decision: the core ships only deterministic rules. LLM-based rules, network-backed rules, or ML-model-backed rules live in optional plugin crates (planned v0.3).
Rationale: a pre-commit hook that takes 5 seconds and varies between runs is worse than no hook. Determinism is non-negotiable in the happy path.
Decision: every language-dependent rule supports English and French from v0.1.
Rationale: most French-speaking OSS developers write docs in English. Targeting French only would miss the majority. Supporting both from day one is cheap and signals the ambition.
Decision: v0.1 uses Flesch-Kincaid Grade Level for all languages. Language-specific formulas (Kandel-Moles for French, SMOG, Coleman-Liau) are deferred to v0.2.
Rationale: Flesch-Kincaid is understood, reproducible, and well-behaved. Adding three more formulas before validating the basics would be premature optimization.
Decision: native support for .md, .markdown, .txt, and stdin in v0.1. Other formats (AsciiDoc, HTML, docx, PDF) use Pandoc as a pre-processor.
Rationale: Markdown covers the overwhelming majority of open-source and technical writing. Pandoc is free, ubiquitous, and removes the burden of maintaining multiple parsers.
Decision: each rule lives in its own file under src/rules/ with a consistent structure (struct, config, Rule impl, tests).
Rationale: makes adding a rule a well-defined operation (new file from template), and makes reviewing easy (one rule, one PR, one file to read).
Decision: v0.1 detects language by stop-word ratio. No external dependency.
Rationale: short, deterministic, no runtime cost. For the cases where it fails (very short texts, code-heavy docs), the unknown fallback is safe.
Decision: profiles are Profile::DevDoc | Public | Falc. They cannot be defined in user config in v0.1.
Rationale: adding custom profiles is a speculative abstraction until someone asks for it. Per-rule overrides are enough to cover 95% of the “I want a slightly different preset” cases.
Decision: ROADMAP.md is demoted from edited source to generated artifact. The source-of-truth becomes a structured set of files under .roadmap/ (gitignored), one markdown file per feature with TOML front-matter, plus narrative chunks. A small Rust workspace member (crates/roadmap-cli) provides add / generate / validate / rename subcommands. The generator is invoked locally during release prep; the regenerated ROADMAP.md is committed on the release-prep PR. CI does not regenerate. Scoped under F-roadmap-toml-source.
Rationale:
main (in place since 2026-05-03 via F-repo-config-hardening) forces every ROADMAP.md tweak through the worktree → branch → PR → CI → merge → cleanup cycle. Forecast steady-state was 10–30 ROADMAP-only edits per week. The PR review value on those edits is null (solo author), so the ceremony was pure overhead.ROADMAP.md would weaken the branch-protection signals tracked by the OpenSSF Scorecard / Best Practices badges. Demoting the file from main source preserves those signals untouched.pulldown-cmark already in dependencies, folds tests into cargo test, single-toolchain maintenance, and stays extractable as a standalone crate if the tool matures..roadmap/ (gitignored and machine-local). Release cadence — not real-time — was an accepted trade-off; the public ROADMAP.md artifact updates per v* tag.<a id="…"> anchor emission (so existing [F46](#f46)-style cross-links from PRs and commits keep resolving), an add templating subcommand (so creating a feature is one keystroke, not a regression), and a round-trip determinism test (regenerate the artifact, diff against committed, fail on drift).Emergency fallback: if crates/roadmap-cli work overruns budget, the file moves instead to a roadmap orphan branch with direct push and the same .md shape — preserves Scorecard signals via a different mechanism, at the cost of a non-standard branch layout. Documented as the escape hatch but not the chosen path.
RULES.md — the authoritative rule referenceROADMAP.md — future work trackedCODING_STANDARDS.md — day-to-day conventionsFuture rules, refinements, and platform extensions tracked from v0.1 onwards.
Status as of 2026-05-02: v0.1 shipped 2026-04-20 (17 rules). v0.2.0
shipped 2026-04-22 (25 rules + hybrid scoring + SARIF + condition tags),
v0.2.1 + v0.2.2 shipped 2026-04-23, v0.2.3 shipped 2026-04-29
(structure.line-length-wide author-break-aware + encoding hygiene
F110/F111/F112 + correctness wins). The v0.2.x patch cycle is active:
F25 closed 2026-05-01 (FR pair-completeness 41/41); the FR
content-staleness gate is --strict on main since 2026-05-01 and
on PRs since 2026-05-02 (F92 sub-task fully closed);
F35b/F35c, F104, F105, F107, F123 all shipped.
v0.3 strategy locked 2026-05-02: the breaking change is the 5-rule
cohort (F46 / F49 / F51 / F53 / F57) flipping from default-off to
default-on. Each rule ships in v0.2.x as Experimental via the F-experimental-rule-status
substrate — visible, opt-in for dogfooding, no score regression — then
flips to Stable at the v0.3 cut. v0.4 is a horizon bet list.
| Status | Meaning |
|---|---|
| ✅ | Done (merged on main) |
| 🚧 | In progress |
| ☐ | Not started |
| Priority | Meaning |
|---|---|
| 🔴 Next | Actively queued for the next cut |
| 🟡 Later | Likely someday, not scheduled |
| 🟢 Speculative | Nice-to-have, exploratory |
| — | Shipped; priority meaningless once the item has landed |
Version-centric and topic-centric summary views. The sections below this one are the authoritative topic-centric tables; use them when you need origin, rationale, or full history. Use this section when you need to answer “what’s next?” or “what’s the 0.3 shape?” in a glance.
| Version | State | Breaking? | Headline content |
|---|---|---|---|
| v0.1 | ✅ Released 2026-04-20 | — | 17 rules across 5 phases, minimal inline-disable, mdBook site with FR stub |
| v0.2.0 | ✅ Released 2026-04-22 | Yes (rule-id harmonisation) | Hybrid scoring (F14), SARIF (F32), condition tags (F71/F72), 8 new rules (25 total), F-readability-formulas-extra EN/FR auto-formula |
| v0.2.1 | ✅ Released 2026-04-23 | No | Localhost 404.html fix, 3rd per-rule TOML override, fixtures pipeline, TTY GIFs, v0.1/v0.2 prose sweep |
| v0.2.2 | ✅ Released 2026-04-23 | No | FR syntax.nested-negation pair-based counting |
| v0.2.3 | ✅ Released 2026-04-29 | No | structure.line-length-wide author-break-aware (60+ FR FPs killed), encoding hygiene at the engine boundary (F110/F111/F112 — UTF-8 BOM strip + NFC normalisation), strict whitelist validation, library expect() removal, scoring clamp invariant |
| v0.2.x | 🚧 In progress | No | FR translations (F25 ✅ closed 2026-05-01), responsive (F-docs-responsive), F-rule-mention-linking rule-mention linking, F-example-fixtures-part2 part 2, F-project-scoring-rollup project roll-up |
| v0.3 | ☐ Scoped | Yes | F22 v0.3 slice, F-readability-formulas-extra remainder, 5 condition-tag rules (F46/F49/F51/F53/F57) |
| v0.4 | ☐ Horizon | Varies | LLM plugin (F-llm-plugin), alternative formats (F-asciidoc-support–F-pandoc-companion), feedback-driven items |
Filtered to 🔴 Next + 🚧 In-progress. The narrative sections later in this file are the source of truth; this catalog is a derived index, hand-maintained alongside the narrative. If you spot drift, the narrative wins.
Sort: target version (current cycle first) → status (🚧 in-progress before ☐ next) → F-ID.
| ID | Topic | Status | Target | Summary |
|---|---|---|---|---|
| F22 | Rules refinement | 🚧 | v0.2.x → v0.3 | Parenthesised-list (Oxford ✅; non-Oxford + interleaved deferred to v0.3 slice) |
| F-example-fixtures-part2 | Example fixtures | 🚧 | v0.2.x | Part 2 — redistributable replacements (3/N closed 2026-05-01) |
| F-experimental-rule-status | Architecture | ☐ | v0.2.x | Experimental rule status substrate — gates the v0.3 cohort, opens dogfood window |
| F143 | Architecture | ☐ | v0.2.x | Inline AST layer over pulldown-cmark — substrate for F49 (gates the cohort lead) |
| F-weasel-words-severity-tiering | Rules refinement | ☐ | v0.2.x | Severity tiering for lexicon.weasel-words (quantifier info, hedge warning) — unblocks the audit-and-PR play |
| F-redundant-intensifier-bullet-fix | Rules refinement | ☐ | v0.2.x | lexicon.redundant-intensifier parser miss in bullet / **strong** spans (verification slice for F-tight-list-paragraphs) |
| F-severity-floor-flag | Suppression / config | ☐ | v0.2.x | --severity-floor=warning CLI flag — narrow-audit shape for external PRs |
| F-roadmap-slug-ids | Architecture | ☐ | v0.2.x | New ROADMAP entries adopt F-<slug> form (legacy F1–F146 stay numeric); slug-uniqueness Rust test enforces, runs offline |
| F-repo-discoverability-polish | Adoption channels | ☐ | v0.2.x | README badge row (crates.io / docs / CI / license) + GitHub social preview image (1280×640) — first-impression surfaces for crates.io and link unfurls |
| F-report-quick-wins | Reporting / DX | ☐ | v0.2.x | TTY quick-wins block under the diagnostic list — acronym whitelist hint + single-rule hot-spot hint; non-breaking, additive |
| F-project-scoring-rollup | Architecture | ☐ | v0.2.x | Project-level scoring roll-up (per-file + summary) |
| F-rule-mention-linking | Docs — content | ☐ | v0.2.x | Rule-mention linking pass across guide-prose pages |
| F-docs-responsive | Docs — reading prefs | ☐ | v0.2.x | Responsive / mobile adaptation |
| F-github-action | Adoption channels | 🚧 | v0.3 | GitHub Action — composite scaffold internal; v0.3 first cut emits ::warning:: |
| F-readability-formulas-extra | Rules refinement | ☐ | v0.3 | SMOG / Dale-Chall / Scolarius / --readability-verbose |
| F-tight-list-paragraphs | Architecture | ☐ | v0.3 | Markdown parser emits paragraphs for tight list items (correctness) |
| F46 | New rules (v0.3) | ✅ | v0.3 | lexicon.homophone-density — shipped 2026-05-03 (PR #41) as Status::Experimental; flips to Stable at v0.3 cut |
| F49 | New rules (v0.3) | ✅ | v0.3 | structure.italic-span-long — shipped 2026-05-02 (PR #26) as Status::Experimental (cohort lead); flips to Stable at v0.3 cut |
| F51 | New rules (v0.3) | ✅ | v0.3 | structure.number-run — shipped 2026-05-04 (PR #46) as Status::Experimental; flips to Stable at v0.3 cut |
| F53 | New rules (v0.3) | ✅ | v0.3 | readability.large-number-unanchored — shipped 2026-05-04 (PR #50) as Status::Experimental; flips to Stable at v0.3 cut |
| F57 | New rules (v0.3) | ✅ | v0.3 | syntax.parenthetical-depth — shipped 2026-05-04 (PR #55) as Status::Experimental; flips to Stable at v0.3 cut |
| F-npm-wrapper | Adoption channels | ☐ | v0.3 | npm wrapper (@lucid-lint/cli-{platform} optionalDependencies pattern) |
| F-docsrs-metadata–F-public-api-audit | Docs.rs polish | ☐ | v0.3 | [package.metadata.docs.rs], logo + favicon, doctests, cargo public-api audit |
Where the active energy is. Counts include 🔴 Next only; shipped items excluded.
just check + CI;
any 🔴-tagged row is eligible to ride the next patch cut.Routed 2026-04-24 in .personal/brainstorm/20260424-next-cycles.md.
Each bet lists the signal that unlocks it, so horizon items don’t
drift into Must by tenure alone. No commitments; this is “what could
be true in ~6 months if 0.2 and 0.3 land cleanly”.
| Bet | Unlock signal |
|---|---|
F-llm-plugin — lucid-lint-llm plugin | ≥ 2 concrete LLM-as-Judge rules designed on paper; deterministic-core base stable enough that non-determinism is a clear opt-in |
| F-asciidoc-support / F-html-support / F-docx-support / F-pandoc-companion — alternative formats (AsciiDoc / HTML / .docx / pandoc bridge) | External user requests; pick the single format with most pull and ship it alone, not the set |
| F-rule-fixture-coverage-map + F-reference-auto-discovery — fixture coverage maps + auto-discovery | Referential has stabilised (F-example-fixtures-part2 part 2 done) and rule set stops churning |
| F-lexicon-vocabulary-rarity — vocabulary-rarity | Lexique.org + COCA frequency lexicons built and licence-cleared |
| F-rhythm-forward-reference-heavy – F-syntax-address-inconsistency — remaining condition-tag rules | F46 / F49 / F51 / F53 / F57 validated in the wild at 0.3 |
| F-section-scoring — section-level scoring | Document + project level proven; users ask “which H2 is the problem?” |
| F-reading-time-score — reading-time unit | Validated heuristic exists; companion metrics (comfort, fatigue, understandability) defined |
| F-score-evolution-dashboard — score-evolution dashboard | CI users explicitly ask for trend view (not delta — delta is F-diff-mode / --compare) |
| F-interop-suppression — interop suppression (if not shipped in 0.3) | A second rule joins deeply-nested-lists as a markdownlint overlap |
| F-rule-discovery-corpus — rule-discovery corpus mining | Student / intern resource available; separate research track |
| LSP server | Editor demand visible (Cursor / VSCode issues); would change the deployment story |
| F-lede-buried / F-paragraph-landmark-density — research-track rules | Only if someone codes them for fun |
| F-external-feedback-top3 — top 3 items from first-10-external-users feedback (TBD) | 0.2.0 ships and ≥ 10 non-maintainer users exist — placeholder reserved so the horizon isn’t 100 % maintainer bets (renumbered from F98 post-collision with stream-2 cargo-mutants) |
F-figurative-language — metaphor / analogy / comparison detection (NLP or LLM plugin). Cognitive-load grounded: figurative language costs extra inference for tired readers, aphasia, L2 readers, and is a known axis for ASD (currently out of v0.2/v0.3 scope). Belongs in lucid-lint-nlp (F-nlp-plugin) or lucid-lint-llm (F-llm-plugin) — non-deterministic, so plugin-only per prime directive #4. Bilingual-viable concern: idiomatic FR vs EN metaphors don’t map; FR + EN paths need separate corpora at proposal time. | Either NLP or LLM plugin scaffolding lands AND a dogfood / external case surfaces a missed metaphor that confused a reader |
Deliberately off the 0.4 list:
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-llm-plugin | lucid-lint-llm plugin (LLM-as-Judge rules) | 🟢 Speculative | Research on existing tools |
The plugin would add rules like unclear-antecedent-semantic that use an LLM to detect semantic ambiguities the pattern-based heuristics miss.
Disabled by default due to non-determinism, API cost, and latency incompatible with pre-commit hooks.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-nlp-plugin | lucid-lint-nlp plugin specification and scaffolding (Python subprocess or WASM-based). Replaces heuristic rules with POS- / dependency-tree- / anaphora-backed precise versions. Ship only when the first plugin rule is concretely scheduled — scaffolding-without-consumer is the red flag from AGENTS.md directive #1 (2026-04-24 brainstorm-next-cycles). | 🟡 Later | Rule-system-growth brainstorm (2026-04-20) |
Candidate rules for the plugin:
syntax.passive-voice detection (replaces v0.1 heuristic)syntax.unclear-antecedentstructure.deep-subordinationDeferred from v0.2 because they require corpus work, lexicon builds, or
depend on earlier features (F9, F14). Naming uses the provisional
category.rule-name prefix pending F29.
| ID | Rule | Category | Tags | Grounding | Depends on |
|---|---|---|---|---|---|
| F46 | lexicon.homophone-density | Lexicon | dyslexia | BDA (dyslexia) | FR corpus tuning; ships as info. Slip-flag (2026-04-24): if FR corpus tuning exceeds ~2 days, slides to 0.3.x. Ships as Experimental in v0.2.x via F-experimental-rule-status; flips to Stable at v0.3 cut. |
| F49 | structure.italic-span-long | Structure | dyslexia | BDA | Cohort lead (2026-05-02) — first rule on the F-experimental-rule-status substrate, depends on F143 (inline AST layer). Ships as Experimental in v0.2.x; flips to Stable at v0.3 cut. |
| F51 | structure.number-run | Structure | dyscalculia | plainlanguage.gov | Ships as Experimental in v0.2.x via F-experimental-rule-status; flips to Stable at v0.3 cut. |
| F53 | readability.large-number-unanchored | Readability | dyscalculia, general | CDC CCI, plainlanguage.gov “Use Numbers Effectively” | Ships as Experimental in v0.2.x via F-experimental-rule-status; flips to Stable at v0.3 cut. Scoped 2026-05-04 (.personal/feature-torture/reports/F53.md): MVP fires when a numeric token (≥ 4 digits) or magnitude word (million / milliard / billion / trillion) appears with no same-sentence anchor — unit, percentage, ratio (X out of Y / X sur Y), or curated comparator phrase (≤ 30 entries per language under src/language/{en,fr}/). Excludes year-shaped numerals (1700–2100), ordinals, and figure / page / section refs. Profile thresholds: dev-doc = 100 000, public = 10 000, falc = 1 000. Severity at flip TBD (Warning cohort default vs Suggestion); pre-flip dogfood pass on examples/public/ gates the call. Boundary with F51 structure.number-run: F51 fires on numeric clusters; F53 fires on isolated unanchored numerals. |
| F57 | syntax.parenthetical-depth | Syntax | adhd, general | plainlanguage.gov, Hemingway | Ships as Experimental in v0.2.x via F-experimental-rule-status; flips to Stable at v0.3 cut. Scoped 2026-05-04 (.personal/feature-torture/reports/F57.md): MVP fires when a sentence’s maximum balanced-bracket nesting depth across () and [] reaches the profile threshold. Profile thresholds: dev-doc = 4, public = 3, falc = 2. Em-dash pairs and comma-flanked appositives deferred (filed as F-syntax-appositive-depth if dogfood demands). Code spans / blocks already excluded by parser helper; unbalanced brackets fail open (mirrors parenthesised_list_comma_count in src/rules/enumeration.rs). Severity at flip TBD (Warning cohort default vs Suggestion); pre-flip dogfood pass on examples/public/ gates the threshold tuning. Boundary with F22 structure.excessive-commas: F22 already discounts flat (A, B, C) enumerations at depth 1 via parenthesised_list_comma_count; F57 fires only at depth ≥ 2, so the rules are mechanically orthogonal. |
| F58 | syntax.front-loaded-subject-delay | Syntax | adhd, general | plainlanguage.gov | FR corpus validation (dislocation FP risk) |
| F-rhythm-pronoun-density | rhythm.pronoun-density | Rhythm | aphasia, general | FALC | — |
| F-rhythm-topic-shift-cluster | rhythm.topic-shift-cluster | Rhythm | adhd, general | Hemingway | May merge into F-missing-connectors after corpus review |
| F-lexicon-falc-idiom | lexicon.falc-idiom | Lexicon | aphasia, non-native | IFLA, FALC | Curated bilingual idiom lexicon |
| F-lexicon-vocabulary-rarity | lexicon.vocabulary-rarity | Lexicon | non-native, general | — | Frequency lexicon per language (Lexique.org for FR, COCA / Google-Books for EN). Tiered weights: common / context-dependent / expert. LLM-built fallback only. |
| F-rhythm-forward-reference-heavy | rhythm.forward-reference-heavy | Rhythm | adhd, general | Working-memory load | — |
| F-lexicon-acronym-distance | lexicon.acronym-distance-from-definition | Lexicon | adhd, non-native | Memory decay | F9 (definition-aware abbreviation) |
| F-syntax-complex-tense | syntax.complex-tense | Syntax | non-native, aphasia | FALC tense restrictions | FR morphology primary; EN lighter |
| F-syntax-impersonal-voice-heavy | syntax.impersonal-voice-heavy | Syntax | aphasia | FALC direct-address rule | — |
| F-syntax-address-inconsistency | syntax.address-inconsistency | Syntax | non-native, general | Register consistency | FR primary (tu / vous); EN weaker (you / one) |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-diff-mode | Differential diagnostics — --compare=<ref> CLI mode. Runs against two revisions of the same text(s) and reports score-delta + diagnostic-delta. Pitch: CI/PR comment framing (“this PR adds 2 warnings, removes 5, net −3”), inverting alarm fatigue the way coverage tools do. CLI + JSON + SARIF-run-comparison. No dashboard (that is F-score-evolution-dashboard). | 🟡 Later | Rule-system-growth brainstorm (2026-04-20). Depends on F14 stabilising. |
| F-explain-fancy-rendering | Fancy terminal rendering for lucid-lint explain — pipe the bundled markdown through termimad (or a custom pulldown-cmark + owo-colors walker) so headings, tables, code fences, bullets, and inline code render with proper typography instead of raw markdown. Ship a toned Skin that matches the existing warning-yellow / info-cyan palette rather than termimad’s magenta defaults — the brand direction is calm, typographic, not “rich CLI”. Defer past v0.2 so the check output polish (F?) lands first. | 🟡 Later | TTY-output critique (2026-04-22) |
Motivation: lucid-lint and Markdown-syntax linters (markdownlint, Vale, proselint, textlint) can flag the same line from different angles. Cognitive-load rules that happen to share a substrate with a structural check should stay shipped in core — users without markdownlint, users who disabled the matching markdownlint rule, and users feeding non-Markdown input (plain text, .docx via F-docx-support, HTML via F-html-support) all rely on lucid-lint for that coverage. The pain point is editor LSP sessions where two servers report the same span with different severities and different wording, not CLI pipelines where tools run sequentially.
Scope audit at 2026-04-20: after the structure.heading-jump reframing (cognitive
“comprehension cliff” at skip ≥ 2 levels, distinct from MD001’s strict
+1 rule), structure.deeply-nested-lists is the only lucid-lint rule that
remains functionally equivalent to a markdownlint rule (MD007). The
mechanism below is designed to scale — Vale, proselint, textlint
overlaps are likely as the rule set grows — rather than to solve a
single-rule problem.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F77 | ✅ Shipped in v0.2 — main.rs now auto-discovers lucid-lint.toml walking up from the CWD (stopping at the nearest .git boundary) and applies [default].profile, [default].conditions, [scoring] via ScoringFileConfig::into_scoring_config, and [rules.readability-score].formula. New --config <path> flag overrides discovery. Precedence: built-in profile defaults → TOML → CLI flags. Per-rule TOML overrides beyond readability.score extend rule-by-rule as each Config gains Deserialize. See docs/src/guide/configuration.md. | — | F11 follow-up (2026-04-21) |
| F-interop-suppression | Interop suppression mechanism. Rules declare overlapping external linter rules in their metadata (e.g. Rule::external_overlaps() -> &[(Linter, &'static str)], enum Linter::Markdownlint | Vale | Proselint | Textlint). Users opt in via [interop] suppress_when = ["markdownlint"] in lucid-lint.toml (CLI equivalent: --interop-suppress=markdownlint); opt-out is default, so coverage never silently drops. When active, affected rules are skipped at emission time with an info-level trace in --verbose. Ships CLI + LSP (the LSP path is the real motivator: two servers squiggling the same span with different severities and wording erodes trust in both). Only structure.deeply-nested-lists qualifies at time of writing (MD007); framework is designed to scale to future overlaps. Non-goal: detecting whether the external linter is actually installed or configured — the config field is the signal. | 🟡 Later | Markdownlint-overlap scan (2026-04-20) |
Filed 2026-04-25 from the adoption-channels brainstorm
(.personal/brainstorm/20260425-adoption-channels.md). This section
tracks distribution and integration channels — work that lives
in this repo (release artifacts, plugins, docs pages, IDE / CI
integrations).
Pure promotion / outreach plays (DINUM submission, awesome-list PRs,
audit-and-PR on famous OSS docs, W3C COGA submission, conference
talks, social-media cadence, Hacker News, etc.) moved to
.personal/promotion-channels.md on 2026-05-01. The freed F-IDs
(F111, F112, F113, F117, F118, F119) are considered lost — not
reused. F110 (Vale style pack — code) was renumbered to F137 (since
renamed to F-vale-style-pack under the slug
convention) to free F110 for the encoding-hygiene canonical entry
already shipped in v0.2.3.
The regulatory tailwind (EAA enforceable since 2025-06-28; RGAA 5 ships end-2026 with DGCCRF / Arcom sanctions up to 50k€ + renewable) shapes the must-list — F-vale-style-pack (Vale pack) leans directly on it. Bilingual EN/FR is the differentiator that makes the FR-government channel viable.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-vale-style-pack | Vale style pack — subset of rules → vale-cli/packages topic. Map only the rules that fit Vale’s existence / substitution / occurrence checks (target list: lexicon.weasel-words, lexicon.redundant-intensifier, lexicon.jargon-undefined, lexicon.unexplained-abbreviation, lexicon.all-caps-shouting — plus a couple thresholded structure rules if Vale’s conditional extends cleanly). The cognitive-load core (sentence-too-long thresholds, structure.deep-subordination, scoring engine, FALC profile) stays standalone-only. Pack is generated from the rule registry (~50 lines of Rust emitting Vale YAML) — zero hand-maintenance, regenerated per release. Each rule’s Vale link: field points to docs/src/rules/<id>.md so curiosity about gaps surfaces the standalone tool. Pack README opens with: “This is a subset of lucid-lint for Vale users. For sentence-shape, paragraph rhythm, scoring and the FALC profile, use lucid-lint standalone — see [link].” The Vale pack is intentionally a “trailer.” Risks (discovery dilution, identity blur, maintenance drag) all fall on the README + per-rule link surfaces; not cannibalisation — Vale users are a new audience, not poached existing users. | 🔴 Next | Adoption-channels brainstorm 2026-04-25 |
| F-github-action | GitHub Action in Marketplace (promoted to 🔴 Next, targeted at v0.3 from 2026-04-27 Block E recon — early-adoption feedback channel). Verified peer shape: both astral-sh/ruff-action and biomejs/setup-biome are thin composite actions (yaml-only) that download the prebuilt binary from upstream Releases, add it to PATH, optionally run it. Composite > Docker container for sub-second cold start; pure JS action avoided (no Node runtime needed). Proposed contract: uses: lucid-lint/lucid-lint-action@v1 with with: inputs version (default latest), paths, profile (falc / dev-doc / public), format (tty / json / sarif), min-score. v0.3 first cut emits ::warning file=…,line=…:: workflow commands for inline PR annotations; v0.4 swaps to SARIF upload via github/codeql-action/upload-sarif once the SARIF output stabilises, feeding GitHub Code Scanning natively. Risk: a composite action coupled to cargo-dist release-tarball naming — any rename breaks consumers, so pin the manifest contract. Internal scaffold landed 2026-04-28 — action.yml at the repo root implements the locked input contract (version, paths, profile, format, min-score, plus working-directory and passthrough args); a smoke workflow (.github/workflows/action-smoke.yml) exercises it on Linux / macOS / Windows runners against this repo’s own docs/src/. Not yet published, not yet v1-tagged, not yet listed in the Marketplace. Bake-in plan: dogfood the contract internally for 2–3 weeks, revise inputs that don’t survive contact with reality, then split out to a dedicated bastien-gallay/lucid-lint-action repo (the canonical ruff / biome pattern) and tag v1 alongside the v0.3 release. v0.3 first cut still emits ::warning::; SARIF upload deferred to v0.4 behind F32. | 🔴 Next | Adoption-channels brainstorm 2026-04-25 + Block E recon 2026-04-27 + scaffold 2026-04-28 |
| F-falc-readiness-guide | FALC-readiness guide page — new docs page docs/src/guide/falc-readiness.md (FR mirror at docs/src/fr/guide/falc-readiness.md) explaining how lucid-lint --profile=falc maps to the Inclusion Europe European Easy-to-Read standards. Cite the European Easy-to-Read logo program (logo use is free if conditions met: document follows the standards + at least one person with intellectual disability validated readability). Do not claim certification — claim readiness. The guide drives qualified traffic from disability-federation networks (UNAPEI, Inclusion Europe, etc.). | 🟡 Later | Adoption-channels brainstorm 2026-04-25 |
| F-mdbook-lint-coexistence | mdbook-lint coexistence guide. Short page in our docs (and a one-liner cross-PR to mdbook-lint’s README) explaining “use both”: mdbook-lint = markdown structure, lucid-lint = prose / cognitive load. Different niches, complementary. Free, opportunistic. | 🟢 Could | Adoption-channels brainstorm 2026-04-25 |
| F-pre-commit-hook-listing | Pre-commit hook listing in pre-commit/pre-commit registry. Fires once --check mode is stable across our CLI surface (currently most surfaces use --format=json and exit codes; hook-friendly summary + fast-fail mode is the prerequisite). | 🟢 Could | Adoption-channels brainstorm 2026-04-25 |
| F-wasm-playground | WASM playground for in-browser linting. Peer pattern (ruff play.ruff.rs, biome biomejs.dev/playground): single-page React/Preact + Vite app driving a Monaco editor, with a dedicated *_wasm Rust crate built via wasm-pack (ruff publishes ruff_wasm; biome publishes @biomejs/wasm-web from biome_wasm). Source layout: a playground/ workspace at repo root with wasm/ and web/ sub-trees. Hosting: Cloudflare Pages or GitHub Pages on a subdomain (e.g. play.lucid-lint.dev). Proposed shape for lucid-lint: crates/lucid-lint-wasm exposing lint(text, lang, profile) -> Diagnostic[] via wasm-bindgen; tiny Vite+Preact UI; estimated 300–600 kB gzipped given our deterministic core (no network, no LLM). Phase: v0.4+ — the surface needs its own brainstorm before scoping (UX shape, share-link encoding, persistence, mobile experience, contribution channel) and is best framed as a traction / acquisition lever once v0.3 distribution is in place. Risks: (1) bundle-size cliff if regex + unicode-segmentation push past 1 MB; (2) ongoing maintenance of a JS surface that can drift from CLI behaviour. | 🟢 Speculative | 2026-04-27 Block E recon |
| F123 | ✅ Shipped 2026-04-28 — curl-pipe-sh + PowerShell one-liners are surfaced in README.md and docs/src/guide/installation.md. The cargo-dist installer flip itself was a no-op — installers = ["shell", "powershell"] has been in Cargo.toml [workspace.metadata.dist] since the initial scaffold (d153ad8), so v0.1.1 / v0.2.0 / v0.2.1 / v0.2.2 have all been attaching lucid-lint-installer.sh and lucid-lint-installer.ps1 to their GitHub Releases. Yesterday’s Block E recon mis-filed F123 as a config flip; today’s reconnaissance confirmed the actual gap was discoverability. Documentation now covers both one-liners (Linux / macOS / WSL via curl … | sh; Windows via PowerShell irm | iex), the --check / audit-before-running pattern (download to a file, less/notepad, then execute), version pinning (releases/download/v<version>/… instead of releases/latest/…), and how each installer drops the binary on $PATH. The cargo install and source-build routes stay alongside as fallbacks. README’s stale “Once released to crates.io” lead-in dropped. Vanity sh.lucid-lint.dev redirect remains a v0.5 concern. | — | 2026-04-27 Block E recon |
| F-npm-wrapper | npm wrapper with platform optionalDependencies — promoted to 🔴 Next, targeted at v0.3 (early-adoption feedback channel for the JS-toolchain audience: Prettier / ESLint / Husky / package.json scripts users). Canonical pattern verified on the npm registry: biome (@biomejs/biome 2.4.13) and dprint (0.54.0) both publish a thin root package whose optionalDependencies lists one sub-package per target; npm resolves only the matching platform; root bin shim execs the binary; dprint additionally runs a postinstall install.cjs as fallback. Proposed shape: root lucid-lint (~10 kB) + five platform-specific @lucid-lint/cli-{aarch64-apple-darwin, x86_64-apple-darwin, x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, x86_64-pc-windows-msvc}. Version stays in lockstep with the Rust crate; release workflow gains an npm publish --provenance step using OIDC (biome already does this). Risks: (1) 5+ packages per release multiply publish-failure surface — release workflow needs all-or-nothing semantics; (2) npm registry outages would block JS users — document fallback to direct binary download (F123). | 🔴 Next | 2026-04-27 Block E recon |
| F-repo-discoverability-polish | Repo discoverability polish — README badges + GitHub social preview. Two adjacent first-impression surfaces, bundled because they share the audience (drive-by visitors on crates.io, link unfurls on Mastodon / LinkedIn / HN). Checklist: (1) README badge row at the top of README.md: crates.io version (img.shields.io/crates/v/lucid-lint), docs (mdBook URL), CI (ci.yml badge), license. Standard Rust crate signal — first thing crates.io users scan to gauge a project. (2) Social preview image uploaded under Settings → General → Social preview (1280×640, 40pt safe zone per GitHub template). Replaces the auto-generated avatar+name card that GitHub serves on Twitter / Mastodon / LinkedIn / Slack / HN unfurls. Two viable directions — pure brand card (wordmark + tagline), or terminal-screenshot card (real lucid-lint diagnostic on a real sentence, more “shows what it does”). Pick at design time; brand assets in .impeccable.md. Both items are non-breaking, no code touched — fits any v0.2.x patch slot or rides alongside an unrelated PR. | 🔴 Next | Repo-config session 2026-05-03 |
| F-homebrew-tap | Homebrew distribution (own tap → core). macOS-first audiences (writers, designers, docs teams) reach for brew install before cargo. Path is well-trodden: ship a tap on <org>/homebrew-tap immediately, graduate to homebrew-core once eligibility is met (current acceptable-formula policy needs a manual cross-check on homebrew/brew docs/Acceptable-Formulae.md — sandboxed during Block E recon; the old “75 stars” line was removed but maintainers still gate on adoption signal). Implementation: enable cargo-dist’s homebrew installer — it generates a Ruby formula referencing the same release tarballs we already build (aarch64-apple-darwin, x86_64-apple-darwin, plus Linux bottles) and opens a PR against our tap on each tag. Bottle building runs free on macos-latest runners. v0.4 launches the tap; homebrew-core submission deferred to v0.5+ behind real adoption signal. Risks: (1) tap fragmentation if we never graduate to core; (2) core review can take weeks. | 🟡 Later | 2026-04-27 Block E recon |
Hardened quality stack for the project’s internal hygiene, balancing CI speed with high-signal security and prose audits. Routed 2026-05-04.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-python-hygiene | Python scripting hygiene — ruff + mypy. Enforce high quality on the scripts/ layer (lang-sync, text-conversion). ruff for linting/formatting (replaces black/isort/flake8); mypy for type checking (leverages existing type hints). Pre-commit + CI integration. | 🔴 Next | Static-analysis session 2026-05-04 |
| F-prose-audits | Prose & Doc audits — typos + lychee. typos for fast source-code spell checking (aligned with lucid-lint mission); lychee for link integrity across docs/ and README.md. | ✅ Done | Static-analysis session 2026-05-04 |
| F-cargo-deny | Supply-chain gate — cargo-deny (+ actionlint). Replace the cargo-audit step in .github/workflows/ci.yml with cargo-deny for unified CVE + license + crate-ban checks (e.g. ban lazy_static in favour of OnceLock). Pairs with actionlint in the same PR — both are CI-workflow edits, shared review surface. Adds deny.toml (advisories + MIT/Apache-2.0/BSD-3-Clause/Unicode-DFS-2016 license whitelist + bans). Split out of former F-security-hardening 2026-05-05 (see report). | 🔴 Next | Static-analysis session 2026-05-04; split 2026-05-05 |
| F-gitleaks-precommit | Secret-leak guard — gitleaks pre-commit hook. Add gitleaks to .pre-commit-config.yaml alongside ruff/mypy/typos (PR #58). Minimal .gitleaks.toml. Mirror as a non-blocking CI warning step for one cycle, then promote to required. Open question on first-run scope (history vs staged-only) and blocking-vs-warning posture for the first week. Split out of former F-security-hardening 2026-05-05. | 🔴 Next | Static-analysis session 2026-05-04; split 2026-05-05 |
| F-infra-audit | API-stability gate — cargo-semver-checks. Protect the [lib] API surface before major releases by failing CI on accidental breaking changes. Narrowed from the original actionlint + cargo-semver-checks pairing 2026-05-05 — actionlint moved into F-cargo-deny (shared ci.yml edit). | 🔴 Next | Static-analysis session 2026-05-04; narrowed 2026-05-05 |
| F-cargo-udeps | Unused dependency audit — cargo-udeps. Identify and prune unused crates to minimize binary size and compile times. Runs weekly or manually (requires nightly). | 🟡 Later | Static-analysis session 2026-05-04 |
Bets that don’t commit to a ship date. Tracked to ensure they’re not forgotten.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-paragraph-landmark-density | structure.paragraph-landmark-density — reprise-points for attention-fragile readers. Research needed to define “landmark” (bold / italic / headers / list-starts / code spans?). | 🟢 Speculative | Rule-system-growth brainstorm (2026-04-20) |
| F-lede-buried | structure.lede-buried — journalistic inverted-pyramid check. Strong candidate for a future lucid-lint-journalism plugin rather than core. | 🟢 Speculative | Rule-system-growth brainstorm (2026-04-20) |
| F-rule-discovery-corpus | Rule-discovery corpus project — mine writer-heavy git histories for patterns that authors repeatedly rewrite. Source of evidence-grounded rule proposals. Intern / student project scale. | 🟢 Speculative | Rule-system-growth brainstorm (2026-04-20) |
Additional research directions captured for posterity but not yet ID’d:
--fix / quickfix suggestions — safe rules only (e.g.
structure.long-enumeration → concrete list skeleton). Controversial for
prose; needs guardrails.lucid-lint baseline — record per-project medians; rules flag
regressions rather than absolutes (ESLint-style).extends = "falc") — reduce duplication
across projects.lucid-lint-style plugin — adverb overuse, show-don’t-tell, and
other aesthetic rules excluded from core by design.lucid-lint-a11y plugin — alternative home for a11y-markup-
tagged rules if the tag proves insufficient to separate them from
prose rules.The 2026-04-22 reprioritisation favoured a tight 0.2.0 cut over a
fat one: anything non-blocking slides to 0.2.x patch releases, which
exist precisely to absorb per-rule polish and per-surface slices.
v0.2.0, v0.2.1, and v0.2.2 are shipped; v0.2.x remains open as a
rolling patch cycle. 0.2.x routing was reviewed on 2026-04-24
in .personal/brainstorm/20260424-next-cycles.md (not tracked;
.personal/ is gitignored).
| ID | Summary |
|---|---|
| F29-slim | Rule IDs moved to category.rule-name form (25 rules); src/rules/<cat>/ subdirectories; Category::for_rule derives from prefix. Hard break — suppression directives, [rules.<id>] TOML keys, JSON/SARIF ruleId all use the new form. |
| F35a | theme/index.hbs forked from upstream mdBook; skip link + EN / FR switch server-rendered. WCAG 2.4.1 Bypass Blocks passes with JS disabled. |
| F35d | Accessibility statement page (docs/src/accessibility.md + FR counterpart). |
| F-fail-on-warning-bool | --fail-on-warning accepts optional boolean; hidden mirror --no-fail-on-warning. --min-score now testable in isolation on documents with warnings. |
Localhost 404.html rendering fix (F-example-fixtures-part2 part 1), per-rule TOML override
for structure.excessive-commas (third rule wired after
readability.score.formula and lexicon.unexplained-abbreviation.whitelist),
scraped-prose fixtures pipeline (examples/texts.yaml + just texts),
TTY-capture GIFs via vhs tapes, v0.1 / v0.2 staleness sweep, idea-highlight
motif extended to the structure.sentence-too-long rule page. First
crates.io publish since v0.1.1 — packaging switched from exclude to
an explicit include list so docs/src/rules/*.md reach the tarball
(needed by src/explain.rs’s include_str!).
F87 — FR syntax.nested-negation pair-based counting over ne / n'
clitics and second-position particles (pas, rien, jamais, …).
Routed 2026-04-24 from the active-work view. Each row here has a full entry under a topic section below; priority column reflects the routing decision.
| ID | Topic | Item |
|---|---|---|
| F25 | Docs — bilingual | ✅ Closed 2026-05-01 — per-rule 25/25 + guides 8/8 + architecture 2/2 + contributing |
| F-docs-responsive | Docs — reading prefs | Responsive / mobile adaptation |
| F35b | Docs — reading prefs | Drop role="radiogroup" on reading chips (P2 a11y) |
| F-example-fixtures-part2 part 2 | Example-text fixtures | Redistributable replacements for load-bearing slots |
| F-vale-style-pack | Adoption channels | Vale style pack (subset of rules → vale-cli/packages topic) |
| F-experimental-rule-status | Architecture | Experimental rule status substrate — gates v0.3 cohort, opens dogfood window |
| F143 | Architecture | Inline AST layer over pulldown-cmark — substrate for F49 (cohort lead) |
| F-weasel-words-severity-tiering | Rules refinement | Severity tiering for lexicon.weasel-words (quantifier info, hedge warning) — unblocks the audit-and-PR play |
| F-redundant-intensifier-bullet-fix | Rules refinement | Fix lexicon.redundant-intensifier parser miss in bullet / **strong** spans — unblocks the audit-and-PR play |
| F-severity-floor-flag | Suppression / config | --severity-floor CLI flag — unblocks the audit-and-PR play narrow-audit shape |
| F-report-quick-wins | Reporting / DX | TTY quick-wins block under the diagnostic list — acronym whitelist hint + single-rule hot-spot hint |
| ID | Topic | Item |
|---|---|---|
| F-project-scoring-rollup | Architecture | Project-level scoring roll-up (per-file + summary) |
| — | Suppression / config | Per-rule TOML plumbing, rule-by-rule as each Config gains Deserialize |
| F-suppression-reason-field | Suppression / config | reason="..." field on suppression directives |
| F-rule-mention-linking | Docs — content | Rule-mention linking audit + coverage test (F-rule-mention-coverage-test) |
| F-github-action | Adoption channels | GitHub Action published to Marketplace (depends on stable SARIF output) |
| F-falc-readiness-guide | Adoption channels | FALC-readiness guide page citing Inclusion Europe standards |
| F-roadmap-slug-ids | Architecture | ROADMAP feature IDs adopt F-<kebab-slug> for new entries; legacy F1–F146 stay numeric; slug-uniqueness CI test (offline-runnable) |
F-excessive-nominalization-suffix-refine (nominalization suffix refine), F43 (RULES.md drift cleanup), F73
(font-leak CI gate), F-docs-final-polish (final polish pass), F-explain-fancy-rendering (fancy explain
rendering), F-suppression-disable-file (disable-file), F-text-source-adapters / F-text-before-after-refine / F-texts-yaml-url-maintenance (fixture hygiene),
F-mdbook-lint-coexistence (mdbook-lint coexistence guide),
F-pre-commit-hook-listing (pre-commit hook listing once --check mode stabilises).
F-score-letter-grade letter grade, F-score-traffic-light traffic light, F-per-family-subscores per-family sub-scores, F-lucid-stance-unify
.lucid-stance unify, F-fix-mode --fix mode (narrow).
Detail under “New rules (v0.3 candidates)” and the ## v0.4 — horizon
section below.
--readability-verbose.STR-001) only earn their cost on a real rename, and there are
zero scheduled renames. Revisit when one actually happens.| ID | Item | Priority | Origin |
|---|---|---|---|
| F14 | ✅ Hybrid scoring model shipped in v0.2 (global score + per-category sub-scores + diagnostics). X/max arbitrary-max at both levels, 5 fixed categories (Structure · Rhythm · Lexicon · Syntax · Readability), composition = weighted sum × density-normalization × per-category cap, weight field added to Diagnostic, --min-score=N CLI flag. See docs/src/guide/scoring.md. Letter-grade / traffic-light / reading-time decorations deferred (F-score-letter-grade–F-reading-time-score). | — | Architecture decision discussion |
| F-project-scoring-rollup | 🚧 Document-level scoring shipped in v0.2 (multi-path runs are aggregated as one document). Project-level roll-up (per-file breakdown + project summary) still open. Section-level deferred → F-section-scoring. | 🔴 Next | Linked to F14 |
| F-per-family-subscores | Per-family sub-scores | 🟡 Later | Linked to F14 |
| F32 | ✅ Shipped in v0.2 — lucid-lint check --format=sarif emits a SARIF v2.1.0 log compatible with GitHub Code Scanning. One rule descriptor per observed rule id (category, default severity, default weight, helpUri to the per-rule mdBook page); per-result properties carry weight + section. Workflow snippet in docs/src/guide/ci-integration.md. | — | v0.1 AGENTS.md audit |
| F37 | ✅ Rule-message clarity audit completed: all 17 rules reviewed against “what do I change?” bar. 15 rules already actionable; structure.heading-jump updated (first-heading-not-H1 and missing-H1 variants now include repair guidance). readability.score info variant left observational by design (fires only when always_report is set). | — | F14 brainstorm/20260420-score-semantics.md |
| F-section-scoring | Section-level granularity for scoring (deferred from F-project-scoring-rollup) — per-heading sub-scores once document + project are proven in the wild. | 🟡 Later | F14 brainstorm/20260420-score-semantics.md |
| F-score-letter-grade | Letter-grade decoration (A–F) on the X/max score — promote when user feedback shows the numbers feel noisy or hard to compare across docs. | 🟡 Later | F14 brainstorm/20260420-score-semantics.md |
| F-score-traffic-light | Traffic-light (🔴🟡🟢) + pass/fail margin in the TTY output — promote when CI users ask for a stronger glance signal than the number alone. | 🟡 Later | F14 brainstorm/20260420-score-semantics.md |
| F-reading-time-score | Reading-time-seconds as an alternative score unit — ties score to concrete user outcome. Requires validated heuristic + companion metrics (comfort, fatigue, understandability) so the time unit doesn’t monopolize the read. | 🟢 Speculative | F14 brainstorm/20260420-score-semantics.md |
| F71 | ✅ Shipped in v0.2 — ConditionTag enum (fixed 7-variant ontology: a11y-markup, dyslexia, dyscalculia, aphasia, adhd, non-native, general) plus Rule::condition_tags() trait method (default &[General]). All 17 v0.2 rules are general; future tagged rules (F48, F55, F56) opt in by overriding. See docs/src/guide/conditions.md. | — | Rule-system-growth brainstorm (2026-04-20) |
| F-tight-list-paragraphs | Markdown parser — emit paragraphs for tight list items (correctness fix). Discovered 2026-05-01 while verifying the F22 third-tranche dogfood metric: the same bullet content that triggers excessive-commas / dense-punctuation-burst / readability.score in a loose list (multiple items separated by blank lines) is silent in a tight list (single item, or items without separating blank lines). Root cause: pulldown-cmark only emits Tag::Paragraph events for items in loose lists; tight-list text events fire directly inside Tag::Item. The parser at src/parser/markdown.rs only buffers text inside heading or paragraph contexts, so tight-list content goes into the void and every paragraph-level rule (all 17 in v0.1) inherits the blindspot. Same pre-existing limitation flagged in F126 for structure.line-length-wide — F-tight-list-paragraphs resolves it once for every rule. Fix: synthesize a paragraph for each list-item span when no Tag::Paragraph event fires inside it. Expected dogfood impact: many CHANGELOG / release-note / README bullets become newly visible to rules — some genuine new diagnostics, some snapshot updates. | 🔴 Next | F22 third-tranche verification (2026-05-01) |
| F72 | ✅ Shipped in v0.2 — [default] conditions = [...] config field and --conditions CLI flag (comma-separated). Filter semantics: rules tagged general always run; tagged-only rules run iff their tags intersect the active list. Profiles unchanged; FALC retains its regulatory meaning. See docs/src/guide/conditions.md. | — | Rule-system-growth brainstorm (2026-04-20) |
| F143 | Inline AST layer over pulldown-cmark — substrate for inline-positional rules. Routed 2026-05-02 (.personal/brainstorm/20260502-parser-substrate-choice.md). The current Markdown parser at src/parser/markdown.rs flattens emphasis, strong, and link spans into the Paragraph.text string before rules see it — visible text preserved, structure lost. F49 (structure.italic-span-long, cohort lead) needs italic-span boundaries; future inline-positional rules (F-paragraph-landmark-density speculative, F-lexicon-acronym-distance conditional on F9) would hit the same wall. Decision: introduce a thin typed inline AST on top of pulldown-cmark, not swap the engine for comrak / markdown-rs. Reasons: pulldown stays the perf-leading parser; the AST is the domain model the rules walk (CUPID-aligned: composable, predictable, domain-based); the engine swap regresses bench by ≈ 2–3× and collides with the lightning-fast positioning pillar. Minimal viable substrate (YAGNI applied inside the layer): enum Inline { Text(String), Emphasis(Vec<Inline>) } plus a Paragraph.inline: Vec<Inline> field captured during the existing pulldown walk. Not modeled yet: Strong, Link, Code, footnotes, task-list markers, hard breaks inside emphasis. Each gets added when a second rule actually demands it; today only F49 does, and the steel-man check confirmed the cohort is non-uniform (F51 / F53 / F57 don’t need inline spans). Plain-text parser path: empty inline vec (no Markdown semantics). Estimated effort: half a day for the substrate, then F49 ships on top in a follow-up PR. Bench gate: PR-1 against the existing bench corpus; > 5 % regression triggers a profile pass before merge. Reversibility: the layer can be deleted and folded back into per-rule fields if a year passes with one consumer; the engine swap (comrak) was rejected partly because that reversal is much harder. | 🔴 Next | 2026-05-02 parser substrate brainstorm; F49 cohort lead unblocked |
| F-experimental-rule-status | Experimental rule status — registry substrate for the v0.3 cohort. Routed 2026-05-02 (.personal/brainstorm/20260502-v03-breaking-change.md). Soft-breaking changes (new default-active rules) are the SemVer-major signal for linters; lucid-lint has 5 such rules queued for v0.3 (F46 / F49 / F51 / F53 / F57). Rather than smear 5 score regressions across v0.2.x patches or hold all 5 until a single v0.3 cut, this entry adds a rule lifecycle status (Stable / Experimental) and ships the cohort in v0.2.x as Experimental (off by default). Users — including this repo’s own dogfood loop on adjacent projects — opt in via a [experimental] config section (enabled = ["structure.italic-span-long", …] or enabled = "*") or --experimental <id> CLI flag. v0.3’s breaking change is then a single-line per rule (Status::Experimental → Status::Stable) plus a CHANGELOG cohort entry. Why this shape, not per-rule default = false knobs: Status is one concept that maps to a known industry pattern (clippy nursery, biome nursery, ESLint experimental rules, rust #[unstable]); per-rule booleans would add five toggles for the same concept and pre-figure no lifecycle. Minimal viable substrate (resist gold-plating): Status enum on the Rule trait (default Stable); default_rules() filters Experimental unless config opts in; [experimental] TOML section parsing; --experimental CLI flag (multi-occur + *); experimental tagging visible in --list-rules output; one snapshot test for the experimental-off vs experimental-on diff. No rule-group / preset / category-toggle machinery yet — the biome-style recommendedRules preset is filed as a v0.4 question. Estimated effort: half a day for the substrate, then one line per rule once F49 / F51 / F53 / F57 ship on top of it. F46 keeps its original FR-corpus slip-flag (independent of the experimental status). | 🔴 Next | 2026-05-02 v0.3 breaking-change brainstorm; user-proposed dogfood window |
| F-repo-config-hardening | ✅ Shipped 2026-05-03 — full pass closed. What landed today: tag ruleset on v* pattern (block deletion + force-push, Active; matches all 7 release tags v0.1.0 → v0.2.4). Pre-existing (verified via API audit before clicking): .github/dependabot.yml (cargo + github-actions, weekly, grouped); Actions pinned-SHA required (sha_pinning_required: true); secret scanning + push protection both enabled; private vulnerability reporting enabled; CodeQL configured via Advanced setup workflow (.github/workflows/codeql.yml, weekly Rust scan, last run 0 results / 29 rules); Scorecard workflow also running. Retro on the routed-vs-actual gap: the entry as routed listed 6 items as if all were unconfigured; the API audit revealed 5 of 6 already shipped via earlier hardening passes that didn’t surface to the ROADMAP. Net work today: 1 click (the tag ruleset) + 1 retro audit. Lesson for future “GH Settings hardening” entries — verify state via gh api before drafting the checklist; the repo’s actual posture had drifted past the assumed baseline. Heads-up resolved 2026-05-03: the legacy main-protection branch ruleset transiently showed enforcement: disabled during the hardening session; re-verified via gh api repos/:owner/:repo/rulesets later the same day — both main-protection (branch, 5 rules → main) and v-tag-protection (tag, 2 rules → 7 v* tags) report enforcement: active. Branch protection is enforced on main + tags v*. Branch-Protection scorecard plafond (post-hoc note). OSSF Scorecard’s Branch-Protection check warns that main requires no approvers, no CODEOWNERS review, and no last-push approval. These three sub-controls assume ≥ 2 humans; a solo-maintainer repo cannot satisfy them without admin bypass (theatre) or auto-approve (contournement of the control’s intent). Decision 2026-05-03: accept the 4/10 ceiling as structural while solo. Revisit on co-maintainer onboarding, or pair with F-adversarial-review to add bot-driven review without faking human gating. Original checklist (preserved for retro reference): (1) Tag ruleset on v*. (2) Dependabot version updates (cargo + github-actions). (3) Actions: Require pinned-SHA. (4) Secret scanning + push protection. (5) CodeQL default setup. (6) Private vulnerability reporting. | — | Repo-config session 2026-05-03 |
| F-adversarial-review | Adversarial PR review — bot second pair of eyes. While the repo stays solo-maintainer, every PR is self-authored and self-reviewed; OSSF Scorecard Branch-Protection plafonds at 4/10 (see F-repo-config-hardening). An adversarial bot reviewer on each PR adds real signal without faking human-review-as-gate. Two complementary tracks: (1) LLM review — Claude Code or Gemini Code Assist via GitHub integration, comment-only mode, included in existing subscriptions; catches semantic and context-aware issues. (2) Rule-engine review — Semgrep, custom CodeQL queries, cargo-deny, cargo-audit, danger.js, or similar, run as required PR checks; deterministic, complementary to LLM, catches structural issues (unsafe patterns, license drift, CHANGELOG gaps). Hard scope rule: review-only, no auto-approve — auto-approve crosses into Scorecard theatre and is explicitly out. Reassess if a co-maintainer joins or a release-managers group is added. | 🟡 Later | Repo-config session 2026-05-03 |
| F-roadmap-slug-ids | ROADMAP feature IDs adopt F-<kebab-slug> form for all new entries. Routed 2026-05-02 (.personal/brainstorm/20260502-roadmap-id-attribution.md). Numeric F<n> IDs collided when two branches independently picked the same free number; reservation-on-main was ruled out because new features are routinely discovered mid-implementation inside an existing feature branch. Decision: new ROADMAP entries use a slug-as-ID form (F-inline-ast-substrate); legacy F1–F146 stay numeric (no migration — Devil’s-Advocate verified no programmatic parser in src/, tests/, scripts/, .github/, justfile depends on the format; mixed taxonomy is cosmetic). Slugs are coined locally with no coordination. The cross-branch race that survives (two offline branches independently coining the same slug) is detected at PR time and resolved by a one-line slug rename in ROADMAP + CHANGELOG — no branch rename, no rebase, no commit-history rewrite, because the new convention drops the F- prefix from branch names and commit subjects. Minimal viable substrate: (a) tests/roadmap_id_uniqueness.rs parses ROADMAP.md + CHANGELOG.md and asserts every F-<slug> appears uniquely as a definition site, no slug shadows the legacy F<number> namespace, and every referenced [F-foo](#f-foo) resolves; runs offline via cargo test, re-runs in CI as a backstop. (b) Explicit <a id="f-..."></a> anchors on first definition (matches the existing convention for numeric IDs). (c) F- prefix becomes optional in branch names and commit subjects — branches use plain feature slugs (feat/<slug>), commits use scope syntax (feat(parser): <subject>). Surfaces touched: tests/roadmap_id_uniqueness.rs (new), AGENTS.md Conventions section, CHANGELOG.md [Unreleased], this entry itself (first dogfood). Reversibility: if the mixed taxonomy ever bites (it shouldn’t — no parser depends on it), a one-shot rename script could fold slug entries into a numeric scheme at any future v0.x cut. Estimated effort: ~1 h total — uniqueness test (30 min), AGENTS.md update (15 min), this ROADMAP wiring (15 min). | 🔴 Next | 2026-05-02 ID-attribution brainstorm |
The linter is a UTF-8 → diagnostics function. Encoding conversion is
the user’s responsibility, exactly once, before lint-time (iconv
or “save as UTF-8”). Invalid UTF-8 fails at the read boundary
(std::fs::read_to_string returns an io::Error). Other encodings
(Windows-1252, Latin-1, Shift-JIS, …) are explicit non-goals: any
in-process transcoder would violate the deterministic-core prime
directive (charset detection is heuristic, “same input, same output”
no longer holds). The entries below cover the valid-UTF-8 edge
cases the test surface should pin.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F110 | ✅ Shipped 2026-04-28 — leading \u{FEFF} stripped once at the engine boundary (Engine::lint_with_source, via the normalize_input helper). Funnels every input path (string, stdin, file) through the same boundary so rules never see the BOM. Regression test in src/engine.rs::tests::bom_prefix_does_not_shift_diagnostics proves identical diagnostics + line/column locations with and without a leading BOM on a sentence-too-long fixture. | — | 2026-04-25 encoding survey |
| F111 | ✅ Shipped 2026-04-28 — unicode-normalization = "0.1" added; Engine::lint_with_source NFC-normalizes input at the same boundary as F110, fast-pathing already-NFC text via is_nfc_quick. NFC café and NFD cafe + U+0301 now hash identically in every HashMap-using rule. Regression test in src/engine.rs::tests::nfd_input_yields_same_diagnostics_as_nfc exercises a 4-sentence FR fixture and asserts diagnostic count + per-diagnostic rule id and line match across NFC and NFD inputs. | — | 2026-04-25 encoding survey |
| F112 | ✅ Shipped 2026-04-28 — src/engine.rs::tests::lone_cr_line_endings_are_normalized pins parity between LF and lone-CR three-paragraph fixtures (word count + diagnostic count). src/engine.rs::tests::zero_width_chars_inside_words_pin_behaviour pins observed behaviour for U+200B / 200C / 200D inside words: the engine round-trips without panicking and produces a valid Report; exact word count is intentionally not asserted because nfc() does not strip them and tokenisation is owned by unicode-segmentation. | — | 2026-04-25 encoding survey |
| F-mixed-script-fixtures | Mixed-script test fixtures. Pin behaviour on EN + CJK and LTR + RTL prose mixed within one paragraph. unicode_words() should handle the boundaries correctly (UAX-29), but no regression test exists. Filed as Speculative — no known bug, just a coverage gap. Open if a real-world bilingual corpus surfaces edge cases. | 🟢 Speculative | 2026-04-25 encoding survey |
| F126 | ✅ Shipped — Markdown parser maps <br> to \n in paragraph.text. Pulldown-cmark emits <br> as Event::InlineHtml, not Event::HardBreak, so the v0.2.x author-break-aware fix for structure.line-length-wide silently dropped <br> despite advertising it as a measured hard break. Helper html_is_br_tag recognises <br>, <br/>, <br /> (any case, optional whitespace); HTML comments (suppression directives) flow through unchanged. Five new tests pin the contract: br_tag_inside_paragraph_is_a_hard_break and html_comment_directives_do_not_inject_newlines (parser); markdown_br_tag_is_checked, list_item_text_is_out_of_scope, table_cell_text_is_out_of_scope (rule). The two out-of-scope tests pin the parser-construction contract that list-item content and GFM table cells are not emitted as paragraphs today, so the rule is silent on over-length content inside them — a future parser change that starts emitting either as paragraphs would need to revisit this rule. | — | 2026-04-30 audit follow-up to the structure.line-length-wide author-break-aware fix (.personal/2026-04-30-today.md:125) |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F9 | ✅ Shipped in v0.2 — definition-aware lexicon.unexplained-abbreviation is now two-pass. A pre-scan collects acronyms defined anywhere in the document in either canonical form (Expansion (ACRONYM) or ACRONYM (Expansion); expansion side ≥ 2 alphabetic words to reject (TBD)-shaped noise), and a single definition silences every occurrence of that token. Silencing precedence: defined-in-doc → user whitelist → baseline. See docs/src/rules/unexplained-abbreviation.md. | — | Rule 10 simplified in v0.1 |
| F-readability-formulas-extra | 🚧 Must-ship slice shipped in v0.2 — readability.score auto-selects the formula by detected language: Flesch-Kincaid for EN (kept), Kandel & Moles (1958) for FR. Kandel-Moles ease scores are converted to a grade-equivalent so per-profile max_grade_level stays comparable across languages. Unknown language → Flesch-Kincaid. See docs/src/rules/readability-score.md. Still open: Gunning Fog / SMOG / Dale-Chall (EN), Scolarius / Flesch-Kandel (FR), --readability-verbose multi-formula reports, per-file override (covered by F11). | 🟡 Later | Rule 11 simplified in v0.1; scope expanded in rule-system-growth brainstorm (2026-04-20) |
| F11 | ✅ Shipped in v0.2 — --readability-formula {auto,flesch-kincaid,kandel-moles} CLI flag + FormulaChoice enum on readability_score::Config + Engine::with_readability_formula(choice). auto (default) keeps F-readability-formulas-extra per-language selection; flesch-kincaid / kandel-moles pin a formula for cross-document comparison. TOML config wiring is tracked separately as F77. | 🟡 Later | Rule 11 |
| F-missing-connectors | missing-connectors rule (15b not shipped in v0.1) | 🟡 Later | Rule 15 decomposition |
| F-low-diversity-stoplist | Custom stoplist parameter for lexicon.low-lexical-diversity | 🟡 Later | Rule 5 |
| F-sentence-diversity-density | Sentence-level low-lexical-diversity density | 🟢 Speculative | Rule 5 |
| F-comma-density-relative | Comma density metric (relative) for structure.excessive-commas | 🟢 Speculative | Rule 3a |
| F22 | 🚧 First slice shipped in v0.2.x — structure.excessive-commas now discounts commas inside (A, B, C, …) parenthesised token lists (3+ short comma-separated segments inside balanced parens, language-agnostic). Sibling helper parenthesised_list_comma_count in src/rules/enumeration.rs. Dogfood drops from 25 → 15 hits (10 FPs killed, ~40% reduction). Deferred to v0.3: relaxing MAX_SEGMENT_WORDS = 2 for 3–4-word Oxford items, non-Oxford / “plus”-closed lists, interleaved parentheticals inside Oxford runs. See research note in .personal/research/[F22](#f22).md. | 🔴 Next | v0.1 dogfood: 5 false-ish positives on technical docs |
| F23 | ✅ Shipped in v0.2 — false-positive cleanup complete for v0.2. Hits inside inline code spans, straight "..." quotes, paired curly "..." quotes, and directional rather than / plutôt que pairings are now skipped. Single quotes / apostrophes are deliberately not recognised (possessives, contractions, FR elisions). The “concrete noun” semantic check ("many X" where X is a concrete noun) stays unshipped — needs POS data and belongs in the lucid-lint-nlp plugin (F-nlp-plugin) rather than the deterministic core. | — | v0.1 dogfood: 11 false-ish positives on this repo’s own docs |
| F-excessive-nominalization-suffix-refine | Refine lexicon.excessive-nominalization suffix list (drop or gate -al; many adjectives — crucial, horizontal, positional, attentional — are flagged despite not being abstract nouns) | 🟡 Later | v0.1 dogfood |
| F87 | ✅ Shipped in 0.2.x — FR syntax.nested-negation now uses pair-based counting over ne / n' clitics and the second-position particles pas, rien, jamais, plus, personne, aucun, aucune, guère, nulle part. Each clitic contributes one negation and consumes its nearest particle within a 6-token window; unpaired particles in a ne-sentence contribute one more — so Nous ne disons pas que rien n'est jamais possible now counts as 3 (was 2). Guards: pas / plus never count when unpaired, de rien idiom is skipped, particles in ne-less sentences are skipped. Fixture at tests/corpus/fr/nested-negation.md anchors the behaviour. | — | 2026-04-23 docs clarity session — FR pedagogical example surfaced the detection gap |
| F31 | ✅ Shipped in v0.2 — dev-doc baseline narrowed to the infrastructure stack (URL, HTML, CSS, JSON, XML, HTTP, HTTPS, UTF, IO, API, CLI, GUI, OS, CPU, RAM, SSD, USB, IDE, SDK, CI, CD). Accessibility standards, engineering-practice initialisms, and AI/language-tech terms moved to project config via new [rules.unexplained-abbreviation].whitelist in lucid-lint.toml (additive over baseline). Breaking change for downstream users, flagged in CHANGELOG with the recovery snippet. Dogfooded in this repo’s own lucid-lint.toml. | — | v0.1 review feedback |
| F126 | TOML overrides for lexicon.jargon-undefined. In v0.2 the active jargon lists are baked into the profile preset and there is no [rules."lexicon.jargon-undefined"] deserializer in src/config.rs — users can’t add custom domain terms, silence individual entries, or activate a non-default list combination from lucid-lint.toml. Wire the same shape unexplained-abbreviation already uses (validated whitelist, plus custom_jargon for additive terms and an explicit active_lists enum array). The rule’s underlying Config struct already exposes the fields (active_lists, custom, whitelist) — this is a config-layer wiring task, not a rule rewrite. Definition of done: TOML round-trip test, docs page (docs/src/rules/jargon-undefined.md + FR mirror) describing the schema, drop the F126 forward-link in those pages. | 🟡 Later | 2026-04-28 FR-translation review surfaced the gap |
| F-weasel-words-severity-tiering | Severity tiering for lexicon.weasel-words. Routed 2026-05-02 (.personal/brainstorm/20260502-async-book-pr-timing.md); blocks the async-book audit-and-PR play (tracked in .personal/promotion-channels.md). The current rule fires uniform warning on every entry in the EN/FR weasel-list, conflating two distinct linguistic functions: quantifiers (some, many, often, most, several) which are legitimate technical hedging in reference docs, and hedges (a bit, just, quite, rather, pretty, kind of) which signal under-confident prose. Stripping all of them in a Rust async-reference produces prose reviewers reject as artisan (“over-edited” — surfaced in .personal/f113-async-book/READABILITY_REVIEW.md). Fix: split WEASEL_WORDS_EN / _FR into two sub-lists, emit Severity::Info on quantifier hits and Severity::Warning on hedge hits. Per-rule TOML override stays available for users who want stricter / looser bands. Surfaces the pattern other lexical rules can adopt later (no architectural lift; a per-match severity decision inside the rule body). Definition of done: split lists in src/language/{en,fr}/weasel.rs, severity routing in src/rules/lexicon/weasel_words.rs, snapshot regen for both languages, docs page (docs/src/rules/weasel-words.md + FR mirror) describing the two bands and the rationale, CHANGELOG ## [Unreleased] entry. Pairs with F-severity-floor-flag: once --severity-floor=warning exists, an external auditor running on a Rust-reference repo gets the “no contested edits” view in one flag. | 🔴 Next | F113 audit-and-PR play (2026-05-02) |
| F-redundant-intensifier-bullet-fix | lexicon.redundant-intensifier parser miss inside bullet items / **strong** spans. Routed 2026-05-02 (.personal/brainstorm/20260502-async-book-pr-timing.md); blocks the async-book audit-and-PR play (tracked in .personal/promotion-channels.md). Surfaced while linting rust-lang/async-book/src/why_async.md: very inside - **OS threads** are very … (bullet + strong span) does not fire, while highly in a flat paragraph does. Same family of misses as F-tight-list-paragraphs — paragraph-level rules go silent when the surrounding event is Tag::Item with no enclosing Tag::Paragraph. F-tight-list-paragraphs is the right substrate to fix once for every paragraph-level rule; this entry is the verification slice that pins the regression for redundant-intensifier so the case cannot regress when F-tight-list-paragraphs lands. Definition of done: corpus fixture tests/corpus/en/redundant-intensifier-bullet.md + FR mirror, snapshot covering very / highly / really inside - **strong** ... and * **strong** ... shapes, comment in the test linking to F-tight-list-paragraphs so the slot stays after F129 lands, CHANGELOG entry. | 🔴 Next | F113 audit-and-PR play (2026-05-02); same family as F-tight-list-paragraphs |
F22 context. The v0.1 rule is a flat comma-per-sentence threshold. In technical docs that routinely enumerate short items, this fires often even when the sentence is perfectly scannable. Candidate relaxations to evaluate (needs corpus research — don’t pick blindly):
(...),
[...], en/em-dash pairs). A parenthetical enumeration is already
visually bracketed; its commas are not adding subordination load.: when what follows is a list of
short items. Colon + short items is idiomatic prose-enumeration and
reads well.max_short_enum_items parameter, or implicit).structure.long-enumeration: the shared
enumeration::detect_enumerations helper already discounts Oxford-
style enumeration commas from structure.excessive-commas (3+ short items).
F22 is specifically about the cases that helper still misses:
parentheticals, post-colon lists, and non-Oxford enumerations
(“A, B, C and D” without the final comma).Research inputs to gather before deciding: FR/EN corpus samples of
technical docs, a handful of real false positives from dogfooding and
downstream projects, how textlint / Vale / write-good handle
parentheticals. Decide between relaxation parameters vs. a smarter
token-aware counter.
Findings filed from the 2026-04-24 code-review stream-2 pass on
src/. Each has a concrete source reference so it survives past the
.personal/<date>-today.md scratchpad.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F93 | Parser hot-path allocations. src/parser/mod.rs:43 (Paragraph::new(trimmed.to_string(), …)) and src/parser/tokenizer.rs:~88/109 (current.trim().to_string() per sentence) allocate in hot loops. impl Into<String>; pass the already-owned buffer where possible.Paragraph::new does not appear in the profile; to_string() in tokenizer = 3 samples / 0.03%. Real hot spots are F102 (detect_language 7.5%) and F103 (per-rule split_sentences). | ✅ Done (refuted) | 2026-04-24 code review (stream-2 #3); refuted 2026-04-25 |
| F94 | Tokenizer Vec<char> per sentence. src/parser/tokenizer.rs:~60 collects a full Vec<char> for lookahead. Peekable<CharIndices>.Vec<char> drop = 3 samples / 0.03% on the engine path. Yesterday’s “low ceiling” note (~5%) was generous; real ceiling is ~0.1%. Skip. | ✅ Done (refuted) | 2026-04-24 code review (stream-2 #5); refuted 2026-04-25 |
| F102 | detect_language cost. Single function showed 7.5% inclusive in samply profile 2026-04-25. Rewrote as single-pass, alloc-light: scalar counters, to_lowercase() only for words containing an uppercase character, no intermediate vectors. Bench delta on engine_lint_str/en_long_devdoc vs stream2-noisy: −0.56 % (p = 0.00, ~20 µs) — smaller than profile suggested because most of the inclusive cost is unicode_words() itself, which the rewrite cannot touch. | ✅ Done | 2026-04-25 samply profile; landed 2026-04-25 |
| F103 | Per-rule split_sentences re-parse. 8 rules called split_sentences(¶graph.text, …) directly. Moved sentence splitting into Paragraph::new; rules now read ¶graph.sentences. Bench delta vs stream2-noisy: engine_lint_str/en_long_devdoc −11.58 % (~394 µs); parse_markdown/en_long +17.67 % (~38 µs, intentional — split cost moved into the parser phase, where it pays for itself across the eight consumers). Net user-facing win ~360 µs. New baseline saved as stream2-after-f103. | ✅ Done | 2026-04-25 samply profile; landed 2026-04-25 |
| F95 | ✅ Shipped 2026-04-24 in commit 925ffb5. Two non-literal expects fixed: consecutive_long_sentences.rs (streak_start unwrap when streak_len > max) and all_caps_shouting.rs::flush_run (first()/last() on a Vec already verified len >= min_run). The originally flagged parser/tokenizer.rs:177 candidate is now an if let Some(...) pattern. Remaining expect("non-zero literal") sites are all NonZeroU32::new(LITERAL) — idiomatic compile-time invariants, explicitly out of audit scope. | ✅ Done | 2026-04-24 code review (stream-2 #2) |
| F96 | ✅ Shipped 2026-04-24 in commit 925ffb5. src/scoring.rs:199-209 now carries an explicit safety-contract comment naming the [0, cap] clamp dependency, plus a debug_assert!(normalized.is_finite() && (0.0..=cap).contains(&normalized)) that trips in debug builds if a future edit loosens the clamp. The #[allow(clippy::cast_possible_truncation, clippy::cast_sign_loss)] stays — it masks a lint, not a real bug — but the invariant is now load-bearing in tests. | ✅ Done | 2026-04-24 code review (stream-2 #1) |
| F-config-whitelist-normalize | Config whitelist normalization at load time. src/config.rs — normalize (trim, case-fold per rule needs) on load instead of per invocation; catches user typos early. Small win; fits a v0.3 config-plumbing pass rather than a 0.2.x patch. | 🟡 Later | 2026-04-24 code review (stream-2 #6) |
New rule candidates raised in the rule-system-growth brainstorm
(2026-04-20). Naming uses a provisional category.rule-name prefix
pending F29 harmonisation. Grounding column points at the standard or
research that justifies the rule.
Must-ship v0.2 (blocking release):
| ID | Rule | Category | Tags | Grounding | Priority |
|---|---|---|---|---|---|
| F48 | ✅ lexicon.all-caps-shouting shipped in v0.2 — see docs/src/rules/all-caps-shouting.md | Lexicon | a11y-markup, dyslexia, general | WCAG 3.1.5, BDA Dyslexia Style Guide | — |
| F55 | ✅ syntax.nested-negation shipped in v0.2 — see docs/src/rules/nested-negation.md | Syntax | aphasia, adhd, general | FALC, CDC Clear Communication Index | — |
| F56 | ✅ syntax.conditional-stacking shipped in v0.2 — see docs/src/rules/conditional-stacking.md | Syntax | aphasia, adhd, general | FALC, plainlanguage.gov | — |
Should-ship v0.2 (cuttable under time pressure, in suggested cut order):
| ID | Rule | Category | Tags | Grounding | Priority |
|---|---|---|---|---|---|
| F62 | ✅ lexicon.redundant-intensifier shipped in v0.2 — see docs/src/rules/redundant-intensifier.md | Lexicon | general | Plain-language guides | 🟡 Later |
| F52 | ✅ structure.mixed-numeric-format shipped in v0.2 — see docs/src/rules/mixed-numeric-format.md | Structure | dyscalculia, general | CDC Clear Communication Index | 🟡 Later |
| F50 | ✅ structure.line-length-wide shipped in v0.2 — see docs/src/rules/line-length-wide.md | Structure | dyslexia, general | WCAG 1.4.8 (AAA) | 🟡 Later |
| F47 | ✅ lexicon.consonant-cluster shipped in v0.2 — see docs/src/rules/consonant-cluster.md | Lexicon | dyslexia, general | BDA Dyslexia Style Guide | 🟡 Later |
| F54 | ✅ syntax.dense-punctuation-burst shipped in v0.2 — see docs/src/rules/dense-punctuation-burst.md | Syntax | general | IFLA easy-to-read guidelines | 🟡 Later |
Cut order if schedule slips: F47 → F54 → F62 → F52 → F50 → F11. F55 and F56 are non-negotiable (trivial implementation cost, strong grounding).
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-asciidoc-support | Native AsciiDoc support | 🟡 Later | Format scope v0.1 |
| F-html-support | Native HTML support | 🟡 Later | Relevant for EAA compliance |
| F-docx-support | .docx support via Pandoc integration | 🟡 Later | FALC institutional target |
| F-pandoc-companion | Companion script pandoc → lucid-lint | 🟡 Later | Documented in v0.1 README |
Scraper + cleaner + converter triplet under scripts/texts_*.py
populates examples/public/ (committable public_ok sources) from
examples/texts.yaml. First batch landed 21 fixtures. The follow-ups
below close the remaining rough edges.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-text-source-adapters | Per-source adapters for git-cloned upstreams. The generic clean / convert path doesn’t know how to extract text from shallow-cloned repos (proselint checks, Vale style packs, write-good / alex / retext / textlint-rule fixtures, ASSET / OneStopEnglish / EASSE / CLEAR-corpus datasets). Each needs a small extractor that walks the repo and emits one or more .md files per rule / excerpt. | 🟡 Later | First scraper batch, 2026-04-22 |
| F-text-before-after-refine | Refine texts_convert._split_before_after. The current heuristic looks for literal ## Before / ## After (EN/FR) headings; no upstream page in the current batch uses that shape, so every before_after source fell back to a single content.md with a warning. Replace with a per-source pair-extraction rule (plainlanguage.gov, EC How to write clearly, Canada.ca, OneStopEnglish, ASSET, Inclusion Europe) that emits before.md + after.md. | 🟡 Later | First scraper batch, 2026-04-22 |
| F-texts-yaml-url-maintenance | Maintenance pass on examples/texts.yaml URLs. 12 sources failed on the first batch — 404s from moved landing pages (canada.ca × 2, BDA Dyslexia, Center for Plain Language, Newsela, HuggingFace wiki_auto), UA-/bot-blocking (Légifrance 403, Orthodidacte 403, ADHD Foundation 400), and a DNS error for the specific 18F post. Audit and update entries; for sources that genuinely require a browser-flavoured UA, add a per-source override in the fetcher. Fold in the opportunistic hygiene tasks from the 2026-04-23 brainstorm: (a) dedupe overlapping canada.ca / plainlanguage.gov entries, (b) add a licence-drift guard that flags when a source’s redistribution changes between fetches. | 🟡 Later | First scraper batch, 2026-04-22 + referential brainstorm, 2026-04-23 |
| F-example-fixtures-part2 | Desired-fixture-shapes coverage table + replacements for high-value local-only entries. Part 1 — coverage tables: ✅ Shipped (2026-04-23) — scripts/texts_coverage.py splits output by audience: the committed examples/texts.md shows public_ok counts only (no totals, no names that would leak local-only existence), spliced between <!-- coverage:begin/end --> markers; the gitignored examples/local/COVERAGE.md carries the full matrices plus the load-bearing local-only list. Wired as just texts-coverage / just texts-coverage-check. Part 2 — replacement hunting: 🟡 In progress. First addition (2026-04-25): a French government FALC source under Etalab Open Licence 2.0 — knock-on lifted aphasia × FR and gov_guide × FR out of 0 / N ⚠. Second addition (2026-04-27): three US-federal public-domain ADHD sources — NIMH ADHD topic page (mixed shape, ~780 words), CDC About ADHD (good, ~920 words), CDC Treatment of ADHD (good, ~1040 words). All three covered by the explicit reproduction notices in NIMH and CDC reuse policies (17 USC § 105 + agency policy pages). Knock-on: adhd × EN lifted from the load-bearing list; public-coverage gov_guide × EN and condition adhd × EN rise to non-zero counts. Remaining load-bearing slots: dyscalculia × EN (one BDA link_only) and aphasia × EN+FR (three plain-language standards as link_only). | 🟡 In progress | Referential brainstorm, 2026-04-23 |
| F-rule-fixture-coverage-map | Bidirectional rule ↔ fixture coverage map. Generate examples/COVERAGE.md from each content.md’s rules_relevant frontmatter, rendered as two views: rule → fixtures that exercise it (surfaces under-fixtured rules) and fixture → rules it covers (surfaces untagged or mis-tagged fixtures). Once stable, embed or link the canonical fixture per rule from docs/src/rules/<rule-id>.md. Optional follow-up: calibrated snapshot tests that lock expected lint output per canonical fixture. | 🟡 Later | Referential brainstorm, 2026-04-23 |
| F-reference-auto-discovery | Auto-discovery of new references with triage queue. Crawler (sitemaps, RSS, GitHub search, ACL Anthology API) surfaces candidate sources against a relevance filter derived from rules_relevant keywords; a lightweight triage file lists candidates with accept / ignore / defer. Mini-product — revisit post-v0.3 once the referential has stabilised. | 🟢 Speculative | Referential brainstorm, 2026-04-23 |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-code-block-without-lang | code-block-without-lang rule | 🟡 Later | Rule 8 dropped from v0.1, candidate for lucid-lint-docs plugin |
Polish items for the auto-generated rustdoc surface at https://docs.rs/lucid-lint. The crate-level banner pointing readers to the mdBook + repo + RULES.md was added 2026-05-01 (src/lib.rs); module-level //! headers are already in place and #![warn(missing_docs)] is satisfied. Items below are deferred extras.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-docsrs-metadata | [package.metadata.docs.rs] block in Cargo.toml. Pin the toolchain and feature set docs.rs builds with; add rustdoc-args = ["--cfg", "docsrs"] so any future feature-gated items can carry #[cfg_attr(docsrs, doc(cfg(feature = "x")))] and render the “available with feature X” badge. Cheap, lands the day a real feature flag is introduced. Renumbered from F-tight-list-paragraphs (collision with the parser tight-list fix that landed in parallel). | 🟢 Speculative (0.2.x or 0.3) | 2026-05-01 docs.rs polish discussion |
| F-docsrs-logo | Logo + favicon on docs.rs via #![doc(html_logo_url = "…")] and #![doc(html_favicon_url = "…")] at crate root. Reuses an asset hosted under the repo’s raw URL. Tiny visual identity win on the docs.rs landing page. Renumbered from F130. | 🟢 Speculative (0.2.x) | 2026-05-01 docs.rs polish discussion |
| F-doctest-entrypoints | One runnable doctest per major entry point (Engine::with_profile, Engine::lint_str, Report field access, key Profile variants). /// blocks render as code samples on docs.rs and run under cargo test --doc, so they cannot rot. ~5 lines each. Lifts the API page from “list of names” to self-explanatory reference. Renumbered from F131. | 🟡 Later (0.3) | 2026-05-01 docs.rs polish discussion |
| F-public-api-audit | Public-API audit with cargo public-api: surface candidates that should carry #[doc(hidden)] (re-exports for macros, internal helpers leaked via pub) so the rustdoc index reflects the intended surface, not the current surface. Pair with a CI gate later if the surface becomes load-bearing for SemVer. Renumbered from F132. | 🟡 Later (0.3) | 2026-05-01 docs.rs polish discussion |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F25 | French mirror of the mdBook docs (/fr/ tree). First slice shipped 2026-04-22: translated introduction + rules-index, short FR accessibility and roadmap pages pointing at EN, SUMMARY sidebar entry. Second slice shipped post-0.2.1 (2026-04-23): fr/rules-index.md renamed to fr/rules/index.md for EN-parity, first FR per-rule page landed (structure.sentence-too-long), parallel-version sidebar and EN↔FR deep-link toggle (F-summary-per-locale plan slot A, F92). Third slice shipped 2026-04-24: four more FR per-rule pages landed (structure.excessive-commas, structure.long-enumeration, lexicon.weasel-words, lexicon.unexplained-abbreviation), locked template honoured, SUMMARY.md + fr/rules/index.md rewired to point at the local FR versions. Fourth slice shipped 2026-04-25: six more FR per-rule pages landed (structure.paragraph-too-long, structure.line-length-wide, structure.mixed-numeric-format, structure.deeply-nested-lists, structure.heading-jump, structure.deep-subordination), closing out the structure category (9 / 9 rules FR-complete). Fifth slice shipped 2026-04-27: two more FR per-rule pages landed (rhythm.consecutive-long-sentences, rhythm.repetitive-connectors), closing out the rhythm category (2 / 2 rules FR-complete). Both EN pages were brought up to canonical template first (Examples + See also added). Sixth slice shipped 2026-04-28: six more FR per-rule pages landed (lexicon.low-lexical-diversity, lexicon.excessive-nominalization, lexicon.jargon-undefined, lexicon.all-caps-shouting, lexicon.redundant-intensifier, lexicon.consonant-cluster), closing out the lexicon category (8 / 8 rules FR-complete). Three of five categories now at 100 % (structure + rhythm + lexicon). Seventh slice shipped 2026-04-30: six more FR per-rule pages landed (syntax.passive-voice, syntax.unclear-antecedent, syntax.dense-punctuation-burst, syntax.conditional-stacking, syntax.nested-negation, readability.score), closing out the syntax (5 / 5) and readability (1 / 1) categories — all 5 categories now 100 % FR-complete (25 / 25 per-rule pages). SUMMARY.md was missing FR Syntaxe + Lisibilité subsections entirely; added in the same commit. Also fixed an EN/FR logic bug in syntax.nested-negation example (After clause now something is possible / quelque chose est possible, matching the predicate-logic-faithful inversion of the Before clause). Eighth slice shipped 2026-05-01 (Block C slice A): first two FR guide pages landed (fr/guide/installation.md, fr/guide/quick-start.md); new Premiers pas draft-chapter group in SUMMARY.md; both pages stamped with the F92 sub-task en-source-sha HTML comment. Ninth slice shipped 2026-05-01 (Block C slice B): two more FR guide pages landed (fr/guide/profiles.md, fr/guide/suppression.md) — Block C now half-done (4 / 8). 4 EN-only guide pages remain (conditions, configuration, scoring, ci-integration); FR pair-completeness now 35 / 42 (untranslated EN: 7, down from 11 at start of day). Tenth slice shipped 2026-05-01 (Block C slice C — closing slice): four FR guide pages landed (fr/guide/conditions.md, fr/guide/configuration.md, fr/guide/scoring.md, fr/guide/ci-integration.md); SUMMARY.md Premiers pas group now lists all 8 children. Block C complete (8 / 8). All 8 EN guide pages now have FR mirrors; FR pair-completeness 39 / 42 — only the architecture overview, design-decisions, and contributing pages remain untranslated (these are next-tier surfaces, not part of the user-facing guide). Eleventh slice shipped 2026-05-01 (next-tier close): three FR pages landed (fr/architecture/overview.md, fr/architecture/design-decisions.md, fr/contributing.md); SUMMARY.md gains an Architecture draft-chapter group + Contribuer entry under Version française. F25 closes — pair-completeness 41 / 41 (only roadmap.md remains intentionally asymmetric). | ✅ Closed 2026-05-01 | v0.1 docs /shape session, bilingual-equality prime directive |
| F-summary-per-locale | Split SUMMARY.md per locale (EN + FR) via a small preprocessor. v0.2.1 ships the single-SUMMARY.md + CSS :has() locale-hiding approach (1.A); both language trees coexist in the built HTML and each viewer only sees theirs. A clean separation would maintain SUMMARY.en.md + SUMMARY.fr.md and stitch them at build. Benefit: smaller per-page sidebar payload; clearer authoring story; no :has() browser-support floor. Cost: build-time stitcher, tooling to keep the two files in pair-sync. File when the FR tree outgrows the hide-via-CSS approach. | 🟢 Speculative | 2026-04-23 FR per-rule pages session |
| F-multi-book-mdbook | Multi-book mdBook layout (one book per locale). The truest “parallel version” — / redirects to /en/, /fr/ is its own mdBook with its own theme inheritance. Benefit: each locale has its own table of contents, its own search index, its own navigation neighbour hints; no cross-locale bleed in any surface. Cost: biggest surgery — book.toml per locale, build orchestration, shared theme / asset de-duplication, sitemap updates, redirects. Revisit only if F-summary-per-locale isn’t enough. | 🟢 Speculative | 2026-04-23 FR per-rule pages session |
| F92 | ✅ Shipped post-0.2.1 (2026-04-23) — scripts/sync_lang_counterparts.py walks docs/book/**/*.html after mdbook build and rewrites both hreflang="en" and hreflang="fr" anchors so the lang-switch deep-links to the matching page (e.g. /fr/rules/sentence-too-long.html ↔ /rules/sentence-too-long.html). Wired into just docs-build, the Deploy-docs workflow, and a new just docs-lang-check CI gate that runs with --check and fails on orphaned FR pages (FR without EN counterpart). The invariant is asymmetric by design: EN is canonical, FR is a translation layer — untranslated EN pages are informational and tracked as F25, not gated. No front-matter flag yet; add a counterpart: none flag only when a truly asymmetric page appears. Sub-task — FR content-staleness gate (shipped 2026-05-01): filename parity is gated; content drift was not. Every FR page now carries an en-source-sha HTML-comment stamp on its first line (<!-- en-source-sha: 5e24f614… -->), recording the EN counterpart’s last commit SHA at translation time. mdBook passes HTML comments through unchanged so the stamp is invisible in the rendered page; YAML front-matter was tried first but mdBook renders --- as <hr> and the body as text. scripts/check_lang_staleness.py walks every FR page, compares the stored SHA to git log -n1 --pretty=%H -- <EN counterpart>, reports drift soft (PR ci.yml + main docs-deploy.yml) and fails on main with STRICT=1 once the existing stale backlog clears. Wired as just docs-lang-staleness. scripts/backfill_en_source_sha.py (one-shot) stamped the 29 already-translated FR pages with the EN SHA at their introduction commit. Reconcile shipped 2026-05-01 (commit 438fa48b, “F92 — reconcile stale FR backlog (13 → 0) + flip gate to strict”): of the 13 pages reported stale, 12 were cosmetic stamp drift only (the F105/F105b references-section sweep, the F35b/F35c a11y fix, and the line-length-wide author-break-aware fix all touched FR counterparts in the same commits — only the en-source-sha stamps lagged), 1 was substantive (fr/index.md had drifted on three sections — État du projet v0.2 numbers, Aperçu peak-end demo block, Pour aller plus loin guide-links update). Same commit flipped docs-deploy.yml from soft to --strict. PR-side ci.yml flipped to --strict on 2026-05-02 — both surfaces now strict-gated, sub-task fully closed. Optional further layers: an mdBook preprocessor banner above stale FR pages; a needs-fr-translation PR label automation for EN edits without FR counterparts. | — (sub-task: ✅ Closed 2026-05-02) | 2026-04-23 FR per-rule pages session, option 2.B; 2026-05-01 Block C planning |
| F-docs-i18n-substrate | Docs i18n substrate evaluation (Starlight vs Sphinx). mdBook is a twin-tree at the file level with a post-build hreflang patcher (F92); it cannot deliver page-keyed translations or identical section numbering across languages by construction. A real i18n model needs either route-keyed translations (Astro Starlight: defaultLocale + locales, language dropdown built in, Markdown sources kept) or message-catalogue translations (Sphinx + sphinx-intl / gettext PO files: FR is the same file with strings substituted, headings and numbering identical by construction; weblate-style flow). Don’t migrate now — F92 + F25 + the F92 staleness sub-task carry through v0.2.x. Migration triggers (any one): (a) a third language is requested (Spanish or German via the EU disability-federation play, F-falc-readiness-guide); (b) docs surface crosses ~50 pages; (c) contributors complain about FR/EN drift after the staleness gate is in place. Default pick on trigger: Starlight (lightest migration, keeps Markdown). Sphinx only if RGAA-mandated structural parity becomes a contractual requirement. Placeholder entry — no work scheduled. | 🟡 Later | 2026-05-01 Block C planning, F25 follow-up |
| F107 | ✅ Shipped 2026-04-27 — Two-part fix without aliasing the rule ID. (1) Page subtitle: every shipped FR rule page opens with a short italic gloss directly under the H1 (e.g. *Phrase trop longue.*); 13 pages received the subtitle, the remaining 12 land alongside their translation. (2) Index gloss: fr/rules/index.md “Catégories” block reshaped into 5 per-category sub-tables (Structure / Rythme / Lexique / Syntaxe / Lisibilité), each Règle | Libellé two-column. All 25 rules carry a FR label even when the page still points to the EN version (marked (en) inline). One-line note clarifies the kebab-case ID is the stable contract; the FR label is a reading aid. Sidebar TOC labels stay in EN — translating them would force a per-locale SUMMARY.md (F-summary-per-locale, parked Speculative). | — | 2026-04-25 docs UX critique (Block E) |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F27 | ✅ Shipped in v0.2 — docs/src/roadmap.md is auto-generated from the root ROADMAP.md by scripts/sync-roadmap.py. just docs-build / just docs-serve run the sync first, so the mdBook site always ships the current roadmap. Relative links are rewritten (targets under docs/src/ become docs-relative; others become absolute GitHub URLs) so the docs_links_stay_inside_docs gate still passes. | — | v0.1 docs review |
| F28 | ✅ Shipped in v0.2 — one page per rule under docs/src/rules/, wired into docs/src/SUMMARY.md, enforced by tests/rule_docs_coverage.rs. Each page carries category, severity, default weight, parameters per profile, EN/FR examples where applicable, and suppression guidance. | — | v0.1 docs review |
| F29 | Rule ID harmonisation. F29-slim ✅ shipped 2026-04-22 in v0.2.0: the 25 rule IDs now use category.rule-name form (structure.excessive-commas, lexicon.weasel-words, readability.score, …) and rule source files moved into category subdirectories under src/rules/<cat>/. Category::for_rule derives the category from the id prefix rather than a hand-maintained match arm (F43-style drift now impossible by construction). Hard break — suppression directives, [rules.<id>] TOML keys, JSON/SARIF ruleId fields all use the new form; no alias layer. mdBook filenames and docs URLs still use the flat kebab slug; docs-tree rearchitecture into category subdirs is a separate slice. F29-full (parked 2026-04-24) would add a stable category-numbered code (STR-001, LEX-002, SYN-003) that survives renames — slim already makes drift impossible by construction, and numeric codes only earn their cost on a real rename. Revisit only when a rename actually happens. | — (slim) / 🟢 Speculative (full) | v0.1 docs review; 2026-04-22 reprioritisation; 2026-04-24 brainstorm-next-cycles |
| F-rule-mention-linking | Audit every rule mention across the docs and link it to its reference page (F28). Requires F28 to land first. References-page surface (rule IDs in → Relevant to: lines + rule → reference summary table) covered by F105b 2026-04-27; remaining surface is rule mentions in docs/src/guide/* prose pages, RULES.md, and the introduction. | 🟡 Later | v0.1 docs review |
| F42 | ✅ Shipped in v0.2 — rule documentation coverage gate. tests/rule_docs_coverage.rs cross-checks every shipped rule id against its mdBook page, Category::for_rule, scoring::WEIGHTED_RULE_IDS, and (on CI, gated by RULE_DOCS_GATE_GIT=1) the ## [Unreleased] section of CHANGELOG.md. Contract documented in CONTRIBUTING.md. | — | v0.2 interlude |
| F43 | ✅ Shipped in v0.2 — RULES.md category drift fixed. Per-rule **Category** lines and the Categories table now match Category::for_rule: structure.excessive-commas and structure.deep-subordination are structure, rhythm.repetitive-connectors is rhythm, syntax.unclear-antecedent is syntax. The drift banners on the four per-rule mdBook pages are removed. | 🟡 Later | Surfaced by F42 interlude |
| F-rule-mention-coverage-test | Coverage test for F-rule-mention-linking rule-mention linking — assert each rule id mentioned in docs/src/**/*.md is linked on first-per-section occurrence. Follow-up from F-rule-mention-linking. | 🟡 Later | F-rule-mention-linking follow-up |
| F104 | ✅ Shipped 2026-04-27 — SUMMARY.md reshaped into 5 collapsible sub-trees (Structure / Rhythm / Lexicon / Syntax / Readability) using mdBook draft chapters (- [Title]()) as non-clickable group headers; FR Version française block mirrors the same shape (Structure / Rythme / Lexique — Syntaxe and Lisibilité materialise as those FR translations land). markdownlint MD042 disabled globally to permit the empty-link draft-chapter syntax (matches the pre-existing MD025 carve-out for SUMMARY-required multiple H1s). Picked over (B) “one sub-page per category” — B doubles the page count without adding clarity the index table doesn’t already provide. | — | 2026-04-25 docs UX critique (Block E) |
| F105 | ✅ Shipped 2026-04-27 — docs/src/references.md (EN, under Project) and docs/src/fr/references.md (FR, under Version française) consolidate every cited source into one informative surface, preserving the full taxonomy of examples/REFERENCES.md (legend, per-domain sections, rule → reference summary table) and the scholarly-honesty note. examples/REFERENCES.md becomes a thin redirect to the docs sources — kept because external citations may already point there. Both rule indexes (EN + FR) cross-link to the new page next to the existing RULES.md pointer. Per-citation anchors deferred — readers scan the page or use browser search; if a need surfaces, file a follow-up. | — | 2026-04-25 docs UX critique (Block E) |
| F105b | ✅ Shipped 2026-04-27 — Per-citation anchors (<a id="author-year">) on every entry of references.md + fr/references.md, plus a ## References / ## Références section on every rule page (25 EN + 13 FR) listing the relevant citations as anchored links. The references page now links rule IDs in → Relevant to: lines and the rule → reference summary table to their per-rule mdBook pages — bidirectional rules ↔ references. Verified canonical URLs (DOI, publisher landing page, official archive — researched in 2026-04-27 lap, 26 of 34 academic citations carry one) added inline as raw HTML anchors with rel="nofollow noopener noreferrer" target="_blank": nofollow so the docs site does not vouch for outside content, noopener noreferrer for new-tab safety. Sources without a verifiable canonical URL stay text-only — no guessed links. Subsumes the F-rule-mention-linking rule-mention linking pass for the references-page surface; wider F-rule-mention-linking audit (rule mentions in docs/src/guide/* prose pages) stays open. | — | F105 follow-up filed 2026-04-27 |
| F-docs-codegen | Code → docs codegen for data-heavy surfaces. Several docs surfaces are hand-maintained today but derivable from the rule registry and config types: per-rule pages (defaults, weight, condition tags, category, severity), docs/src/rules/index.md table, docs/src/guide/profiles.md threshold tables, docs/src/guide/conditions.md tag list, docs/src/guide/suppression.md directive list, JSON output schema page. Proposed shape: a lucid-lint manifest --format=json subcommand emits one document with everything pulled from default_rules, Category::for_rule, scoring::default_weight_for, the Condition enum, profile presets, and schemars::JsonSchema derives. A just docs-gen script renders marked regions in existing prose pages (<!-- BEGIN: lucid-gen rule-defaults id=structure.sentence-too-long lang=en --> … <!-- END: lucid-gen -->) so prose around the data stays hand-authored. CI runs just docs-gen and fails on non-empty git diff (same shape as F27 for the roadmap sync). Translation surface shrinks to prose only — labels (Default, Profile, Threshold) come from a small i18n.toml keyed by (lang, key), the data is identical across languages by construction. Block on F25 guide translations landing first so we don’t change the substrate mid-translation; open after Block C closes. | 🟡 Later | 2026-05-01 Block C planning, F25 / F28 / F42 follow-up |
| F-landing-page | Landing-page polish. docs/src/introduction.md already plays both roles today: lens-motif hero, before/after figure, “what makes it different”, quick-taste terminal capture, “where to next”. A real landing-page push only earns its cost when there’s a first consumer outside the maintainer (project gets adopted, traffic shows up). Until then, polishing is design work without a forcing function. Notes for when triggered: more positioning above the fold, demo grid for the rule families (one canonical example per category), CTA toward profiles + quick-start, lens-motif extension already validated for use across the page. | 🟢 Speculative | 2026-04-25 docs UX critique (Block E) |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F26 | ✅ MVP shipped in v0.2 via DOM-level trim in lucid-navigation.js — the picker now shows three honest items (Auto · Lucid light · Lucid dark); the stock Rust / Navy / Ayu <li>s are marked hidden so they’re inert for keyboard and screen-reader. CSS class mapping is unchanged (.light / .rust → lucid-light, .coal / .navy / .ayu → lucid-dark), so pre-existing localStorage selections still render correctly. Follow-up (optional): a full index.hbs override to drop the stock markup entirely rather than hide it; preferred once the mdBook upgrade cadence settles. | 🟡 Later | v0.1 docs /colorize session; mdBook stock limitation |
| F73 | ✅ Pre-deploy font-leak gate shipped in v0.2 — just docs-check-clean rebuilds the book, runs scripts/sanitize-stock-css.py, and greps the output for active font-family / --*-font / local() references to Open Sans or Source Code Pro. Not wired into just check (mdbook build is too slow for the dev loop); wire it into the docs-publish CI workflow before any release-candidate goes live. | 🟡 Later | v0.2 /critique polish pass follow-up |
| F-example-fixtures-part2 | ✅ Shipped in v0.2.1 — fixed localhost 404.html rendering under mdbook serve. book.toml sets site-url = "/lucid-lint/" for GitHub Pages, and mdBook emits <base href="/lucid-lint/"> into 404.html (only there). On localhost that prefix doesn’t exist, so the browser’s preload scanner fired 18 stylesheet/script requests with the wrong prefix before the page recovered via a second fetch. The previous JS workaround in docs/theme/head.hbs rewrote <base> at parse time, but ran after the preload scanner. Fix: just docs-serve now sets MDBOOK_OUTPUT__HTML__SITE_URL=/ for the serve process, so 404.html carries <base href="/"> on localhost and the correct <base href="/lucid-lint/"> in production builds; the JS workaround is removed. | — | 2026-04-23 Block A |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-reading-prefs-popover | Full reading-preferences popover UI — cog button in the header opens a popover with font radio (Atkinson / Standard / OpenDyslexic), line-spacing slider (1.4–2.0, 0.05 step) and text-size slider (90–130 %, 5 % step). v0.1 ships only the Introduction-page demonstrator; the CSS-variable plumbing (--reading-scale, --reading-line-height, [data-font]) is already in place, so this is UI work only. | 🟡 Later | v0.1 docs /shape + /typeset sessions |
| F-docs-responsive | Responsive / mobile adaptation — right-rail page TOC and header controls collapse gracefully below 700 px; touch targets verified ≥ 44 × 44 px; sidebar drawer behaviour polished. | 🔴 Next | v0.1 docs /layout session, deferred to /adapt |
| F-a11y-audit-sweep | Accessibility audit sweep — full AAA pass on both themes (contrast, focus order, prefers-reduced-motion coverage, keyboard-only walk-through, skip-link), plus a published accessibility statement page. First audit pass ran 2026-04-22 (17/20, 0 P0, 2 P1, 3 P2); findings filed as F35a–F35d below. F-a11y-audit-sweep stays open until the statement page ships and P1s are cleared. | 🟡 In progress | v0.1 docs /audit plan |
| F35a | ✅ Shipped 2026-04-22 — theme/index.hbs is now forked from mdBook v0.5.2’s upstream template (minimal-diff approach, documented so future mdBook upgrades stay a mechanical re-sync). The skip link and EN / FR language switch are emitted as server-rendered HTML inside <body> and inside .right-buttons; both language variants are rendered and CSS in lucid-layout.css hides the wrong-locale copy based on html[lang] (which head.hbs sets synchronously before first paint on /fr/ pages). The previous skipLink() and langSwitch() IIFEs in lucid-navigation.js are gone; the only remaining JS on the skip-link path is a progressive-enhancement smooth-scroll handler. WCAG 2.4.1 Bypass Blocks now passes with JS disabled. Unblocks F26 (stock theme labels can be collapsed at the markup level). | — | F-a11y-audit-sweep audit 2026-04-22 |
| F35b | Drop role="radiogroup"/role="radio" on reading-demo chips (P2 from F-a11y-audit-sweep audit). Current markup declares radiogroup semantics but the JS only binds click — arrow-key traversal is missing, so the ARIA contract is broken. Simpler fix is to switch to plain buttons with aria-pressed (the chips are preset toggles, not radios) rather than add a keyboard handler. Promoted to 🔴 Next on 2026-04-24 (brainstorm-next-cycles). | 🔴 Next | F-a11y-audit-sweep audit 2026-04-22 |
| F35c | ✅ Closed 2026-05-01 as audit false-positive. The 2026-04-22 audit reported that .lucid-stance__idea lost its colour tint under prefers-reduced-motion. Re-audit on 2026-05-01 against docs/theme/css/lucid-layout.css:567-622 and docs/theme/css/lucid-typography.css:424-431: no @media (prefers-reduced-motion: reduce) rule touches .lucid-stance__idea; the global reduced-motion reset zeroes animation-duration / transition-duration only and never overrides background-color. The only rule that strips the tint is @media (forced-colors: active) (line 620–622), which is intentional (Windows High Contrast users get the OS palette, position-based pairing carries the meaning). The original audit appears to have conflated forced-colors: active with prefers-reduced-motion: reduce. No code change needed; accessibility.md known-limitation bullet removed in the same commit. | — | F-a11y-audit-sweep audit 2026-04-22 |
| F35d | Publish an accessibility statement page (docs/src/accessibility.md, FR counterpart at docs/src/fr/accessibility.md). EN page carries the stated bar (WCAG 2.2 AAA), first audit pass result (2026-04-22, 17/20), a “Known limitations” block listing F35a/b/c pending, report route, and audit cadence. FR stub mirrors the limitations block. Shipped 2026-04-22. | 🟢 Shipped | F-a11y-audit-sweep audit 2026-04-22 |
| F-docs-final-polish | Final polish pass — optical alignment, spacing rhythm, edge-state copy, favicon PNG fallback, social-card refinement, re-running /critique to verify the score moves above 30/40. | 🟡 Later | v0.1 docs /polish plan |
| F-terminal-demo-a11y | Terminal-demo accessibility — keep VHS, add motion + transcript fallbacks. Audited VHS (charmbracelet/vhs, active 2026-04-27, headless+CI-reproducible) vs. terminalizer (~16k stars, last commit 2024-08-29, effectively unmaintained). Verdict: keep VHS — .tape files are text-diffable, the build is reproducible, and the motion-handling problem is the same on both tools, so it is not a recorder choice but a wrapping problem. AAA gap to close: every embedded GIF on the docs site (today: docs/src/assets/tty/explain.gif plus future captures) must (1) honour prefers-reduced-motion — browsers do not pause animated GIFs automatically, so a static <picture> source-set with a still PNG fallback served when (prefers-reduced-motion: reduce) is the right shape; (2) carry the per-step transcript inside the page so non-sighted, screen-reader, and reduced-motion readers reach the same content as motion viewers — a stepwise prose block (e.g. <details><summary>Transcript</summary>…</details> with each tape command + its visible output as a list) sitting next to the GIF, plus an alt= summary on the image itself. The .tape source already encodes the steps deterministically — a small generator can emit the transcript from the same file the GIF is built from, keeping motion view and transcript view pair-locked. Phase: v0.3 marketing. | 🟡 Later | 2026-04-27 Block E recon |
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-score-evolution-dashboard | Score evolution dashboard across runs | 🟢 Speculative | Rule 11, inspired by coverage reports |
| F98 | Mutation testing via cargo-mutants. ✅ Baseline shipped 2026-04-25 — dev-tool installed, just mutants <file> recipe added (timeout 60 s, no-shuffle for reproducibility), four-file probe run: sentence_too_long.rs 6 caught / 0 missed / 4 unviable (100 %), scoring.rs 18 / 0 / 2 (100 %), engine.rs 5 / 0 / 12 (100 %), low_lexical_diversity.rs 29 / 47 / 5 (36 %). Canonical reference rule + cross-cutting layer score perfectly; the lexical-diversity rule has two clear test gaps surfaced as F108 + F109. Triage methodology: cluster missed mutants by site → one ROADMAP entry per root cause, not per mutant. | ✅ Done | Stream-2 testing brainstorm, 2026-04-24 |
| F108 | low_lexical_diversity::ratio_at_anchor_min — assert reported ratio in tests. ✅ Shipped 2026-04-25. Added reported_ratio() helper (parses the documented message format) and three new test fixtures: reported_ratio_is_minimum_observed_in_cluster (50 W + 100 cache + 50 V → cluster-exit path with min ratio 0.01 deep mid-slide, not at anchor), flush_path_reports_final_ratio (cache-only doc → flush path), and exactly_window_size_tokens_runs_the_check (boundary on the early-return guard). Ratio assertion uses (ratio - 0.01).abs() < 1e-9 so floating-point shifts from arithmetic mutations are caught. Bonus refactor (typed-ratio field on Diagnostic) deferred — string parsing is fine for the test-only consumer. | ✅ Done | F98 baseline 2026-04-25 |
| F109 | low_lexical_diversity::check — borderline-cluster fixtures. ✅ Shipped 2026-04-25 alongside F108. Added cluster_starts_at_strict_inequality and ratio_exactly_at_threshold_does_not_trigger — the latter uses 49 W + 51 cache so the only full window has unique=50 → ratio exactly 0.50 = min_ratio. With strict < the rule must not trigger; a < → <= flip would emit a diagnostic and fail the test. Combined effect: the rule’s mutation score moved from 36 % (29 / 47 / 5) at F98 baseline to 89 % (68 / 8 / 5). The remaining 8 missed mutants are equivalent under the current rule logic — defensive guards (start_index + window > tokens.len() is unreachable in normal flow because anchor.index ≤ len − window), or initial values the slide loop unconditionally overwrites (let mut best = unique / window is replaced as soon as a lower ratio appears, which it always does in a real cluster). Closing those would require rule refactoring (e.g. starting best at f64::INFINITY to prove the initial computation is dead) — diminishing returns; deferred. | ✅ Done | F98 baseline 2026-04-25 |
| F-proptest-invariants | Property-based tests via proptest (dep already in [dev-dependencies], zero call sites today — paid for, unused). Four invariants in tests/properties.rs, deliberately small: (1) split_sentences never drops a non-whitespace character on round-trip, (2) re-linting an identical string yields identical diagnostics (engine idempotence), (3) for threshold-driven rules, public-profile diagnostics are a superset of dev-doc-profile diagnostics on the same input (profile monotonicity), (4) Engine::lint_str never panics on arbitrary valid UTF-8 ≤ 10KB. Goal: fortify tokenizer / engine seams, not rewrite the suite. | 🟡 Later | Stream-2 testing brainstorm, 2026-04-24 |
| F-llm-fp-miner | LLM false-positive miner via Claude Code. Dev-only audit script (not a test, not a CI gate) that runs lucid-lint across the CC corpus, asks Claude to flag diagnostics that look wrong, writes a triage report to .personal/audits/. Reframed from the original “LLM-as-Judge harness” after Devil’s Advocate surfaced three blockers on the gating form: non-determinism across Claude model versions, ambiguity about whether a disagreement indicts the rule or the judge, cost / wall-clock at 600×N scale. The miner form sheds all three — human triages, Claude suggests. Respects prime directive #4 (deterministic core, no LLM) because it lives entirely outside the library crate and never blocks just check. Wait until v0.3 lucid-lint-nlp plugin work surfaces the need for correctness review at scale. | 🟢 Speculative | Stream-2 testing brainstorm, 2026-04-24 |
| F93 | Tokenizer split_sentences Vec\<char\> allocation. The helper collects the full input into a Vec\<char\> per call to support lookbehind (chars[idx-1]) and arbitrary lookahead (chars[idx+1..].find(!ws) for ellipsis-continuation). Nominal waste on real corpus is ~5% of the split_sentences budget (bench shows 35µs total, Vec\<char\> alloc ~1–2µs). Refactor to a small ring-buffer + Peekable\<CharIndices\> is feasible but high-churn for low ceiling. Revisit only if profiling pins the tokenizer as a bottleneck. | 🟢 Speculative | Stream-2 code review 2026-04-24 (measured; deferred) |
| F-lucid-stance-unify | Unify rule-page example figures on the .lucid-stance component. Today the intro page uses a custom .lucid-stance figure (Before / After side-by-side, colour-matched ideas, diagnostic in the figcaption), while rule pages use plain H3 + blockquote + fenced text for the diagnostic (see docs/src/rules/sentence-too-long.md). The H3 form works and is cheap to roll out, but wide screens could show stronger Before↔After pairing with the side-by-side figure. Scope: extract .lucid-stance into a reusable component (mdBook include or raw HTML pattern), tune the styling for in-content width (rule pages sit inside the narrower content column, not the landing-page hero), one figure per language, drop the H3 subsections in favour of a data-lang attribute surfaced as a chip on the figure. Ship only after the H3-based rollout has landed across all example-bearing rule pages and the unified pairing is confirmed as the dominant reader complaint. | 🟢 Speculative | 2026-04-23 docs clarity session — H3 subsections landed as the lightweight option; F-lucid-stance-unify parks the heavier unify-the-components path |
| F-fix-mode | --fix mode for the mechanical subset of rules — promoted to 🟡 Later on 2026-04-24 (brainstorm-next-cycles, 0.3 Should). Narrow scope locked: lexicon.all-caps-shouting (lowercase the run), lexicon.redundant-intensifier (drop the intensifier), structure.mixed-numeric-format (normalise to the detected majority style), structure.line-length-wide (rewrap to max_chars). All other rules stay report-only — cognitive-load judgments need the author to choose the rewrite. Borderline structure.heading-jump stays out of the initial cut. Design: per-rule fixable: bool metadata on the Rule trait, --fix flag walks diagnostics in document order applying only those with concrete replacements, writes files in place (or emits a unified diff with --fix=print), exits with count of fixes applied. Conservative default: --fix only touches the explicitly-fixable set, never guesses. | 🟡 Later | 2026-04-23 docs clarity session — framing “lucid-lint reports, you rewrite” surfaced the question |
File/directory discovery. Distinct from suppression (below): scope control excludes inputs before they are scanned; suppression hides diagnostics after scanning.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F78 | ✅ Shipped in v0.2 — exclude = [...] glob list in [default] of lucid-lint.toml and --exclude <GLOB> CLI flag (comma-delimited, repeatable). Patterns match against paths relative to the walked root; matching directories are pruned, not descended. Explicit file args bypass exclusion. Backed by globset. See docs/src/guide/configuration.md. .lucidignore (gitignore-style file) deferred to F78b if user demand surfaces. | — | Dogfood feedback 2026-04-21 |
| F78b | .lucidignore file (gitignore-style, with negations and nested files). Different crate (ignore) and a larger test matrix than the glob-list MVP. Ship only if users ask — the exclude list in lucid-lint.toml covers the dominant use case. | 🟢 Speculative | F78 deferral, 2026-04-21 |
v0.1 ships the minimal inline-disable directive (see brainstorm
brainstorm/20260419-inline-disable-feature.md). Extensions deferred:
| ID | Item | Priority | Origin |
|---|---|---|---|
| F18 | ✅ Block form shipped in v0.2: <!-- lucid-lint-disable <rule-id> --> … <!-- lucid-lint-enable --> silences one rule across every line in the scope. enable with no argument closes every open scope; with a rule id, closes only that rule’s scope (so overlapping disables for different rules can nest). Unterminated disable extends to end-of-document. See RULES.md → Suppressing diagnostics. | — | v0.1 inline-disable brainstorm |
| F19 | ✅ Shipped in v0.2 — top-level [[ignore]] array-of-tables in lucid-lint.toml, each entry with a required rule_id silences every diagnostic for that rule across Markdown, plain text, and stdin. Unknown ids tolerated. Applied post-engine, pre-scoring, so scoring / rendering / exit-code logic all see the filtered view. Scope broadened from the roadmap’s original “.txt and stdin” wording because a global filter is simpler and more useful; Markdown users can still prefer inline directives for local silencing. reason field tracked as F-suppression-reason-field. See docs/src/guide/configuration.md. | — | v0.1 inline-disable brainstorm |
| F-suppression-reason-field | reason="..." field, optional in v0.1, surfaced in reports and optionally required via config | 🟡 Later | v0.1 inline-disable brainstorm |
| F-suppression-disable-file | File-level directive (disable-file) and multi-rule lists | 🟡 Later | v0.1 inline-disable brainstorm |
| F-severity-floor-flag | --severity-floor=warning CLI flag. Routed 2026-05-02 (.personal/brainstorm/20260502-async-book-pr-timing.md); supports the async-book audit-and-PR play (tracked in .personal/promotion-channels.md). Need: external audit PRs (async-book and adjacent) want a “narrow audit” mode that drops info diagnostics from output and from score impact, so the PR demonstrates value on the unambiguous wins (sentence-too-long, redundant-intensifier, unclear-antecedent, paragraph-too-long) without the contested ones (info-tier weasel words after F-weasel-words-severity-tiering lands). Shape: --severity-floor={info,warning,error} with default info (current behavior). Pairs with F-weasel-words-severity-tiering: once weasel-words emits info on quantifiers, an auditor running --severity-floor=warning ships a PR where reviewers see only the prose changes the tool is most confident about. Implementation is a post-engine filter (mirrors F19 [[ignore]] post-engine pre-scoring shape) so JSON / SARIF / TTY all see the same filtered view; scoring excludes filtered diagnostics so --min-score interacts correctly. Definition of done: CLI flag in src/cli.rs, filter in src/engine.rs post-rule pre-score, two snapshot tests (info-included default vs warning-floor), docs in docs/src/guide/configuration.md + FR mirror with a “running a narrow audit on someone else’s repo” worked example, CHANGELOG entry. | 🔴 Next | F113 audit-and-PR play (2026-05-02) |
TTY-output decoration on the lucid-lint check summary surface.
Distinct from the rule engine (no diagnostic semantics change), from
suppression (which hides diagnostics), and from scoring (which weights
them) — this section covers what the user reads after the diagnostic
list. JSON / SARIF stay structural until a second consumer asks.
| ID | Item | Priority | Origin |
|---|---|---|---|
| F-report-quick-wins | TTY report quick-wins block — actionable hint hooks under the diagnostic list. Routed 2026-05-03 from Block C of .personal/2026-05-03-today.md (originally surfaced in the 2026-05-02 deferred buffer). Dogfood loops on this repo and adjacent docs surfaced the gap: high-density runs already produce a complete summary (score line + per-category breakdown + diagnostic list), but the next action a user can take is buried inside the list. This entry adds a small “quick wins” block rendered after the diagnostic list, TTY only in v0.2.x, with two seed shapes both grounded in observed dogfood patterns: (1) Acronym whitelist hint — when ≥ 3 occurrences of lexicon.unexplained-abbreviation share one token, surface → add "X" to whitelist (N hits suppressed) (one line per acronym, top 3); (2) Single-rule hot-spot hint — when one rule fires ≥ 10 times in one file, surface → <rule-id> dominates this file — see <docs URL> (one line per rule per file). Why a single ROADMAP entry, not two: one shape (a quick-wins reporter), two seed heuristics that share the threshold-based fire rule, the TTY-only render path, and the test pattern; a third heuristic (e.g. category dominance) earns its own line later without new scaffolding. Threshold heuristics (sketch, finalised in PR): acronym hint requires ≥ 3 hits sharing a token; hot-spot hint requires ≥ 10 hits of the same rule in one file. Each hint is one line; the block caps at ≤ 5 lines so it never crowds the score banner. Output surfaces: TTY only in v0.2.x; JSON / SARIF stay structural (separate ROADMAP entry if a CI consumer asks; no current ask). Definition of done: new report::quick_wins module (or extension of the existing TTY renderer) with the two seed hints wired, snapshot tests pinning the fires-vs-silent path for each shape, threshold parameters live next to the hint definition (no central config knob until a second consumer asks), CHANGELOG [Unreleased] entry. Non-breaking: purely additive output below the diagnostic list, no JSON / SARIF schema change, no scoring change, no new CLI flag (existing --quiet already suppresses TTY decoration if a consumer wants the bare list). | 🔴 Next | Block C 2026-05-03 (deferred from 2026-05-02 buffer) |
Shipped in the tag: all 17 rules across 5 phases, the minimal inline-disable directive, and the mdBook documentation site (Lucid light / Lucid dark themes, Atkinson Hyperlegible Next / Literata / Commit Mono / OpenDyslexic typography layer, reading-preferences demonstrator, accessibility page, EN/FR header switch with v0.2 FR-stub). See CHANGELOG.md for the full release notes.
| Status | Rule | Notes |
|---|---|---|
| ✅ | structure.paragraph-too-long | Sentence-count + word-count thresholds per profile (src/rules/paragraph_too_long.rs) |
| ✅ | structure.deeply-nested-lists | Flags list items nested beyond profile depth (src/rules/deeply_nested_lists.rs) |
| ✅ | structure.heading-jump | Walks section depths, flags jumps > +1 level (src/rules/heading_jump.rs) |
| Status | Rule | Notes |
|---|---|---|
| ✅ | structure.sentence-too-long | Reference implementation — template for the 15 others (src/rules/sentence_too_long.rs) |
| ✅ | structure.excessive-commas | Per-profile comma-per-sentence threshold (src/rules/excessive_commas.rs) |
| ✅ | rhythm.consecutive-long-sentences | Intra-paragraph streak of long sentences (src/rules/consecutive_long_sentences.rs) |
| Status | Rule | Notes |
|---|---|---|
| ✅ | lexicon.weasel-words | Per-language phrase list, word-boundary match (src/rules/weasel_words.rs) |
| ✅ | lexicon.unexplained-abbreviation | Pattern-based (v0.1); definition-awareness tracked as F9 (src/rules/unexplained_abbreviation.rs) |
| ✅ | lexicon.jargon-undefined | Pattern-based, profile-activated category lists (src/rules/jargon_undefined.rs) |
| ✅ | lexicon.excessive-nominalization | Per-sentence suffix-based density check (src/rules/excessive_nominalization.rs) |
| ✅ | rhythm.repetitive-connectors | Sliding-window connector frequency, one diagnostic per cluster (src/rules/repetitive_connectors.rs) |
| Status | Rule | Notes |
|---|---|---|
| ✅ | readability.score | Per-document Flesch-Kincaid grade; info under threshold, warning above (src/rules/readability_score.rs) |
| Status | Rule | Notes |
|---|---|---|
| ✅ | structure.long-enumeration | Shared enumeration detector with structure.excessive-commas; suggests list conversion (src/rules/long_enumeration.rs, src/rules/enumeration.rs) |
| ✅ | structure.deep-subordination | Counts subordinators between strong-punct breaks; skips pronoun enumerations (src/rules/deep_subordination.rs) |
| ✅ | syntax.passive-voice | Heuristic be/être+past-participle detector; POS-based detection remains a lucid-lint-nlp plugin candidate (src/rules/passive_voice.rs) |
| ✅ | syntax.unclear-antecedent | Info-level heuristic: bare demonstrative + verb, or paragraph-start personal pronoun (src/rules/unclear_antecedent.rs) |
| ✅ | lexicon.low-lexical-diversity | Sliding-window TTR over non-stopword content tokens (src/rules/low_lexical_diversity.rs) |
| Status | Feature | Notes |
|---|---|---|
| ✅ | Minimal inline-disable | <!-- lucid-lint disable-next-line <rule-id> --> for Markdown inputs, single rule id, optional reason. See RULES.md → Suppressing diagnostics. Block form, config ignores, file-level scope and required reason= are tracked as F18–F-suppression-disable-file below. |
| ✅ | Accessibility page in the docs | docs/src/accessibility.md covers the WCAG 2.2 AAA bar, the reading-preferences control, typography credits (Atkinson Hyperlegible Next — Braille Institute; OpenDyslexic — Abelardo Gonzalez; Literata — TypeTogether), keyboard shortcuts, and how the site dogfoods the project’s mission. Linked from the sidebar and the footer. |
Decided: v0.1 diagnostics carry only what cannot be trivially recomputed.
#![allow(unused)]
fn main() {
pub struct Diagnostic {
pub rule_id: String,
pub severity: Severity,
pub location: Location,
pub section: Option<String>, // H2 (or configured level) containing the diagnostic
pub message: String,
}
}
Kept : section is stored at emission. Recomputing it a posteriori would require re-parsing the Markdown to walk headings and match locations. Expensive. Storing it is cheap.
Omitted : category is a pure function of rule_id. A category_of(rule_id) -> Category utility derives it in O(1). No duplication in diagnostics.
Omitted : weight and suggestion are not used in v0.1 and will be introduced when the hybrid scoring model (F14) lands.
This aligns with the “open to change, not abstracted for change” principle applied earlier to format handling: struct fields can be added later without breaking JSON serialization compatibility.
A number of configuration and ergonomics questions were raised but postponed. They will be addressed before or during v0.2:
dev-doc, public, falc confirmed)LL001 vs. name-only)# lucid-lint disable-next, block disable/enable, ignore file)whatlang)rayon for multi-file processing.lucidignore (now tracked as F78)lucid-lint-core for third-party integrationFuture rules and plugins can be proposed by the community. The default jargon and stoplists (lexicon.jargon-undefined, lexicon.weasel-words, lexicon.low-lexical-diversity) are especially welcome targets for community pull requests to expand coverage across domains and languages.
lucid-lint is a cognitive-accessibility tool. The docs site you are
reading is its first proof of concept. If the site itself is not
comfortable to read for the audiences the project claims to serve, the
pitch does not hold.
This page lists the bar, the controls, and the credits.
WCAG (Web Content Accessibility Guidelines) is the international standard for web accessibility. It defines three conformance levels; AAA (the strictest) is the ceiling, not the floor.
The stated bar for this site is WCAG 2.2 Level AAA. In practice:
prefers-reduced-motion: reduce absolutely — no
decorative animation, no parallax, no auto-playing content.Both themes (Lucid light and Lucid dark) clear AAA for body text (14:1 and above) and inline links (7.4:1 and above).
Where AAA is impractical — for example contrast on a third-party
embed — the exception is documented in
.impeccable.md.
The first audit pass (2026-04-22) scored 17 / 20 against the AAA bar: 0 blockers, 2 P1 items, 3 P2, 2 P3. Each open item below has a roadmap ticket; fixes land in subsequent v0.2.x slices.
lucid-navigation.js at end-of-body. Users with JS
disabled, or readers on the pre-paint frame, do not see them.
WCAG 2.4.1 (Bypass Blocks) asks for the skip link without
JS. A theme/index.hbs override that server-renders both is
tracked as F35a.<html lang="en"> at build
time because mdBook supports a single book-wide language. A
small script corrects lang="fr" on load; screen readers that
respect dynamic changes pick it up. Proper per-locale builds
land with the full French mirror in
F25.A small set of controls tunes the site to your own reading profile.
Selections persist across visits via localStorage.
Three choices, picked from the Introduction page demonstrator or from the reading-preferences popover (on the way — see the roadmap).
| Option | When it helps |
|---|---|
| Atkinson Hyperlegible Next (default) | A humanist sans built by the Braille Institute for maximum character differentiation. Reads well for most readers and especially for readers with low vision or reading-speed fatigue. Every surface on the site uses it by default. |
| Standard | The same Atkinson for body prose, paired with Literata serif for headings — a traditional bookish pairing for readers who prefer serif display contrast. |
| OpenDyslexic | A typeface whose letters are weighted at the bottom to reduce swapping and rotating. Preferred by some dyslexic readers; not universally helpful. |
Adjustable from 1.4 to 2.0 in 0.05 steps. The default is 1.7 — the research range for low-fatigue reading sits between 1.6 and 1.8.
Adjustable from 90 % to 130 % in 5 % steps. Browser zoom is honoured in addition.
The site inherits mdBook’s keyboard map:
| Key | Action |
|---|---|
/ or s | Focus the search box |
← | Previous chapter |
→ | Next chapter |
Escape | Close the search or theme popover |
Tab | Follow the focus order. The first focusable element is always the Skip to main content link. |
Every font on the site is self-hosted under
docs/src/_fonts/.
All four ship under the SIL Open Font License 1.1, issued by the Summer Institute of Linguistics.
rn vs m, I vs l vs 1).The prose on this site is linted by lucid-lint itself at the public
profile, via just dogfood. A page cannot regress below the bar the
tool sets for its users without the build failing.
If something on this site is harder to use than it should be, open
an issue on
GitHub
with the accessibility label. Reports are triaged against the
v0.2 milestone unless they block a release. If an email route suits
you better, write to the maintainer listed in
CONTRIBUTING.md.
Audit cadence: a full AAA sweep runs at least once per minor release (v0.1, v0.2, …). The last pass was 2026-04-22. Findings and their status live in the roadmap under the F35 family.
Academic, normative, and practical sources that inform the design of
lucid-lint.
This page lists the references that shaped lucid-lint’s rules, profiles, and design decisions. Each entry states where the reference matters in the project. The French mirror lives at fr/references.md.
External links open in a new tab; we mark them rel="nofollow noopener noreferrer" so the new-tab is safe and the docs site does not vouch for outside content.
| Status | Meaning |
|---|---|
| ✅ | Verified — canonical reference |
| ⚠️ | To verify — likely correct, confirm citation details |
| 🔍 | Opportunistic — sound rationale, citation may be looser |
| 📖 | Book / secondary source |
| 🌐 | Normative standard |
| 🧪 | Practical source (style guide, tool) |
The theoretical core of lucid-lint: prose imposes a mental cost on the reader, and this cost can be measured and reduced.
✅ Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. ↗
Foundational paper. Distinguishes intrinsic, extraneous, and germane load. Justifies the core premise that poor prose imposes extraneous load that can be reduced through better structure.
→ Relevant to: most rules, especially structure.*, rhythm.*, syntax.nested-negation, syntax.conditional-stacking.
📖 Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive Load Theory. Springer. ↗
Modern synthesis of 30 years of research.
✅ Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers, 36(2), 193–202. ↗
The reference paper for automated cohesion analysis. Over 200 linguistic indices measuring local and global cohesion. Our rules are simplified, deterministic versions of several Coh-Metrix metrics.
→ Relevant to: rhythm.repetitive-connectors, syntax.unclear-antecedent, lexicon.low-lexical-diversity.
📖 McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. ↗
✅ Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68(1), 1–76. ↗
Foundational paper on Dependency Locality Theory. Formalizes the cost of holding distant grammatical referents in working memory.
→ Relevant to: structure.deep-subordination, syntax.unclear-antecedent, syntax.conditional-stacking.
✅ Sanders, T. J. M., & Noordman, L. G. M. (2000). The role of coherence relations and their linguistic markers in text processing. Discourse Processes, 29(1), 37–60. ↗
Central reference on how logical connectors guide or confuse readers.
→ Relevant to: rhythm.repetitive-connectors.
✅ Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233. ↗
Original paper for the Flesch Reading Ease formula.
✅ Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas for Navy enlisted personnel. Technical Report, Naval Technical Training Command. ↗
Origin of the Flesch-Kincaid Grade Level formula used in v0.1.
→ Relevant to: readability.score.
📖 McLaughlin, G. H. (1969). SMOG grading: A new readability formula. Journal of Reading, 12(8), 639–646. ↗
Alternative readability formula. Candidate for v0.2.
📖 Herdan, G. (1960). Type-Token Mathematics: A Textbook of Mathematical Linguistics.
Origin of the Type-Token Ratio used in lexical diversity analysis.
→ Relevant to: lexicon.low-lexical-diversity.
✅ McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. ↗
✅ Clark, H. H., & Chase, W. G. (1972). On the process of comparing sentences against pictures. Cognitive Psychology, 3(3), 472–517. ↗
Classic experimental work showing that negative sentences take longer to verify than affirmative ones. Foundational evidence that negation carries a comprehension cost.
→ Relevant to: syntax.nested-negation.
✅ Carpenter, P. A., & Just, M. A. (1975). Sentence comprehension: A psycholinguistic processing model of verification. Psychological Review, 82(1), 45–73. ↗
Extends Clark & Chase with a formal model of sentence processing. Stacked negations compound the verification cost.
→ Relevant to: syntax.nested-negation.
🔍 Kaup, B., Lüdtke, J., & Zwaan, R. A. (2006). Processing negated sentences with contradictory predicates: Is a door that is not open mentally closed? Journal of Pragmatics, 38(7), 1033–1050. ↗
Modern reference on negation processing. Useful if you want to go deeper.
🔍 Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Psychology Press. ↗
Mental models theory of conditional reasoning. Stacked conditionals multiply the number of mental models the reader must maintain.
→ Relevant to: syntax.conditional-stacking.
🔍 Evans, J. St. B. T., & Over, D. E. (2004). If. Oxford University Press. ↗
Comprehensive review of the psychology of conditionals. More accessible than Johnson-Laird for non-specialists.
🔍 Caveat: the link between chained conditionals and reader cognitive load is intuitive and well-supported by the broader reasoning literature, but the specific rule “more than N conditionals per sentence is harmful” is a practitioner heuristic, not a directly tested threshold. Treat the threshold as configurable and empirically calibrated.
🔍 Arditi, A., & Cho, J. (2007). Letter case and text legibility in normal and low vision. Vision Research, 47(19), 2499–2505. ↗
Empirical evidence on the reading-speed cost of all-caps text: readers lose the word-shape cues that mixed-case ascenders and descenders provide.
→ Relevant to: lexicon.all-caps-shouting.
🧪 Nielsen, J. (Nielsen Norman Group). Multiple articles on all-caps readability in user interfaces.
Industry-standard reference on why ALL-CAPS text reduces reading speed.
→ Relevant to: lexicon.all-caps-shouting.
📖 Bringhurst, R. (2013). The Elements of Typographic Style (4th ed.). Hartley & Marks.
Canonical reference on typography. Supports the principle that uniform-height text (all-caps) slows reading compared to mixed-case.
✅ Legge, G. E., & Bigelow, C. A. (2011). Does print size matter for reading? A review of findings from vision science and typography. Journal of Vision, 11(5). ↗
Review of vision-science evidence on reading. Covers line-length effects among other factors.
→ Relevant to: structure.line-length-wide.
🔍 Seidenberg, M. S., Waters, G. S., Barnes, M. A., & Tanenhaus, M. K. (1984). When does irregular spelling or pronunciation influence word recognition? Journal of Verbal Learning and Verbal Behavior, 23(3), 383–404. ↗
Classic work showing that unusual letter patterns slow word recognition.
🔍 Treiman, R., Kessler, B., Zevin, J. D., Bick, S., & Davis, M. (2006). Influence of consonantal context on the reading of vowels: Evidence from children. Journal of Experimental Child Psychology, 93(1), 1–24. ↗
Work showing that consonant clusters and their context affect reading accuracy and speed.
🔍 Caveat: the
lexicon.consonant-clusterrule is grounded in the broader literature on word-form complexity, but a specific validated threshold like “4+ consonants in a row is harmful” does not come from a single canonical paper. The rule is a practitioner heuristic informed by the literature, not a direct transposition of a published metric.
🔍 Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Longman.
Classical grammar reference classifying intensifiers as “amplifiers” whose semantic contribution is often marginal. Justifies flagging them as low-value words.
→ Relevant to: lexicon.redundant-intensifier.
🧪 Zinsser, W. (2006). On Writing Well (30th anniversary ed.). HarperCollins.
Practical guide that famously argues against adverb intensifiers (“very”, “really”, “quite”) as clutter. Not academic, but widely cited in writing pedagogy.
📖🧪 Strunk, W., & White, E. B. (1999). The Elements of Style (4th ed.). Longman.
The canonical English writing guide. Codifies active voice, concision, clear pronouns, and warns against qualifiers like weasel words and intensifiers.
→ Relevant to: syntax.passive-voice, lexicon.weasel-words, lexicon.redundant-intensifier, syntax.unclear-antecedent.
🧪 US Plain Language Action and Information Network (2011). Federal Plain Language Guidelines. ↗
Grounds short sentences, active voice, no nominalization, no jargon.
→ Relevant to: structure.sentence-too-long, structure.paragraph-too-long, lexicon.excessive-nominalization, lexicon.jargon-undefined, syntax.passive-voice.
🧪 European Commission (2011). How to write clearly. Publications Office of the European Union. ↗
European plain-language equivalent in all EU languages.
🌐 International Organization for Standardization (2022). ISO 80000-1:2022 — Quantities and units — Part 1: General. ↗
International standard on numeric formatting, including digit grouping and decimal separators. Grounds the idea that mixing formats within a single text impairs scanning.
→ Relevant to: structure.mixed-numeric-format.
🧪 The Chicago Manual of Style (17th ed., 2017). University of Chicago Press. ↗
Canonical style guide covering when to spell numbers out vs. use digits, and why consistency matters.
→ Relevant to: structure.mixed-numeric-format.
⚠️ Martinussen, R., Hayden, J., Hogg-Johnson, S., & Tannock, R. (2005). A meta-analysis of working memory impairments in children with attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 44(4), 377–384. ↗
⚠️ Caveat: direct research on “text readability for ADHD readers” is dispersed and of variable quality. The cognitive accessibility angle is sound, but treat specific ADHD claims carefully.
📖 Barkley, R. A. (2012). Executive Functions: What They Are, How They Work, and Why They Evolved. The Guilford Press. ↗
✅ Rello, L., & Baeza-Yates, R. (2013). Good fonts for dyslexia. Proceedings of ASSETS ’13. ↗
Empirical research on font choice impact for dyslexic readers.
✅ Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904–911. ↗
→ Relevant to: possible future rule “abstractness density” (not in v0.1).
🌐 W3C (2018). Web Content Accessibility Guidelines (WCAG) 2.1. ↗
Key criteria invoked:
structure.heading-jumpstructure.line-length-widestructure.heading-jumplexicon.jargon-undefinedlexicon.unexplained-abbreviationreadability.score⚠️ Verify exact criterion numbers against the WCAG version you want to cite (2.1 or 2.2).
🌐 Accessibility Standards Canada (2025). CAN-ASC-3.1:2025 — Plain Language (first edition). ↗
First-edition Canadian national standard on plain language, published bilingually by Accessibility Standards Canada under the Accessible Canada Act. Prescriptive (shall / should / may) requirements over five areas: audience identification, evaluation methods, structure, wording, design. Grounds many of our lexicon.*, structure.*, and readability.score defaults independently of the US / EU plain-language canons.
→ Relevant to: lexicon.jargon-undefined, lexicon.unexplained-abbreviation, lexicon.weasel-words, structure.sentence-too-long, structure.paragraph-too-long, syntax.passive-voice, readability.score.
🌐 Directive (EU) 2019/882 of the European Parliament and of the Council of 17 April 2019 — European Accessibility Act (EAA). ↗
Legal framework extending accessibility requirements to private-sector services from 28 June 2025.
| Rule | Primary references |
|---|---|
readability.score | Flesch (1948); Kincaid et al. (1975); Henry (1975); Kandel & Moles (1958); CAN-ASC-3.1:2025 |
| Rule | Primary references |
|---|---|
rhythm.consecutive-long-sentences | Sweller (1988); Sweller et al. (2011) |
rhythm.repetitive-connectors | Sanders & Noordman (2000); Graesser et al. (2004) |
| Rule | Primary references |
|---|---|
structure.deep-subordination | Gibson (1998); FALC |
structure.deeply-nested-lists | WCAG 2.1; cognitive load heuristics |
structure.excessive-commas | Gibson (1998) — 🔍 practitioner heuristic |
structure.heading-jump | WCAG 1.3.1 & 2.4.6; RGAA 9.1 |
structure.line-length-wide | WCAG 1.4.8 (AAA); Legge & Bigelow (2011) |
structure.long-enumeration | FALC; Plain Language US |
structure.mixed-numeric-format | ISO 80000-1; Chicago Manual of Style |
structure.paragraph-too-long | Sweller (1988); Graesser et al. (2004); CAN-ASC-3.1:2025 |
structure.sentence-too-long | Sweller (1988); Plain Language US; FALC; CAN-ASC-3.1:2025 |
| Rule | Primary references |
|---|---|
syntax.conditional-stacking | Johnson-Laird & Byrne (1991); Evans & Over (2004); Gibson (1998) — 🔍 threshold is practitioner heuristic |
syntax.dense-punctuation-burst | Sweller (1988); Gibson (1998) — 🔍 purely heuristic |
syntax.nested-negation | Clark & Chase (1972); Carpenter & Just (1975); Kaup et al. (2006) |
syntax.passive-voice | Strunk & White; Plain Language US; FALC; CAN-ASC-3.1:2025 |
syntax.unclear-antecedent | Strunk & White; Gibson (1998); Graesser et al. (2004) |
lucid-lint is an engineering project informed by research, not a research project itself. The references above ground our design choices but we do not claim to validate new findings. Several rules (lexicon.consonant-cluster, syntax.conditional-stacking, syntax.dense-punctuation-burst, structure.excessive-commas) are practitioner heuristics informed by the literature rather than direct transpositions of published metrics — we mark these with 🔍 in the summary table.
Where we simplify an academic metric (e.g., syntax.unclear-antecedent as a pattern heuristic vs. full anaphora resolution), we document the simplification in RULES.md and plan richer versions in the roadmap.
If you are a researcher and spot an error, an outdated citation, or a misattribution, please open an issue — we will correct it promptly and credit you.
See CONTRIBUTING.md for the full contribution guide.
just check locally.Conçu pour les lecteurs dont l'attention est sollicitée — TDAH, dyslexie, fatigue, langue seconde, ou contexte d'accessibilité.
lucid-lint lit votre Markdown ou texte brut et repère les passages
qui alourdissent la lecture. Il ne réécrit pas votre voix. Il vous
tend une liste courte, puis s’efface.
Avant
Le sous-système de cache, introduit lors d'un jalon antérieur, s'est révélé mal interagir avec la nouvelle chaîne de traitement des requêtes sous charge soutenue, et l'enquête qui a suivi a exigé plusieurs rondes de profilage.
Après
Le sous-système de cache a été introduit plus tôt. Il interagit mal avec la nouvelle chaîne de traitement des requêtes sous charge soutenue. L'enquête a exigé plusieurs rondes de profilage.
lucid-lint a signalé sentence-too-long
(43 mots) et consecutive-long-sentences. Il n'a pas
proposé la réécriture — elle est de vous.
La plupart des outils mesurent le style (write-good), la grammaire
(Antidote) ou un score de lisibilité de surface (Flesch).
lucid-lint mesure la charge cognitive — l’effort mental qu’un
lecteur dépense pour comprendre une phrase. Il repère les motifs que
la recherche de Sweller, Gibson, Graesser et
Coh-Metrix ont isolés.
dev-doc, public ou falc (Facile À
Lire et à Comprendre), puis ajustez chaque règle si besoin.lucid-lint est en v0.2 (publiée le 2026-04-22). Les 25 règles
listées dans
RULES.md
sont livrées (17 en v0.1, 8 ajoutées pendant le cycle v0.2),
accompagnées du modèle de score hybride — un
score global X / max et cinq sous-scores par catégorie, calculés
au-dessus des diagnostics. Pré-1.0 : des changements de rupture
restent possibles entre versions mineures. La
feuille de route indique la suite.
Un fichier sans diagnostic obtient le score complet 100/100 et la bannière du logo — le moment fort d’une analyse réussie :

~~~~~ ⟨ • ⟩ ───── lucid-lint v0.2.0
cognitive accessibility linter · prose · EN / FR
────────────────────────────────────────────────
No issues found.
────────────────────────────────────────────────────────────
score: 100/100
structure █████ 20/20
rhythm █████ 20/20
lexicon █████ 20/20
syntax █████ 20/20
readability █████ 20/20
cargo install lucid-lint
# Analyser un fichier
lucid-lint check README.md
# Profil le plus strict (FALC)
lucid-lint check --profile=falc docs/
# Entrée standard
echo "Ceci est une phrase de test." | lucid-lint check -
# JSON pour la CI
lucid-lint check --format=json docs/
# Échouer la build si le score global passe sous 85/100 (v0.2+)
lucid-lint check --min-score=85 docs/
Tout le site est conçu comme un compagnon de lecture. Choisissez la police qui vous convient le mieux — elle sera mémorisée entre les pages.
Atkinson Hyperlegible Next
Un paragraphe dense peut beaucoup demander à un esprit sollicité. Chaque virgule, chaque proposition, chaque parenthèse ajoute son coût. Une bonne prose maintient ce coût bas.
L'interligne et la taille du texte arriveront bientôt sous forme de curseurs. En attendant, choisissez une police et le zoom du navigateur est respecté.
Double licence MIT ou Apache-2.0, à votre choix.
lucid-lint propose quatre voies d’installation. Choisissez celle qui correspond à votre environnement.
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh | sh
Le script est généré par cargo-dist à chaque version publiée. Il détecte votre plate-forme. Il télécharge le binaire pré-compilé correspondant depuis la version GitHub. Il le place sur $PATH (par défaut : $CARGO_HOME/bin si défini, sinon ~/.cargo/bin).
curl … | sh est rapide mais opaque. Pour lire le script avant de l’exécuter :
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh -o install.sh
less install.sh
sh install.sh
Le script est court — moins de 200 lignes de shell POSIX. Une lecture rapide reste réaliste. Il fixe la version pour laquelle il a été généré. Il vérifie la taille attendue de l’archive téléchargée. Il sort en erreur si une valeur diffère.
latest pointe vers la version la plus récente. Pour fixer une version connue et stable :
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/download/v0.2.2/lucid-lint-installer.sh | sh
powershell -ExecutionPolicy Bypass -c "irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 | iex"
Même mécanique cargo-dist, version PowerShell. Le binaire atterrit dans %CARGO_HOME%\bin si CARGO_HOME est défini, sinon dans %USERPROFILE%\.cargo\bin.
Pour auditer avant d’exécuter, sauvegardez le script et inspectez-le :
irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 -OutFile install.ps1
notepad install.ps1
.\install.ps1
cargo install lucid-lint
Cette voie compile depuis les sources publiées sur crates.io. Elle place le binaire dans votre dossier bin de Cargo (par défaut ~/.cargo/bin/). Plus lent que l’installeur pré-compilé. Utile quand les cibles pré-compilées ne couvrent pas votre plate-forme.
git clone https://github.com/bastien-gallay/lucid-lint
cd lucid-lint
cargo install --path .
Chaque version publie des binaires pré-compilés pour :
x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl)aarch64-apple-darwin, x86_64-apple-darwin)x86_64-pc-windows-msvc)Les installeurs shell et PowerShell ci-dessus choisissent l’archive correcte automatiquement. Pour installer à la main, téléchargez depuis la page des versions GitHub et placez le binaire extrait sur $PATH.
lucid-lint --version
cargo install).Cette page suit l’analyse de votre premier document.
lucid-lint check README.md
Sortie :
warning <path>/README.md:14:1 Sentence is 27 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]
summary: 1 warnings.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long
────────────────────────────────────────────────────────────
score: 88/100
structure ██▏░░ 8/20
rhythm █████ 20/20
lexicon █████ 20/20
syntax █████ 20/20
readability █████ 20/20
Le bloc final est le résumé du score. Il affiche un score global X / 100 puis le détail par catégorie.
lucid-lint check docs/*.md CHANGELOG.md
lucid-lint check docs/
Tous les fichiers avec une extension .md, .markdown ou .txt seront traités.
echo "This is a test sentence." | lucid-lint check -
Pour les formats que lucid-lint ne sait pas encore lire nativement :
pandoc report.docx -t markdown | lucid-lint check -
# Le plus strict : Facile À Lire et à Comprendre
lucid-lint check --profile=falc docs/
# Le plus permissif : documentation pour développeurs
lucid-lint check --profile=dev-doc docs/
Voir Profils pour le détail.
# JSON pour l'intégration continue
lucid-lint check --format=json docs/
Voir Intégration continue pour les recettes CI.
| Code | Signification |
|---|---|
| 0 | Aucun problème (ou seulement des info) et score au-dessus de --min-score (si défini) |
| 1 | Avertissements détectés ou score sous --min-score |
| 2 | Erreur d’exécution (arguments invalides, fichier illisible) |
Les deux portes se combinent. Voir Intégration continue pour les recettes combinées.
Un profil est un ensemble pré-configuré de seuils de règles, ajusté pour un public précis.
dev-docPour la documentation technique, les références d’API, les ADR et le contenu destiné aux développeurs.
Les seuils sont permissifs. Les lecteurs techniques tolèrent mieux les phrases longues, les nominalisations et le jargon de domaine.
public (par défaut)Pour le contenu grand public : pages marketing, descriptions produit, articles de blog.
Les seuils sont modérés. Les principes du langage clair s’appliquent.
falcPour le contenu qui suit le standard Facile À Lire et à Comprendre / Easy-to-Read européen.
Les seuils sont stricts : phrases courtes, vocabulaire simple, pas de voix passive, pas d’acronyme non défini.
Commencez par le profil qui correspond à l’intention du contenu. Surchargez les règles individuelles si besoin via lucid-lint.toml.
Voir la référence des règles pour les seuils exacts par règle et par profil.
Le schéma général :
dev-doc : 30 mots par phrase, 4 virgules, 7 phrases par paragraphepublic : 22 mots par phrase, 3 virgules, 5 phrases par paragraphefalc : 15 mots par phrase, 2 virgules, 3 phrases par paragrapheLe même fichier analysé trois fois sous dev-doc, public puis falc — le score baisse à mesure que le profil se resserre :

Tout seuil défini par règle dans lucid-lint.toml prend le pas sur le préréglage du profil.
[default]
profile = "public"
[rules.sentence-too-long]
max_words = 18 # plus strict que les 22 de public
Une étiquette de condition décrit la condition cognitive qu’une règle vise en priorité. Les conditions sont orthogonales aux profils : un profil (dev-doc, public, falc) règle la sévérité des règles toujours actives ; les conditions ajoutent des règles ciblées pour un public précis.
| Étiquette | Cible |
|---|---|
general | Règles toujours actives. La base de v0.2. |
a11y-markup | Signaux de balisage proches de la prose (par exemple le cri en majuscules). |
dyslexia | Signaux ciblant la dyslexie. Source : BDA Dyslexia Style Guide. |
dyscalculia | Format des nombres et points d’ancrage. Source : CDC Clear Communication Index. |
aphasia | Signaux ciblant l’aphasie. Source : FALC, guides en langage clair. |
adhd | Signaux liés à la fragilité de l’attention. |
non-native | Signaux pour lectrices et lecteurs non natifs (mots rares, expressions imagées). |
L’ensemble est figé. Ajouter une étiquette est un choix réfléchi et versionné.
Pour chaque règle, le moteur évalue :
general est toujours active.general ne tourne que si une de ses étiquettes apparaît dans la liste de conditions actives de la personne qui lance l’outil.Les 17 règles de v0.2 portent toutes general, donc le comportement par défaut ne change pas. Les futures règles étiquetées (par exemple lexicon.all-caps-shouting pour a11y-markup, syntax.nested-negation pour aphasia + adhd) s’activent par cette liste.
Dans lucid-lint.toml :
[default]
profile = "falc"
conditions = ["dyslexia", "aphasia"]
En ligne de commande (séparées par des virgules, répétables) :
lucid-lint check --profile falc --conditions dyslexia,aphasia docs/
FALC garde son sens réglementaire. Ajouter dyslexia ne le relâche pas et ne le renomme pas — la condition pose des signaux dyslexie par-dessus.
Trois niveaux de sévérité × N conditions explose en combinaisons. Garder les deux axes orthogonaux préserve le sens réglementaire de falc tout en laissant composer des couches dédiées à un public. Voir les entrées F71 et F72 sur la feuille de route.
lucid-lint se configure par un fichier lucid-lint.toml à la racine du projet (facultatif) et par des options en ligne de commande (qui priment sur le fichier).
# lucid-lint.toml
[default]
profile = "public"
[rules.sentence-too-long]
max_words = 22
[rules.passive-voice]
max_per_paragraph = 2
[default]Réglages par défaut appliqués à toute l’exécution.
| Champ | Type | Défaut | Description |
|---|---|---|---|
profile | chaîne | "public" | Une valeur parmi dev-doc, public, falc |
conditions | tableau de chaînes | [] | Étiquettes de condition actives. Voir Conditions. |
exclude | tableau de motifs glob | [] | Chemins ignorés pendant la descente récursive. Voir Exclure des chemins. |
[rules.<rule-id>]Configuration par règle. Les champs disponibles dépendent de la règle. Voir les pages de règles dans la référence des règles.
[scoring]Paramètres ajustables du modèle hybride de score. Tous les champs sont facultatifs ; un champ absent retombe sur le défaut livré (category_max = 20, category_cap = 15).
[scoring]
category_max = 20
category_cap = 15
[scoring.weights]
sentence-too-long = 3
weasel-words = 2
La sous-table [scoring.weights] est indexée par identifiant de règle. Les identifiants inconnus sont ignorés ; retirer une règle dans une version future ne casse donc pas les anciens fichiers.
Du plus faible au plus fort :
public)lucid-lint.tomlUne option non passée en ligne de commande retombe sur la valeur TOML ; un champ TOML absent retombe sur le préréglage du profil.
lucid-lint remonte depuis le dossier courant jusqu’au premier lucid-lint.toml trouvé, et s’arrête à la frontière du dépôt .git le plus proche. L’option --config <chemin> saute la découverte et charge le fichier indiqué ; un chemin explicite manquant est une erreur, mais un fichier auto-découvert manquant ne l’est pas.
Les gros dépôts de documentation contiennent souvent des sorties générées, des textes vendus avec le projet et des instantanés qui noieraient le linter sous le bruit. Utilisez le champ exclude dans [default] — ou l’option --exclude <GLOB> en ligne de commande — pour les écarter à la découverte, avant l’analyse.
[default]
exclude = [
"vendor/**",
"**/fixtures/**",
"CHANGELOG.md",
]
L’équivalent en ligne de commande :
lucid-lint check --exclude 'vendor/**,**/fixtures/**,CHANGELOG.md' docs
Notes :
lucid-lint check docs avec exclude = ["drafts/**"] ignore docs/drafts/....docs/CHANGELOG.md directement en ligne de commande, il est analysé même quand CHANGELOG.md est dans la liste d’exclusion. Si vous le nommez, c’est que vous le voulez.--exclude et le champ TOML exclude se cumulent ; ils ne se remplacent pas. Séparez plusieurs motifs par des virgules dans une option, ou répétez --exclude.Les documents Markdown acceptent des directives de désactivation en ligne pour faire taire localement, mais le texte brut et l’entrée standard n’ont pas cette porte de sortie. [[ignore]] comble le manque — et fonctionne pareil sur tous les formats d’entrée.
[[ignore]]
rule_id = "unexplained-abbreviation"
[[ignore]]
rule_id = "weasel-words"
Chaque entrée [[ignore]] retire tous les diagnostics dont le rule_id correspond, dans les fichiers Markdown, le texte brut et l’entrée standard. Le filtre s’applique après l’exécution de toutes les règles, mais avant le score, donc le score reflète la vue post-filtre.
Notes :
[[ignore]] seulement quand une règle est vraiment bruyante sur tout le projet.reason = "..." sur chaque entrée est suivi par F-suppression-reason-field — quand il arrivera, il sera affiché dans les rapports et exigible par configuration.La configuration TOML est branchée règle par règle, à mesure que chaque Config reçoit son accesseur dédié. Deux règles l’honorent aujourd’hui :
[rules.readability-score][rules.readability-score]
formula = "kandel-moles" # ou "flesch-kincaid", "auto"
Fixe la formule de lisibilité, quelle que soit la langue détectée. auto (défaut) garde la sélection par langue de F-readability-formulas-extra.
[rules.unexplained-abbreviation][rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]
Les entrées sont additives par rapport à la base du profil (F31). Utilisez ce champ pour réintroduire des sigles propres au projet — normes d’accessibilité, sigles métier, termes de pratique d’ingénierie — que la base de v0.2 ne livre plus. Chaque entrée fait taire le sigle dans tout le document, comme si vous l’aviez défini en ligne par Expansion (ACRONYME).
[rules."structure.excessive-commas"][rules."structure.excessive-commas"]
max_commas = 2
Surcharge le plafond de virgules par phrase (défaut : 4 / 3 / 2 pour dev-doc / public / falc). La valeur doit être un entier positif — 0 ou une valeur négative est refusée au chargement. La surcharge remplace le préréglage du profil ; elle n’est pas additive.
Les tables pour les autres règles se lisent sans erreur, mais n’ont pas d’effet à l’exécution. Étendre cette liste est un changement mécanique par règle, qui se poursuivra pendant le cycle v0.2.x.
v0.2 ajoute un modèle hybride de score par-dessus les diagnostics existants. Chaque exécution répond désormais à deux questions à la fois :
Les deux surfaces sont complémentaires. Les scores sont des résumés ; les diagnostics restent le signal sur lequel agir.
Le score prend la forme X / max — un maximum arbitraire, pas un nombre normalisé sur 0–100. v0.2 livre max = 100 (cinq catégories × vingt points), mais ce nombre est traité comme un calibrage à tester et apprendre : l’échelle peut bouger dans une future version mineure, à mesure que les poids des règles sont ajustés sur de vrais corpus.
Les règles d’usage pour le calibrage du jour :
| Plage | Lecture |
|---|---|
| 80 – 100 | Le score s’affiche en vert dans le terminal. Rien de bloquant. |
| 60 – 79 | Le score s’affiche en jaune. Quelques signalements à passer en revue. |
| 0 – 59 | Le score s’affiche en rouge. Problèmes denses ou règle qui s’emballe. |
Les bandes de couleur aident la lecture ; elles ne sont pas un contrat de réussite ou d’échec. Pour bloquer la CI, utilisez --min-score avec un nombre concret que vous avez choisi.
Chaque règle appartient à exactement une catégorie. v0.2 fige la taxonomie en cinq cases :
| Catégorie | Couvre |
|---|---|
structure | Longueur, imbrication, ponctuation, squelette du document |
rhythm | Cadence et répétition entre phrases voisines |
lexicon | Vocabulaire, terminologie, sigles, diversité lexicale |
syntax | Style et clarté au niveau de la phrase |
readability | Métriques de lisibilité au niveau du document |
Voir la référence des règles pour la correspondance règle → catégorie.
Pour un seul document :
coût_par_règle = Σ (poids × multiplicateur_de_sévérité) sur les hits
coût_par_catégorie = min(Σ coût_par_règle / (mots / 1000), ← densité
category_cap) ← plafond
score_de_catégorie = category_max − coût_par_catégorie (borné ≥ 0)
score_global = Σ score_de_catégorie
Trois mécaniques s’empilent :
poids × multiplicateur_de_sévérité. La table de poids par défaut vit dans scoring::default_weight_for ; elle insiste sur les règles dont les hits portent la plus grosse charge cognitive (readability-score = 5, longueur / subordination / passive / unclear-antecedent = 2, le reste = 1).mots / 1000, pour qu’un manuel de 10 000 mots ne soit pas puni d’avoir plus de hits qu’un README de 400 mots. Les documents de moins de 200 mots sont traités comme des documents de 200 mots ; les petites fixtures ne sont donc pas pénalisées artificiellement.category_cap sur category_max. Une règle bruyante mange au plus 75 % de sa propre catégorie (15 / 20 par défaut), et ne déborde pas sur les autres.Le multiplicateur de sévérité est info = 1, warning = 3, error = 5.
Le formateur de terminal imprime chaque diagnostic, une courte ligne de résumé, puis un bloc de score : le nombre global, suivi de chaque score de catégorie avec une barre sparkline en huit pas.

La même exécution rendue en texte brut, pour les lecteurs d’écran et le copier-coller :
warning examples/sample.md:7:1 Sentence is 35 words long (maximum 30). Consider splitting it into shorter sentences. [section: A paragraph with a long sentence] [structure.sentence-too-long]
warning examples/sample.md:7:11 Weasel phrase "rather" weakens the statement. Replace with concrete language or remove it. [section: A paragraph with a long sentence] [lexicon.weasel-words]
info examples/sample.md:1:1 Flesch-Kincaid grade 6.8 (target ≤ 14.0). [readability.score]
info examples/sample.md:7:1 Sentence starts with a bare demonstrative "this". Name the referent to avoid forcing the reader to guess. [section: A paragraph with a long sentence] [syntax.unclear-antecedent]
warning examples/sample.md:7:1 Line is 210 characters wide (maximum 120). [section: A paragraph with a long sentence] [structure.line-length-wide]
summary: 3 warnings, 2 info.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long, lexicon.weasel-words, readability.score + 2 more
────────────────────────────────────────────────────────────
score: 45/100
structure █▎░░░ 5/20
rhythm █████ 20/20
lexicon █▎░░░ 5/20
syntax ██▌░░ 10/20
readability █▎░░░ 5/20
Les cinq catégories sont toujours affichées, pour que le découpage reste structurellement stable d’une exécution à l’autre. Un document parfait affiche score: 100/100 avec toutes les barres pleines (█████). Quand la même règle se déclenche deux fois ou plus dans un fichier, les hits se groupent sous un en-tête compact, et le message ou la section partagés sont remontés pour n’apparaître qu’une fois.
Le schéma JSON est en version = 2 dans v0.2. Nouveaux champs :
{
"version": 2,
"diagnostics": [
{
"rule_id": "structure.sentence-too-long",
"severity": "warning",
"location": { "file": { "kind": "path", "path": "draft.md" }, "line": 12, "column": 1, "length": 42 },
"section": "Introduction",
"message": "Sentence is 27 words long (maximum 22).",
"weight": 2
}
],
"summary": { "info": 0, "warning": 1, "error": 0, "total": 1 },
"score": { "value": 88, "max": 100 },
"category_scores": [
{ "category": "structure", "value": 8, "max": 20 },
{ "category": "rhythm", "value": 20, "max": 20 },
{ "category": "lexicon", "value": 20, "max": 20 },
{ "category": "syntax", "value": 20, "max": 20 },
{ "category": "readability", "value": 20, "max": 20 }
]
}
Les valeurs de catégorie sont des chaînes minuscules, dans l’ordre fixe listé plus haut. Les outils qui lisaient le schéma v0.1 doivent :
version attendue de 1 à 2 ;length → structure, lexical → lexicon, style → syntax, global → readability) ;--min-scoreLa sous-commande check accepte une option facultative --min-score=N. L’exécution sort 1 si le score global agrégé est sous N, indépendamment du blocage par sévérité.
# Échoue le build si la qualité globale tombe sous 85/100
lucid-lint check --min-score=85 docs/
Les deux gardes s’empilent : l’exécution échoue si l’une ou l’autre se déclenche. Choisissez l’une, l’autre ou les deux selon votre flux :
--fail-on-warning=false --min-score=85) : tolère des warnings isolés, mais échoue quand la densité dépasse votre seuil.--min-score=85) : pics et dérives échouent tous les deux le build.lucid-lint.tomlLes projets peuvent surcharger le calibrage dans leur lucid-lint.toml :
[scoring]
category_max = 20
category_cap = 15
[scoring.weights]
sentence-too-long = 3
weasel-words = 2
Les champs absents retombent sur les défauts livrés. La sous-table [scoring.weights] est indexée par identifiant de règle ; les identifiants inconnus sont ignorés, donc retirer une règle plus tard ne casse pas les anciens fichiers.
Le brainstorm qui a façonné F14 (voir brainstorm/20260420-score-semantics.md) a gardé le modèle minimal. Les décorations ne seront promues que si les retours utilisateurs l’exigent :
lucid-lint propose deux directives en ligne pour faire taire des diagnostics dans les entrées Markdown. Elles servent aux cas rares où une règle se déclenche sur de la prose intentionnelle (un terme vague cité, un exemple didactique de nominalisation lourde, une voix passive légitime). Préférez réécrire la prose d’abord. Sortez une directive quand la détection est un faux positif connu, ou quand l’autrice a vu l’avertissement et choisi de garder le texte.
<!-- lucid-lint disable-next-line structure.sentence-too-long -->
Une phrase longue qui est intentionnelle et ne doit pas être signalée.
<!-- lucid-lint disable-next-line lexicon.weasel-words reason="citation du guide de style" --> — surfacée dans la sortie JSON ; sera exigée via configuration dans une version future (suivi par F-suppression-reason-field sur la feuille de route).<!-- lucid-lint-disable structure.sentence-too-long -->
Une phrase longue.
Une autre phrase longue dans la même portée.
<!-- lucid-lint-enable -->
<!-- lucid-lint-disable <rule-id> --> ouvre une portée pour une règle.<!-- lucid-lint-enable --> ferme toutes les portées en cours. Passer un identifiant de règle (<!-- lucid-lint-enable <rule-id> -->) ne ferme que la portée de cette règle, ce qui permet d’imbriquer proprement des désactivations chevauchantes pour des règles différentes.disable-file planifiée (F-suppression-disable-file) dès qu’elle arrive.[[ignore]] dans lucid-lint.toml) couvrant .txt et l’entrée standard sont suivis par F19.Les extensions suivantes sont suivies sur la feuille de route :
| ID | Élément |
|---|---|
| F19 | Ignorés par configuration ([[ignore]] dans lucid-lint.toml) pour les entrées .txt et l’entrée standard |
| F-suppression-reason-field | Champ reason="..." facultatif puis exigé, surfacé dans les rapports |
| F-suppression-disable-file | Directive niveau fichier (disable-file) et listes multi-règles séparées par virgule |
## Suppression sur chaque page de règle dans la référence des règles.lucid-lint est conçu pour la CI. Il renvoie :
0 quand aucun problème (ou seulement info) n’est trouvé1 quand des warnings sont trouvés2 sur erreur d’exécution (arguments invalides, fichier illisible)name: Docs lint
on:
pull_request:
paths:
- '**/*.md'
push:
branches: [main]
paths:
- '**/*.md'
jobs:
lucid-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install lucid-lint
run: cargo install lucid-lint
- name: Lint docs
run: lucid-lint check --profile=public docs/ README.md
À ajouter dans votre .pre-commit-config.yaml :
repos:
- repo: local
hooks:
- id: lucid-lint
name: lucid-lint
entry: lucid-lint check --profile=public
language: system
types: [markdown]
Pour faire remonter les diagnostics en commentaires de revue de pull request :
lucid-lint check --format=json docs/ | reviewdog -f=rdjson -reporter=github-pr-review
Note : l’adaptateur RDJSON n’est pas livré. Pour une remontée native dans la revue de code, préférez le flux GitHub Code Scanning ci-dessous.
--format=sarif émet un journal SARIF v2.1.0 que GitHub Code Scanning lit directement : chaque diagnostic devient une alerte de code-scanning, annotée sur le diff de la pull request.
name: Lucid lint (code scanning)
on:
pull_request:
paths: ['**/*.md']
push:
branches: [main]
paths: ['**/*.md']
permissions:
security-events: write
contents: read
jobs:
lucid-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: cargo install lucid-lint
- name: Run lucid-lint and emit SARIF
run: |
lucid-lint check \
--profile=public \
--format=sarif \
--fail-on-warning=false \
docs/ README.md > lucid-lint.sarif
- name: Upload SARIF to Code Scanning
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: lucid-lint.sarif
category: lucid-lint
Notes :
--fail-on-warning=false laisse l’étape d’upload toujours s’exécuter ; reposez-vous sur les gardes de Code Scanning dans l’UI de la pull request, plutôt que sur le code de sortie du linter.runs[0].tool.driver.rules, avec sa catégorie, sa sévérité par défaut, son poids de score par défaut, et un helpUri qui pointe vers la page mdBook de la règle.properties.weight et properties.section portent le poids de score et le titre de section sous lequel le diagnostic a été trouvé.Pour ne pas faire échouer la CI sur des warnings (par exemple pendant une phase d’adoption progressive), vous pouvez inverser le défaut :
lucid-lint check --fail-on-warning=false docs/
L’exécution renvoie alors toujours 0, sauf en cas d’erreur d’exécution.
Vous pouvez aussi bloquer le build sur le modèle de score agrégé. L’exécution sort 1 si le score global est sous le seuil, indépendamment de la garde par sévérité.
jobs:
lucid-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: cargo install lucid-lint
- name: Lint and gate on score
run: lucid-lint check --min-score=85 docs/ README.md
Les deux gardes s’empilent — l’exécution échoue si l’une ou l’autre se déclenche. Choisissez la combinaison adaptée à votre courbe d’adoption :
| Objectif | Options |
|---|---|
| Attraper les warnings nouvellement introduits (comportement par défaut) | par défaut |
| Tolérer des warnings isolés mais échouer sur la dérive | --fail-on-warning=false --min-score=85 |
| Échouer sur les pics et la dérive | par défaut + --min-score=85 |
Une exécution bloquée qui échoue — lucid-lint imprime son résumé habituel, puis le shell expose le code de sortie non nul :

$ lucid-lint check --min-score=85 examples/sample.md
…
score: 45/100
structure █▎░░░ 5/20
rhythm █████ 20/20
lexicon █▎░░░ 5/20
syntax ██▌░░ 10/20
readability █▎░░░ 5/20
$ echo "exit: $?"
exit: 1
lucid-lint livre 25 règles en v0.2 (17 reprises de v0.1, 8 ajouts
v0.2). Chaque règle dispose d’une page dédiée avec sa catégorie, sa
sévérité, son poids par défaut, ses seuils par profil, des exemples,
et les consignes de neutralisation.
La référence compacte RULES.md
reste la vue d’ensemble en un seul fichier, conservée à la racine du
dépôt. Les sources académiques et normatives derrière chaque règle
sont consolidées sur la page Références.
Traduction FR — complète. Les 25 règles ont chacune leur page dédiée en français (jalon F25 sur la feuille de route).
Chaque règle appartient à exactement une des cinq catégories fixes.
La taxonomie fait autorité — le modèle de score
compose les sous-scores par catégorie dans le score global X / max.
L’identifiant en kebab-case (par ex. structure.sentence-too-long)
est le contrat stable utilisé partout : option CLI, sortie JSON, clé
de configuration, citation dans les docs. Le libellé FR ci-dessous
est un repère humain ; il n’aliase jamais l’identifiant.
| Règle | Libellé |
|---|---|
structure.sentence-too-long | Phrase trop longue |
structure.paragraph-too-long | Paragraphe trop long |
structure.heading-jump | Saut de niveau de titre |
structure.deeply-nested-lists | Listes trop imbriquées |
structure.excessive-commas | Virgules en excès |
structure.long-enumeration | Énumération trop longue |
structure.deep-subordination | Subordination profonde |
structure.line-length-wide | Lignes trop larges |
structure.mixed-numeric-format | Formats numériques mixtes |
| Règle | Libellé |
|---|---|
rhythm.consecutive-long-sentences | Phrases longues consécutives |
rhythm.repetitive-connectors | Répétition de connecteurs |
| Règle | Libellé |
|---|---|
lexicon.low-lexical-diversity | Diversité lexicale faible |
lexicon.excessive-nominalization | Nominalisations en excès |
lexicon.unexplained-abbreviation | Abréviations non explicitées |
lexicon.weasel-words | Mots évasifs |
lexicon.jargon-undefined | Jargon non défini |
lexicon.all-caps-shouting | Majuscules criardes |
lexicon.redundant-intensifier | Intensificateurs redondants |
lexicon.consonant-cluster | Amas consonantiques |
| Règle | Libellé |
|---|---|
syntax.passive-voice | Voix passive |
syntax.unclear-antecedent | Antécédent flou |
syntax.nested-negation | Négations imbriquées |
syntax.conditional-stacking | Empilement de conditions |
syntax.dense-punctuation-burst | Rafale de ponctuation |
| Règle | Libellé |
|---|---|
readability.score | Score de lisibilité |
Source d’autorité. La catégorie de chaque règle est déterminée par
Category::for_ruledanssrc/types.rs. Les tableaux ci-dessus reflètent cette fonction. Un test de couverture (tests/rule_docs_coverage.rs) tient les pages par règle, le helper de catégorie et les poids du score synchronisés.
| Niveau | Sens | Effet |
|---|---|---|
info | Signal à connaître, pas un défaut | Remonté ; ne fait pas échouer la CI |
warning | Problème de qualité à corriger | Remonté ; peut faire échouer la CI selon --min-score |
error | Réservé pour v0.3+ | Non émis en v0.2 |
Voir Contributing pour la checklist d’ajout de règle — toute nouvelle règle doit être livrée avec une page dans cette section.
structure.sentence-too-longPhrase trop longue.
Les phrases dont la longueur dépasse un plafond par profil. La charge cognitive intrinsèque d’une phrase croît de façon non linéaire avec son nombre de mots (Graesser et al. 2004, Coh-Metrix) ; le FALC plafonne à 15 mots, le Plain English à 20. Les phrases longues augmentent la probabilité qu’un lecteur à l’attention fragilisée perde le fil en cours de lecture.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 2 |
| Langues | EN · FR (détection identique) |
| Source | src/rules/sentence_too_long.rs |
Le texte est découpé en phrases via la ponctuation forte (., !,
?, …, sauts de paragraphe). Les tokens mots Unicode sont comptés
en excluant la ponctuation. Les contractions (don't) et élisions
(l'accessibilité) comptent pour un seul mot quand l’apostrophe est
entourée de deux lettres. Les blocs de code sont ignorés.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_words | int | 30 | 22 | 15 |
exclude_code_blocks | bool | true | true | true |
Trois idées, teintes assorties d’un bout à l’autre de la réécriture — la position les appariait déjà, la couleur confirme que la réécriture n’en perd aucune.
Avant (FR, signalée) :
Le sous-système de cache introduit lors d’un jalon précédent interagit mal avec le nouveau pipeline de requêtes sous charge soutenue, et l’enquête a nécessité plusieurs rondes de profilage.
Après :
Le cache a été introduit lors d’un jalon précédent. Il interagit mal avec le nouveau pipeline sous charge soutenue. L’enquête a nécessité plusieurs rondes de profilage.
Avant (EN, signalée) :
The caching subsystem, which was introduced in an earlier milestone, turned out to interact poorly with the new request pipeline under sustained load, and the investigation that followed required multiple rounds of profiling.
Après :
The caching subsystem was introduced earlier. It interacts poorly with the new request pipeline under sustained load. The investigation required several rounds of profiling.
Voir Neutralisation des diagnostics (page EN pour l’instant) pour les formes en ligne et par bloc.
rhythm.consecutive-long-sentences — capture le rythme ; son seuil doit rester inférieur au max_words d’ici.structure.sentence-too-long porte un poids de 2 parce que le coût cognitif se compose avec la longueur.Voir Références pour la bibliographie complète.
structure.paragraph-too-longParagraphe trop long.
Les paragraphes qui dépassent un seuil en nombre de phrases ou en
nombre de mots. Le paragraphe est l’unité visuelle de reprise : un
paragraphe trop long dilue ce point de reprise pour les lecteurs qui
s’interrompent souvent. Les deux mesures sont vérifiées afin qu’un
paragraphe court mais dense (une seule phrase de 80 mots) soit aussi
attrapé — structure.sentence-too-long
couvre le cas complémentaire.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 2 |
| Langues | EN · FR (détection identique) |
| Source | src/rules/paragraph_too_long.rs |
Découpage sur les lignes vides (convention Markdown du paragraphe). Comptage des phrases et des mots par paragraphe. Signalement des paragraphes dépassant l’un ou l’autre des seuils.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_sentences | int | 7 | 5 | 3 |
max_words | int | 150 | 100 | 60 |
Un paragraphe de huit phrases moyennes sous le profil public se
déclenchera sur max_sentences. Un paragraphe contenant une seule
phrase de 120 mots se déclenchera sur max_words (et également sur
structure.sentence-too-long).
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
structure.heading-jumpSaut de niveau de titre.
Les sauts de niveau de titre qui cassent la carte mentale du document
(par exemple H2 → H4). Chaque niveau doit suivre le précédent d’au
plus +1. Les lecteurs avec des difficultés attentionnelles s’appuient
fortement sur la hiérarchie des titres pour se repositionner après
une interruption ; une hiérarchie cassée détruit cet indice. Signale
aussi le tout premier titre s’il est plus profond que H2 quand
allow_first_heading_any_level vaut false, ainsi que l’absence de
H1 quand require_h1 vaut true.
Références. WCAG 2.1 SC 1.3.1 (Information et relations) et 2.4.6 (En-têtes et étiquettes) ; RGAA 9.1.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | indépendant de la langue |
| Source | src/rules/heading_jump.rs |
Analyse des titres Markdown (#, ##, …). Parcours dans l’ordre
source ; signalement de chaque titre dont le niveau dépasse le
précédent de plus d’un. Déterministe, pas de faux positifs.
| Clé | Type | Défaut |
|---|---|---|
allow_first_heading_any_level | bool | true |
require_h1 | bool | false |
Règle binaire — pas de seuils par profil.
Signalé :
# Vue d'ensemble
#### Détails ← saut de H1 à H4
Propre :
# Vue d'ensemble
## Section
### Sous-section
Voir Neutralisation des diagnostics (page EN pour l’instant).
structure.deeply-nested-lists — le
signal équivalent au niveau des listes.Voir Références pour la bibliographie complète.
structure.deeply-nested-listsListes trop imbriquées.
Les items de liste à puces imbriqués au-delà d’une profondeur raisonnable. Une liste profondément imbriquée force le lecteur à reconstruire une hiérarchie mentale complexe — l’indentation horizontale cesse d’être un indice positionnel et devient du bruit. Quatre niveaux d’indentation, c’est trop pour des lecteurs avec des difficultés attentionnelles.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | indépendant de la langue |
| Source | src/rules/deeply_nested_lists.rs |
Analyse Markdown via pulldown-cmark ; extraction des items de liste
avec leur niveau d’indentation ; signalement des items au-delà de
max_depth. Déterministe, pas de faux positifs.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_depth | int | 4 | 3 | 2 |
Sous le profil public (profondeur max 3) :
- Niveau 1
- Niveau 2
- Niveau 3
- Niveau 4 ← signalé
Inclut un guide de réparation : aplatir la structure, scinder en plusieurs listes, ou promouvoir les sous-items en sous-sections avec des titres.
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
structure.line-length-wideLignes trop larges.
Les lignes choisies par l’auteur plus larges que le plafond du profil. WCAG 1.4.8 (AAA) plafonne le texte rendu à environ 80 caractères par ligne, car des lignes plus longues forcent l’œil à parcourir plus de distance entre saccades et augmentent la relecture au retour à la ligne — une difficulté connue chez les lecteurs dyslexiques (BDA Dyslexia Style Guide).
« Choisies par l’auteur » est important. En Markdown, les sauts mous sont remplacés par des espaces lors de l’analyse, parce que le rendu réorganise le texte selon la largeur de l’écran. La largeur de la ligne source ne dit donc rien de ce que voit le lecteur. Cette règle ne mesure que les sauts gardés volontairement : sauts durs Markdown (<br> ou deux espaces en fin de ligne) et retours à la ligne explicites en texte brut. Un paragraphe Markdown soft-wrappé est exempté, peu importe la longueur de son texte joint. Pour borner la densité d’un paragraphe, voir structure.paragraph-too-long.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Mots-clés de condition | dyslexia, general |
| Langues | EN · FR (indépendant de l’écriture) |
| Source | src/rules/line_length_wide.rs |
Pour chaque paragraphe qui contient un saut de ligne voulu par l’auteur, mesure de la largeur de chaque ligne en clusters de graphèmes ; signalement des lignes au-delà de max_line_length.
Un paragraphe Markdown sans saut dur (le cas courant en prose) est exempté. Les sauts mous sont remplacés par des espaces lors de l’analyse : ce qui reste est une ligne logique dont la longueur source suit la largeur de l’éditeur, pas le rendu visé par WCAG 1.4.8. Le texte brut suit la même logique : un paragraphe sans \n interne est exempté ; un paragraphe avec retours à la ligne internes est mesuré ligne par ligne.
Les blocs de code (clôturés ou indentés) sont exclus en amont par le parseur Markdown. Les titres, items de liste et cellules de tableau sont hors scope par construction — paragraph-too-long, sentence-too-long et les règles de titres couvrent les charges cognitives qui s’appliquent à ces blocs.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_line_length | int | 120 | 100 | 80 |
Le profil FALC s’aligne sur la recommandation AAA WCAG 1.4.8 de 80 caractères.
Les paragraphes de prose en une seule ligne source sont exemptés volontairement. La règle se déclenchait dessus auparavant et générait beaucoup de bruit sur de la prose réelle ; v0.2.x la restreint aux sauts choisis par l’auteur. À combiner avec structure.paragraph-too-long si tu veux aussi un plafond sur la longueur jointe du paragraphe.
Les titres et items de liste ne sont pas mesurés par cette règle. Leur largeur de retour dépend du rendu (corps des titres, indentation des listes), et les charges cognitives sous-jacentes sont déjà couvertes par d’autres règles.
Voir Neutralisation des diagnostics (page EN pour l’instant).
structure.paragraph-too-longstructure.sentence-too-longVoir Références pour la bibliographie complète.
structure.mixed-numeric-formatFormats numériques mixtes.
Les phrases qui mêlent des numéraux en chiffres (42, 3.14,
1,000, 1 000) avec des numéraux écrits en toutes lettres (two,
trois, twenty, cent) au sein de la même phrase. Présenter les
nombres de manière incohérente force le lecteur à changer de forme
visuelle en cours de proposition et à ré-ancrer le référent — une
charge connue pour les lecteurs dyscalculiques et un anti-patron du
langage clair.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Mots-clés de condition | dyscalculia, general |
| Langues | EN · FR |
| Source | src/rules/mixed_numeric_format.rs |
Pour chaque phrase produite par le tokenizer, balayage des tokens chiffrés et des entrées de la liste des numéraux en lettres pour la langue. Si au moins un de chaque type co-existe, un seul diagnostic est émis pour la phrase, citant un token représentatif de chaque type.
Les tokens chiffrés acceptent les chiffres ASCII plus un séparateur
décimal facultatif (.) ou de milliers (,, espace fine U+0020)
quand il est encadré de chiffres des deux côtés. Les correspondances
en toutes lettres sont des comparaisons ASCII insensibles à la casse
contre en::SPELLED_NUMERALS
et fr::SPELLED_NUMERALS.
Les formes ambiguës one (EN) et un / une (FR) sont exclues
de la liste des numéraux en lettres parce qu’elles servent aussi de
pronoms indéfinis et d’articles. Cela maintient un taux de faux
positifs gérable, au prix de manquer les vrais cas de format mixte
dont le seul numéral en lettres est one / un / une. Les formes
régionales (Suisse / Belgique : septante, huitante, octante,
nonante) ainsi que les formes métropolitaines sont incluses.
Les phrases sont produites par le tokenizer partagé
(voir src/parser/tokenizer.rs),
afin que les abréviations, décimales et points de suspension ne
fragmentent pas indûment les phrases. Les blocs de code (clôturés ou
indentés) sont exclus en amont par le parseur Markdown.
Aucun. La règle n’a pas de seuil configurable — une seule co-occurrence des deux formes suffit.
one / un /
une ne sont pas signalées, par construction (voir Détection).first, premier, 2nd, 3e) sont hors périmètre.
2nd se lit actuellement comme un token chiffré (2) suivi d’un
mot (nd), ce qui ne correspond pas à la liste des numéraux en
lettres — pas de faux positif.IV, XIV) ne sont ni des chiffres ni des
numéraux en lettres pour cette règle.Voir Neutralisation des diagnostics (page EN pour l’instant).
readability.scoreVoir Références pour la bibliographie complète.
structure.excessive-commasVirgules en excès.
Les phrases dont le nombre de virgules dépasse un plafond par profil. La virgule est le marqueur le plus fréquent de complexité syntaxique ; plutôt que de démêler la cause (subordination, apposition, énumération, incise), la règle se sert de la densité comme indicateur avancé de surcharge.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (détection identique) |
| Source | src/rules/excessive_commas.rs |
Compter les virgules par phrase, signaler celles qui dépassent
max_commas.
Interaction. Quand structure.long-enumeration
se déclenche sur la même phrase, cette règle est neutralisée pour cette
phrase afin d’éviter un double signalement. Le détecteur d’énumération
partagé décompte les virgules Oxford (3 items courts ou plus, plus une
passe rythmique relâchée pour les items de 1 à 4 mots, plus les listes
fermées par plus au même titre que et / ou — voir « Faux positifs
connus » ci-dessous) et les virgules à l’intérieur des listes de tokens
parenthésées (A, B, C, …) (3 segments courts ou plus séparés par des
virgules entre parenthèses équilibrées) — tous les décomptes sont
agnostiques à la langue.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_commas | int | 4 | 3 | 2 |
Les faux positifs restants viennent surtout des listes sans
connecteur terminal (par exemple Rules touched: A, B, C) et des
énumérations Oxford interrompues par une parenthèse interleavée ;
ils sont suivis sous F22 dans la
feuille de route pour les prochaines
sous-tranches v0.3.
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
structure.long-enumerationÉnumération trop longue.
Les énumérations en prose inline qui seraient plus claires sous forme
de liste à puces — 5 items ou plus séparés par des virgules et fermés
par un coordinateur (et, ou, and, or).
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (détection identique) |
| Source | src/rules/long_enumeration.rs, helper partagé src/rules/enumeration.rs |
Séquence de min_items segments courts ou plus, séparés par des
virgules, terminée par , and / , or / , plus / , et / , ou
(virgule Oxford facultative). Le détecteur partagé alimente également
structure.excessive-commas.
| Clé | Type | Défaut |
|---|---|---|
min_items | int | 5 |
Suggère de convertir l’énumération en liste à puces.
Six items, teintes assorties d’un bout à l’autre de la réécriture — chaque terme inline s’aligne avec sa puce.
Avant (FR, signalée) :
Le plat contient tomate, oignon, ail, basilic, persil et thym.
Après :
Le plat contient :
- tomate
- oignon
- ail
- basilic
- persil
- thym
Avant (EN, signalée) :
The dish contains tomato, onion, garlic, basil, parsley, and thyme.
Après :
The dish contains:
- tomato
- onion
- garlic
- basil
- parsley
- thyme
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
structure.deep-subordinationSubordination profonde.
Les cascades de subordonnées : plusieurs pronoms relatifs ou conjonctions de subordination enchaînés sans rupture forte de ponctuation. Chaque référent ouvert doit rester en mémoire de travail jusqu’à sa clôture — la Dependency Locality Theory de Gibson (1998) relie le coût de traitement directement à cette distance.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 2 |
| Langues | EN · FR (listes distinctes) |
| Source | src/rules/deep_subordination.rs |
Parcours de la phrase entre ruptures fortes de ponctuation ;
décompte des subordonnants consécutifs. Signalement quand le décompte
dépasse max_consecutive_subordinators. Les énumérations de
pronoms (qui, que, dont, où) sont ignorées — le détecteur reconnaît
la forme listée et ne la traite pas comme une cascade.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_consecutive_subordinators | int | 3 | 2 | 2 |
Chaque token surligné est un subordonnant compté par la règle. Quatre consécutifs déclenchent le seuil dev-doc (3) ; deux consécutifs déclenchent public et falc.
Signalé (FR) :
Le document qui a été rédigé par l’équipe que nous avons constituée et qui couvre les points que nous avions discutés…
Signalé (EN) :
The report that was drafted by the team which we formed last month and which covers the topics that we had discussed…
Non signalé (forme énumération, reconnue par le détecteur) :
Les pronoms relatifs en français sont : qui, que, dont, où.
Et la forme équivalente en anglais :
The English relative pronouns are: which, that, who, whom, whose.
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
structure.italic-span-longPhrase en italique trop longue.
Expérimentale en v0.2.x. Désactivée par défaut ; activez-la via
--experimental structure.italic-span-longou[experimental] enabled = ["structure.italic-span-long"]danslucid-lint.toml. Passe àStableau moment du tag v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour le tagdyslexiaqui gouverne cette règle selon les conditions actives.
Les spans italiques (*…* / _…_) qui dépassent un seuil de mots configurable. Les glyphes inclinés gênent la reconnaissance des formes de lettres pour les personnes dyslexiques — un constat solide qui motive la recommandation de la British Dyslexia Association : garder l’italique pour de courtes phrases plutôt que pour des passages entiers. Les longs passages en italique nuisent aussi au repérage visuel pour tout lecteur dont l’attention est déjà sollicitée (fatigue, lecture en seconde langue, basse vision).
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Statut | experimental (v0.2.x) → stable au tag v0.3 |
| Tag de condition | dyslexia (gouverné ; ne s’exécute qu’avec --conditions correspondant) |
| Langues | EN · FR (détection identique — le substrat est agnostique) |
| Source | src/rules/structure/italic_span_long.rs |
Parcourt l’arbre inline typé attaché à chaque Paragraph (substrat F143) et signale tout span Inline::Emphasis dont le nombre de mots visibles dépasse le seuil du profil. Les blocs de code et le code inline sont exclus par le parseur ; un italique dans un bloc de code ne déclenche jamais la règle. Le gras (**bold**) ne déclenche pas non plus cette règle — seul l’italique (*italique* / _italique_).
La position du diagnostic pointe sur le délimiteur d’ouverture : le surlignage dans votre éditeur se place sur le * ou _ visible, pas sur une colonne arbitraire dans le paragraphe.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_words | int | 12 | 8 | 5 |
Pour ajuster via lucid-lint.toml :
[rules."structure.italic-span-long"]
max_words = 6
Avant (signalé) :
The team eventually concluded that the proposed migration plan would require careful coordination across three regional offices and an extended freeze window before any deployment could begin.
Ce que lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia rapporte :
warning input.md:1:36 Italic span is 17 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]
Après (réécriture proposée) :
The team eventually concluded that the proposed migration plan would require careful coordination. Three regional offices and an extended freeze window are prerequisites before any deployment.
L’italique marque maintenant un seul mot porteur — l’usage que le guide BDA recommande.
Avant (signalé) :
L’équipe a fini par conclure que le plan de migration proposé nécessiterait une coordination soignée entre trois bureaux régionaux et une fenêtre de gel prolongée avant tout déploiement.
Ce que lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia rapporte :
warning input.md:1:35 Italic span is 18 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]
Après (réécriture proposée) :
L’équipe a fini par conclure que le plan de migration nécessiterait une coordination soignée. Trois bureaux régionaux et une fenêtre de gel prolongée sont indispensables avant tout déploiement.
Voir Supprimer un diagnostic pour les formes inline et bloc. La directive inline fonctionne sur cette règle :
<!-- lucid-lint disable-next-line structure.italic-span-long -->
Une *phrase volontairement longue en italique que la règle signalerait normalement* est ici.
dyslexia qui gouverne cette règle.Voir Références pour la bibliographie complète.
structure.number-runTrop de nombres dans une seule phrase.
Expérimentale en v0.2.x. Désactivée par défaut ; activez-la via
--experimental structure.number-runou[experimental] enabled = ["structure.number-run"]danslucid-lint.toml. Passe àStableau moment du tag v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour le tagdyscalculiaqui gouverne cette règle selon les conditions actives.
Les phrases qui empilent plus d’un seuil configurable de jetons numériques. plainlanguage.gov est explicite sur le cadrage — « Don’t put a lot of numbers together in one sentence » et « Avoid placing too many statistics close together » — et les personnes dyscalculiques en paient le coût en premier : chaque jeton numérique force un nouvel ancrage quantité-vers-symbole qui ne profite pas du contexte de la prose comme un mot ordinaire. Les enfilades de citations ((Smith 2020, Jones 2021, Wei 2022, Park 2023)), les tableaux de mesures aplatis dans la prose et les paragraphes saturés de statistiques sont les cas typiques.
| Catégorie | structure |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Statut | experimental (v0.2.x) → stable au tag v0.3 |
| Tag de condition | dyscalculia (gouverné ; ne s’exécute qu’avec --conditions correspondant) |
| Langues | EN · FR (détection identique — les chiffres sont agnostiques) |
| Source | src/rules/structure/number_run.rs |
Parcourt le flux de phrases de chaque paragraphe (après aplatissement, les blocs de code clos sont déjà exclus par le parseur) et compte les jetons numériques par phrase. Un jeton numérique est une suite contiguë de chiffres ASCII, contenant éventuellement un séparateur décimal (. ou ,) suivi de chiffres. Le tiret, le deux-points, la barre oblique et les espaces séparent les jetons.
| Entrée | Jetons comptés | Remarque |
|---|---|---|
42 | 1 | Entier nu |
3.14 | 1 | Séparateur décimal conservé |
1,000 | 1 | Virgule conservée |
2026-05-04 | 3 | Les tirets séparent — une date vaut trois nombres en charge cognitive |
$3.50 | 1 | Préfixe monétaire non-chiffre, ignoré |
1st | 1 | Lettres finales séparées ; les chiffres comptent |
La position du diagnostic pointe sur le premier jeton numérique de la phrase fautive : le surlignage de l’éditeur tombe sur le bloc visible plutôt qu’au début de la phrase.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_numbers | int | 6 | 4 | 3 |
Pour ajuster via lucid-lint.toml :
[rules."structure.number-run"]
max_numbers = 5
Avant (signalé) :
The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points over the prior year.
Ce que lucid-lint check --profile public --experimental structure.number-run --conditions dyscalculia rapporte :
warning input.md:1:5 Sentence packs 8 numeric tokens (maximum 4). plain-language guidance recommends not placing many numbers or statistics together in one sentence; split the sentence or move some figures to a list or table. [structure.number-run]
Après (votre réécriture) :
The 2024 cohort sat 1,200 students across 4 campuses. They posted a 92.5% pass rate on the reviewed papers and improved 18 points over the prior year.
Les chiffres voyagent toujours ensemble, mais chaque phrase porte une charge qu’une lectrice dyscalculique peut ré-ancrer sans perdre le référent.
Avant (signalé) :
La promotion 2024 a réuni 1 200 étudiants sur 4 campus, affiché un taux de réussite de 92,5 % sur les 3 copies revues, et progressé de 18 points par rapport à l’année précédente.
Après (votre réécriture) :
La promotion 2024 a réuni 1 200 étudiants sur 4 campus. Le taux de réussite atteint 92,5 % sur les copies revues et progresse de 18 points par rapport à l’année précédente.
Voir Supprimer les diagnostics pour les formes inline et bloc. La désactivation inline fonctionne aussi sur cette règle :
<!-- lucid-lint disable-next-line structure.number-run -->
The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points.
dyscalculia qui gouverne cette règle.structure.mixed-numeric-format — règle sœur sur la cohérence de la forme numérique. Découpe atomique : mixed-numeric-format regarde si chiffres et numéraux écrits cohabitent ; number-run regarde combien de jetons numériques s’agglutinent, peu importe la forme.mixed-numeric-format).Voir Références pour la bibliographie complète.
rhythm.consecutive-long-sentencesPhrases longues consécutives.
Des séries de phrases longues à l’intérieur d’un même paragraphe. Une
phrase longue isolée reste gérable ; plusieurs d’affilée fatiguent
l’attention même si chaque phrase reste sous le plafond de
structure.sentence-too-long. Cette règle
capte le rythme.
| Catégorie | rhythm |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (détection identique) |
| Source | src/rules/consecutive_long_sentences.rs |
Parcourir les phrases dans l’ordre à l’intérieur de chaque paragraphe.
Maintenir un compteur de phrases consécutives au-dessus de
word_threshold. Émettre un seul diagnostic par série atteignant
max_consecutive.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
word_threshold | int | 20 | 15 | 10 |
max_consecutive | int | 3 | 2 | 2 |
structure.sentence-too-longLes deux règles regardent la longueur des phrases mais signalent des problèmes différents :
| Règle | Seuil (dev-doc / public / falc) | Se déclenche sur |
|---|---|---|
structure.sentence-too-long | max_words 30 / 22 / 15 | une phrase isolée au-delà du plafond |
rhythm.consecutive-long-sentences | word_threshold 20 / 15 / 10 | une série de max_consecutive phrases chacune au-dessus du seuil inférieur |
Comme word_threshold reste sous max_words, cette règle capte le
rythme même quand aucune phrase isolée ne franchit
sentence-too-long. L’invariant word_threshold < max_words (par
profil) empêche les deux règles de se déclencher ensemble sur la
même phrase.
Cinq idées, teintes assorties d’un bout à l’autre de la réécriture —
seul le rythme change. lucid-lint signale ; la réécriture vous
appartient.
Avant (signalée) :
La migration a introduit une couche de cache qui se place devant chaque lecture de la base de données primaire. L’équipe a observé des pics de latence inattendus chaque fois que le cache s’invalidait sous une charge d’écriture soutenue. Une enquête ultérieure a relié la régression à un effet thundering-herd qui se déclenchait sur chaque clé froide. Le tableau de bord des métriques signalait à tort un délai d’attente générique parce que la propagation de la trace était incomplète. Le correctif a fusionné les remplissages concurrents, randomisé les TTL, et instrumenté la couche de cache avec un émetteur de span dédié.
Cinq phrases, chacune au-delà de 20 mots — la série fatigue l’attention.
Ce que lucid-lint check --profile dev-doc rapporte :
warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]
Après (votre réécriture) :
La migration a introduit une couche de cache devant la base de données primaire. La latence montait dès que le cache s’invalidait sous écritures soutenues. Le coupable : un thundering-herd sur les clés froides. Les métriques signalaient un délai générique — la trace était cassée. Le correctif fusionne les remplissages, randomise les TTL et émet un span dédié.
Avant (signalée) :
The migration introduced a caching layer that sits in front of every read from the primary database. The team observed unexpected latency spikes whenever the cache invalidated under sustained write load. A subsequent investigation traced the regression to a thundering-herd pattern that fired on every cold key. The metrics dashboard misreported the issue as a generic timeout because the trace propagation was incomplete. The fix coalesced concurrent fills, added jittered TTLs, and instrumented the cache layer with a dedicated span emitter.
Ce que lucid-lint check --profile dev-doc rapporte :
warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]
Après (votre réécriture) :
The migration introduced a caching layer in front of the primary database. Latency spiked whenever the cache invalidated under heavy writes. The cause was a thundering-herd pattern on cold keys. Metrics misreported it as a generic timeout — trace propagation was broken. The fix coalesced concurrent fills, added jittered TTLs, and emitted a dedicated span.
Voir Neutralisation des diagnostics (page EN pour l’instant) pour les formes en ligne et par bloc.
structure.sentence-too-long — capte les phrases longues isolées ; cette règle capte la série même quand chaque phrase reste sous ce plafond.rhythm.consecutive-long-sentences porte le poids par défaut 1 ; le coût cognitif est cumulatif, pas par phrase.Voir Références pour la bibliographie complète.
rhythm.repetitive-connectorsRépétition de connecteurs.
Surutilisation d’un même connecteur logique dans une fenêtre courte de phrases. Les connecteurs (opposition, cause, conséquence, séquence, illustration, addition) sont des points d’attention ; répétés, ils aplatissent le sentiment de progression. Sanders & Noordman (2000), Connectives as processing signals ; Graesser et al. (2004), cohésion locale.
| Catégorie | rhythm |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (listes séparées) |
| Source | src/rules/repetitive_connectors.rs |
Fenêtre glissante de window_size phrases. Par connecteur, compter
les occurrences dans la fenêtre. Émettre un diagnostic par grappe
qui dépasse max_per_window.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_per_window | int | 4 | 3 | 2 |
window_size | int | 5 | 5 | 5 |
custom_connectors | list | [] | [] | [] |
lucid-lint signale ; la réécriture vous appartient.
Cinq actions, teintes assorties d’un bout à l’autre de la réécriture — seuls les connecteurs changent.
Avant (signalée) :
Nous avons analysé les données. Ensuite nous avons construit le modèle. Ensuite nous avons validé les résultats. Ensuite nous avons publié le rapport. Ensuite nous avons archivé les données brutes.
Quatre ensuite en cinq phrases — aucune progression ressentie.
Ce que lucid-lint check --profile public rapporte :
warning input.md:1:1 Connector "ensuite" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]
Après (votre réécriture) :
Nous avons analysé les données. À partir de là nous avons construit le modèle. La validation a suivi, et dès que les résultats ont tenu nous avons publié le rapport. Les données brutes ont été archivées en dernier.
Cinq actions, teintes assorties d’un bout à l’autre de la réécriture — seuls les connecteurs changent.
Avant (signalée) :
We analysed the data. Then we built the model. Then we validated the results. Then we published the report. Then we archived the raw data.
Quatre then en cinq phrases — aucune progression ressentie.
Ce que lucid-lint check --profile public rapporte :
warning input.md:1:1 Connector "then" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]
Après (votre réécriture) :
We analysed the data. From it we built the model. Validation followed, and once the results held up we published the report. The raw data was archived last.
Voir Neutralisation des diagnostics (page EN pour l’instant) pour les formes en ligne et par bloc.
structure.sentence-too-long — phrases longues et abus de connecteurs co-occurrent souvent ; signaler les deux fait apparaître un signal de rythme plus riche.rhythm.repetitive-connectors porte le poids par défaut 1 ; le coût est local, pas cumulatif.Voir Références pour la bibliographie complète.
lexicon.weasel-wordsMots évasifs.
Les qualificatifs vagues qui affaiblissent une affirmation. Un mot fuyant ajoute une charge cognitive invisible : le lecteur doit décider si l’assertion compte, est vraie, ou mesurable. Références : guide de style Wikipédia (Avoid weasel words), Strunk & White, FALC.
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (listes distinctes) |
| Source | src/rules/weasel_words.rs |
Correspondance sur frontière de mot contre une liste par langue. Insensible à la casse. Un diagnostic par occurrence.
`…`
est ignorée. Entourer un terme fuyant de backticks quand on parle du
mot lui-même.plutôt que (FR) et rather than
(EN) sont des conjonctions qui signifient « au lieu de » — ce ne
sont pas des formules d’atténuation — et sont ignorés.| Clé | Type | Défaut |
|---|---|---|
custom_weasels_fr | list | [] |
custom_weasels_en | list | [] |
disable_weasels | list | [] |
Deux motifs se déclenchent encore en v0.2 : les termes entre guillemets
droits ("many X" sans backticks) et "many X" où X est un nom
concret. Les deux sont suivis sous F23 dans la
feuille de route. Entourer le terme cité de
backticks, ou utiliser un commentaire de neutralisation inline, pour
opter hors de la règle.
Utiliser <!-- lucid-lint disable-next-line lexicon.weasel-words -->
quand le mot fuyant est intentionnel (citation, référence légitime à
un sous-ensemble, méta-discussion). Voir
Neutralisation des diagnostics (page EN
pour l’instant).
Voir Références pour la bibliographie complète.
lexicon.unexplained-abbreviationAbréviations non explicitées.
Les acronymes employés sans définition proche. Chaque interruption forcée pour deviner ou chercher un acronyme casse le fil et augmente le risque de perdre l’attention.
Références. WCAG 2.1 SC 3.1.4 (Abréviations) ; RGAA 9.4.
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (listes blanches distinctes) |
| Source | src/rules/unexplained_abbreviation.rs |
Pré-scan du document entier pour repérer les acronymes définis sous l’une ou l’autre forme canonique :
Expansion complète (ACRONYME) — exemple : World Wide Web (WWW)ACRONYME (Expansion complète) — exemple : WWW (World Wide Web)Le côté « expansion » doit contenir au moins deux mots alphabétiques,
pour que des notes courtes entre parenthèses comme (TBD) ou
(à vérifier) ne soient pas comptées comme définitions.
Appariement des séquences de 2 lettres capitales consécutives ou plus (optionnellement avec des chiffres) dans le texte principal.
Filtrage de chaque candidat par trois couches, dans l’ordre :
[rules.unexplained-abbreviation].whitelist.Signalement de chaque occurrence restante.
Une seule définition n’importe où dans le document fait taire chaque occurrence du même acronyme — ce qui correspond à la manière dont les lecteurs utilisent réellement la documentation (remonter une fois pour trouver l’expansion, la retenir ensuite).
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_length | int | 3 | 2 | 2 |
whitelist | list | étendue | minimale | vide |
Liste blanche par défaut (v0.2, resserrée par F31) : la pile
d’infrastructure — URL, HTML, CSS, JSON, XML, HTTP, HTTPS, UTF, IO, API, CLI, GUI, OS, CPU, RAM, SSD, USB, IDE, SDK, CI, CD — plus les
acronymes FR/EN courants et les mots-clés d’emphase RFC 2119
(PDF, SMS, GPS, ID, OK, FAQ, MUST, SHALL, SHOULD, …).
[rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]
Les entrées de la liste blanche utilisateur sont additives par rapport à la liste de base — elles l’étendent, jamais ne la remplacent.
Voir Neutralisation des diagnostics (page EN pour l’instant).
lexicon.jargon-undefined —
l’équivalent pour les mots de contenu.Voir Références pour la bibliographie complète.
lexicon.low-lexical-diversityDiversité lexicale faible.
Les passages qui répètent excessivement leurs mots de contenu. Un
texte monotone perd l’attention du lecteur et trahit souvent une
pensée mal structurée. La règle n’est pas un anti-jargon : les
termes techniques (API, requête, cache) sont attendus comme
récurrents — le signal vise les mots de contenu non techniques.
| Catégorie | lexicon |
| Sévérité par défaut | info |
| Poids par défaut | 1 |
| Langues | EN · FR (listes de mots-outils distinctes) |
| Source | src/rules/low_lexical_diversity.rs |
Fenêtre glissante de window_size mots. Dans la fenêtre, on calcule
mots_uniques / mots_totaux sur les jetons hors mots-outils et hors
blocs de code. Le diagnostic se déclenche quand le ratio passe sous
min_ratio.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
window_size | int | 100 | 100 | 80 |
min_ratio | float | 0.40 | 0.50 | 0.55 |
use_stoplist | bool | true | true | true |
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
lexicon.excessive-nominalizationNominalisations en excès.
Les phrases densément peuplées de nominalisations — verbes transformés en noms abstraits. Deux problèmes se cumulent : le texte nominalisé est plus abstrait (plus coûteux à traiter) et il masque l’agent (« qui fait quoi » disparaît). Le FALC et le Plain Writing Act américain recommandent les verbes forts plutôt que les nominalisations.
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (listes de suffixes qui se recoupent) |
| Source | src/rules/excessive_nominalization.rs |
Parcours de la phrase. On signale les mots dont le suffixe figure
dans la liste de la langue. Le diagnostic se déclenche quand le
nombre par phrase franchit max_per_sentence.
-tion, -sion, -ment, -ance, -ence,
-age, -ité, -isme, -ure-tion, -sion, -ment, -ance, -ence,
-ity, -ism, -ness, -al| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_per_sentence | int | 4 | 3 | 2 |
suffixes | list | défauts par langue | défauts par langue | défauts par langue |
Le vocabulaire technique (function, implementation,
configuration) contient beaucoup de nominalisations légitimes,
ce qui justifie le seuil relâché de dev-doc. Le suffixe -al en
anglais est trop large (il signale crucial, horizontal,
positional alors qu’il ne s’agit pas de noms abstraits) et reste
suivi sous F-excessive-nominalization-suffix-refine dans la
feuille de route.
Nominalisations mises en couleur, appariées aux verbes actifs correspondants dans la version réécrite.
Avant (lourd) :
La réalisation de l’analyse de la conformité permettra l’identification des axes d’amélioration.
Après (allégé) :
Nous analyserons la conformité. Cela permettra d’identifier les axes à améliorer.
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
lexicon.jargon-undefinedJargon non défini.
Les termes spécialisés employés sans définition. Le jargon est contextuel : acceptable entre spécialistes, exclusif autrement. Comme les acronymes, le jargon impose des interruptions de lecture au non-spécialiste ; à la différence des acronymes, ce sont des mots de contenu, pas des séquences en majuscules.
Références. Plain Language (US), FALC, WCAG 2.1 SC 3.1.3 (Mots inhabituels).
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Langues | EN · FR (listes distinctes par langue et par domaine) |
| Source | src/rules/jargon_undefined.rs |
tech,
legal, medical, admin).| Profil | Listes actives |
|---|---|
dev-doc | aucune (les développeurs maîtrisent leur propre jargon) |
public | tech, legal, medical, admin |
falc | tech, legal, medical, admin, mode strict |
En v0.2, les listes actives sont fixées par le profil et ne sont
pas encore surchargées depuis lucid-lint.toml. Les surcharges
TOML par règle — ajouter des termes de domaine, neutraliser des
entrées précises, ou activer une combinaison de listes différente
du profil — sont suivies sous F126 dans la
feuille de route.
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
lexicon.all-caps-shoutingMajuscules criardes.
Les suites de mots consécutifs en MAJUSCULES.
Le texte tout en majuscules supprime les indices de forme sur lesquels les lecteurs dyslexiques s’appuient pour distinguer les mots :
b, d, h, k, l.g, p, q, y.a, e, o et les hautes comme h, l.En tout-majuscules, chaque lettre repose sur la même ligne de base à la même hauteur. Le lecteur perd la silhouette du mot et doit décoder lettre à lettre. Le tout-majuscules déclenche aussi de nombreux lecteurs d’écran à épeler la suite lettre à lettre, sauf indication contraire dans le balisage.
WCAG 3.1.5 et le BDA Dyslexia Style Guide recommandent la minuscule ou la casse de phrase pour l’emphase.
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Étiquettes de condition | a11y-markup, dyslexia, general |
| Langues | EN · FR (détection sur le script — agnostique de la langue) |
| Source | src/rules/all_caps_shouting.rs |
Par paragraphe, on cherche les suites de mots consécutifs en
MAJUSCULES. Les connecteurs mineurs (,, ;, :, -, espaces)
gardent la suite vivante ; un mot en minuscule, un point ou un saut
de paragraphe la termine.
Un mot est en MAJUSCULES quand il fait au moins 2 lettres et ne
contient aucune minuscule. Les jetons en MAJUSCULES isolés sont
traités comme des abréviations et relèvent de
lexicon.unexplained-abbreviation.
Les blocs de code sont exclus par le parseur Markdown avant que la règle ne s’exécute.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_run_length | int | 3 | 2 | 2 |
dev-doc tolère une emphase à 2 mots (DO NOT) courante en
documentation technique.
lucid-lint signale ; la réécriture vous appartient toujours.
Une seule formule d’emphase, mise en couleur dans la version réécrite — le cri devient une emphase typographique sans perdre l’insistance.
Avant (signalé) :
Please DO NOT touch this.
DO NOT se lit comme un cri.
Ce que lucid-lint check --profile public rapporte :
warning input.md:1:8 2 consecutive ALL-CAPS words read as shouting and degrade legibility for dyslexic readers. Use sentence case and rely on emphasis (italics, bold) or a callout instead. [lexicon.all-caps-shouting]
Après (votre réécriture) :
Please do not touch this.
Une chaîne de trois acronymes ou plus en prose (API HTTP TLS) est
structurellement indiscernable d’un cri et déclenchera la règle.
Neutraliser sur la ligne si la chaîne est intentionnelle, ou
restructurer la phrase.
Voir Neutralisation des diagnostics (page EN pour l’instant).
lexicon.unexplained-abbreviationVoir Références pour la bibliographie complète.
lexicon.redundant-intensifierIntensificateurs redondants.
Les intensificateurs — adverbes qui tentent de renforcer la
confiance d’une affirmation sans rien y ajouter en information.
très important se réduit à important, ou mieux, à une assertion
chiffrée. plainlanguage.gov (chapitre 4) et le CDC Clear
Communication Index signalent les intensificateurs comme un
anti-motif de langue claire.
La règle est le pendant délibéré de
lexicon.weasel-words : les mots évasifs
affaiblissent la confiance (atténuations, qualifications) ; les
intensificateurs redondants la renforcent. Les deux listes sont
disjointes par construction.
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Étiquettes de condition | general |
| Langues | EN · FR |
| Source | src/rules/redundant_intensifier.rs |
Par paragraphe, le texte est mis en minuscules puis chaque
intensificateur de la liste par langue
(en::INTENSIFIERS,
fr::INTENSIFIERS)
est cherché via la recherche partagée à frontières de mot. Les hits
à l’intérieur d’un span de code (clôturé ou inline) sont ignorés.
Les documents dont la langue est Unknown sont ignorés plutôt que
devinés, par parallèle avec lexicon.weasel-words.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
custom_intensifiers_en | list<string> | [] | [] | [] |
custom_intensifiers_fr | list<string> | [] | [] | [] |
disable | list<string> | [] | [] | [] |
custom_intensifiers_en / _fr ajoutent des locutions aux défauts.
disable retire des locutions de ces défauts (correspondance exacte
en minuscules).
très dans la formule figée très bien (comme acquiescement)
déclenche tout de même — les guides de langue claire le signalent
quand même, et la règle ne taille pas d’exception. Neutraliser via
une directive inline si le contexte l’impose vraiment.Voir Neutralisation des diagnostics (page EN pour l’instant).
lexicon.weasel-wordslexicon.jargon-undefinedVoir Références pour la bibliographie complète.
lexicon.consonant-clusterAmas consonantiques.
Les mots dont la plus longue suite de consonnes consécutives atteint ou dépasse un seuil par profil. Les amas consonantiques denses sont une barrière de décodage connue pour les lecteurs dyslexiques (BDA Dyslexia Style Guide) : le lecteur doit retenir plus de phonèmes en mémoire de travail avant que la voyelle suivante « libère » la syllabe.
Exemples typiques en anglais au seuil public de 5 :
strengths (n-g-t-h-s), twelfths (l-f-t-h-s), sixths (x-t-h-s
sur 4 + contexte). Exemples typiques en français au seuil falc
de 4 : constructions (n-s-t-r).
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Étiquettes de condition | dyslexia, general |
| Langues | EN · FR |
| Source | src/rules/consonant_cluster.rs |
Par ligne source, on parcourt le flux de graphèmes une seule fois.
Un mot est une suite maximale de caractères alphabétiques ; les
traits d’union, apostrophes et espaces ferment le mot (ainsi
dys-lexique compte pour deux mots, pas un amas de dix lettres).
À l’intérieur d’un mot, on suit la plus longue suite de consonnes
consécutives. Un diagnostic est émis par mot dont la plus longue
suite atteint min_run_length.
Les voyelles sont sensibles à la langue — les formes accentuées
françaises (é, è, ê, à, â, î, ï, ô, ö, ù, û,
ü, ÿ, œ, æ) comptent comme des voyelles. Le repli anglais
accepte les voyelles latin-1 accentuées courantes pour que les
emprunts (café, naïve) soient décodés correctement. Le y est
traité comme une voyelle dans toutes les langues (clémence), ce qui
évite des faux positifs gênants sur des mots comme fly, rhythm.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_run_length | int | 6 | 5 | 4 |
dev-doc est tolérant — la prose technique nomme régulièrement des
choses comme strengths ou benchmarks. falc (audience grand
public) attrape toute suite de 4 consonnes.
hatching (5 lettres : t-c-h-n-g — suite de 5) se lit fluidement
pour la plupart des lecteurs parce que tch est un seul digramme
anglais. Neutraliser via directive inline quand un hit est
inévitable.en ou fr — en pratique ce
contenu sort du périmètre d’un linter bilingue EN/FR.Voir Neutralisation des diagnostics (page EN pour l’instant).
lexicon.all-caps-shoutingreadability.score (page EN
pour l’instant)Voir Références pour la bibliographie complète.
lexicon.homophone-densityDensité d’homophones trop élevée.
Expérimentale en v0.2.x. Désactivée par défaut ; activez-la via
--experimental lexicon.homophone-densityou[experimental] enabled = ["lexicon.homophone-density"]danslucid-lint.toml. Passe àStableau moment du tag v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour les tagsdyslexiaetaphasiaqui gouvernent cette règle selon les conditions actives.
Les paragraphes dont la part d’homophones — des mots qui se prononcent pareil mais s’écrivent différemment (their / there / they're, to / too / two, cours / court, amande / amende) — dépasse un pourcentage configurable. Les homophones imposent une double passe : l’oreille reconnaît le mot, l’œil doit ensuite choisir la bonne orthographe via le contexte. Ce détour est anodin isolément, coûteux en grappe. Le guide de la British Dyslexia Association cite les homophones comme un point de friction connu pour la lecture dyslexique, et les recommandations FALC d’orthographe claire conseillent de reformuler les passages denses pour les lecteurs aphasiques et les publics « facile à lire ».
| Catégorie | lexicon |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Statut | experimental (v0.2.x) → stable au tag v0.3 |
| Tags de condition | dyslexia, aphasia (gouvernés ; ne s’exécute qu’avec --conditions correspondant) |
| Langues | EN · FR (listes d’homophones spécifiques à chaque langue) |
| Source | src/rules/lexicon/homophone_density.rs |
Pour chaque paragraphe, parcourt le flux de mots une fois, compte les mots alphabétiques au dénominateur, et compte comme « occurrences » les mots qui apparaissent dans la table d’homophones de la langue. Si occurrences / total dépasse strictement le seuil du profil, émet un diagnostic ancré sur la première ligne du paragraphe. Les paragraphes de moins de 20 mots de contenu sont ignorés — sous ce plancher, un seul homophone produit un pourcentage à deux chiffres trompeur. Le message du diagnostic cite jusqu’à deux exemples d’homophones effectivement rencontrés, pour que la localisation reste le paragraphe mais que les pistes de réécriture soient concrètes.
Les tables d’homophones (HOMOPHONE_GROUPS_EN, HOMOPHONE_GROUPS_FR dans src/language/) privilégient des paires de mots-contenu dont la confusion orthographique altère vraiment le sens. Les homophones-outils français très fréquents (et / est, a / à, ou / où) sont volontairement exclus : ils apparaissent dans presque toutes les phrases et feraient grimper la densité de référence au-dessus de tous les seuils, noyant le signal que la règle veut capter.
Quand la langue détectée est Unknown, la règle n’a pas de table à appliquer et s’abstient silencieusement plutôt que de deviner.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_density_percent | float | 8.0 | 5.0 | 3.0 |
Pour ajuster via lucid-lint.toml :
[rules."lexicon.homophone-density"]
max_density_percent = 4.0
Avant (signalé) :
Their report shows there were too many decisions to make and two teams could not affect the launch nor lose the schedule despite careful planning across each region and product line every quarter.
Ce que lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia rapporte :
warning input.md:1:1 Paragraph density of homophones is 21.2% (7 of 33 content words (e.g. their, there)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]
Après (réécriture proposée) :
The report shows that the team made many decisions and that the two squads kept the launch on schedule despite careful planning across each region and product line every quarter.
La réécriture remplace their / there / to / too / two par des tournures ancrées dans le contexte (the report, that, the team, kept, the two squads), faisant tomber la densité bien sous le seuil.
Avant (signalé) :
Pendant le cours du matin la cuisinière prépare le foie de veau avant la pause de midi puis revient à sa tâche après avoir rangé les ustensiles sur la grande table en bois clair.
Ce que lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia rapporte :
warning input.md:1:1 Paragraph density of homophones is 11.8% (4 of 34 content words (e.g. cours, foie)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]
Après (réécriture proposée) :
Pendant la séance du matin la cuisinière prépare le foie de veau avant la coupure de midi puis reprend son travail après avoir rangé les ustensiles sur la grande table en bois clair.
cours devient séance, pause devient coupure, tâche devient travail — trois des quatre occurrences disparaissent sans perte de sens.
Voir Supprimer un diagnostic pour les formes inline et bloc. La directive inline fonctionne sur cette règle :
<!-- lucid-lint disable-next-line lexicon.homophone-density -->
Their report shows there were too many decisions to make and two teams could not lose the launch.
dyslexia et aphasia qui gouvernent cette règle.Voir Références pour la bibliographie complète.
syntax.passive-voiceVoix passive.
Les constructions à la voix passive. La passive masque l’agent et allonge la phrase sans ajouter d’information. Des exceptions légitimes existent (agent inconnu, style scientifique, mise en relief volontaire de l’action) — la règle signale, l’auteur décide.
Références. US Plain Language ; Strunk & White ; FALC.
| Catégorie | syntax |
| Sévérité par défaut | warning |
| Poids par défaut | 2 |
| Langues | EN · FR (heuristiques distinctes) |
| Source | src/rules/passive_voice.rs |
be (conjugué) + participe passé [+ by …]. Gère le -ed
régulier et la table des participes irréguliers.être (conjugué) + participe passé [+ par …], plus
se faire + infinitif. Plus difficile qu’en anglais à cause de
l’accord du participe (genre/nombre) et de la confusion avec
(a) l’attribut du sujet (il est content vs il est vu) et
(b) l’auxiliaire être des temps composés (elle est partie —
passé composé, actif).Précision attendue ~70–80 %. Un remplaçant à base d’analyseur
morphosyntaxique est prévu pour un futur greffon lucid-lint-nlp.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_per_paragraph | int | 3 | 1 | 0 |
ignore_scientific_style | bool | false | false | false |
Pour les passives volontaires, utiliser une directive inline. Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
syntax.unclear-antecedentAntécédent flou.
Les pronoms dont l’antécédent n’est pas évident dans le contexte immédiat. La référence pronominale ambiguë est l’une des ruptures de compréhension les plus coûteuses pour les lecteurs souffrant de troubles attentionnels : chaque ambiguïté force un retour conscient pour chercher l’antécédent.
Références. Strunk & White ; FALC (« préférer la répétition du nom au pronom ») ; Graesser et al. Coh-Metrix (cohésion référentielle).
| Catégorie | syntax |
| Sévérité par défaut | info |
| Poids par défaut | 2 |
| Langues | EN · FR (listes de pronoms distinctes) |
| Source | src/rules/unclear_antecedent.rs |
La détection exacte demande une résolution d’anaphore (problème avancé de traitement automatique du langage). v0.1 attrape les deux motifs les plus fréquents :
This/That/These/
Those, Ceci/Cela/Ce) non suivis d’un nom.La sévérité est info parce que l’heuristique est approximative —
le niveau de bruit justifie une sévérité douce.
| Clé | Type | Défaut |
|---|---|---|
check_demonstratives | bool | true |
check_paragraph_start_pronouns | bool | true |
Les performances étaient médiocres avec le cache LRU. Cela a motivé le changement.
À quoi renvoie cela ? Aux performances ? Au cache ? Ambigu.
Voir Neutralisation des diagnostics (page EN pour l’instant).
Voir Références pour la bibliographie complète.
syntax.nested-negationNégations imbriquées.
Les phrases qui empilent plusieurs négations. Deux négations ou plus dans une même phrase forcent le lecteur à basculer mentalement les valeurs de vérité. La charge est connue pour les lecteurs aphasiques et ceux qui souffrent d’un trouble du déficit de l’attention (TDAH). Le coût se multiplie sous pression cognitive. Les guides de langage clair (FALC, CDC Clear Communication Index, plainlanguage.gov) recommandent de réécrire les doubles négatives au positif.
| Catégorie | syntax |
| Sévérité par défaut | warning |
| Poids par défaut | 2 |
| Étiquettes de condition | aphasia, adhd, general |
| Langues | EN · FR (comptage spécifique par langue) |
| Source | src/rules/nested_negation.rs |
On compte les négations par phrase ; on signale les phrases dont le
compte dépasse max_negations.
not, no, never, none,
nothing, nobody, nowhere, neither, nor, cannot,
without) plus les occurrences du suffixe contracté n't
(don't, won't, isn't, doesn't, …).ne / n' contribue pour une négation et s’apparie à la particule
de seconde position la plus proche (pas, rien, jamais,
plus, personne, aucun, aucune, guère, nulle part) dans
une fenêtre courte ; l’appariement consomme simplement la particule
pour éviter le double comptage. Les particules non appariées dans
une phrase avec ne contribuent pour une de plus — ce qui attrape
les formes comme rien employé en sujet nominal négatif. Garde-fous :
pas / plus ne comptent jamais sans appariement (trop ambigus en
dehors de ne …) ; rien précédé de de est traité comme
l’idiome de rien et ignoré ; les particules d’une phrase sans
clitique ne sont ignorées également (plus de courage,
personne d'autre). Les autonomes sans / non comptent toujours.| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_negations | int | 3 | 2 | 1 |
lucid-lint signale ; la réécriture reste à l’auteur.
Passe sous public :
Nous ne sommes pas prêts.
Le bipartite ne … pas compte pour une négation.
Avant (signalée) :
Nous ne disons pas que rien n’est jamais possible.
Trois négations : ne…pas (un bipartite), rien (non apparié),
n'…jamais (un bipartite).
Ce que rapporte lucid-lint check --profile public :
warning input.md:1:1 Sentence stacks 3 negations (maximum 2). Rewrite as a positive statement or split the negations across separate sentences. [syntax.nested-negation]
Après (votre réécriture) :
Nous disons que quelque chose est possible.
Trois négations → trois affirmations, teintes assorties d’un bout à
l’autre de la réécriture. Le not disparaît simplement — la
simplification se voit.
Avant (signalée) :
We do not say nothing is never possible.
Trois négations (not, nothing, never).
Après :
We say something is possible.
Voir Neutralisation des diagnostics (page EN pour l’instant).
syntax.passive-voicestructure.deep-subordinationVoir Références pour la bibliographie complète.
syntax.conditional-stackingEmpilement de conditions.
Les phrases qui enchaînent plusieurs propositions conditionnelles.
Chaque if / when / unless / quand / si ouvre une branche
que le lecteur doit garder en pile mentale jusqu’à la résolution de
la proposition englobante. Deux ou trois empilées dans une même
phrase forment un multiplicateur de charge connu. L’effet touche les
lecteurs avec aphasie, trouble du déficit de l’attention (TDAH) et
toute personne sous pression cognitive. Les guides de langage clair
(FALC, plainlanguage.gov) recommandent de scinder les chaînes
conditionnelles en phrases distinctes ou en liste à puces.
| Catégorie | syntax |
| Sévérité par défaut | warning |
| Poids par défaut | 2 |
| Étiquettes de condition | aphasia, adhd, general |
| Langues | EN · FR (listes spécifiques par langue) |
| Source | src/rules/conditional_stacking.rs |
Par phrase, on compte les connecteurs conditionnels et on signale les
comptes au-dessus de max_conditionals.
if, unless, when, whenever, while,
until, provided, assuming, in case, as long as,
as soon as, even if, only if).si, sauf si, à moins que, à moins de,
quand, lorsque, lorsqu', dès que, tant que, pourvu que,
à condition que, à condition de, au cas où, même si,
en cas de) plus les clitiques élidés s'il / s'ils.| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_conditionals | int | 3 | 2 | 1 |
Trois conditions, teintes assorties d’un bout à l’autre de la
réécriture — la position les appariait déjà, la couleur confirme
que la réécriture conserve chaque branche. lucid-lint signale ;
la réécriture reste à l’auteur.
Avant (signalée) :
Si nous expédions, quand le test passe, à moins que la barrière échoue, nous déployons.
Trois connecteurs conditionnels (si, quand, à moins que).
Ce que rapporte lucid-lint check --profile public :
warning input.md:1:1 Sentence stacks 3 conditional clauses (maximum 2). Split the conditions across separate sentences or convert them to a bullet list. [syntax.conditional-stacking]
Après (votre réécriture) :
Nous déployons quand les trois conditions tiennent :
- la commande d’expédition a tourné,
- le test passe,
- la barrière n’échoue pas.
Avant (signalée) :
If we ship, when the build passes, unless the gate fails, we deploy.
Après :
We deploy when all three checks hold:
- the ship command ran,
- the build passes,
- the gate does not fail.
La liste anglaise mêle des conditionnels purs avec des conjonctions
temporelles (when, while) qui peuvent introduire des
sous-propositions à valeur conditionnelle. Un usage purement temporel
peut produire un faux positif sur des phrases longues. Utiliser
disable-next-line (page EN pour
l’instant) quand la lecture temporelle est sans ambiguïté.
Voir Neutralisation des diagnostics (page EN pour l’instant).
syntax.nested-negationstructure.deep-subordinationVoir Références pour la bibliographie complète.
syntax.dense-punctuation-burstRafale de ponctuation.
Des rafales locales de ponctuation : une fenêtre glissante de
graphèmes qui contient trop de signes qualifiants (,, ;, :,
—, –). Les amas serrés de signes indiquent une subordination
empilée, des incises parenthétiques ou des listes dans des listes.
Ce sont des constructions difficiles à analyser pour les lecteurs
souffrant de troubles cognitifs ou attentionnels (lignes directrices
IFLA pour les textes faciles à lire).
À distinguer de structure.excessive-commas,
qui compte les virgules sur une phrase entière. Une phrase avec
8 virgules réparties sur 200 caractères ne déclenche pas ici, alors
qu’une phrase avec 3 virgules dans 30 caractères déclenche.
| Catégorie | syntax |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Étiquettes de condition | general |
| Langues | EN · FR (agnostique au script) |
| Source | src/rules/dense_punctuation_burst.rs |
Par ligne source, on parcourt le flux de graphèmes une fois et on
recense la colonne de chaque signe qualifiant. Quand une fenêtre de
window_graphemes graphèmes contient min_marks signes ou plus, on
émet une rafale qui couvre du premier au dernier signe de la fenêtre.
Puis on avance au-delà de ce dernier signe pour éviter que les
fenêtres recouvrantes ne tirent deux fois sur le même amas.
Les blocs de code (fencés et indentés) sont exclus en amont par
l’analyseur Markdown. Les terminateurs de phrase (., !, ?) et
les parenthèses ne comptent pas dans la rafale.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_marks | int | 4 | 3 | 3 |
window_graphemes | int | 30 | 30 | 40 |
dev-doc tolère un amas de 3 signes — typique des listes techniques
au contact de la prose. FALC garde le même seuil de densité que
public mais élargit la fenêtre pour attraper des rafales un peu
plus lâches.
—, U+2014) et le tiret demi-cadratin (–,
U+2013) qualifient ; le succédané ASCII à double trait (--) non,
sous l’hypothèse que les auteurs soucieux de lisibilité utilisent
les bonnes formes Unicode.Voir Neutralisation des diagnostics (page EN pour l’instant).
structure.excessive-commasstructure.sentence-too-longVoir Références pour la bibliographie complète.
syntax.parenthetical-depthExpérimentale en v0.2.x. Désactivée par défaut ; activée via
--experimental syntax.parenthetical-depthou[experimental] enabled = ["syntax.parenthetical-depth"]danslucid-lint.toml. Passe àStableà la coupe v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour les étiquettesadhdetgeneral.
Une phrase dont la profondeur d’imbrication maximale entre crochets équilibrés () et [] atteint le seuil du profil. Les parenthèses empilées obligent la lectrice à garder en mémoire plusieurs idées suspendues à la fois — un signal reconnu de « phrase difficile » dans la tradition plainlanguage.gov et Hemingway, et un coût particulier pour les lectrices avec TDAH, qui portent en premier la charge en mémoire de travail.
La règle complète structure.excessive-commas, qui ignore déjà les énumérations plates (A, B, C) à profondeur 1. Cette règle-ci ne se déclenche qu’à partir de la profondeur 2 ; les deux règles sont mécaniquement orthogonales.
| Catégorie | syntax |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Statut | experimental (v0.2.x) → stable à la coupe v0.3 |
| Étiquettes de condition | adhd, general (filtrées : exécutée seulement si --conditions correspond) |
| Langues | EN · FR (indépendant de la langue — les familles de crochets sont identiques) |
| Source | src/rules/syntax/parenthetical_depth.rs |
Pour chaque phrase, la règle parcourt le texte du paragraphe une fois aplati par le parseur (les blocs de code sont donc déjà exclus en amont) et tient un seul compteur de profondeur courante.
( ou [ ; décrémenter sur ) ou ].parenthesised_list_comma_count utilisée par structure.excessive-commas.max_depth ≥ le seuil du profil, ancré sur le crochet ouvrant le plus profond.Les paires de tirets longs (— … —), les accolades ({}) et les appositions encadrées par des virgules sont volontairement hors scope en v0.2.x. Détecter une paire de tirets longs est fragile (confusion entre tirets demi-cadratin / cadratin, ambiguïté avec le trait d’union) et ramènerait du périmètre par la fenêtre.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_depth | int | 4 | 3 | 2 |
max_depth est la profondeur d’imbrication inclusive à laquelle la règle se déclenche. Une phrase dont le crochet le plus profond reste un cran en-dessous reste silencieuse.
Réglage via lucid-lint.toml :
[rules."syntax.parenthetical-depth"]
max_depth = 3
Avant (signalé) :
The migration tool (which now supports rollbacks (see
--reverse, added in 0.4.2 [tracked in #312])) is opt-in.
Ce que lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd rapporte :
warning input.md:1:21 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]
Après (réécriture proposée) :
The migration tool is opt-in. It now supports rollbacks via
--reverse, added in 0.4.2 (tracked in #312).
Les deux parenthétiques de premier niveau ont disparu ; il ne reste qu’une parenthèse plate à profondeur 1. La lectrice n’a plus à empiler trois pensées suspendues pour arriver au point.
Avant (signalé) :
Le module (qui dépend du noyau (chargé au démarrage [voir le manuel])) est facultatif.
Ce que lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd rapporte :
warning input.md:1:23 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]
Après (réécriture proposée) :
Le module est facultatif. Il dépend du noyau, chargé au démarrage. Voir le manuel pour les détails.
Trois phrases, aucun crochet imbriqué. La chaîne de dépendances est désormais linéaire et la lectrice récupère chaque fait dans l’ordre où il apparaît.
Voir Neutralisation des diagnostics pour les formes inline et bloc. La désactivation inline fonctionne aussi sur cette règle :
<!-- lucid-lint disable-next-line syntax.parenthetical-depth -->
The migration tool (which now supports rollbacks (see `--reverse`, added in 0.4.2 [tracked in #312])) is opt-in.
adhd et general qui filtrent cette règle.structure.excessive-commas — règle sœur sur les énumérations plates entre parenthèses. Découpage atomique : excessive-commas ignore les listes (A, B, C) à profondeur 1 ; cette règle se déclenche seulement à partir de la profondeur 2.syntax.dense-punctuation-burst — règle sœur sur la densité locale de ponctuation. Les deux règles signalent des phrases difficiles à analyser, sous deux angles différents.Voir Références pour la bibliographie complète.
readability.scoreScore de lisibilité.
Un indice de lisibilité au niveau du document. Les formules de lisibilité sont le signal synthétique historique de la complexité textuelle — simples, reproductibles, reconnues par les guides gouvernementaux US/UK et par WCAG. À traiter comme la complexité cyclomatique : d’abord une métrique, ensuite un avertissement.
| Catégorie | readability |
| Sévérité par défaut | info (toujours signalée) · warning quand au-dessus de max_grade_level |
| Poids par défaut | 5 |
| Langues | EN — Flesch-Kincaid · FR — Kandel-Moles (auto-sélection selon la langue détectée ; v0.2+) |
| Source | src/rules/readability_score.rs |
La formule est sélectionnée selon la langue détectée du document :
Anglais — Flesch-Kincaid Grade Level :
0.39 × (mots / phrases) + 11.8 × (syllabes / mots) − 15.59
Le résultat est un niveau scolaire américain. Comparé directement à
max_grade_level.
Français — Kandel & Moles (1958) :
207 − 1.015 × (mots / phrases) − 73.6 × (syllabes / mots)
Le résultat est un score d’aisance, typiquement dans 0..100 (plus
haut = plus facile), à la Flesch. Pour rester comparable d’une
langue à l’autre, la règle le convertit en équivalent niveau scolaire
avec l’approximation linéaire standard (100 − score) / 10, et
compare ce niveau à max_grade_level. Le message de diagnostic
remonte à la fois le score d’aisance natif et l’équivalent niveau
scolaire.
Langue inconnue : repli sur Flesch-Kincaid.
| Niveau | Équivalent scolaire (FR) |
|---|---|
| < 6 | Primaire |
| 6–9 | Collège |
| 9–12 | Lycée |
| 12–16 | Études supérieures |
| > 16 | Expert |
D’autres formules (Gunning Fog, SMOG, Dale-Chall, Scolarius)
et un rapport multi-formules --readability-verbose restent sur la
feuille de route.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
max_grade_level | float | 14 | 9 | 6 |
always_report | bool | true | true | true |
formula | auto | flesch-kincaid | kandel-moles | auto | auto | auto |
formula peut être surchargée via --readability-formula en CLI ;
auto suit la langue détectée, les autres valeurs figent la formule.
info (pour l’observabilité, même sous le
seuil).warning quand le niveau dépasse max_grade_level.Neutraliser une métrique au niveau du document est rarement la bonne
réponse ; ajuster max_grade_level dans lucid-lint.toml à la
place. Voir Configuration (page EN
pour l’instant).
Voir Références pour la bibliographie complète.
readability.large-number-unanchoredExpérimentale en v0.2.x. Désactivée par défaut ; activée via
--experimental readability.large-number-unanchoredou[experimental] enabled = ["readability.large-number-unanchored"]danslucid-lint.toml. Passe àStableà la coupe v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour les étiquettesdyscalculiaetgeneral.
Un grand nombre ou un mot d’ordre de grandeur qui apparaît dans une phrase sans aucun ancrage proche — pas d’unité, pas de pourcentage, pas de symbole monétaire, pas de ratio, pas de phrase de comparaison. Le CDC Clear Communication Index demande si les nombres sont clairs et utiles pour le public visé ; plainlanguage.gov est plus direct sur le mécanisme — « Use Numbers Effectively » recommande d’accompagner chaque grand nombre d’une comparaison ou d’un dénominateur que la lectrice peut situer. Les lectrices avec dyscalculie portent ce coût en premier : un « 4,8 milliards » hors contexte impose une estimation de l’ordre de grandeur à l’aveugle, là où la prose ordinaire fournit habituellement des appuis.
La règle complète structure.number-run, qui se déclenche sur des grappes numériques (≥ N tokens dans une même phrase). Cette règle-ci se déclenche sur un seul grand nombre ou mot d’ordre de grandeur sans ancrage.
| Catégorie | readability |
| Sévérité par défaut | warning |
| Poids par défaut | 1 |
| Statut | experimental (v0.2.x) → stable à la coupe v0.3 |
| Étiquettes de condition | dyscalculia, general (filtrée ; ne s’active qu’avec --conditions correspondants) |
| Langues | EN · FR (lexiques de comparateurs et de références figure/page par langue) |
| Source | src/rules/readability/large_number_unanchored.rs |
Pour chaque phrase, la règle parcourt le texte de paragraphe (post-aplatissement, donc les blocs de code clôturés sont déjà exclus par le parseur) et cherche les candidats sans ancrage.
Un candidat de niveau phrase est l’un de :
,, ., espace ASCII, NBSP, espace fine, NBSP étroite) entre les groupes de chiffres, donc 1 000 (FR) et 1,000 (EN) comptent tous les deux comme un seul token de 4 chiffres et de valeur 1000.million(s), milliard(s), billion(s), trillion(s) en FR ; million(s), billion(s), trillion(s) en EN. Mot entier, insensible à la casse.1000..=2999. 2024 et 1789 sont des années, pas des ordres de grandeur.1st, 12th).figure, page, section, tableau, chapitre, annexe, §, p., pp., n°, #, ou les équivalents EN.L’un quelconque des éléments ci-dessous, n’importe où dans la phrase, ancre tous les candidats de la phrase :
%).€, $, £, ¥).km, kg, m², °C, L, Hz, Mo, …).X sur Y, X out of Y, ou X / Y entre chiffres.soit environ, équivalent à, environ, plus de, par rapport à, … ; EN : roughly, approximately, more than, the size of, …).La position du diagnostic pointe sur le premier candidat survivant dans la phrase fautive, pour que le surlignage tombe sur le nombre visible plutôt que sur le début de la phrase.
| Clé | Type | dev-doc | public | falc |
|---|---|---|---|---|
min_value | int | 100000 | 10000 | 1000 |
min_value est la borne inférieure inclusive sur la valeur entière d’un candidat numérique. Les tokens qui passent le filtre du nombre de chiffres mais dont la valeur est en-dessous de min_value sont ignorés — les quantités de type numéro de page passent déjà par le filtre référence figure/page ; ce paramètre est un second filet.
À régler via lucid-lint.toml :
[rules."readability.large-number-unanchored"]
min_value = 50000
Avant (signalé) :
Le budget atteint 4 800 000 000 selon le rapport final.
Ce que lucid-lint check --profile public --experimental readability.large-number-unanchored --conditions dyscalculia rapporte :
warning input.md:1:19 Large numeral (10-digit, value ≈ 4800000000) appears with no anchor in this sentence (no unit, percentage, ratio, or comparison phrase). plain-language guidance recommends giving large numbers a comparison or denominator the reader can ground. [readability.large-number-unanchored]
Après (votre réécriture) :
Le budget atteint 4,8 milliards d’euros, soit environ 6 % du PIB selon le rapport final.
Le nombre est désormais accompagné d’une unité (euros), d’un pourcentage (6 %) et d’une phrase de comparateur (soit environ). Une lectrice qui ne peut pas estimer « 4,8 milliards » à brut dispose maintenant de trois ancres indépendantes.
Avant (signalé) :
The proposal mentions several billion in vague spending across regions.
Après (votre réécriture) :
The proposal mentions several billion dollars in vague spending across regions, roughly the annual budget of a mid-sized state agency.
L’ordre de grandeur est désormais accompagné d’une unité (dollars) et d’une phrase de comparateur (roughly the annual budget).
Voir Suppression des diagnostics pour les formes en ligne et en bloc. La désactivation en ligne fonctionne aussi sur cette règle :
<!-- lucid-lint disable-next-line readability.large-number-unanchored -->
Le budget atteint 4 800 000 000 selon le rapport final.
dyscalculia et general qui filtrent cette règle.structure.number-run — règle sœur sur les grappes numériques. Découpe atomique : number-run se déclenche sur des grappes de tokens numériques ; cette règle-ci se déclenche sur un seul grand nombre sans ancrage.structure.mixed-numeric-format — autre règle sœur, sur la cohérence de forme numérique (chiffres vs lettres).Voir Références pour la bibliographie complète.
lucid-lint est une petite caisse Rust avec un pipeline volontairement simple.
texte d'entrée
│
▼
┌──────────────────────────┐
│ Détection de la langue │ heuristique du ratio de mots vides
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Parseur │ pulldown-cmark ou texte brut
│ (Markdown | brut) │
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Modèle de document │ Section > Paragraphe > Phrase
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Règles │ Chaque règle reçoit le document + la langue
│ (sentence-too-long, ...) │
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Diagnostics │ rule_id, severity, location, section,
│ │ message, weight
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐ v0.2+
│ Score │ normalisé par densité, plafonné par catégorie
│ (Scorecard) │ 5 catégories figées
└─────────────┬────────────┘
│
▼
┌──────────────────────────┐
│ Formateur de sortie │ TTY (défaut) ou JSON
│ │ — porte les diagnostics + le scorecard
└──────────────────────────┘
Diagnostic — l’unité de sortie. Porte weight (initialisé depuis scoring::default_weight_for) depuis v0.2.Rule (trait) — fn check(document, language) -> Vec<Diagnostic>.Document — la sortie du parseur. Consciente des sections.Scorecard — global: Score, plus [CategoryScore; 5] dans l’ordre figé Structure · Rhythm · Lexicon · Syntax · Readability.Report — diagnostics + scorecard + word_count, renvoyé par Engine::lint_* depuis v0.2.Engine — regroupe un profil, un jeu de règles et une ScoringConfig facultative ; expose lint_str, lint_file, lint_stdin.Ces principes sont appliqués en revue de code. Voir Décisions de conception pour le contexte.
NonZeroU32.src/
├── lib.rs — racine de la bibliothèque
├── main.rs — point d'entrée du binaire
├── cli.rs — CLI clap
├── config.rs — préréglages de profil, lecture du fichier de configuration
├── engine.rs — orchestration
├── language/ — détection + données par langue
├── parser/ — Markdown + texte brut + tokeniseur + modèle de document
├── rules/ — un fichier par règle
├── scoring.rs — modèle hybride de score (v0.2+)
├── output/ — formateurs TTY + JSON
└── types.rs — types métier (Diagnostic, Severity, Location, ...)
Cette page consigne les décisions de conception prises pendant v0.1 qui méritent d’être revues avant tout changement.
Décision : v0.1 a livré la forme classique de linter, avec les sévérités info / warning. v0.2 a ajouté un modèle hybride de score (score global + sous-scores par catégorie + diagnostics) par-dessus, sans retirer la forme linter.
Raison : livrer la forme linter d’abord nous a permis de valider la qualité de détection sur de vrais corpus avant d’ajouter la couche d’agrégation. La couche de score est additive — les outils qui ne s’intéressent qu’aux diagnostics ignorent le scorecard.
Décision : un score global + 5 sous-scores par catégorie, tous sous la forme X / max. La composition empile une somme pondérée, une normalisation par densité (par 1 000 mots, plancher à 200) et un plafond par catégorie. 5 catégories figées : Structure · Rhythm · Lexicon · Syntax · Readability. Nouveau champ Diagnostic.weight, nouvelle option --min-score=N en ligne de commande.
Raison (brainstorm complet dans brainstorm/20260420-score-semantics.md) :
X / max plutôt que 0–100 : un maximum arbitraire nous laisse réajuster sans prétendre que le 80 d’aujourd’hui est le 80 de la prochaine version. La compétence /impeccable utilise déjà cette convention.category_of(rule_id) déjà décidée en v0.1. Dériver depuis le préfixe (plan B) a été rejeté : il aurait fallu renommer 17 règles rien que pour F14.DiagnosticDécision : un Diagnostic porte rule_id, severity, location, section, message et (depuis v0.2) weight.
Ce qui n’est PAS stocké, et pourquoi :
category — dérivable depuis rule_id via Category::for_rule. La stocker dupliquerait l’information et créerait un risque de dérive.suggestion — toujours différée ; les messages actuels sont actionnables par eux-mêmes.Ce qui EST stocké, et pourquoi :
section — la recalculer après coup demanderait de reparser le document pour parcourir les titres et faire correspondre les positions. Le coût de stockage est une Option<String> par diagnostic ; le coût de recalcul est un second parsing complet.weight (v0.2) — initialisé à l’émission depuis scoring::default_weight_for, pour que les surcharges utilisatrices (par configuration) et les surcharges au niveau règle (par with_weight) traversent l’agrégation sans seconde recherche.Décision : le cœur ne livre que des règles déterministes. Les règles à base de LLM, les règles qui s’appuient sur le réseau ou les règles à base de modèle d’apprentissage vivent dans des caisses d’extension facultatives (prévues pour v0.3).
Raison : un hook pre-commit qui prend 5 secondes et varie d’une exécution à l’autre est pire que pas de hook du tout. Le déterminisme n’est pas négociable sur le chemin nominal.
Décision : chaque règle qui dépend de la langue gère l’anglais et le français depuis v0.1.
Raison : la plupart des développeurs francophones de l’open source écrivent leur documentation en anglais. Viser le français seul passerait à côté de la majorité. Gérer les deux dès le premier jour coûte peu et signale l’ambition.
Décision : v0.1 utilise le grade Flesch-Kincaid pour toutes les langues. Les formules par langue (Kandel-Moles pour le français, SMOG, Coleman-Liau) sont différées à v0.2.
Raison : Flesch-Kincaid est connue, reproductible et bien comprise. Ajouter trois formules avant de valider les bases serait une optimisation prématurée.
Décision : prise en charge native de .md, .markdown, .txt et de l’entrée standard en v0.1. Les autres formats (AsciiDoc, HTML, docx, PDF) passent par Pandoc en pré-traitement.
Raison : Markdown couvre la grande majorité de l’écriture open-source et technique. Pandoc est libre, omniprésent, et lève la charge de maintenir plusieurs parseurs.
Décision : chaque règle vit dans son propre fichier sous src/rules/, avec une structure cohérente (struct, config, impl Rule, tests).
Raison : ajouter une règle devient une opération bien définie (un nouveau fichier depuis un gabarit), et la revue est facile (une règle, une PR, un fichier à lire).
Décision : v0.1 détecte la langue par le ratio de mots vides. Aucune dépendance externe.
Raison : court, déterministe, sans coût à l’exécution. Pour les cas où elle échoue (textes très courts, documents pleins de code), la valeur de repli unknown est sûre.
Décision : les profils sont Profile::DevDoc | Public | Falc. Ils ne peuvent pas être définis dans la configuration de l’utilisateur en v0.1.
Raison : ajouter des profils personnalisés est une abstraction spéculative tant que personne ne le demande. Les surcharges par règle suffisent à couvrir 95 % des cas « je veux un préréglage légèrement différent ».
Décision : ROADMAP.md est rétrogradé de source éditée à artefact généré. La source de vérité devient un ensemble structuré de fichiers sous .roadmap/ (ignoré par git), un fichier markdown par fonctionnalité avec front-matter TOML, plus des fragments narratifs. Un petit membre de workspace Rust (crates/roadmap-cli) fournit les sous-commands add / generate / validate / rename. Le générateur est invoqué localement pendant la préparation de release ; le ROADMAP.md régénéré est committé sur la PR de préparation. La CI ne régénère pas. Cadré sous F-roadmap-toml-source.
Raison :
main (en place depuis le 2026-05-03 via F-repo-config-hardening) force chaque modification de ROADMAP.md à passer par le cycle worktree → branche → PR → CI → merge → nettoyage. Le débit prévu en régime stable était de 10 à 30 modifications ROADMAP-seules par semaine. La valeur de revue PR sur ces modifications est nulle (auteur unique), donc la cérémonie n’était que pur surcoût.ROADMAP.md affaiblirait les signaux de protection de branche suivis par les badges OpenSSF Scorecard / Best Practices. Rétrograder le fichier hors de main préserve ces signaux intacts.pulldown-cmark déjà dans les dépendances, intègre les tests dans cargo test, maintenance avec une seule chaîne d’outils, et reste extractible en caisse autonome si l’outil mûrit..roadmap/ (ignoré par git et local à la machine). La cadence de release — pas le temps réel — était un compromis accepté ; l’artefact public ROADMAP.md se met à jour à chaque tag v*.<a id="…"> (pour que les liens croisés existants de la forme [F46](#f46) dans les PR et commits contingent de résoudre), une sous-commande add qui sert de gabarit (pour que créer une fonctionnalité soit une seule frappe, pas une régression), et un test de déterminisme aller-retour (régénérer l’artefact, le comparer à la version committée, échouer en cas de dérive).Solution de repli d’urgence : si le travail sur crates/roadmap-cli dépasse le budget, le fichier migre plutôt vers une branche orpheline roadmap avec push direct et la même forme .md — préserve les signaux Scorecard via un autre mécanisme, au prix d’une disposition de branches non standard. Documenté comme issue de secours mais pas comme chemin retenu.
RULES.md — la référence des règles qui fait foiROADMAP.md — les travaux à venirCODING_STANDARDS.md — les conventions du quotidienEn cours de traduction. La feuille de route complète est pour l’instant disponible en anglais. Sa traduction FR est suivie dans F25 — la tâche même qui pilote la mise en place de cette version française.
La version anglaise reste la référence en attendant la traduction complète.
Traduction en cours. La page d’accessibilité détaillée est pour l’instant disponible en anglais. Sa traduction FR est suivie dans F25 sur la feuille de route.
En résumé : le site vise WCAG 2.2 niveau AAA. Il dogfoode
lucid-lint sur sa propre prose. Les contrastes, tailles,
raccourcis clavier et la compatibilité avec les lecteurs d’écran
sont testés à chaque livraison.
Premier audit complet le 2026-04-22 : 17 / 20, 0 bloquant.
theme/index.hbs est prévu
(F35a).Ouvrez une
issue sur GitHub
avec le label accessibility. Les signalements sont traités sur le
jalon v0.2, sauf s’ils bloquent une publication.
Sources académiques, normatives et pratiques qui fondent la conception de
lucid-lint.
Cette page liste les références qui ont façonné les règles, profils et décisions de conception de lucid-lint. Chaque entrée précise où la référence intervient dans le projet. Le miroir anglais est à references.md.
Les liens externes ouvrent un nouvel onglet ; ils portent rel="nofollow noopener noreferrer" pour que le nouvel onglet reste sûr et que le site documentaire ne cautionne pas les contenus tiers.
| Statut | Signification |
|---|---|
| ✅ | Vérifiée — référence canonique |
| ⚠️ | À vérifier — probablement correcte, détails à confirmer |
| 🔍 | Opportuniste — raisonnement solide, citation plus lâche |
| 📖 | Livre / source secondaire |
| 🌐 | Standard normatif |
| 🧪 | Source pratique (guide de style, outil) |
Le socle théorique de lucid-lint : un texte impose un coût mental au lecteur, et ce coût peut être mesuré et réduit.
✅ Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. ↗
Papier fondateur. Distingue la charge intrinsèque, extrinsèque et germane.
→ Concerne : la plupart des règles, notamment structure.*, rhythm.*, syntax.nested-negation, syntax.conditional-stacking.
📖 Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive Load Theory. Springer. ↗
✅ Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers, 36(2), 193–202. ↗
Papier de référence pour l’analyse automatisée de la cohésion.
→ Concerne : rhythm.repetitive-connectors, syntax.unclear-antecedent, lexicon.low-lexical-diversity.
📖 McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. ↗
✅ Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68(1), 1–76. ↗
Papier fondateur de la Dependency Locality Theory.
→ Concerne : structure.deep-subordination, syntax.unclear-antecedent, syntax.conditional-stacking.
✅ Sanders, T. J. M., & Noordman, L. G. M. (2000). The role of coherence relations and their linguistic markers in text processing. Discourse Processes, 29(1), 37–60. ↗
→ Concerne : rhythm.repetitive-connectors.
✅ Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233. ↗
✅ Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas for Navy enlisted personnel. Technical Report, Naval Technical Training Command. ↗
→ Concerne : readability.score.
⚠️ Kandel, L., & Moles, A. (1958). Application de l’indice de Flesch à la langue française. Cahiers Études de Radio-Télévision, 19, 253–274.
⚠️ À vérifier : pagination et intitulé exact du périodique. À contrôler sur Cairn ou en bibliothèque universitaire.
✅ Henry, G. (1975). Comment mesurer la lisibilité. Labor, Bruxelles. ↗ (compte rendu)
Ouvrage de référence francophone proposant la formule de Henry. Le lien Persée pointe vers le compte rendu de De Landsheere (1976), faute de page éditeur en ligne pour l’ouvrage.
→ Concerne : candidat pour v0.2 de readability.score.
✅ François, T., & Fairon, C. (2012). An “AI readability” formula for French as a foreign language. EMNLP-CoNLL 2012. ↗
⚠️ Rectification : « Scolarius », évoqué en session de conception, est un outil commercial québécois et non une formule académique publiée. À ne pas citer comme référence scientifique.
📖 Herdan, G. (1960). Type-Token Mathematics: A Textbook of Mathematical Linguistics.
→ Concerne : lexicon.low-lexical-diversity.
✅ McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. ↗
✅ Clark, H. H., & Chase, W. G. (1972). On the process of comparing sentences against pictures. Cognitive Psychology, 3(3), 472–517. ↗
Travaux expérimentaux classiques démontrant que les phrases négatives prennent plus de temps à traiter que les affirmatives. Preuve fondamentale que la négation porte un coût de compréhension.
→ Concerne : syntax.nested-negation.
✅ Carpenter, P. A., & Just, M. A. (1975). Sentence comprehension: A psycholinguistic processing model of verification. Psychological Review, 82(1), 45–73. ↗
Prolonge Clark & Chase avec un modèle formel du traitement des phrases. Les négations empilées composent le coût de vérification.
🔍 Kaup, B., Lüdtke, J., & Zwaan, R. A. (2006). Processing negated sentences with contradictory predicates: Is a door that is not open mentally closed? Journal of Pragmatics, 38(7), 1033–1050. ↗
🔍 Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Psychology Press. ↗
Théorie des modèles mentaux du raisonnement conditionnel. Les conditionnelles empilées multiplient le nombre de modèles que le lecteur doit maintenir.
→ Concerne : syntax.conditional-stacking.
🔍 Evans, J. St. B. T., & Over, D. E. (2004). If. Oxford University Press. ↗
🔍 Précaution : le lien entre conditionnelles enchaînées et charge cognitive du lecteur est intuitif et bien étayé par la littérature globale sur le raisonnement, mais la règle spécifique « plus de N conditionnelles par phrase est néfaste » relève d’une heuristique de praticien, non d’un seuil directement testé. Traiter le seuil comme configurable et calibré empiriquement.
🔍 Arditi, A., & Cho, J. (2007). Letter case and text legibility in normal and low vision. Vision Research, 47(19), 2499–2505. ↗
Preuves empiriques du coût de lecture du texte en majuscules : le lecteur perd les indices de forme des mots que fournissent les jambages et hampes du mixed-case.
→ Concerne : lexicon.all-caps-shouting.
🧪 Nielsen, J. (Nielsen Norman Group). Articles multiples sur la lisibilité du texte en majuscules dans les interfaces.
→ Concerne : lexicon.all-caps-shouting.
📖 Bringhurst, R. (2013). The Elements of Typographic Style (4ᵉ éd.). Hartley & Marks.
Référence canonique en typographie.
✅ Legge, G. E., & Bigelow, C. A. (2011). Does print size matter for reading? A review of findings from vision science and typography. Journal of Vision, 11(5). ↗
Revue des preuves issues des sciences de la vision sur la lecture. Couvre les effets de longueur de ligne.
→ Concerne : structure.line-length-wide.
🔍 Seidenberg, M. S., Waters, G. S., Barnes, M. A., & Tanenhaus, M. K. (1984). When does irregular spelling or pronunciation influence word recognition? Journal of Verbal Learning and Verbal Behavior, 23(3), 383–404. ↗
Travail classique montrant que les patterns de lettres inhabituels ralentissent la reconnaissance des mots.
🔍 Treiman, R., Kessler, B., Zevin, J. D., Bick, S., & Davis, M. (2006). Influence of consonantal context on the reading of vowels: Evidence from children. Journal of Experimental Child Psychology, 93(1), 1–24. ↗
Travaux montrant que les clusters consonantiques et leur contexte affectent précision et vitesse de lecture.
🔍 Précaution : la règle
lexicon.consonant-clusterest fondée sur la littérature globale sur la complexité des formes de mots, mais un seuil spécifique validé du type « 4+ consonnes d’affilée est néfaste » ne provient pas d’un papier canonique unique. C’est une heuristique de praticien informée par la littérature, non la transposition directe d’une métrique publiée.
🔍 Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Longman.
Grammaire classique classant les intensificateurs comme « amplificateurs » dont la contribution sémantique est souvent marginale.
→ Concerne : lexicon.redundant-intensifier.
🧪 Zinsser, W. (2006). On Writing Well (30ᵉ éd. anniversaire). HarperCollins.
Guide pratique qui plaide contre les adverbes intensificateurs comme encombrement.
📖🧪 Strunk, W., & White, E. B. (1999). The Elements of Style (4ᵉ éd.). Longman.
→ Concerne : syntax.passive-voice, lexicon.weasel-words, lexicon.redundant-intensifier, syntax.unclear-antecedent.
🧪 US Plain Language Action and Information Network (2011). Federal Plain Language Guidelines. ↗
→ Concerne : structure.sentence-too-long, structure.paragraph-too-long, lexicon.excessive-nominalization, lexicon.jargon-undefined, syntax.passive-voice.
🧪 European Commission (2011). Rédiger clairement. Office des publications de l’Union européenne. ↗
🌐 International Organization for Standardization (2022). ISO 80000-1:2022 — Quantities and units — Part 1: General. ↗
Standard international sur le formatage des nombres, y compris groupement des chiffres et séparateurs décimaux.
→ Concerne : structure.mixed-numeric-format.
🧪 The Chicago Manual of Style (17ᵉ éd., 2017). University of Chicago Press. ↗
Guide de style canonique couvrant quand écrire les nombres en lettres ou en chiffres, et pourquoi la cohérence importe.
⚠️ Martinussen, R., Hayden, J., Hogg-Johnson, S., & Tannock, R. (2005). A meta-analysis of working memory impairments in children with attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 44(4), 377–384. ↗
⚠️ Précaution : la recherche spécifique sur « lisibilité textuelle pour lecteurs TDAH » est dispersée et de qualité variable. L’angle « accessibilité cognitive » est sain, mais traiter les affirmations spécifiques au TDAH avec prudence.
📖 Barkley, R. A. (2012). Executive Functions: What They Are, How They Work, and Why They Evolved. The Guilford Press. ↗
✅ Rello, L., & Baeza-Yates, R. (2013). Good fonts for dyslexia. Proceedings of ASSETS ’13. ↗
🌐 W3C (2018). Web Content Accessibility Guidelines (WCAG) 2.1. ↗
Critères clés invoqués :
structure.heading-jumpstructure.line-length-widestructure.heading-jumplexicon.jargon-undefinedlexicon.unexplained-abbreviationreadability.score⚠️ Vérifie les numéros de critères sur la version WCAG que tu veux citer (2.1 ou 2.2).
🌐 DINUM (2023). Référentiel Général d’Amélioration de l’Accessibilité (RGAA) version 4.1. ↗
structure.heading-jumplexicon.unexplained-abbreviation🌐 Inclusion Europe (2009, mise à jour 2014). Information pour tous : Règles européennes pour une information facile à lire et à comprendre.
Référentiel FALC (Facile À Lire et à Comprendre).
→ Concerne : le profil falc est directement inspiré de ces règles.
🌐 Normes d’accessibilité Canada (2025). CAN-ASC-3.1:2025 — Langage clair (première édition). ↗
Première norme nationale canadienne sur le langage clair, publiée en version bilingue par Normes d’accessibilité Canada dans le cadre de la Loi canadienne sur l’accessibilité. Exigences prescriptives (doit / devrait / peut) sur cinq axes : identification du public, méthodes d’évaluation, structure, formulation, conception. Fonde indépendamment plusieurs de nos seuils par défaut côté lexicon.*, structure.* et readability.score.
→ Concerne : lexicon.jargon-undefined, lexicon.unexplained-abbreviation, lexicon.weasel-words, structure.sentence-too-long, structure.paragraph-too-long, syntax.passive-voice, readability.score.
🌐 Directive (UE) 2019/882 du Parlement européen et du Conseil du 17 avril 2019 — European Accessibility Act (EAA). ↗
Cadre légal étendant les exigences d’accessibilité aux services du secteur privé à partir du 28 juin 2025.
| Règle | Références principales |
|---|---|
readability.score | Flesch (1948); Kincaid et al. (1975); Henry (1975); Kandel & Moles (1958); CAN-ASC-3.1:2025 |
| Règle | Références principales |
|---|---|
rhythm.consecutive-long-sentences | Sweller (1988); Sweller et al. (2011) |
rhythm.repetitive-connectors | Sanders & Noordman (2000); Graesser et al. (2004) |
| Règle | Références principales |
|---|---|
syntax.conditional-stacking | Johnson-Laird & Byrne (1991); Evans & Over (2004); Gibson (1998) — 🔍 seuil heuristique de praticien |
syntax.dense-punctuation-burst | Sweller (1988); Gibson (1998) — 🔍 purement heuristique |
syntax.nested-negation | Clark & Chase (1972); Carpenter & Just (1975); Kaup et al. (2006) |
syntax.passive-voice | Strunk & White; Plain Language US; FALC; CAN-ASC-3.1:2025 |
syntax.unclear-antecedent | Strunk & White; Gibson (1998); Graesser et al. (2004) |
lucid-lint est un projet d’ingénierie informé par la recherche, pas un projet de recherche en soi. Les références ci-dessus fondent nos choix de conception mais nous ne prétendons pas valider de nouveaux résultats. Plusieurs règles (lexicon.consonant-cluster, syntax.conditional-stacking, syntax.dense-punctuation-burst, structure.excessive-commas) sont des heuristiques de praticien informées par la littérature, et non des transpositions directes de métriques publiées — nous les marquons 🔍 dans le tableau récapitulatif.
Lorsque nous simplifions une métrique académique (par exemple syntax.unclear-antecedent comme heuristique de pattern au lieu d’une résolution complète des anaphores), nous documentons la simplification dans RULES.md et planifions des versions plus riches dans la feuille de route.
Si vous êtes chercheur et repérez une erreur, une citation obsolète ou une mauvaise attribution, ouvrez une issue — nous corrigerons rapidement et vous créditerons.
Voir CONTRIBUTING.md pour le guide de contribution complet.
just check en local.