Introduction

Built for readers whose attention is stretched — ADHD, dyslexia, fatigue, a second language, or an accessibility-sensitive context.

lucid-lint reads your Markdown or plain text and flags the moments that make prose hard to process. It does not rewrite your voice. It hands you a short list and gets out of the way.

Before

The caching subsystem, which was introduced in an earlier milestone, turned out to interact poorly with the new request pipeline under sustained load, and the investigation that followed required multiple rounds of profiling.

After

The caching subsystem was introduced earlier. It interacts poorly with the new request pipeline under sustained load. The investigation required several rounds of profiling.

Three ideas, colour-matched left to right — the rewrite shortens the sentences without losing any of them. lucid-lint flagged sentence-too-long (43 words) and consecutive-long-sentences. It did not propose the rewrite — that's yours.

What makes it different

Most prose tools measure style (write-good), grammar (Antidote), or a surface readability score (Flesch). lucid-lint measures cognitive load — the mental effort a reader spends to understand a sentence. It flags the patterns that the research behind Sweller, Gibson, Graesser, and Coh-Metrix single out.

Bilingual EN/FR from day one, with equal quality.
Deterministic by default. Identical input produces identical output. LLM-based rules live in optional plugins.
CI-native. Plain-text and JSON outputs; exit codes that pre-commit and GitHub Actions understand without a wrapper.
Profile-based. Pick dev-doc, public, or falc (Easy-to-Read), then override per rule if you want.

Project status

lucid-lint is at v0.2 (released 2026-04-22). All 25 rules listed in RULES.md are shipped (17 from v0.1, 8 added during the v0.2 cycle), alongside the hybrid scoring model — a global X / max score plus five per-category sub-scores, computed on top of the diagnostics. Pre-1.0: breaking changes remain possible between minor versions. See the roadmap for what comes next.

Quick taste

A clean file earns the full 100/100 and a wordmark banner — the peak-end moment of a passing lint run:

Terminal capture: a clean lucid-lint run showing the three-part wordmark banner, the message “No issues found.”, and a score block reading 100/100 with every category bar full

~~~~~ ⟨ • ⟩ ─────  lucid-lint  v0.2.0
                   cognitive accessibility linter · prose · EN / FR
                   ────────────────────────────────────────────────

No issues found.

────────────────────────────────────────────────────────────
score: 100/100
       structure    █████  20/20
       rhythm       █████  20/20
       lexicon      █████  20/20
       syntax       █████  20/20
       readability  █████  20/20

cargo install lucid-lint

# Lint a file
lucid-lint check README.md

# Strictest profile (Easy-to-Read / FALC)
lucid-lint check --profile=falc docs/

# Stdin
echo "This is a test sentence." | lucid-lint check -

# JSON for CI
lucid-lint check --format=json docs/

# Fail the build if the aggregate score drops below 85/100
lucid-lint check --min-score=85 docs/

Where to next

Installation — how to install it.
Quick start — a five-minute walkthrough.
Profiles — pick the one that fits.
Rules reference — all twenty-five rules explained.
Accessibility — the WCAG AAA bar and how the site itself dogfoods the project.

Reading preferences

The whole site is built as a reading companion. Pick the font that reads best for you — it will stick across pages.

Atkinson Hyperlegible Next

A dense paragraph can ask a lot of a stretched mind. Every comma, every clause, every bracketed aside adds a little cost. Good prose keeps that cost low.

Line spacing and text size are on the way as sliders. Until then, pick a font and your browser's zoom is honoured.

License

Dual-licensed under MIT or Apache-2.0, at your option.

Installation

lucid-lint ships through four routes. Pick the one that matches your environment.

One-line installer (Linux, macOS, WSL)

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh | sh

The script is generated by cargo-dist for every tagged release. It detects your platform, downloads the matching prebuilt binary from the GitHub release, and places it on $PATH (default: $CARGO_HOME/bin if set, else ~/.cargo/bin).

Audit before running

curl … | sh is fast but opaque. To read the script before executing it:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh -o install.sh
less install.sh
sh install.sh

The script is short — under 200 lines of POSIX shell — so a quick read is realistic. It pins the release version it was generated for, verifies the downloaded archive’s expected size, and exits non-zero on any mismatch.

Pin a specific version

latest resolves to the most recent release. To pin a known-good version:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/download/v0.2.2/lucid-lint-installer.sh | sh

One-line installer (Windows PowerShell)

powershell -ExecutionPolicy Bypass -c "irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 | iex"

Same cargo-dist machinery, PowerShell flavour. Drops the binary into %CARGO_HOME%\bin when CARGO_HOME is set, else %USERPROFILE%\.cargo\bin.

To audit before running, save the script and inspect it:

irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 -OutFile install.ps1
notepad install.ps1
.\install.ps1

Via Cargo

cargo install lucid-lint

This compiles from source via crates.io and places the binary in your Cargo bin directory (default ~/.cargo/bin/). Slower than the prebuilt installer but useful when the prebuilt targets don’t match your platform.

From source

git clone https://github.com/bastien-gallay/lucid-lint
cd lucid-lint
cargo install --path .

Pre-built binaries

Each release ships pre-built binaries for:

Linux (x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl)
macOS (aarch64-apple-darwin, x86_64-apple-darwin)
Windows (x86_64-pc-windows-msvc)

The shell and PowerShell installers above pick the right archive automatically. To install manually, download from the GitHub releases page and put the extracted binary on $PATH.

Verify the installation

lucid-lint --version

System requirements

Rust 1.75 or newer (only needed if building from source or via cargo install).
No runtime dependencies.

Quick start

This page walks through linting your first document.

Lint a single file

lucid-lint check README.md

Output:

warning <path>/README.md:14:1 Sentence is 27 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]

summary: 1 warnings.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long
────────────────────────────────────────────────────────────
score: 88/100
       structure    ██▏░░  8/20
       rhythm       █████  20/20
       lexicon      █████  20/20
       syntax       █████  20/20
       readability  █████  20/20

The trailing block is the scoring summary — a global X / 100 score followed by the full per-category breakdown.

Lint several files

lucid-lint check docs/*.md CHANGELOG.md

Lint a directory

lucid-lint check docs/

All files with .md, .markdown, or .txt extensions will be processed.

Use stdin

echo "This is a test sentence." | lucid-lint check -

Pipe from Pandoc

For formats that lucid-lint does not parse natively yet:

pandoc report.docx -t markdown | lucid-lint check -

Choose a profile

# Strictest: Easy-to-Read
lucid-lint check --profile=falc docs/

# Looser: developer documentation
lucid-lint check --profile=dev-doc docs/

See Profiles for details.

Change the output format

# JSON for CI
lucid-lint check --format=json docs/

See CI integration for CI recipes.

Exit codes

Code	Meaning
0	No issues (or only `info`) and score above `--min-score` (if set)
1	Warnings found or score below `--min-score`
2	Runtime error (invalid args, unreadable file)

The two gates stack. See CI integration for combination recipes.

Profiles

A profile is a preset bundle of rule thresholds tuned for a specific audience.

Available profiles

`dev-doc`

For technical documentation, API references, ADRs, and developer-facing content.

Thresholds are loose: technical readers have higher tolerance for long sentences, nominalizations, and domain-specific jargon.

`public` (default)

For general-audience content: marketing pages, product descriptions, blog posts.

Thresholds are moderate. Plain-language guidelines apply.

`falc`

For content that follows the Facile À Lire et à Comprendre / Easy-to-Read European standard.

Thresholds are strict: short sentences, simple vocabulary, no passive voice, no undefined acronyms.

Choosing a profile

Start with the profile that matches the intent of the content. Override specific rules if needed via lucid-lint.toml.

Threshold comparison

See the rule reference for exact thresholds per rule and per profile.

The overall pattern is:

dev-doc: 30 words per sentence, 4 commas, 7 sentences per paragraph
public: 22 words per sentence, 3 commas, 5 sentences per paragraph
falc: 15 words per sentence, 2 commas, 3 sentences per paragraph

The same file linted three times under dev-doc, public, and falc in turn — the score drops as the profile tightens:

Terminal capture: three sequential lucid-lint runs against examples/sample.md under the dev-doc, public, and falc profiles. The dev-doc pass surfaces a handful of diagnostics and a mid-range score; public tightens and more issues appear; falc flags the most and the score drops the furthest

Overriding a profile

Any per-rule threshold set in lucid-lint.toml takes precedence over the profile preset.

[default]
profile = "public"

[rules.sentence-too-long]
max_words = 18   # stricter than public's 22

Conditions

A condition tag describes the cognitive condition a rule primarily targets. Conditions are orthogonal to profiles: a profile (dev-doc, public, falc) sets the strictness of the always-on rules; conditions enable additional rules tuned for a specific audience.

The fixed ontology

Tag	Targets
`general`	Always-on rules. The v0.2 baseline.
`a11y-markup`	Prose-adjacent markup signals (e.g. all-caps shouting).
`dyslexia`	Dyslexia-targeted signals. Source: BDA Dyslexia Style Guide.
`dyscalculia`	Numeric format and anchoring. Source: CDC Clear Communication Index.
`aphasia`	Aphasia-targeted signals. Source: FALC, plain-language guides.
`adhd`	Attention-fragility signals.
`non-native`	Non-native reader signals (vocabulary rarity, idioms).

The set is fixed. New tags are a deliberate, versioned change.

How filtering works

For every rule the engine evaluates:

A rule tagged general is always enabled.
A rule without general runs only when at least one of its tags appears in the user’s active condition list.

All 17 v0.2 rules carry general, so the default behavior is unchanged. Future tagged rules (e.g. lexicon.all-caps-shouting for a11y-markup, syntax.nested-negation for aphasia + adhd) opt in via this list.

Configuring conditions

In lucid-lint.toml:

[default]
profile = "falc"
conditions = ["dyslexia", "aphasia"]

On the command line (comma-separated, repeatable):

lucid-lint check --profile falc --conditions dyslexia,aphasia docs/

FALC retains its regulatory meaning. Adding dyslexia does not relax or rename it — it layers dyslexia-specific signals on top.

Why tags, not parallel profiles

Three strictness levels × N conditions explodes combinatorially. Keeping the two axes orthogonal preserves the regulatory meaning of falc while letting users compose audience-specific overlays. See ROADMAP entries F71 and F72.

Configuration

lucid-lint is configured via a lucid-lint.toml file at the project root (optional) and CLI flags (overrides the file).

File layout

# lucid-lint.toml

[default]
profile = "public"

[rules.sentence-too-long]
max_words = 22

[rules.passive-voice]
max_per_paragraph = 2

Sections

`[default]`

Top-level defaults applied to the whole run.

Field	Type	Default	Description
`profile`	string	`"public"`	One of `dev-doc`, `public`, `falc`
`conditions`	array of strings	`[]`	Active condition tags. See Conditions.
`exclude`	array of glob strings	`[]`	Paths to skip during directory recursion. See Excluding paths.

`[rules.<rule-id>]`

Per-rule configuration. The fields available depend on the rule. See the rule pages in Rules reference.

`[scoring]`

Tunables for the hybrid scoring model. All fields are optional; missing fields fall back to the shipped defaults (category_max = 20, category_cap = 15).

[scoring]
category_max = 20
category_cap = 15

[scoring.weights]
sentence-too-long = 3
weasel-words      = 2

The [scoring.weights] sub-table is keyed by rule id. Unknown ids are ignored, so removing a rule in a future version does not break older configs.

Precedence

From lowest to highest:

Profile preset (e.g., public)
lucid-lint.toml overrides
CLI flags

An unset CLI flag defers to the TOML value; an unset TOML field defers to the profile preset.

Discovery

lucid-lint walks up from the current working directory to the first lucid-lint.toml it finds, stopping at the nearest .git repo boundary. Passing --config <path> skips auto-discovery and loads the given file directly; a missing explicit path is an error, but a missing auto-discovered file is not.

Excluding paths

Large documentation repositories routinely contain generated output, vendored text, and snapshots that would drown the linter in noise. Use the exclude field in [default] — or the --exclude <GLOB> CLI flag — to skip them at discovery time, before parsing.

[default]
exclude = [
    "vendor/**",
    "**/fixtures/**",
    "CHANGELOG.md",
]

Equivalently on the command line:

lucid-lint check --exclude 'vendor/**,**/fixtures/**,CHANGELOG.md' docs

Notes:

Matching. Globs are matched against the path relative to the walked root. Passing lucid-lint check docs with exclude = ["drafts/**"] skips docs/drafts/....
Prune, don’t visit. A matching directory is not descended into — large excluded trees cost nothing to walk.
Explicit files bypass. If you pass docs/CHANGELOG.md directly on the command line, it is linted even when CHANGELOG.md is in the exclude list. If you named the path, you meant it.
Additive. CLI --exclude and TOML exclude are unioned, not overridden. Comma-separate multiple patterns in a single flag, or repeat --exclude.

Silencing rules globally

Markdown documents support inline-disable directives for local silencing, but plain text and stdin have no such escape hatch. [[ignore]] fills that gap — and works uniformly across all input formats.

[[ignore]]
rule_id = "unexplained-abbreviation"

[[ignore]]
rule_id = "weasel-words"

Each [[ignore]] entry removes every diagnostic whose rule_id matches, across Markdown files, plain text, and stdin. The filter is applied after all rules have run but before scoring, so the score reflects the post-filter view too.

Notes:

Global scope. The filter is not per-file. Inline directives remain the recommended escape hatch for spot silencing in Markdown — reach for [[ignore]] only when a rule is genuinely noisy project-wide.
Unknown ids tolerated. Entries referencing rules that no longer exist are dropped silently, so removing a rule in a future release does not break older configs.
Future fields. A reason = "..." field on each entry is tracked as F-suppression-reason-field — when it lands it will be surfaced in reports and optionally required via config.

Per-rule overrides

TOML-driven config is wired rule-by-rule as each Config gains a dedicated accessor. Two rules honour it today:

`[rules.readability-score]`

[rules.readability-score]
formula = "kandel-moles"  # or "flesch-kincaid", "auto"

Pins the readability formula regardless of detected language. auto (default) preserves the F-readability-formulas-extra per-language selection.

`[rules.unexplained-abbreviation]`

[rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]

Entries are additive over the profile baseline (F31). Use this to restore project-specific acronyms — accessibility standards, domain initialisms, engineering-practice terms — that the v0.2 baseline no longer ships. Each entry is silenced globally across the document, same as if it had been defined inline via Expansion (ACRONYM).

`[rules."structure.excessive-commas"]`

[rules."structure.excessive-commas"]
max_commas = 2

Overrides the per-sentence comma ceiling (default: 4 / 3 / 2 for dev-doc / public / falc). Must be a positive integer — 0 or negative values are rejected at load time. The override replaces the profile preset; it is not additive.

Tables for other rules parse without error but have no runtime effect. Extending this list is a mechanical per-rule change and will continue through the v0.2.x cycle.

Scoring

v0.2 adds a hybrid scoring model on top of the existing diagnostics. Every run now answers two questions at once:

What specifically is wrong? — the diagnostics list, unchanged from v0.1.
How bad is this document overall? — a new global score plus five per-category sub-scores.

The two surfaces are complementary. Scores are summaries; diagnostics remain the actionable signal.

What the score means

The score takes the form X / max — an arbitrary maximum rather than a 0–100 normalized number. v0.2 ships with max = 100 (five categories × twenty points), but the number is treated as a test-and-learn calibration: the scale may shift in a future minor release as rule weights are tuned against real corpora.

The rules of thumb for today’s calibration:

Range	Reading
80 – 100	Score reads green in the terminal. Nothing blocking.
60 – 79	Score reads yellow. A handful of hits worth reviewing.
0 – 59	Score reads red. Dense issues or a runaway rule.

The colour bands are a reader aid, not a pass / fail contract. For CI gating, use --min-score with a concrete number you picked.

The five categories

Every rule belongs to exactly one category. v0.2 fixes the taxonomy at five buckets:

Category	Covers
`structure`	Length, nesting, punctuation, document skeleton
`rhythm`	Cadence and repetition across adjacent sentences
`lexicon`	Vocabulary, terminology, acronyms, lexical diversity
`syntax`	Sentence-level style and syntactic clarity
`readability`	Document-level readability metrics

See the rules reference for the rule-to-category mapping.

How a score is computed

For a single document:

per_rule_cost     = Σ (weight × severity_multiplier)        over hits
per_category_cost = min(Σ per_rule_cost / (words / 1000),   ← density
                        category_cap)                        ← cap
category_score    = category_max − per_category_cost         (clamped ≥ 0)
global_score      = Σ category_score

Three mechanics stack:

Weighted sum — each hit costs weight × severity_multiplier. The default weight table lives in scoring::default_weight_for and emphasises rules whose hits carry the most cognitive load (readability-score = 5, length / subordination / passive / unclear-antecedent = 2, everything else = 1).
Density normalization — costs are divided by words / 1000 so a 10 000-word handbook is not punished for having more hits than a 400-word README. Documents shorter than 200 words are treated as 200-word documents, so tiny fixtures are not artificially penalized.
Per-category cap — no single category can lose more than category_cap out of category_max. A single noisy rule eats at most 75 % of its own category (15 / 20 by default) and cannot leak into the others.

The severity multiplier is info = 1, warning = 3, error = 5.

Reading the TTY output

The terminal formatter prints each diagnostic, a short summary line, then a score block: the global number followed by every category score with an eight-step sparkline bar.

lucid-lint run on examples/sample.md — five diagnostics, a summary counting 3 warnings and 2 info, an explain hint, and a score block that reads 45/100 with category bars for structure, rhythm, lexicon, syntax, and readability

The same run rendered as plain text, for screen readers and copy-paste:

warning examples/sample.md:7:1 Sentence is 35 words long (maximum 30). Consider splitting it into shorter sentences. [section: A paragraph with a long sentence] [structure.sentence-too-long]
warning examples/sample.md:7:11 Weasel phrase "rather" weakens the statement. Replace with concrete language or remove it. [section: A paragraph with a long sentence] [lexicon.weasel-words]
info    examples/sample.md:1:1 Flesch-Kincaid grade 6.8 (target ≤ 14.0). [readability.score]
info    examples/sample.md:7:1 Sentence starts with a bare demonstrative "this". Name the referent to avoid forcing the reader to guess. [section: A paragraph with a long sentence] [syntax.unclear-antecedent]
warning examples/sample.md:7:1 Line is 210 characters wide (maximum 120). [section: A paragraph with a long sentence] [structure.line-length-wide]

summary: 3 warnings, 2 info.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long, lexicon.weasel-words, readability.score + 2 more
────────────────────────────────────────────────────────────
score: 45/100
       structure    █▎░░░  5/20
       rhythm       █████  20/20
       lexicon      █▎░░░  5/20
       syntax       ██▌░░  10/20
       readability  █▎░░░  5/20

All five categories are always displayed so the breakdown stays structurally stable run-to-run. A perfect document reads score: 100/100 with every bar full (█████). When the same rule fires two or more times on one file, the hits cluster under a compact header and any shared message or section is hoisted up so it only appears once.

Reading the JSON output

The JSON schema is at version = 2 in v0.2. New fields:

{
  "version": 2,
  "diagnostics": [
    {
      "rule_id": "structure.sentence-too-long",
      "severity": "warning",
      "location": { "file": { "kind": "path", "path": "draft.md" }, "line": 12, "column": 1, "length": 42 },
      "section": "Introduction",
      "message": "Sentence is 27 words long (maximum 22).",
      "weight": 2
    }
  ],
  "summary": { "info": 0, "warning": 1, "error": 0, "total": 1 },
  "score": { "value": 88, "max": 100 },
  "category_scores": [
    { "category": "structure",   "value": 8,  "max": 20 },
    { "category": "rhythm",      "value": 20, "max": 20 },
    { "category": "lexicon",     "value": 20, "max": 20 },
    { "category": "syntax",      "value": 20, "max": 20 },
    { "category": "readability", "value": 20, "max": 20 }
  ]
}

Category values are lowercase strings in the fixed order listed above. Consumers that parsed the v0.1 schema should:

bump their expected version from 1 to 2;
replace the old category names (length → structure, lexical → lexicon, style → syntax, global → readability);
ignore unknown fields so future additive schema changes don’t break them.

Gating CI with `--min-score`

The check subcommand takes an optional --min-score=N flag. The run exits 1 if the aggregate global score is below N, independently of the severity-based gate.

# Fail the build if overall quality drops below 85/100
lucid-lint check --min-score=85 docs/

Both gates stack: the run fails if either the severity gate trips or the score gate trips. Pick one or both depending on your workflow:

Severity gate only (v0.1 behaviour): catches newly introduced warnings, doesn’t react to a slow drift.
Score gate only (--fail-on-warning=false --min-score=85): tolerates individual warnings but fails when density drifts past your threshold.
Both (default + --min-score=85): both spikes and drifts fail the build.

Tuning weights in `lucid-lint.toml`

Projects can override the calibration in their lucid-lint.toml:

[scoring]
category_max = 20
category_cap = 15

[scoring.weights]
sentence-too-long = 3
weasel-words      = 2

Missing fields fall back to the shipped defaults. The [scoring.weights] sub-table is keyed by rule id; unknown ids are ignored so removing a rule later doesn’t break older configs.

What’s deferred

The brainstorm that shaped F14 (see brainstorm/20260420-score-semantics.md) kept the model minimal. Decorations promoted only when user feedback requires them:

Letter grades (A–F) — tracked as F-score-letter-grade. Promoted if the numbers feel noisy or hard to compare across documents.
Traffic-light + pass/fail margin display — tracked as F-score-traffic-light. Promoted if CI users ask for a stronger glance signal.
Reading-time-seconds as alternative unit — tracked as F-reading-time-score. Needs a validated heuristic plus companion metrics (comfort, fatigue) so it doesn’t monopolize the read.
Section-level sub-scores — tracked as F-section-scoring. Once document + project roll-ups are proven in the wild.
Project-level multi-file roll-up — tracked as F-project-scoring-rollup. The CLI in v0.2 treats all passed paths as a single document for scoring purposes.

Suppressing diagnostics

lucid-lint supports two inline directives for silencing diagnostics in Markdown input. They are intended for the rare cases where a rule fires on intentional prose (a quoted weasel word, a didactic heavy-nominalization example, a legitimate passive). Prefer rewriting the prose first; reach for a directive when the detection is a known false positive or when the author has considered the warning and chosen to keep the text.

Line form

<!-- lucid-lint disable-next-line structure.sentence-too-long -->

A long sentence that is intentional and should not be flagged.

Syntax. HTML comment, one rule id per directive. Multiple line directives may precede the same target line.
Scope. The next non-blank line in the source.
Optional reason.  — surfaced in JSON output; will be required via config in a future release (tracked as F-suppression-reason-field in the roadmap).

Block form (v0.2, F18)

<!-- lucid-lint-disable structure.sentence-too-long -->

A long sentence.

Another long sentence in the same scope.

<!-- lucid-lint-enable -->

Opening.  opens a scope for one rule.
Closing.  closes every currently-open scope. Passing a rule id () closes only that rule’s scope, which lets overlapping disables for different rules nest cleanly.
Scope. Every line between the two comments (inclusive).
Unterminated disable. Extends to end-of-document — useful for whole-file opt-outs, but prefer the planned disable-file directive (F-suppression-disable-file) once it lands.
One rule per comment. Multi-rule lists are tracked as F-suppression-disable-file.

Common properties

Applies to Markdown only. Plain text and stdin cannot carry HTML comments. Config-based ignores ([[ignore]] in lucid-lint.toml) covering .txt and stdin are tracked as F19.
Unknown rule ids are silently ignored. This keeps directives forward-compatible across lint versions.
Suppressed diagnostics cost zero score. The suppression and scoring models are consistent — silencing a diagnostic removes it from the weighted-sum cost. No hidden double-penalty.

Deferred

The following extensions are tracked on the roadmap:

ID	Item
F19	Config-based ignores (`[[ignore]]` in `lucid-lint.toml`) for `.txt` and stdin inputs
F-suppression-reason-field	Optional-then-required `reason="..."` field, surfaced in reports
F-suppression-disable-file	File-level directive (`disable-file`) and multi-rule comma lists

CI integration

lucid-lint is designed for CI. It returns:

0 when no issues (or only info) are found
1 when warnings are found
2 on runtime error (invalid args, unreadable file)

GitHub Actions

name: Docs lint

on:
  pull_request:
    paths:
      - '**/*.md'
  push:
    branches: [main]
    paths:
      - '**/*.md'

jobs:
  lucid-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install lucid-lint
        run: cargo install lucid-lint
      - name: Lint docs
        run: lucid-lint check --profile=public docs/ README.md

Pre-commit

Add to your .pre-commit-config.yaml:

repos:
  - repo: local
    hooks:
      - id: lucid-lint
        name: lucid-lint
        entry: lucid-lint check --profile=public
        language: system
        types: [markdown]

Reviewdog

To surface diagnostics as pull request review comments:

lucid-lint check --format=json docs/ | reviewdog -f=rdjson -reporter=github-pr-review

Note: RDJSON adapter is not shipped. For native code-review surfacing, prefer the GitHub Code Scanning workflow below.

GitHub Code Scanning (SARIF)

--format=sarif emits a SARIF v2.1.0 log that GitHub’s Code Scanning ingests directly: each diagnostic becomes a code-scanning alert annotated on the PR diff.

name: Lucid lint (code scanning)

on:
  pull_request:
    paths: ['**/*.md']
  push:
    branches: [main]
    paths: ['**/*.md']

permissions:
  security-events: write
  contents: read

jobs:
  lucid-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: cargo install lucid-lint
      - name: Run lucid-lint and emit SARIF
        run: |
          lucid-lint check \
            --profile=public \
            --format=sarif \
            --fail-on-warning=false \
            docs/ README.md > lucid-lint.sarif
      - name: Upload SARIF to Code Scanning
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: lucid-lint.sarif
          category: lucid-lint

Notes:

--fail-on-warning=false lets the upload step always run; rely on Code Scanning’s own gating in the PR UI rather than the lint exit code.
Each rule appears once in runs[0].tool.driver.rules with its category, default severity, default scoring weight, and a helpUri pointing at the per-rule mdBook page.
Per-result properties.weight and properties.section carry the scoring weight and the heading the diagnostic was found under.

Exit code control

To avoid failing CI on warnings (e.g., during a gradual adoption phase), you can invert the default:

lucid-lint check --fail-on-warning=false docs/

This always returns 0 except on runtime error.

Gating on score

You can also gate the build on the aggregate scoring model. The run exits 1 if the global score is below the threshold, independently of the severity gate.

jobs:
  lucid-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: cargo install lucid-lint
      - name: Lint and gate on score
        run: lucid-lint check --min-score=85 docs/ README.md

Both gates stack — the run fails if either trips. Pick the combination that fits your adoption curve:

Goal	Flags
Catch newly introduced warnings (default behaviour)	default
Tolerate individual warnings but fail on drift	`--fail-on-warning=false --min-score=85`
Fail on both spikes and drift	default + `--min-score=85`

A gated run that fails — lucid-lint prints its usual summary, then the shell surfaces the non-zero exit code:

Terminal capture: a lucid-lint run on examples/sample.md with –min-score=85, which produces three warnings, two info diagnostics, a score of 45/100, and an “exit: 1” line written by the trailing echo command

$ lucid-lint check --min-score=85 examples/sample.md
…
score: 45/100
       structure    █▎░░░  5/20
       rhythm       █████  20/20
       lexicon      █▎░░░  5/20
       syntax       ██▌░░  10/20
       readability  █▎░░░  5/20
$ echo "exit: $?"
exit: 1

Rules reference

lucid-lint ships 25 rules as of v0.2 (17 from v0.1 plus 8 added during the v0.2 cycle). Each rule has a dedicated page below with category, severity, default weight, thresholds per profile, examples, and suppression guidance.

The compact reference at RULES.md remains the single-file overview kept in the repository root. The academic and normative sources behind every rule are consolidated on the References page.

Category	Rules
`structure`	`structure.sentence-too-long` · `structure.paragraph-too-long` · `structure.heading-jump` · `structure.deeply-nested-lists` · `structure.excessive-commas` · `structure.long-enumeration` · `structure.deep-subordination` · `structure.line-length-wide` · `structure.mixed-numeric-format`
`rhythm`	`rhythm.consecutive-long-sentences` · `rhythm.repetitive-connectors`
`lexicon`	`lexicon.low-lexical-diversity` · `lexicon.excessive-nominalization` · `lexicon.unexplained-abbreviation` · `lexicon.weasel-words` · `lexicon.jargon-undefined` · `lexicon.all-caps-shouting` · `lexicon.redundant-intensifier` · `lexicon.consonant-cluster`
`syntax`	`syntax.passive-voice` · `syntax.unclear-antecedent` · `syntax.nested-negation` · `syntax.conditional-stacking` · `syntax.dense-punctuation-burst`
`readability`	`readability.score`

Severity levels

Level	Meaning	Effect
`info`	Signal worth knowing, not a defect	Reported; does not fail CI
`warning`	Quality issue worth fixing	Reported; may fail CI depending on `--min-score`
`error`	Reserved for v0.3+	Not emitted in v0.2

Reading a rule at the terminal

Each rule bundles its reference page into the binary. Run lucid-lint explain <rule-id> to print the same content this website serves — useful when CI only gives you a diagnostic id and no browser:

Terminal capture: lucid-lint explain structure.sentence-too-long, printing the rule’s “What it flags” paragraph, the at-a-glance table with category, default severity, default weight, supported languages, and the source link

lucid-lint explain structure.sentence-too-long

Contributing a rule

See Contributing for the rule-addition checklist — every new rule must land with a page in this section.

`structure.sentence-too-long`

What it flags

Sentences whose length exceeds a per-profile ceiling. The intrinsic cognitive load of a sentence grows non-linearly with its word count (Graesser et al. 2004, Coh-Metrix); FALC caps at 15 words, Plain English at 20. Long sentences increase the probability of a reader under attentional load losing the thread mid-read.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`2`
Languages	EN · FR (identical detection)
Source	`src/rules/sentence_too_long.rs`

Detection

Split text into sentences via strong punctuation (., !, ?, …, paragraph breaks). Count Unicode word tokens, excluding punctuation. Contractions (don't) and elisions (l'accessibilité) count as one word when the apostrophe sits between two letters. Code blocks are skipped.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_words`	`int`	30	22	15
`exclude_code_blocks`	`bool`	`true`	`true`	`true`

Examples

Three ideas, colour-matched across the rewrite — position already pairs them, the tint just confirms the rewrite loses none. lucid-lint reports; the rewrite is always yours.

English

Before (flagged):

The caching subsystem, which was introduced in an earlier milestone, turned out to interact poorly with the new request pipeline under sustained load, and the investigation that followed required multiple rounds of profiling.

What lucid-lint check --profile public reports:

warning input.md:1:1 Sentence is 33 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]

After (your rewrite):

The caching subsystem was introduced earlier. It interacts poorly with the new request pipeline under sustained load. The investigation required several rounds of profiling.

French

Before (flagged):

Le sous-système de cache introduit lors d’un jalon précédent interagit mal avec le nouveau pipeline de requêtes sous charge soutenue, et l’enquête a nécessité plusieurs rondes de profilage.

What lucid-lint check --profile public reports:

warning input.md:1:1 Sentence is 29 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]

After (your rewrite):

Le cache a été introduit lors d’un jalon précédent. Il interagit mal avec le nouveau pipeline sous charge soutenue. L’enquête a nécessité plusieurs rondes de profilage.

Suppression

See Suppressing diagnostics for the inline and block forms.

See References for the full bibliography.

`structure.paragraph-too-long`

What it flags

Paragraphs that overrun either a sentence-count or a word-count threshold. A paragraph is a visual reprise unit: long paragraphs dilute the reprise point for readers who interrupt often. Both metrics are checked so that a short-but-dense paragraph (one 80-word sentence) is still caught — structure.sentence-too-long covers the complementary case.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`2`
Languages	EN · FR (identical detection)
Source	`src/rules/paragraph_too_long.rs`

Detection

Split on blank lines (Markdown paragraph convention). Count sentences and words per paragraph. Flag paragraphs exceeding either threshold.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_sentences`	`int`	7	5	3
`max_words`	`int`	150	100	60

Examples

A paragraph of eight medium sentences under the public profile will fire on max_sentences. A paragraph containing a single 120-word sentence will fire on max_words (and also on structure.sentence-too-long).

Suppression

See Suppressing diagnostics.

References

Sweller (1988)
Graesser et al. (2004)
CAN-ASC-3.1:2025

See References for the full bibliography.

`structure.heading-jump`

What it flags

Heading-level jumps that break the document’s mental map (e.g. H2 → H4). Each level must follow the previous by at most +1. Readers with attentional difficulties lean heavily on heading hierarchy to reposition after an interruption; a broken hierarchy destroys that cue. Also flags the first heading being deeper than H2 when allow_first_heading_any_level is false, and missing H1 when require_h1 is true.

References. WCAG 2.1 SC 1.3.1 (Info and Relationships) and 2.4.6 (Headings and Labels); RGAA 9.1.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Languages	language-agnostic
Source	`src/rules/heading_jump.rs`

Detection

Parse Markdown headings (#, ##, …). Walk them in source order; report each heading whose level exceeds the previous by more than one. Deterministic, no false positives.

Parameters

Key	Type	Default
`allow_first_heading_any_level`	`bool`	`true`
`require_h1`	`bool`	`false`

A binary rule — no per-profile thresholds.

Examples

Flagged:

# Overview
#### Details    ← jumps from H1 to H4

Clean:

# Overview
## Section
### Subsection

Suppression

See Suppressing diagnostics.

References

WCAG 2.1 — 1.3.1 & 2.4.6

See References for the full bibliography.

`structure.deeply-nested-lists`

What it flags

Bulleted list items nested beyond a reasonable depth. A deeply nested list forces the reader to reconstruct a complex mental hierarchy — horizontal indentation stops being a positional cue and becomes noise. Four levels of indent are too many for readers with attentional difficulties to track.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Languages	language-agnostic
Source	`src/rules/deeply_nested_lists.rs`

Detection

Parse Markdown via pulldown-cmark; extract list items with their indentation level; flag items deeper than max_depth. Deterministic, no false positives.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_depth`	`int`	4	3	2

Example

Under public (max depth 3):

- Level 1
  - Level 2
    - Level 3
      - Level 4    ← flagged

Diagnostic message

Includes repair guidance: flatten the structure, split into multiple lists, or promote sub-items to subsections with headings.

Suppression

See Suppressing diagnostics.

References

WCAG 2.1

See References for the full bibliography.

`structure.line-length-wide`

What it flags

Author-chosen lines wider than the per-profile ceiling. WCAG 1.4.8 (AAA) caps rendered text at roughly 80 characters per line because longer lines force the eye to track further between saccades and increase re-reading on return sweep — a known difficulty for dyslexic readers (BDA Dyslexia Style Guide).

“Author-chosen” matters: in Markdown, soft-wrapped lines collapse to spaces at parse time because the renderer reflows them to fit the viewport. Their source length tells us nothing about what the reader sees. Only line breaks the author kept on purpose are checked here — Markdown hard breaks (<br> or two trailing spaces) and explicit newlines in plain-text input. A soft-wrapped Markdown paragraph is exempt no matter how long its joined text is. Use structure.paragraph-too-long to bound paragraph density.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Condition tags	`dyslexia`, `general`
Languages	EN · FR (script-agnostic)
Source	`src/rules/line_length_wide.rs`

Detection

For every paragraph that carries an authorial line break, scan each line’s width in grapheme clusters and report lines above max_line_length.

A Markdown paragraph with no hard break inside it (the common case for prose) is exempt — the parser collapses its soft breaks to spaces, so what remains is one logical line whose source length tracks the viewport, not the rendered width WCAG 1.4.8 targets. Plain-text input is treated symmetrically: a paragraph with no inner \n is exempt; one with internal newlines is checked line by line.

Fenced and indented code blocks are excluded upstream by the Markdown parser. Headings, list items, and table cells are out of scope by construction — paragraph-too-long, sentence-too-long, and the heading rules cover the cognitive-load concerns that apply to those blocks.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_line_length`	`int`	120	100	80

FALC matches the WCAG 1.4.8 AAA recommendation of 80 characters.

Known caveats

Long single-line prose paragraphs in Markdown source are intentionally exempt. The rule used to fire on them and produced large amounts of noise on real prose; v0.2.x narrows the rule to author-chosen breaks only. Pair this rule with structure.paragraph-too-long if you also want a ceiling on the joined paragraph length.

Headings and list items are not measured by this rule. Their wrap behavior depends on the renderer (display type, list indent), and the underlying cognitive concerns are covered by other rules.

Suppression

See Suppressing diagnostics.

References

WCAG 2.1 — 1.4.8 (AAA)
Legge & Bigelow (2011)

See References for the full bibliography.

`structure.mixed-numeric-format`

What it flags

Sentences that mix digit numerals (42, 3.14, 1,000, 1 000) with spelled-out numerals (two, trois, twenty, cent) within the same sentence. Presenting numbers inconsistently forces the reader to switch surface forms mid-clause and re-anchor the referent — a known load for readers with dyscalculia and a plain-language anti-pattern.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Condition tags	`dyscalculia`, `general`
Languages	EN · FR
Source	`src/rules/mixed_numeric_format.rs`

Detection

For each sentence emitted by the tokenizer, scan for digit-numeric tokens and for entries in the per-language spelled-numeral list. If at least one of each kind co-occurs, emit a single diagnostic for the sentence citing one representative token of each kind.

Digit tokens accept ASCII digits plus an optional decimal (.) or thousands separator (,, narrow space U+0020) when flanked by digits on both sides. Spelled-out matches are case-insensitive ASCII compares against en::SPELLED_NUMERALS and fr::SPELLED_NUMERALS.

The ambiguous forms one (EN) and un / une (FR) are excluded from the spelled-numeral list because they double as indefinite pronouns and articles. This keeps the false-positive rate manageable at the cost of missing genuine mixed-format cases whose only spelled-out numeral is one. Metropolitan French and Swiss / Belgian regional forms (septante, huitante, octante, nonante) are all included.

Sentences are produced by the shared tokenizer (see src/parser/tokenizer.rs), so abbreviations, decimals, and ellipses do not spuriously split sentences. Fenced and indented code blocks are excluded upstream by the Markdown parser.

Parameters

None. The rule has no configurable threshold — a single co-occurrence of the two surface forms is sufficient.

Known caveats

Sentences whose only spelled-out numeral is one / un / une are not flagged, by design (see Detection).
Ordinals (first, premier, 2nd, 3e) are out of scope. 2nd currently reads as a digit token (2) followed by a word (nd), which does not match the spelled-numeral list — no false positive.
Roman numerals (IV, XIV) are neither digits nor spelled-out numerals for this rule.

Suppression

See Suppressing diagnostics.

References

ISO 80000-1:2022
Chicago Manual of Style (17th ed.)

See References for the full bibliography.

`structure.excessive-commas`

What it flags

Sentences whose comma count exceeds a per-profile ceiling. The comma is the most frequent marker of syntactic complexity; rather than disentangle the cause (subordination, apposition, enumeration, parenthetical), the rule uses density as a leading indicator of overload.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (identical detection)
Source	`src/rules/excessive_commas.rs`

Detection

Count commas per sentence, report those above max_commas.

Interaction. When structure.long-enumeration fires on the same sentence, this rule is suppressed for that sentence to avoid double-reporting. The shared enumeration detector discounts Oxford-style enumeration commas (3+ short items, plus a relaxed rhythmic pass for 1–4-word items, plus runs closed by plus as well as and / or — see “Known false positives” below) and commas inside (A, B, C, …) parenthesised token lists (3+ short comma-separated segments inside balanced parens) — all language-agnostic.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_commas`	`int`	4	3	2

Known false positives

Remaining false positives mostly come from bare lists with no terminal connector (e.g. Rules touched: A, B, C) and Oxford runs interrupted by an interleaved parenthetical; these are tracked as F22 in the roadmap for further v0.3 sub-slices.

Suppression

See Suppressing diagnostics.

References

Gibson (1998)

See References for the full bibliography.

`structure.long-enumeration`

What it flags

Inline prose enumerations that would be clearer as a bulleted list — 5+ comma-separated items closed by a coordinator (and, or, et, ou).

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (identical detection)
Source	`src/rules/long_enumeration.rs`, shared helper `src/rules/enumeration.rs`

Detection

Sequence of min_items or more short comma-separated segments ending with , and / , or / , plus / , et / , ou (Oxford comma optional). Shared detector also informs structure.excessive-commas.

Parameters

Key	Type	Default
`min_items`	`int`	`5`

Diagnostic message

Suggests converting the enumeration to a bulleted list.

Examples

lucid-lint reports; the rewrite is always yours.

English

Six items, colour-matched across the rewrite — each inline term lines up with its bullet.

Before (flagged):

The dish contains tomato, onion, garlic, basil, parsley, and thyme.

What lucid-lint check --profile public reports:

warning input.md:1:1 Inline enumeration of 5 items. Consider converting it into a bulleted list so readers can scan the items. [structure.long-enumeration]

After (your rewrite):

The dish contains:

tomato

onion

garlic

basil

parsley

thyme

Suppression

See Suppressing diagnostics.

References

Plain Language US (2011)

See References for the full bibliography.

`structure.deep-subordination`

What it flags

Cascading subordinate clauses: multiple relative pronouns or subordinating conjunctions chained without a strong-punctuation break. Each open referent has to sit in working memory until it closes — Gibson’s Dependency Locality Theory (1998) ties processing cost directly to that distance.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`2`
Languages	EN · FR (separate lists)
Source	`src/rules/deep_subordination.rs`

Detection

Walk the sentence between strong-punctuation breaks; count consecutive subordinators. Flag when the count exceeds max_consecutive_subordinators. Pronoun enumerations (qui, que, dont, où) are skipped — the detector recognises the list form and does not treat it as cascading.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_consecutive_subordinators`	`int`	3	2	2

Language lists

🇫🇷 Relative pronouns: qui, que, dont, où, lequel, laquelle, lesquels, lesquelles
🇫🇷 Subordinators: parce que, afin que, bien que, quoique, puisque, pour que, tandis que
🇬🇧 Relative pronouns: which, that, who, whom, whose
🇬🇧 Subordinators: because, although, while, since, whereas, unless, until

Examples

Each highlighted token is one subordinator counted by the rule. Four in a row triggers the dev-doc threshold (3); two in a row triggers public and falc.

Flagged (FR):

Le document qui a été rédigé par l’équipe que nous avons constituée et qui couvre les points que nous avions discutés…

Flagged (EN):

The report that was drafted by the team which we formed last month and which covers the topics that we had discussed…

Not flagged (enumeration form, recognised by the detector):

Les pronoms relatifs en français sont : qui, que, dont, où.

And the matching English form:

The English relative pronouns are: which, that, who, whom, whose.

Suppression

See Suppressing diagnostics.

References

Gibson (1998)

See References for the full bibliography.

`structure.italic-span-long`

Experimental in v0.2.x. Off by default; opt in via --experimental structure.italic-span-long or [experimental] enabled = ["structure.italic-span-long"] in lucid-lint.toml. Flips to Stable at the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for the dyslexia condition tag that gates this rule under user-active conditions.

What it flags

Italic spans (*…* / _…_) longer than a configurable word threshold. Slanted glyphs degrade letter-shape recognition for readers with dyslexia — a robust finding behind the British Dyslexia Association’s recommendation to keep italic emphasis to a short phrase rather than running a full sentence in italics. Long italic runs also harm scanability for readers whose attention is already taxed (fatigue, second-language reading, low-vision conditions).

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Status	`experimental` (v0.2.x) → `stable` at v0.3 cut
Condition tag	`dyslexia` (gated; runs only under matching `--conditions`)
Languages	EN · FR (identical detection — substrate is language-agnostic)
Source	`src/rules/structure/italic_span_long.rs`

Detection

Walks the typed inline tree captured on each Paragraph (F143 substrate) and flags every Inline::Emphasis span whose visible word count exceeds the per-profile threshold. Code blocks and inline code are excluded by the parser, so an italic span inside a code fence never fires. Strong (**bold**) does not trigger this rule — only emphasis (*italic* / _italic_).

The diagnostic location points at the opening delimiter, so the squiggle in your editor lands on the visible * or _ rather than an arbitrary column inside the paragraph.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_words`	`int`	12	8	5

Tune via lucid-lint.toml:

[rules."structure.italic-span-long"]
max_words = 6

Examples

English

Before (flagged):

The team eventually concluded that the proposed migration plan would require careful coordination across three regional offices and an extended freeze window before any deployment could begin.

What lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia reports:

warning input.md:1:36 Italic span is 17 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]

After (your rewrite):

The team eventually concluded that the proposed migration plan would require careful coordination. Three regional offices and an extended freeze window are prerequisites before any deployment.

The italics now mark a single load-bearing word — the kind of emphasis the BDA style guide endorses.

French

Before (flagged):

L’équipe a fini par conclure que le plan de migration proposé nécessiterait une coordination soignée entre trois bureaux régionaux et une fenêtre de gel prolongée avant tout déploiement.

What lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia reports:

warning input.md:1:35 Italic span is 18 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]

After (your rewrite):

L’équipe a fini par conclure que le plan de migration nécessiterait une coordination soignée. Trois bureaux régionaux et une fenêtre de gel prolongée sont indispensables avant tout déploiement.

Suppression

See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:

<!-- lucid-lint disable-next-line structure.italic-span-long -->
A *deliberately long italic span that the rule would normally flag* lives here.

References

British Dyslexia Association — Dyslexia Style Guide (2018). Recommends keeping italics to short phrases for letter-shape recognition.
Rello & Baeza-Yates (2013) — broader academic context on dyslexia-friendly typography.

See References for the full bibliography.

`structure.number-run`

Experimental in v0.2.x. Off by default; opt in via --experimental structure.number-run or [experimental] enabled = ["structure.number-run"] in lucid-lint.toml. Flips to Stable at the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for the dyscalculia condition tag that gates this rule under user-active conditions.

What it flags

Sentences that pack more than a configurable number of numeric tokens together. plainlanguage.gov is explicit on the framing — “Don’t put a lot of numbers together in one sentence” and “Avoid placing too many statistics close together” — and readers with dyscalculia carry the cost first: each numeric token forces a quantity-to-symbol re-anchoring that does not benefit from running prose context the way ordinary words do. Citation salads ((Smith 2020, Jones 2021, Wei 2022, Park 2023)), benchmark tables flattened into prose, and statistic-heavy paragraphs are the typical hits.

At a glance


Category	`structure`
Default severity	`warning`
Default weight	`1`
Status	`experimental` (v0.2.x) → `stable` at v0.3 cut
Condition tag	`dyscalculia` (gated; runs only under matching `--conditions`)
Languages	EN · FR (identical detection — digits are language-agnostic)
Source	`src/rules/structure/number_run.rs`

Detection

Walks each paragraph’s sentence stream (post-flattening, so fenced code blocks are already excluded by the parser) and counts numeric tokens per sentence. A numeric token is a contiguous run of ASCII digits, optionally containing one decimal separator (. or ,) followed by more digits. Hyphen, colon, slash, and whitespace split tokens.

Input	Tokens counted	Note
`42`	1	Bare integer
`3.14`	1	Decimal separator kept
`1,000`	1	Comma separator kept
`2026-05-04`	3	Hyphens split — a date is three numbers from a load standpoint
`$3.50`	1	Currency prefix is non-digit and ignored
`1st`	1	Trailing letters split; the digits still count

The diagnostic location points at the first numeric token in the offending sentence, so the squiggle in your editor lands on the visible cluster rather than the start of the sentence.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_numbers`	`int`	6	4	3

Tune via lucid-lint.toml:

[rules."structure.number-run"]
max_numbers = 5

Examples

English

Before (flagged):

The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points over the prior year.

What lucid-lint check --profile public --experimental structure.number-run --conditions dyscalculia reports:

warning input.md:1:5 Sentence packs 8 numeric tokens (maximum 4). plain-language guidance recommends not placing many numbers or statistics together in one sentence; split the sentence or move some figures to a list or table. [structure.number-run]

After (your rewrite):

The 2024 cohort sat 1,200 students across 4 campuses. They posted a 92.5% pass rate on the reviewed papers and improved 18 points over the prior year.

The figures still travel together, but each sentence carries a load a dyscalculic reader can re-anchor without losing the running referent.

French

Before (flagged):

La promotion 2024 a réuni 1 200 étudiants sur 4 campus, affiché un taux de réussite de 92,5 % sur les 3 copies revues, et progressé de 18 points par rapport à l’année précédente.

After (your rewrite):

La promotion 2024 a réuni 1 200 étudiants sur 4 campus. Le taux de réussite atteint 92,5 % sur les copies revues et progresse de 18 points par rapport à l’année précédente.

Suppression

See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:

<!-- lucid-lint disable-next-line structure.number-run -->
The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points.

References

plainlanguage.gov — Use short, simple sentences. “Don’t put a lot of numbers together in one sentence.”
plainlanguage.gov — Use numerals. Companion guidance on consistent numeric form (the grounding for mixed-numeric-format).

See References for the full bibliography.

`rhythm.consecutive-long-sentences`

What it flags

Streaks of long sentences within the same paragraph. An isolated long sentence is manageable; several in a row fatigue attention even when each individual sentence is under the structure.sentence-too-long ceiling. This rule catches the rhythm.

At a glance


Category	`rhythm`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (identical detection)
Source	`src/rules/consecutive_long_sentences.rs`

Detection

Walk sentences sequentially inside each paragraph. Maintain a running count of consecutive sentences above word_threshold. Fire once per streak that reaches max_consecutive.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`word_threshold`	`int`	20	15	10
`max_consecutive`	`int`	3	2	2

Relation to `structure.sentence-too-long`

Both rules look at sentence length but catch different problems:

Rule	Threshold (`dev-doc` / `public` / `falc`)	Fires on
`structure.sentence-too-long`	`max_words` 30 / 22 / 15	a single sentence past the ceiling
`rhythm.consecutive-long-sentences`	`word_threshold` 20 / 15 / 10	a streak of `max_consecutive` sentences each above the lower threshold

Because word_threshold sits below max_words, this rule catches the rhythm even when no individual sentence trips sentence-too-long. The invariant word_threshold < max_words (per profile) keeps the two from co-firing on the same sentence.

Examples

Five ideas, colour-matched across the rewrite — only the rhythm changes. lucid-lint reports; the rewrite is always yours.

English

Before (flagged):

The migration introduced a caching layer that sits in front of every read from the primary database. The team observed unexpected latency spikes whenever the cache invalidated under sustained write load. A subsequent investigation traced the regression to a thundering-herd pattern that fired on every cold key. The metrics dashboard misreported the issue as a generic timeout because the trace propagation was incomplete. The fix coalesced concurrent fills, added jittered TTLs, and instrumented the cache layer with a dedicated span emitter.

Five sentences, each over 20 words — the streak fatigues attention.

What lucid-lint check --profile dev-doc reports:

warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]

After (your rewrite):

The migration introduced a caching layer in front of the primary database. Latency spiked whenever the cache invalidated under heavy writes. The cause was a thundering-herd pattern on cold keys. Metrics misreported it as a generic timeout — trace propagation was broken. The fix coalesced concurrent fills, added jittered TTLs, and emitted a dedicated span.

French

Before (flagged):

La migration a introduit une couche de cache qui se place devant chaque lecture de la base primaire. L’équipe a observé des pics de latence inattendus chaque fois que le cache s’invalidait sous une charge d’écriture soutenue. Une enquête ultérieure a relié la régression à un motif de troupeau tonnant qui se déclenchait sur chaque clé froide. Le tableau de bord des métriques signalait à tort un délai d’attente générique parce que la propagation de la trace était incomplète. Le correctif a fusionné les remplissages concurrents, ajouté un TTL avec gigue, et instrumenté la couche de cache avec un émetteur de span dédié.

What lucid-lint check --profile dev-doc reports:

warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]

After (your rewrite):

La migration a introduit une couche de cache devant la base primaire. La latence montait dès que le cache s’invalidait sous écritures soutenues. Le coupable : un troupeau tonnant sur les clés froides. Les métriques signalaient un délai générique — la trace était cassée. Le correctif fusionne les remplissages, ajoute un TTL avec gigue et émet un span dédié.

Suppression

See Suppressing diagnostics for the inline and block forms.

References

Sweller (1988)
Sweller, Ayres & Kalyuga (2011)

See References for the full bibliography.

`rhythm.repetitive-connectors`

What it flags

Overuse of a single logical connector inside a short window of sentences. Connectors (opposition, cause, consequence, sequence, illustration, addition) are attentional anchors; repeated, they flatten the sense of progression. Sanders & Noordman (2000), Connectives as processing signals; Graesser et al. (2004), local cohesion.

At a glance


Category	`rhythm`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (separate lists)
Source	`src/rules/repetitive_connectors.rs`

Detection

Sliding window of window_size sentences. Per connector, count occurrences in the window. Fire once per cluster that crosses max_per_window.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_per_window`	`int`	4	3	2
`window_size`	`int`	5	5	5
`custom_connectors`	`list`	`[]`	`[]`	`[]`

Default connector lists

🇫🇷 Opposition: cependant, toutefois, en revanche, néanmoins, pourtant, mais
🇫🇷 Cause: parce que, car, puisque, en effet
🇫🇷 Consequence: donc, ainsi, par conséquent, c’est pourquoi
🇫🇷 Sequence: d’abord, ensuite, puis, enfin, premièrement
🇫🇷 Illustration: par exemple, notamment, en particulier
🇫🇷 Addition: de plus, en outre, également, par ailleurs
🇬🇧 Opposition: however, nevertheless, yet, although, but
🇬🇧 Cause: because, since, as, for
🇬🇧 Consequence: therefore, thus, consequently, hence, so
🇬🇧 Sequence: first, then, next, finally
🇬🇧 Illustration: for example, notably, in particular, such as
🇬🇧 Addition: moreover, furthermore, also, additionally

Examples

lucid-lint reports; the rewrite is always yours.

English

Five actions, colour-matched across the rewrite — only the connectors change.

Before (flagged):

We analysed the data. Then we built the model. Then we validated the results. Then we published the report. Then we archived the raw data.

Four then in five sentences — no progression felt.

What lucid-lint check --profile public reports:

warning input.md:1:1 Connector "then" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]

After (your rewrite):

We analysed the data. From it we built the model. Validation followed, and once the results held up we published the report. The raw data was archived last.

French

Five actions, colour-matched across the rewrite — only the connectors change.

Before (flagged):

Nous avons analysé les données. Ensuite nous avons construit le modèle. Ensuite nous avons validé les résultats. Ensuite nous avons publié le rapport. Ensuite nous avons archivé les données brutes.

Quatre ensuite en cinq phrases — aucune progression ressentie.

What lucid-lint check --profile public reports:

warning input.md:1:1 Connector "ensuite" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]

After (your rewrite):

Nous avons analysé les données. À partir de là nous avons construit le modèle. La validation a suivi, et dès que les résultats ont tenu nous avons publié le rapport. Les données brutes ont été archivées en dernier.

Suppression

See Suppressing diagnostics for the inline and block forms.

References

Sanders & Noordman (2000)
Graesser et al. (2004)

See References for the full bibliography.

`lexicon.low-lexical-diversity`

What it flags

Passages with excessive repetition of content words. A monotonous text loses reader attention and often signals unstructured thinking. The rule is not an anti-jargon detector: technical terms (API, request, cache) are expected to repeat — the signal targets non-technical content words.

At a glance


Category	`lexicon`
Default severity	`info`
Default weight	`1`
Languages	EN · FR (separate stoplists)
Source	`src/rules/low_lexical_diversity.rs`

Detection

Sliding window of window_size words. Within the window, compute unique_words / total_words over non-stopword, non-code-block tokens. Fire when the ratio falls below min_ratio.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`window_size`	`int`	100	100	80
`min_ratio`	`float`	0.40	0.50	0.55
`use_stoplist`	`bool`	`true`	`true`	`true`

Suppression

See Suppressing diagnostics.

References

Herdan (1960)
McCarthy & Jarvis (2010)
Graesser et al. (2004)

See References for the full bibliography.

`lexicon.excessive-nominalization`

What it flags

Sentences densely packed with nominalizations — verbs turned into abstract nouns. Two problems compound: nominalized text is more abstract (costlier to process) and hides the agent (“who does what” is obscured). FALC and the US Plain Writing Act both recommend strong verbs over nominalizations.

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (overlapping suffix lists)
Source	`src/rules/excessive_nominalization.rs`

Detection

Walk the sentence. Flag words whose suffix matches the language’s nominalization list. Fire when the count per sentence crosses max_per_sentence.

🇫🇷 Suffixes: -tion, -sion, -ment, -ance, -ence, -age, -ité, -isme, -ure
🇬🇧 Suffixes: -tion, -sion, -ment, -ance, -ence, -ity, -ism, -ness, -al

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_per_sentence`	`int`	4	3	2
`suffixes`	`list`	language defaults	language defaults	language defaults

Known false positives

Technical vocabulary (function, implementation, configuration) contains many legitimate nominalizations, which is why dev-doc relaxes the threshold. The -al suffix in English is too broad (flags crucial, horizontal, positional despite these not being abstract nouns) and is tracked for review in F-excessive-nominalization-suffix-refine on the roadmap.

Example

Nominalizations colour-matched to their active-verb counterparts in the rewrite.

Before (heavy):

La réalisation de l’analyse de la conformité permettra l’identification des axes d’amélioration.

After (lighter):

Nous analyserons la conformité. Cela permettra d’identifier les axes à améliorer.

Suppression

See Suppressing diagnostics.

References

Plain Language US (2011)
CAN-ASC-3.1:2025

See References for the full bibliography.

`lexicon.unexplained-abbreviation`

What it flags

Acronyms used without a nearby definition. Each forced interruption to guess or look up an acronym breaks the flow and raises the risk of losing attention.

References. WCAG 2.1 SC 3.1.4 (Abbreviations); RGAA 9.4.

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (whitelists differ)
Source	`src/rules/unexplained_abbreviation.rs`

Detection (v0.2, two-pass — F9)

Pre-scan the whole document for acronyms defined in either canonical form:
- Full Expansion (ACRONYM) — example: World Wide Web (WWW)
- ACRONYM (Full Expansion) — example: WWW (World Wide Web)
The “expansion” side must contain at least two alphabetic words, so short parenthetical notes like (TBD) or (check later) do not count as definitions.
Match sequences of 2+ consecutive uppercase letters (optionally with digits) in the main text.
Filter each candidate against three layers, in order:
1. Defined in document (from the pre-scan) — strongest.
2. User whitelist from [rules.unexplained-abbreviation].whitelist.
3. Baseline whitelist (profile-driven).
Flag each remaining occurrence.

A single definition anywhere in the document silences every occurrence of the same acronym — matching how readers actually use documentation (scroll back once to find the expansion, remember it thereafter).

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`min_length`	`int`	3	2	2
`whitelist`	`list`	extended	minimal	empty

Default whitelist (v0.2, narrowed by F31): the infrastructure stack — URL, HTML, CSS, JSON, XML, HTTP, HTTPS, UTF, IO, API, CLI, GUI, OS, CPU, RAM, SSD, USB, IDE, SDK, CI, CD — plus common FR/EN acronyms and RFC 2119 emphasis keywords (PDF, SMS, GPS, ID, OK, FAQ, MUST, SHALL, SHOULD, …).

[rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]

User-whitelist entries are additive over the baseline — they extend it, never replace it.

Suppression

See Suppressing diagnostics.

References

WCAG 2.1 — 3.1.4
CAN-ASC-3.1:2025

See References for the full bibliography.

`lexicon.weasel-words`

What it flags

Vague qualifiers that weaken a statement. A weasel word adds an invisible cognitive load: the reader has to decide whether the claim matters, is true, or measurable. References: Wikipedia style guide (Avoid weasel words), Strunk & White, FALC.

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (separate lists)
Source	`src/rules/weasel_words.rs`

Detection

Word-boundary match against a per-language list. Case-insensitive. One diagnostic per occurrence.

Inline code spans. A hit inside `…` is skipped. Wrap a weasel term in backticks when you are discussing the word itself.
Directional pairings. rather than (EN) and plutôt que (FR) are conjunctions meaning “instead of” — not hedges — and are skipped.

Parameters

Key	Type	Default
`custom_weasels_fr`	`list`	`[]`
`custom_weasels_en`	`list`	`[]`
`disable_weasels`	`list`	`[]`

Default lists (v0.1)

🇫🇷 quelques, certains, parfois, plutôt, assez, globalement, généralement, souvent, en général, la plupart, il semble que, il semblerait que, on pourrait dire que, on dit souvent, beaucoup de, peu de, presque, quasiment, environ, à peu près
🇬🇧 some, many, often, just, simply, clearly, obviously, seemingly, arguably, basically, essentially, virtually, various, numerous, sort of, kind of, a bit, rather, quite, fairly, relatively, mostly, generally

Known false positives

Two patterns still fire in v0.2: straight-quoted terms ("many X" without backticks) and "many X" where X is a concrete noun. Both are queued under F23 on the roadmap. Wrap the quoted term in backticks, or use an inline disable comment, to opt out.

Suppression

Use  when the weasel is intentional (quotation, legitimate subset reference, meta-discussion). See Suppressing diagnostics.

References

Strunk & White (1999)
CAN-ASC-3.1:2025

See References for the full bibliography.

`lexicon.jargon-undefined`

What it flags

Domain-specific terms used without definition. Jargon is contextual: acceptable among specialists, exclusionary otherwise. Like acronyms, jargon creates reading interruptions for the non-specialist; unlike acronyms, these are content words, not uppercase sequences.

References. US Plain Language, FALC, WCAG 2.1 SC 3.1.3 (Unusual Words).

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Languages	EN · FR (separate lists per language and domain)
Source	`src/rules/jargon_undefined.rs`

Detection

Maintain multiple jargon lists per domain (tech, legal, medical, admin).
User activates the relevant lists via profile.
Flag each occurrence of a listed term.

Profile activation

Profile	Lists active
`dev-doc`	none (developers understand their own jargon)
`public`	`tech`, `legal`, `medical`, `admin`
`falc`	`tech`, `legal`, `medical`, `admin`, strict mode

Configuration

In v0.2, the active lists are set by the profile and are not yet user-overridable from lucid-lint.toml. Per-rule TOML overrides — adding custom domain terms, silencing specific entries, or activating a non-default list combination — are tracked as F126 on the roadmap.

Default starter lists (community contributions welcome)

Tech: idempotent, orthogonal, deterministic, polymorphic, serialization, deserialization, synchronous, asynchronous, concurrency, thread-safe, side-effect, referential transparency, memoization, currying, hoisting, closure, monad, immutable, stateless, refactoring
Legal (mostly FR): apériteur, clause résolutoire, force majeure, cessation de paiement, préjudice subi, onéreux, nonobstant, préalablement, susmentionné, infra, supra, ad hoc, de facto, in fine, subséquemment
Medical: anamnèse, étiologie, pathognomonique, iatrogène, nosocomial, décompensation, récidive, rémission, syndromique
Admin (mostly FR): attributaire, solliciter, diligenter, instruction du dossier, pièces justificatives, circulaire, délibération, arrêté préfectoral, transmission des pièces, ayant droit

Suppression

See Suppressing diagnostics.

References

WCAG 2.1 — 3.1.3
Plain Language US (2011)
CAN-ASC-3.1:2025

See References for the full bibliography.

`lexicon.all-caps-shouting`

What it flags

Runs of consecutive ALL-CAPS words.

ALL-CAPS prose strips the shape cues that dyslexic readers rely on to disambiguate words:

Ascenders — the strokes that rise above the body of letters like b, d, h, k, l.
Descenders — the strokes that drop below the baseline in g, p, q, y.
X-height contrast — the height difference between short letters like a, e, o and tall ones like h, l.

In all-caps, every letter sits on the same baseline at the same height. The reader loses the silhouette of the word and has to decode letter by letter. ALL-CAPS also triggers many screen readers to spell out the run letter by letter unless the surrounding markup says otherwise.

WCAG 3.1.5 and the BDA Dyslexia Style Guide both recommend lowercase or sentence case for emphasis.

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Condition tags	`a11y-markup`, `dyslexia`, `general`
Languages	EN · FR (script-only detection — language-agnostic)
Source	`src/rules/all_caps_shouting.rs`

Detection

Per paragraph, scan for runs of consecutive ALL-CAPS words. Minor connectors (,, ;, :, -, whitespace) keep a run alive; a lowercase word, a period, or paragraph break ends it.

A word is ALL-CAPS when it is at least 2 letters long and contains no lowercase letter. Single ALL-CAPS tokens are treated as abbreviations and are the responsibility of lexicon.unexplained-abbreviation.

Code blocks are excluded by the Markdown parser before the rule runs.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`min_run_length`	`int`	3	2	2

dev-doc tolerates a 2-word emphasis run (DO NOT) common in technical docs.

Examples

lucid-lint reports; the rewrite is always yours.

English

One emphasis phrase, colour-matched across the rewrite — the shouting becomes typographic emphasis without losing the stress.

Before (flagged):

Please DO NOT touch this.

DO NOT reads as shouting.

What lucid-lint check --profile public reports:

warning input.md:1:8 2 consecutive ALL-CAPS words read as shouting and degrade legibility for dyslexic readers. Use sentence case and rely on emphasis (italics, bold) or a callout instead. [lexicon.all-caps-shouting]

After (your rewrite):

Please do not touch this.

Known false positives

A chain of three or more acronyms in prose (API HTTP TLS) is structurally indistinguishable from shouting and will fire. Suppress on the line if the chain is intentional, or restructure the prose.

Suppression

See Suppressing diagnostics.

References

Arditi & Cho (2007)
Nielsen Norman Group
Bringhurst (2013)

See References for the full bibliography.

`lexicon.redundant-intensifier`

What it flags

Intensifiers — adverbs that try to upgrade the confidence of a statement without adding information. very important reduces to important, or better, to a quantified claim. plainlanguage.gov (Chapter 4) and the CDC Clear Communication Index flag intensifiers as a plain-language anti-pattern.

The rule is a deliberate sibling of lexicon.weasel-words: weasel words downgrade confidence (hedges, qualifiers); redundant intensifiers upgrade it. The two lists are disjoint by construction.

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Condition tags	`general`
Languages	EN · FR
Source	`src/rules/redundant_intensifier.rs`

Detection

Per paragraph, lowercase the text and look for each intensifier phrase in the per-language list (en::INTENSIFIERS, fr::INTENSIFIERS) using the shared word-bounded search. Hits inside fenced or inline code spans are ignored. Documents whose language is Unknown are skipped rather than guessed, matching lexicon.weasel-words.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`custom_intensifiers_en`	`list<string>`	`[]`	`[]`	`[]`
`custom_intensifiers_fr`	`list<string>`	`[]`	`[]`	`[]`
`disable`	`list<string>`	`[]`	`[]`	`[]`

custom_intensifiers_en / _fr add phrases to the defaults. disable removes phrases from them (exact lowercase match).

Known caveats

very in the fixed phrase very well (as acknowledgment) still triggers — plain-language guides flag it anyway, so the rule does not carve out an exception. Suppress via inline directive if the context genuinely calls for it.
Metalinguistic references (“the word ‘very’ is an intensifier”) trigger unless the target word is in backticks. Use inline code spans for such references.

Suppression

See Suppressing diagnostics.

References

Strunk & White (1999)
Quirk et al. (1985)
Zinsser (2006)

See References for the full bibliography.

`lexicon.consonant-cluster`

What it flags

Words whose longest run of consecutive consonants meets or exceeds a per-profile threshold. Dense consonant clusters are a known decoding barrier for dyslexic readers (BDA Dyslexia Style Guide): the reader must hold more phonemes in working memory before the next vowel “releases” the syllable.

Typical English offenders at the public threshold of 5 include strengths (n-g-t-h-s), twelfths (l-f-t-h-s), sixths (x-t-h-s in a 4-run plus context). Typical French offenders at the falc threshold of 4 include constructions (n-s-t-r).

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Condition tags	`dyslexia`, `general`
Languages	EN · FR
Source	`src/rules/consonant_cluster.rs`

Detection

Per source line, walk the grapheme stream once. A word is a maximal run of alphabetic characters; hyphens, apostrophes, and whitespace close the word (so dys-lexic is two words, not one ten-letter cluster). Within a word, track the longest run of consecutive consonants. Emit one diagnostic per word whose longest run meets min_run_length.

Vowels are language-aware — French accented forms (é, è, ê, à, â, î, ï, ô, ö, ù, û, ü, ÿ, œ, æ) count as vowels. The English fallback still accepts common latin-1 accented vowels so borrowed words (café, naïve) decode correctly. y is treated as a vowel in every language (lenient), which avoids awkward false positives on words like fly, rhythm.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`min_run_length`	`int`	6	5	4

dev-doc is tolerant — technical prose regularly names things like strengths and benchmarks. falc (plain-language audience) catches any 4-consonant run.

Known caveats

The rule is blind to syllable structure: it counts raw consonant graphemes, not phonemes. A word like hatching (5 letters: t-c-h-n-g — a run of 5) reads fluently to most readers because tch is a single English digraph. Suppress with an inline directive when a hit is unavoidable.
Script-agnostic for any alphabetic script, but the vowel lists are tuned for Latin scripts only. Words in Cyrillic, Greek, Arabic, etc., will likely trigger whenever the language flag is en or fr — in practice such content is out of scope for a bilingual EN/FR linter.

Suppression

See Suppressing diagnostics.

References

Seidenberg et al. (1984)
Treiman et al. (2006)

See References for the full bibliography.

`lexicon.homophone-density`

Experimental in v0.2.x. Off by default; opt in via --experimental lexicon.homophone-density or [experimental] enabled = ["lexicon.homophone-density"] in lucid-lint.toml. Flips to Stable at the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for the dyslexia and aphasia condition tags that gate this rule under user-active conditions.

What it flags

Paragraphs whose share of homophones — words that sound alike but spell differently (their / there / they're, to / too / two, cours / court, amande / amende) — exceeds a configurable percentage. Homophones force a phonological-then-orthographic disambiguation pass: the ear resolves the word, the eye must then pick the right spelling from context. That extra hop is cheap on its own and expensive in a cluster. The British Dyslexia Association style guide flags homophones as a known friction point for dyslexic readers, and the FALC orthographic-clarity guidelines recommend rephrasing dense homophone runs for aphasic and plain-language audiences.

At a glance


Category	`lexicon`
Default severity	`warning`
Default weight	`1`
Status	`experimental` (v0.2.x) → `stable` at v0.3 cut
Condition tags	`dyslexia`, `aphasia` (gated; runs only under matching `--conditions`)
Languages	EN · FR (curated per-language homophone lists)
Source	`src/rules/lexicon/homophone_density.rs`

Detection

For each paragraph, walk the word stream once, count alphabetic words as the denominator, and count words that appear in the per-language homophone table as hits. If hits / total strictly exceeds the per-profile threshold, emit one diagnostic anchored at the paragraph’s start line. Paragraphs with fewer than 20 content words are skipped — below that floor, a single homophone produces a misleading double-digit percentage. The diagnostic message names up to two example homophones the rule actually saw, so the location is the paragraph but the fix candidates are concrete.

The homophone tables (HOMOPHONE_GROUPS_EN, HOMOPHONE_GROUPS_FR in src/language/) lean toward content-word pairs whose orthographic confusion genuinely distorts meaning. Ultra-frequent French function-word homophones (et / est, a / à, ou / où) are intentionally excluded: they appear in nearly every sentence and would push baseline density past every threshold, drowning out the signal the rule is meant to catch.

When the document’s detected language is Unknown the rule has no table to apply and skips silently rather than guessing.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_density_percent`	`float`	8.0	5.0	3.0

Tune via lucid-lint.toml:

[rules."lexicon.homophone-density"]
max_density_percent = 4.0

Examples

English

Before (flagged):

Their report shows there were too many decisions to make and two teams could not affect the launch nor lose the schedule despite careful planning across each region and product line every quarter.

What lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia reports:

warning input.md:1:1 Paragraph density of homophones is 21.2% (7 of 33 content words (e.g. their, there)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]

After (your rewrite):

The report shows that the team made many decisions and that the two squads kept the launch on schedule despite careful planning across each region and product line every quarter.

The rephrase swaps their / there / to / too / two for context-anchored alternatives (the report, that, the team, kept, the two squads), bringing density well below the threshold.

French

Before (flagged):

Pendant le cours du matin la cuisinière prépare le foie de veau avant la pause de midi puis revient à sa tâche après avoir rangé les ustensiles sur la grande table en bois clair.

What lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia reports:

warning input.md:1:1 Paragraph density of homophones is 11.8% (4 of 34 content words (e.g. cours, foie)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]

After (your rewrite):

Pendant la séance du matin la cuisinière prépare le foie de veau avant la coupure de midi puis reprend son travail après avoir rangé les ustensiles sur la grande table en bois clair.

cours becomes séance, pause becomes coupure, tâche becomes travail — three of the four homophone hits disappear without losing meaning.

Suppression

See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:

<!-- lucid-lint disable-next-line lexicon.homophone-density -->
Their report shows there were too many decisions to make and two teams could not lose the launch.

References

British Dyslexia Association — Dyslexia Style Guide (2018). Flags homophones as a friction point for dyslexic readers.
Falc — Information pour tous (2009). FALC orthographic-clarity guidelines for aphasic and plain-language audiences.

See References for the full bibliography.

`syntax.passive-voice`

What it flags

Passive-voice constructions. Passive hides the agent and lengthens the sentence without adding information. Legitimate exceptions exist (unknown agent, scientific style, intentional focus on the action) — the rule flags, the author decides.

References. US Plain Language; Strunk & White; FALC.

At a glance


Category	`syntax`
Default severity	`warning`
Default weight	`2`
Languages	EN · FR (separate heuristics)
Source	`src/rules/passive_voice.rs`

Detection (v0.1 heuristic)

🇬🇧 be (conjugated) + past participle [+ by …]. Handles regular -ed and the irregular-participle table.
🇫🇷 être (conjugated) + past participle [+ par …], plus se faire + infinitif. Harder than EN because of participle agreement (gender/number) and confusion with (a) subject attribute (il est content vs il est vu) and (b) compound-tense être auxiliary (elle est partie — passé composé, active).

Expect ~70–80% precision. A POS-parser-based replacement is planned for a future lucid-lint-nlp plugin.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_per_paragraph`	`int`	3	1	0
`ignore_scientific_style`	`bool`	`false`	`false`	`false`

Suppression

Use inline disables on intentional passives. See Suppressing diagnostics.

References

Strunk & White (1999)
Plain Language US (2011)
CAN-ASC-3.1:2025

See References for the full bibliography.

`syntax.unclear-antecedent`

What it flags

Pronouns whose antecedent is not obvious in the immediate context. Ambiguous pronominal reference is one of the costliest comprehension breaks for readers with attentional difficulties: each ambiguity forces a conscious return-and-search.

References. Strunk & White; FALC (“prefer name repetition over pronouns”); Graesser et al. Coh-Metrix (referential cohesion).

At a glance


Category	`syntax`
Default severity	`info`
Default weight	`2`
Languages	EN · FR (separate pronoun lists)
Source	`src/rules/unclear_antecedent.rs`

Detection (v0.1 heuristic)

Exact detection requires anaphora resolution (advanced NLP). v0.1 catches the two most frequent patterns:

Bare demonstrative pronouns at sentence start (This/That/These/Those, Ceci/Cela/Ce) not followed by a noun.
Personal pronouns at paragraph start (no antecedent in the preceding context).

Severity is info because the heuristic is approximate — the noise level warrants a soft signal.

Parameters

Key	Type	Default
`check_demonstratives`	`bool`	`true`
`check_paragraph_start_pronouns`	`bool`	`true`

Pronoun lists

🇫🇷 ce, cela, ceci, ça, celui-ci, celle-ci, il, elle, ils, elles
🇬🇧 this, that, these, those, it, they, them

Example

Les performances étaient médiocres avec le cache LRU. Cela a motivé le changement.

Cela refers to the performance? The cache? Ambiguous.

Suppression

See Suppressing diagnostics.

References

Strunk & White (1999)
Gibson (1998)
Graesser et al. (2004)

See References for the full bibliography.

`syntax.nested-negation`

What it flags

Sentences that stack multiple negations. Two or more negations in the same sentence force the reader to mentally toggle truth values — a known burden for readers with aphasia and attention-fragile readers (ADHD), and a load multiplier for everyone reading under cognitive pressure. Plain-language guidelines (FALC, CDC Clear Communication Index, plainlanguage.gov) recommend rewriting double negatives as positives.

At a glance


Category	`syntax`
Default severity	`warning`
Default weight	`2`
Condition tags	`aphasia`, `adhd`, `general`
Languages	EN · FR (language-specific counting)
Source	`src/rules/nested_negation.rs`

Detection

Count the negations per sentence; report sentences whose count exceeds max_negations.

English — sum of word-boundary matches against the language’s negation list (not, no, never, none, nothing, nobody, nowhere, neither, nor, cannot, without) plus occurrences of the contracted n't suffix (don't, won't, isn't, doesn't, …).
French — pair-based bipartite counting. Each ne / n' clitic contributes one negation and pairs with its nearest second-position particle (pas, rien, jamais, plus, personne, aucun, aucune, guère, nulle part) within a short window; the pairing just consumes the particle to avoid double-counting. Unpaired particles in a ne-sentence contribute one more — this catches forms like rien used as a nominal negative subject. Guards: pas / plus never count when unpaired (too ambiguous outside ne …); rien preceded by de is treated as the idiom de rien and skipped; particles in a sentence with no ne clitic are skipped too (plus de courage, personne d'autre). Standalones sans / non always count.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_negations`	`int`	3	2	1

Examples

lucid-lint reports; the rewrite is always yours.

English

Three negations → three affirmatives, colour-matched across the rewrite. The not simply drops — the simplification shows.

Before (flagged):

We do not say nothing is never possible.

Three negations (not, nothing, never).

What lucid-lint check --profile public reports:

warning input.md:1:1 Sentence stacks 3 negations (maximum 2). Rewrite as a positive statement or split the negations across separate sentences. [syntax.nested-negation]

After (your rewrite):

We say something is possible.

French

Passes under public:

Nous ne sommes pas prêts.

Bipartite ne ... pas counts as one negation.

Before (flagged):

Nous ne disons pas que rien n’est jamais possible.

Three negations: ne…pas (one bipartite), rien (unpaired), n'…jamais (one bipartite).

What lucid-lint check --profile public reports:

warning input.md:1:1 Sentence stacks 3 negations (maximum 2). Rewrite as a positive statement or split the negations across separate sentences. [syntax.nested-negation]

After (your rewrite):

Nous disons que quelque chose est possible.

Suppression

See Suppressing diagnostics.

References

Clark & Chase (1972)
Carpenter & Just (1975)
Kaup et al. (2006)

See References for the full bibliography.

`syntax.conditional-stacking`

What it flags

Sentences that chain multiple conditional clauses. Each if / when / unless / quand / si opens a branch the reader must keep on a mental stack until the outer clause resolves; two or three of them stacked in one sentence is a known load multiplier for readers with aphasia, ADHD, and anyone reading under cognitive pressure. Plain-language guidelines (FALC, plainlanguage.gov) recommend splitting conditional chains into separate sentences or a bullet list.

At a glance


Category	`syntax`
Default severity	`warning`
Default weight	`2`
Condition tags	`aphasia`, `adhd`, `general`
Languages	EN · FR (language-specific lists)
Source	`src/rules/conditional_stacking.rs`

Detection

Per sentence, count the conditional connectors and report counts above max_conditionals.

English — sum of word-bounded matches against the language list (if, unless, when, whenever, while, until, provided, assuming, in case, as long as, as soon as, even if, only if).
French — sum of word-bounded matches against the language list (si, sauf si, à moins que, à moins de, quand, lorsque, lorsqu', dès que, tant que, pourvu que, à condition que, à condition de, au cas où, même si, en cas de) plus the elliptic clitics s'il / s'ils.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_conditionals`	`int`	3	2	1

Examples

Three conditionals, colour-matched across the rewrite — position already pairs them, the tint just confirms each branch carries through. lucid-lint reports; the rewrite is always yours.

English

Before (flagged):

If we ship, when the build passes, unless the gate fails, we deploy.

What lucid-lint check --profile public reports:

warning input.md:1:1 Sentence stacks 3 conditional clauses (maximum 2). Split the conditions across separate sentences or convert them to a bullet list. [syntax.conditional-stacking]

After (your rewrite):

We deploy when all three checks hold:

the ship command ran,

the build passes,

the gate does not fail.

French

Before (flagged):

Si nous expédions, quand le test passe, à moins que la barrière échoue, nous déployons.

Three conditional connectors (si, quand, à moins que). French rewrite to come with the FR translation pass.

Known false positives

The English list mixes pure conditionals with temporal conjunctions (when, while) that often introduce conditional-like sub-clauses. Pure-temporal usages may produce a false positive on long sentences. Use disable-next-line when the temporal reading is unambiguous.

Suppression

See Suppressing diagnostics.

References

Johnson-Laird & Byrne (1991)
Evans & Over (2004)
Gibson (1998)

See References for the full bibliography.

`syntax.dense-punctuation-burst`

What it flags

Local bursts of punctuation: a sliding window of grapheme clusters that contains too many qualifying marks (,, ;, :, —, –). Tight clusters of marks signal layered subordination, parenthetical interjections, or list-within-list constructions that are hard to parse for readers with cognitive or attentional difficulties (IFLA easy-to-read guidelines).

Distinct from structure.excessive-commas, which counts commas across an entire sentence. A sentence with 8 commas spread evenly across 200 characters does not trigger here, while a sentence with 3 commas inside a 30-character span does.

At a glance


Category	`syntax`
Default severity	`warning`
Default weight	`1`
Condition tags	`general`
Languages	EN · FR (script-agnostic)
Source	`src/rules/dense_punctuation_burst.rs`

Detection

Per source line, walk the grapheme stream once and collect the column of every qualifying mark. When a window of window_graphemes graphemes holds min_marks or more marks, emit a burst spanning the first to the last mark in the window, then advance past that last mark so overlapping windows do not double-fire on the same cluster.

Code blocks (fenced and indented) are excluded upstream by the Markdown parser. Sentence terminators (., !, ?) and brackets do not count toward the burst.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`min_marks`	`int`	4	3	3
`window_graphemes`	`int`	30	30	40

dev-doc tolerates a 3-mark cluster (often unavoidable in technical lists adjacent to prose). FALC keeps the same density floor as public but widens the window to catch slightly looser bursts.

Known caveats

The rule operates per source line. A burst that wraps across a hard line break in the source is not detected; in practice this is rare because dense punctuation is also dense in source bytes.
Em dash (—, U+2014) and en dash (–, U+2013) qualify; the ASCII double-hyphen surrogate (--) does not, on the assumption that authors who care about readability use the proper Unicode forms.

Suppression

See Suppressing diagnostics.

References

Sweller (1988)
Gibson (1998)

See References for the full bibliography.

`syntax.parenthetical-depth`

Experimental in v0.2.x. Off by default; opt in via --experimental syntax.parenthetical-depth or [experimental] enabled = ["syntax.parenthetical-depth"] in lucid-lint.toml. Flips to Stable at the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for the adhd and general condition tags.

What it flags

A sentence whose maximum balanced-bracket nesting depth across () and [] reaches the profile threshold. Stacked parentheticals force the reader to track multiple suspended frames at once — a recognised “hard sentence” signal in the plainlanguage.gov and Hemingway editing traditions, and a particular cost for ADHD readers, who carry the working-memory load first.

The rule complements structure.excessive-commas, which already discounts flat (A, B, C) enumerations at depth 1. This rule fires only at depth 2 or more, so the two rules are mechanically orthogonal: one flat parenthesised list never trips this rule.

At a glance


Category	`syntax`
Default severity	`warning`
Default weight	`1`
Status	`experimental` (v0.2.x) → `stable` at v0.3 cut
Condition tags	`adhd`, `general` (gated; runs only under matching `--conditions`)
Languages	EN · FR (language-agnostic — bracket families are the same in both)
Source	`src/rules/syntax/parenthetical_depth.rs`

Detection

For each sentence, the rule walks the post-flattening paragraph text (so fenced code blocks are already excluded by the parser) and tracks a single running depth counter.

Algorithm

Walk the sentence one character at a time.
Increment depth on ( or [; decrement on ) or ].
A close that would push depth below zero resets depth to zero — the rule fails open on unbalanced markup, mirroring the posture of the parenthesised_list_comma_count helper used by structure.excessive-commas.
Track the maximum depth reached and the position of the opener that achieved it.
Emit one diagnostic per sentence when max_depth ≥ the profile threshold, anchored at the deepest opener.

Skips (false-positive guards)

Code spans / fenced code blocks: already excluded upstream by the Markdown parser.
Unbalanced brackets: the depth-floor reset prevents stray closes from inflating later depths.

Deferred (not in MVP)

Em-dash pairs (— … —), curly braces ({}), and comma-flanked appositives are intentionally out of scope at v0.2.x. Em-dash pair detection is fragile (en/em-dash confusion, hyphen ambiguity) and would smuggle scope back in.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_depth`	`int`	4	3	2

max_depth is the inclusive nesting depth at which the rule fires. A sentence whose deepest bracket frame is one level shallower stays silent.

Tune via lucid-lint.toml:

[rules."syntax.parenthetical-depth"]
max_depth = 3

Examples

English

Before (flagged):

The migration tool (which now supports rollbacks (see --reverse, added in 0.4.2 [tracked in #312])) is opt-in.

What lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd reports:

warning input.md:1:21 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]

After (your rewrite):

The migration tool is opt-in. It now supports rollbacks via --reverse, added in 0.4.2 (tracked in #312).

The two top-level parentheticals are gone; the remaining one sits flat at depth 1. A reader no longer has to push three suspended thoughts on the stack to reach the close.

French

Before (flagged):

Le module (qui dépend du noyau (chargé au démarrage [voir le manuel])) est facultatif.

What lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd reports:

warning input.md:1:23 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]

After (your rewrite):

Le module est facultatif. Il dépend du noyau, chargé au démarrage. Voir le manuel pour les détails.

Three sentences, no nested brackets. The dependency chain is now linear and the reader recovers each fact in the order it appears.

Suppression

See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:

<!-- lucid-lint disable-next-line syntax.parenthetical-depth -->
The migration tool (which now supports rollbacks (see `--reverse`, added in 0.4.2 [tracked in #312])) is opt-in.

References

plainlanguage.gov — Write short sentences. Plain-language guidance treats stacked qualifiers and nested parentheticals as the canonical “long sentence” symptom.
Hemingway editing tradition — surfaces “hard to read” sentences when they layer multiple suspended ideas; nested parentheticals are the cleanest mechanical reading of that signal.

See References for the full bibliography.

`readability.score`

What it flags

A document-level readability index. Readability formulas are the historical synthetic signal for text complexity — simple, reproducible, recognized by US/UK government guidelines and WCAG. Treat it like cyclomatic complexity: a metric first, a warning second.

At a glance


Category	`readability`
Default severity	`info` (always reported) · `warning` when above `max_grade_level`
Default weight	`5`
Languages	EN — Flesch-Kincaid · FR — Kandel-Moles (auto-selected per detected language; v0.2+)
Source	`src/rules/readability_score.rs`

Detection (v0.2 — per-language formula)

The formula is selected by the document’s detected language:

English — Flesch-Kincaid Grade Level:

0.39 × (words / sentences) + 11.8 × (syllables / words) − 15.59

The result is a US-school grade. Compared directly to max_grade_level.

French — Kandel & Moles (1958):

207 − 1.015 × (words / sentences) − 73.6 × (syllables / words)

The result is an ease score on roughly 0..100 (higher = easier), Flesch-style. To stay comparable across languages, the rule converts it to a grade-equivalent with the standard linear approximation (100 − score) / 10, and compares that against max_grade_level. The diagnostic message surfaces both the native ease score and the grade-equivalent.

Unknown language falls back to Flesch-Kincaid.

Grade	US school equivalent
< 6	Elementary
6–9	Middle school
9–12	High school
12–16	College
> 16	Expert

Additional formulas (Gunning Fog, SMOG, Dale-Chall, Scolarius) and multi-formula --readability-verbose reports remain on the roadmap.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`max_grade_level`	`float`	14	9	6
`always_report`	`bool`	`true`	`true`	`true`
`formula`	`auto` \| `flesch-kincaid` \| `kandel-moles`	`auto`	`auto`	`auto`

Override formula via --readability-formula on the CLI; auto uses the detected language, other values pin the formula.

Output modes

Always reported as info (for observability, even when under the threshold).
Reported as warning when the grade level exceeds max_grade_level.

Suppression

Suppressing a document-level metric is rarely the right answer; adjust max_grade_level in lucid-lint.toml instead. See Configuration.

References

Flesch (1948)
Kincaid et al. (1975)
CAN-ASC-3.1:2025

See References for the full bibliography.

`readability.large-number-unanchored`

Experimental in v0.2.x. Off by default; opt in via --experimental readability.large-number-unanchored or [experimental] enabled = ["readability.large-number-unanchored"] in lucid-lint.toml. Flips to Stable at the v0.3 cut as part of the F-experimental-rule-status cohort flip. See Conditions for the dyscalculia and general condition tags.

What it flags

A large numeral or magnitude word that appears in a sentence with no nearby anchor — no unit, no percentage, no currency symbol, no ratio, no comparator phrase. The CDC Clear Communication Index asks whether numbers are clear and meaningful for the primary audience; plainlanguage.gov is more direct on the mechanism — “Use Numbers Effectively” recommends giving every large figure a comparison or denominator the reader can ground. Readers with dyscalculia carry the cost first: a context-free “4.8 milliards” forces an unaided magnitude estimate that ordinary prose context does not provide.

The rule complements structure.number-run, which fires on numeric clusters (≥ N tokens in one sentence). This rule fires on a single large or magnitude-word numeral that lacks anchoring context.

At a glance


Category	`readability`
Default severity	`warning`
Default weight	`1`
Status	`experimental` (v0.2.x) → `stable` at v0.3 cut
Condition tags	`dyscalculia`, `general` (gated; runs only under matching `--conditions`)
Languages	EN · FR (per-language comparator and figure-ref lexicons)
Source	`src/rules/readability/large_number_unanchored.rs`

Detection

For each sentence, the rule walks the post-flattening paragraph text (so fenced code blocks are already excluded by the parser) and searches for unanchored candidates.

Candidate definition

A sentence-level candidate is either:

A numeric token whose digit count is ≥ 4 and whose integer value is ≥ the profile threshold. The scanner collapses common thousands separators (,, ., ASCII space, NBSP, thin space, narrow NBSP) between digit groups so 1 000 (FR) and 1,000 (EN) both count as one 4-digit token with value 1000.
A magnitude word — million(s), billion(s), trillion(s) in EN; million(s), milliard(s), billion(s), trillion(s) in FR. Whole-word, case-insensitive.

Skips (false-positive guards)

Year-shaped: exactly 4 contiguous digits with no thousands or decimal separators and value in 1000..=2999. 2024 and 1789 are years, not magnitudes.
Ordinal: digit run immediately followed by a letter (1st, 12th).
Figure / page / section reference: candidate preceded (within 16 bytes, same sentence) by Figure, Fig., Page, Section, §, p., pp., #, or the FR equivalents (figure, page, section, tableau, chapitre, annexe).

Anchor types (sentence-scoped)

Any of the following anywhere in the sentence anchors all candidates in that sentence:

Percent sign (%).
Currency symbol ($, €, £, ¥).
Unit token from a small curated list (km, kg, m², °C, L, Hz, MB, Mo, …).
Ratio pattern: X out of Y, X sur Y, or X / Y between digits.
Comparator phrase from the per-language lexicon (EN: roughly, approximately, more than, the size of, …; FR: soit environ, équivalent à, environ, plus de, par rapport à, …).

The diagnostic location points at the first surviving candidate in the offending sentence, so the squiggle in your editor lands on the visible numeral rather than the start of the sentence.

Parameters

Key	Type	`dev-doc`	`public`	`falc`
`min_value`	`int`	100000	10000	1000

min_value is the inclusive lower bound on the integer value of a numeric candidate. Tokens that meet the digit-count gate but fall below min_value are skipped — page-number-like quantities already get the figure-ref skip; this is a second safety net.

Tune via lucid-lint.toml:

[rules."readability.large-number-unanchored"]
min_value = 50000

Examples

English

Before (flagged):

The proposal mentions several billion in vague spending across regions.

What lucid-lint check --profile public --experimental readability.large-number-unanchored --conditions dyscalculia reports:

warning input.md:1:32 Magnitude word appears with no anchor in this sentence (no unit, percentage, ratio, or comparison phrase). plain-language guidance recommends pairing magnitude words with a unit or a comparison the reader can ground. [readability.large-number-unanchored]

After (your rewrite):

The proposal mentions several billion dollars in vague spending across regions, roughly the annual budget of a mid-sized state agency.

The figure now sits next to a unit (dollars) and a comparator phrase (roughly the annual budget); both anchor the magnitude for a reader who cannot ground it from raw scale.

French

Before (flagged):

Le budget atteint 4 800 000 000 selon le rapport final.

What lucid-lint check --profile public --experimental readability.large-number-unanchored --conditions dyscalculia reports:

warning input.md:1:19 Large numeral (10-digit, value ≈ 4800000000) appears with no anchor in this sentence (no unit, percentage, ratio, or comparison phrase). plain-language guidance recommends giving large numbers a comparison or denominator the reader can ground. [readability.large-number-unanchored]

After (your rewrite):

Le budget atteint 4,8 milliards d’euros, soit environ 6 % du PIB selon le rapport final.

The figure is now accompanied by a unit (euros), a percentage (6 %), and a comparator phrase (soit environ). A reader who cannot estimate “4,8 milliards” raw now has three independent anchors.

Suppression

See Suppressing diagnostics for the inline and block forms. Inline disable also works on this rule:

<!-- lucid-lint disable-next-line readability.large-number-unanchored -->
The proposal mentions several billion in vague spending across regions.

References

plainlanguage.gov — Use numbers effectively. “Help your reader visualize numbers… Compare numbers to something the reader is familiar with.”
CDC Clear Communication Index — Numbers. Item 6 asks whether numbers are clear and meaningful for the primary audience.

See References for the full bibliography.

Architecture overview

lucid-lint is a small Rust crate with a deliberately simple pipeline.

Pipeline

 input text
     │
     ▼
┌──────────────────────────┐
│ Language detection       │   stop-word ratio heuristic
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Parser                   │   pulldown-cmark or plain text
│ (Markdown | plain)       │
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Document model           │   Section > Paragraph > Sentence
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Rules                    │   Each rule gets the document + language
│ (sentence-too-long, ...) │
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Diagnostics              │   rule_id, severity, location, section,
│                          │   message, weight
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐     v0.2+
│ Scoring                  │   density-normalized, category-capped
│ (Scorecard)              │   5 fixed categories
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Output formatter         │   TTY (default) or JSON
│                          │   — carries diagnostics + scorecard
└──────────────────────────┘

Key types

Diagnostic — the output unit. Carries weight (seeded from scoring::default_weight_for) as of v0.2.
Rule (trait) — fn check(document, language) -> Vec<Diagnostic>.
Document — the parser’s output. Section-aware.
Scorecard — global: Score plus [CategoryScore; 5] in fixed Structure · Rhythm · Lexicon · Syntax · Readability order.
Report — diagnostics + scorecard + word_count, returned by Engine::lint_* since v0.2.
Engine — bundles a profile, rule set, and optional ScoringConfig; exposes lint_str, lint_file, lint_stdin.

Design principles

These principles are enforced in code review. See Design decisions for background.

Make impossible states impossible — newtypes, enums with data, NonZeroU32.
Functional style where it helps — iterator chains, pure rule functions.
Atomic rules — one rule, one signal.
Deterministic core — no network, no LLM, no env-dependent behavior.
YAGNI — no speculative abstractions.

Module layout

src/
├── lib.rs             — library root
├── main.rs            — binary entry point
├── cli.rs             — clap CLI
├── config.rs          — profile presets, config file parsing
├── engine.rs          — orchestration
├── language/          — detection + per-language data
├── parser/            — Markdown + plain + tokenizer + document model
├── rules/             — one file per rule
├── scoring.rs         — hybrid scoring model (v0.2+)
├── output/            — TTY + JSON formatters
└── types.rs           — domain types (Diagnostic, Severity, Location, ...)

Design decisions

This page records design decisions made during v0.1 that are worth revisiting before changing.

Linter model vs scoring model

Decision: v0.1 shipped as a classic linter with info / warning severities. v0.2 added a hybrid scoring model (global score + per-category sub-scores + diagnostics) on top, without removing the linter form.

Rationale: shipping the linter form first let us validate detection quality on real corpora before adding the aggregation layer. The scoring layer is additive — consumers that only care about diagnostics ignore the scorecard.

Hybrid scoring model (v0.2)

Decision: global + 5 per-category sub-scores, all in X / max form. Composition stacks a weighted sum, density normalization (per 1 000 words, floored at 200), and a per-category cap. 5 fixed categories: Structure · Rhythm · Lexicon · Syntax · Readability. New Diagnostic.weight field, new --min-score=N CLI flag.

Rationale (full brainstorm at brainstorm/20260420-score-semantics.md):

X / max over 0–100: arbitrary max lets us re-tune without claiming the 80 we ship today is the same 80 next release. The /impeccable skill already uses this convention.
5 fixed categories: couples nothing to a rule rename; uses the category_of(rule_id) helper already decided in v0.1. Derive-from- prefix (plan B) was rejected because it would require renaming 17 rules for F14 alone.
Three composition mechanics stacked: no single one covers every failure mode. Density alone punishes short docs; weights alone lose to a runaway rule; caps alone can’t reflect cost magnitude.
Letter grades, traffic lights, pass/fail margin, reading-time-seconds were cut from the v0.2 design after a first-principles pass (F-score-letter-grade–F-reading-time-score in ROADMAP). They duplicate function-1 (at-a-glance) that the number already serves.
Actionability (function-2) is delivered by the diagnostics list, not the score. So sub-scores can afford to be minimal — F37 makes sure diagnostic messages hold up the actionability side of the contract.

Diagnostic struct

Decision: a Diagnostic carries rule_id, severity, location, section, message, and (as of v0.2) weight.

What’s NOT stored and why:

category — derivable from rule_id via Category::for_rule. Storing it would duplicate information and risk drift.
suggestion — still deferred; current messages are actionable on their own.

What IS stored and why:

section — recomputing it after the fact would require re-parsing the document to walk headings and match locations. The storage cost is an Option<String> per diagnostic; the recompute cost is a second full parse.
weight (v0.2) — seeded at emission from scoring::default_weight_for so that user overrides (via config) and rule-level overrides (via with_weight) both flow through aggregation without a second lookup.

Deterministic core, plugins for the rest

Decision: the core ships only deterministic rules. LLM-based rules, network-backed rules, or ML-model-backed rules live in optional plugin crates (planned v0.3).

Rationale: a pre-commit hook that takes 5 seconds and varies between runs is worse than no hook. Determinism is non-negotiable in the happy path.

Bilingual EN/FR from day one

Decision: every language-dependent rule supports English and French from v0.1.

Rationale: most French-speaking OSS developers write docs in English. Targeting French only would miss the majority. Supporting both from day one is cheap and signals the ambition.

Single readability formula in v0.1

Decision: v0.1 uses Flesch-Kincaid Grade Level for all languages. Language-specific formulas (Kandel-Moles for French, SMOG, Coleman-Liau) are deferred to v0.2.

Rationale: Flesch-Kincaid is understood, reproducible, and well-behaved. Adding three more formulas before validating the basics would be premature optimization.

Markdown + plain text + stdin, Pandoc for the rest

Decision: native support for .md, .markdown, .txt, and stdin in v0.1. Other formats (AsciiDoc, HTML, docx, PDF) use Pandoc as a pre-processor.

Rationale: Markdown covers the overwhelming majority of open-source and technical writing. Pandoc is free, ubiquitous, and removes the burden of maintaining multiple parsers.

One file per rule

Decision: each rule lives in its own file under src/rules/ with a consistent structure (struct, config, Rule impl, tests).

Rationale: makes adding a rule a well-defined operation (new file from template), and makes reviewing easy (one rule, one PR, one file to read).

Stop-word heuristic for language detection

Decision: v0.1 detects language by stop-word ratio. No external dependency.

Rationale: short, deterministic, no runtime cost. For the cases where it fails (very short texts, code-heavy docs), the unknown fallback is safe.

Profile presets as enum variants

Decision: profiles are Profile::DevDoc | Public | Falc. They cannot be defined in user config in v0.1.

Rationale: adding custom profiles is a speculative abstraction until someone asks for it. Per-rule overrides are enough to cover 95% of the “I want a slightly different preset” cases.

ROADMAP source-of-truth pipeline (v0.2.x+)

Decision: ROADMAP.md is demoted from edited source to generated artifact. The source-of-truth becomes a structured set of files under .roadmap/ (gitignored), one markdown file per feature with TOML front-matter, plus narrative chunks. A small Rust workspace member (crates/roadmap-cli) provides add / generate / validate / rename subcommands. The generator is invoked locally during release prep; the regenerated ROADMAP.md is committed on the release-prep PR. CI does not regenerate. Scoped under F-roadmap-toml-source.

Rationale:

Branch protection on main (in place since 2026-05-03 via F-repo-config-hardening) forces every ROADMAP.md tweak through the worktree → branch → PR → CI → merge → cleanup cycle. Forecast steady-state was 10–30 ROADMAP-only edits per week. The PR review value on those edits is null (solo author), so the ceremony was pure overhead.
Path-scoped ruleset bypass on ROADMAP.md would weaken the branch-protection signals tracked by the OpenSSF Scorecard / Best Practices badges. Demoting the file from main source preserves those signals untouched.
Per-feature files give per-feature git diffs, kill schema lock-in (front-matter is per-file optional), and let narrative sections live as plain markdown rather than TOML strings.
Rust over Python for the generator: reuses pulldown-cmark already in dependencies, folds tests into cargo test, single-toolchain maintenance, and stays extractable as a standalone crate if the tool matures.
Local generator (not CI) avoids granting CI any access to .roadmap/ (gitignored and machine-local). Release cadence — not real-time — was an accepted trade-off; the public ROADMAP.md artifact updates per v* tag.
Day-1 blockers on landing: deterministic <a id="…"> anchor emission (so existing [F46](#f46)-style cross-links from PRs and commits keep resolving), an add templating subcommand (so creating a feature is one keystroke, not a regression), and a round-trip determinism test (regenerate the artifact, diff against committed, fail on drift).

Emergency fallback: if crates/roadmap-cli work overruns budget, the file moves instead to a roadmap orphan branch with direct push and the same .md shape — preserves Scorecard signals via a different mechanism, at the cost of a non-standard branch layout. Documented as the escape hatch but not the chosen path.

References to follow before changing these

RULES.md — the authoritative rule reference
ROADMAP.md — future work tracked
CODING_STANDARDS.md — day-to-day conventions

lucid-lint — Roadmap

Future rules, refinements, and platform extensions tracked from v0.1 onwards.

Status as of 2026-05-02: v0.1 shipped 2026-04-20 (17 rules). v0.2.0 shipped 2026-04-22 (25 rules + hybrid scoring + SARIF + condition tags), v0.2.1 + v0.2.2 shipped 2026-04-23, v0.2.3 shipped 2026-04-29 (structure.line-length-wide author-break-aware + encoding hygiene F110/F111/F112 + correctness wins). The v0.2.x patch cycle is active: F25 closed 2026-05-01 (FR pair-completeness 41/41); the FR content-staleness gate is --strict on main since 2026-05-01 and on PRs since 2026-05-02 (F92 sub-task fully closed); F35b/F35c, F104, F105, F107, F123 all shipped. v0.3 strategy locked 2026-05-02: the breaking change is the 5-rule cohort (F46 / F49 / F51 / F53 / F57) flipping from default-off to default-on. Each rule ships in v0.2.x as Experimental via the F-experimental-rule-status substrate — visible, opt-in for dogfooding, no score regression — then flips to Stable at the v0.3 cut. v0.4 is a horizon bet list.

Legend

Status	Meaning
✅	Done (merged on `main`)
🚧	In progress
☐	Not started

Priority	Meaning
🔴 Next	Actively queued for the next cut
🟡 Later	Likely someday, not scheduled
🟢 Speculative	Nice-to-have, exploratory
—	Shipped; priority meaningless once the item has landed

At a glance

Version-centric and topic-centric summary views. The sections below this one are the authoritative topic-centric tables; use them when you need origin, rationale, or full history. Use this section when you need to answer “what’s next?” or “what’s the 0.3 shape?” in a glance.

Version snapshot

Version	State	Breaking?	Headline content
v0.1	✅ Released 2026-04-20	—	17 rules across 5 phases, minimal inline-disable, mdBook site with FR stub
v0.2.0	✅ Released 2026-04-22	Yes (rule-id harmonisation)	Hybrid scoring (F14), SARIF (F32), condition tags (F71/F72), 8 new rules (25 total), F-readability-formulas-extra EN/FR auto-formula
v0.2.1	✅ Released 2026-04-23	No	Localhost 404.html fix, 3rd per-rule TOML override, fixtures pipeline, TTY GIFs, v0.1/v0.2 prose sweep
v0.2.2	✅ Released 2026-04-23	No	FR `syntax.nested-negation` pair-based counting
v0.2.3	✅ Released 2026-04-29	No	`structure.line-length-wide` author-break-aware (60+ FR FPs killed), encoding hygiene at the engine boundary (F110/F111/F112 — UTF-8 BOM strip + NFC normalisation), strict whitelist validation, library `expect()` removal, scoring clamp invariant
v0.2.x	🚧 In progress	No	FR translations (F25 ✅ closed 2026-05-01), responsive (F-docs-responsive), F-rule-mention-linking rule-mention linking, F-example-fixtures-part2 part 2, F-project-scoring-rollup project roll-up
v0.3	☐ Scoped	Yes	F22 v0.3 slice, F-readability-formulas-extra remainder, 5 condition-tag rules (F46/F49/F51/F53/F57)
v0.4	☐ Horizon	Varies	LLM plugin (F-llm-plugin), alternative formats (F-asciidoc-support–F-pandoc-companion), feedback-driven items

Feature catalog (active work)

Filtered to 🔴 Next + 🚧 In-progress. The narrative sections later in this file are the source of truth; this catalog is a derived index, hand-maintained alongside the narrative. If you spot drift, the narrative wins.

Sort: target version (current cycle first) → status (🚧 in-progress before ☐ next) → F-ID.

ID	Topic	Status	Target	Summary
F22	Rules refinement	🚧	v0.2.x → v0.3	Parenthesised-list (Oxford ✅; non-Oxford + interleaved deferred to v0.3 slice)
F-example-fixtures-part2	Example fixtures	🚧	v0.2.x	Part 2 — redistributable replacements (3/N closed 2026-05-01)
F-experimental-rule-status	Architecture	☐	v0.2.x	Experimental rule status substrate — gates the v0.3 cohort, opens dogfood window
F143	Architecture	☐	v0.2.x	Inline AST layer over pulldown-cmark — substrate for F49 (gates the cohort lead)
F-weasel-words-severity-tiering	Rules refinement	☐	v0.2.x	Severity tiering for `lexicon.weasel-words` (quantifier `info`, hedge `warning`) — unblocks the audit-and-PR play
F-redundant-intensifier-bullet-fix	Rules refinement	☐	v0.2.x	`lexicon.redundant-intensifier` parser miss in bullet / `strong` spans (verification slice for F-tight-list-paragraphs)
F-severity-floor-flag	Suppression / config	☐	v0.2.x	`--severity-floor=warning` CLI flag — narrow-audit shape for external PRs
F-roadmap-slug-ids	Architecture	☐	v0.2.x	New ROADMAP entries adopt `F-<slug>` form (legacy F1–F146 stay numeric); slug-uniqueness Rust test enforces, runs offline
F-repo-discoverability-polish	Adoption channels	☐	v0.2.x	README badge row (crates.io / docs / CI / license) + GitHub social preview image (1280×640) — first-impression surfaces for crates.io and link unfurls
F-report-quick-wins	Reporting / DX	☐	v0.2.x	TTY quick-wins block under the diagnostic list — acronym whitelist hint + single-rule hot-spot hint; non-breaking, additive
F-project-scoring-rollup	Architecture	☐	v0.2.x	Project-level scoring roll-up (per-file + summary)
F-rule-mention-linking	Docs — content	☐	v0.2.x	Rule-mention linking pass across guide-prose pages
F-docs-responsive	Docs — reading prefs	☐	v0.2.x	Responsive / mobile adaptation
F-github-action	Adoption channels	🚧	v0.3	GitHub Action — composite scaffold internal; v0.3 first cut emits `::warning::`
F-readability-formulas-extra	Rules refinement	☐	v0.3	SMOG / Dale-Chall / Scolarius / `--readability-verbose`
F-tight-list-paragraphs	Architecture	☐	v0.3	Markdown parser emits paragraphs for tight list items (correctness)
F46	New rules (v0.3)	✅	v0.3	`lexicon.homophone-density` — shipped 2026-05-03 (PR #41) as `Status::Experimental`; flips to `Stable` at v0.3 cut
F49	New rules (v0.3)	✅	v0.3	`structure.italic-span-long` — shipped 2026-05-02 (PR #26) as `Status::Experimental` (cohort lead); flips to `Stable` at v0.3 cut
F51	New rules (v0.3)	✅	v0.3	`structure.number-run` — shipped 2026-05-04 (PR #46) as `Status::Experimental`; flips to `Stable` at v0.3 cut
F53	New rules (v0.3)	✅	v0.3	`readability.large-number-unanchored` — shipped 2026-05-04 (PR #50) as `Status::Experimental`; flips to `Stable` at v0.3 cut
F57	New rules (v0.3)	✅	v0.3	`syntax.parenthetical-depth` — shipped 2026-05-04 (PR #55) as `Status::Experimental`; flips to `Stable` at v0.3 cut
F-npm-wrapper	Adoption channels	☐	v0.3	npm wrapper (`@lucid-lint/cli-{platform}` `optionalDependencies` pattern)
F-docsrs-metadata–F-public-api-audit	Docs.rs polish	☐	v0.3	`[package.metadata.docs.rs]`, logo + favicon, doctests, `cargo public-api` audit

Topic heatmap

Where the active energy is. Counts include 🔴 Next only; shipped items excluded.

Topic	v0.2.x 🔴	v0.3 🔴	v0.4 bets	Later 🟡	Speculative 🟢
Rules (refinement)	3 (F22 follow-up, F-weasel-words-severity-tiering, F-redundant-intensifier-bullet-fix)	2 (F-readability-formulas-extra, F22)	—	F-low-diversity-stoplist, F-missing-connectors, F-excessive-nominalization-suffix-refine	F-sentence-diversity-density, F-comma-density-relative
New rules	—	5 (F46, F49, F51, F53, F57)	F-rhythm-forward-reference-heavy–F-syntax-address-inconsistency, F-lexicon-vocabulary-rarity	F58, F-rhythm-pronoun-density, F-rhythm-topic-shift-cluster, F-lexicon-falc-idiom	F-paragraph-landmark-density, F-lede-buried
Architecture / scoring	4 (F-project-scoring-rollup, F-experimental-rule-status, F143, F-roadmap-slug-ids)	1 (F-tight-list-paragraphs)	F-section-scoring, F-reading-time-score	F-adversarial-review, F-per-family-subscores, F-section-scoring, F-score-letter-grade, F-score-traffic-light	F-reading-time-score
Docs site (bilingual / content / theming / reading)	2 (F-rule-mention-linking, F-docs-responsive)	—	—	F-docs-final-polish, F43, F-rule-mention-coverage-test, F73, F-lucid-stance-unify, F-summary-per-locale/F-multi-book-mdbook	—
Docs.rs polish	—	4 (F-docsrs-metadata–F-public-api-audit)	—	—	—
Example-text fixtures	1 (F-example-fixtures-part2 part 2)	—	F-rule-fixture-coverage-map, F-reference-auto-discovery	F-text-source-adapters, F-text-before-after-refine, F-texts-yaml-url-maintenance	F-reference-auto-discovery
Performance / hygiene	—	—	—	F-config-whitelist-normalize	—
Adoption channels	2 (F-vale-style-pack, F-repo-discoverability-polish)	2 (F-github-action, F-npm-wrapper)	—	F-falc-readiness-guide, F-mdbook-lint-coexistence, F-pre-commit-hook-listing, F-homebrew-tap	F-wasm-playground (WASM playground)
Suppression / config	1 (F-severity-floor-flag)	—	—	F-suppression-reason-field, F-suppression-disable-file, F-config-whitelist-normalize	—
Formats	—	—	F-asciidoc-support–F-pandoc-companion (single pick)	F-asciidoc-support–F-pandoc-companion	—
Ecosystem interop	—	F-interop-suppression	—	F-interop-suppression	—
Plugins / NLP / LLM	—	F-nlp-plugin (Should)	F-llm-plugin, F-nlp-plugin	F-nlp-plugin	F-llm-plugin
Developer experience	—	F-fix-mode (narrow `--fix`)	LSP server	F-diff-mode (`--compare`), F-explain-fancy-rendering	F-score-evolution-dashboard
Reporting / DX	1 (F-report-quick-wins)	—	—	—	—
Research track	—	—	F-rule-discovery-corpus, F-external-feedback-top3 (user feedback)	—	F-paragraph-landmark-density, F-lede-buried, F-rule-discovery-corpus

Cadence and gating

v0.2.x is a rolling patch cycle, not a single release target. Each Must or Should ships as soon as it’s green on just check + CI; any 🔴-tagged row is eligible to ride the next patch cut.
v0.3 opens only when the v0.2.x Must queue is empty and at least one breaking change is justified. Until then, breaking changes are held — non-breaking items that would otherwise fit 0.3 (e.g. F-score-letter-grade letter grade) can slide into 0.2.x if they mature first.
v0.4 items do not progress by tenure. Each carries an unlock signal — a concrete event that promotes it from horizon to scheduled. See “v0.4 — horizon” at the bottom of this document.

v0.4 — horizon (bets, not commitments)

Routed 2026-04-24 in .personal/brainstorm/20260424-next-cycles.md. Each bet lists the signal that unlocks it, so horizon items don’t drift into Must by tenure alone. No commitments; this is “what could be true in ~6 months if 0.2 and 0.3 land cleanly”.

Bet	Unlock signal
F-llm-plugin — `lucid-lint-llm` plugin	≥ 2 concrete LLM-as-Judge rules designed on paper; deterministic-core base stable enough that non-determinism is a clear opt-in
F-asciidoc-support / F-html-support / F-docx-support / F-pandoc-companion — alternative formats (AsciiDoc / HTML / .docx / pandoc bridge)	External user requests; pick the single format with most pull and ship it alone, not the set
F-rule-fixture-coverage-map + F-reference-auto-discovery — fixture coverage maps + auto-discovery	Referential has stabilised (F-example-fixtures-part2 part 2 done) and rule set stops churning
F-lexicon-vocabulary-rarity — vocabulary-rarity	Lexique.org + COCA frequency lexicons built and licence-cleared
F-rhythm-forward-reference-heavy – F-syntax-address-inconsistency — remaining condition-tag rules	F46 / F49 / F51 / F53 / F57 validated in the wild at 0.3
F-section-scoring — section-level scoring	Document + project level proven; users ask “which H2 is the problem?”
F-reading-time-score — reading-time unit	Validated heuristic exists; companion metrics (comfort, fatigue, understandability) defined
F-score-evolution-dashboard — score-evolution dashboard	CI users explicitly ask for trend view (not delta — delta is F-diff-mode / `--compare`)
F-interop-suppression — interop suppression (if not shipped in 0.3)	A second rule joins `deeply-nested-lists` as a markdownlint overlap
F-rule-discovery-corpus — rule-discovery corpus mining	Student / intern resource available; separate research track
LSP server	Editor demand visible (Cursor / VSCode issues); would change the deployment story
F-lede-buried / F-paragraph-landmark-density — research-track rules	Only if someone codes them for fun
F-external-feedback-top3 — top 3 items from first-10-external-users feedback (TBD)	0.2.0 ships and ≥ 10 non-maintainer users exist — placeholder reserved so the horizon isn’t 100 % maintainer bets (renumbered from F98 post-collision with stream-2 cargo-mutants)
F-figurative-language — metaphor / analogy / comparison detection (NLP or LLM plugin). Cognitive-load grounded: figurative language costs extra inference for tired readers, aphasia, L2 readers, and is a known axis for ASD (currently out of v0.2/v0.3 scope). Belongs in `lucid-lint-nlp` (F-nlp-plugin) or `lucid-lint-llm` (F-llm-plugin) — non-deterministic, so plugin-only per prime directive #4. Bilingual-viable concern: idiomatic FR vs EN metaphors don’t map; FR + EN paths need separate corpora at proposal time.	Either NLP or LLM plugin scaffolding lands AND a dogfood / external case surfaces a missed metaphor that confused a reader

Deliberately off the 0.4 list:

F-score-letter-grade / F-score-traffic-light letter grade + traffic light — routed to 0.3 Should; if they slip they go to 0.3.x, not 0.4.
Full F29 numeric codes — parked until a rename actually happens.
F-sentence-diversity-density, F-comma-density-relative speculative rule refinements — stay speculative until a concrete dogfood case surfaces.
F-per-family-subscores per-family sub-scores — category sub-scores (F14) already ship; unclear what “family” adds beyond that.

v0.3+ — Advanced plugins

LLM-enhanced detection

ID	Item	Priority	Origin
F-llm-plugin	`lucid-lint-llm` plugin (LLM-as-Judge rules)	🟢 Speculative	Research on existing tools

The plugin would add rules like unclear-antecedent-semantic that use an LLM to detect semantic ambiguities the pattern-based heuristics miss.

Disabled by default due to non-determinism, API cost, and latency incompatible with pre-commit hooks.

Advanced NLP

ID	Item	Priority	Origin
F-nlp-plugin	`lucid-lint-nlp` plugin specification and scaffolding (Python subprocess or WASM-based). Replaces heuristic rules with POS- / dependency-tree- / anaphora-backed precise versions. Ship only when the first plugin rule is concretely scheduled — scaffolding-without-consumer is the red flag from AGENTS.md directive #1 (2026-04-24 brainstorm-next-cycles).	🟡 Later	Rule-system-growth brainstorm (2026-04-20)

Candidate rules for the plugin:

POS-based syntax.passive-voice detection (replaces v0.1 heuristic)
Full anaphora resolution for syntax.unclear-antecedent
Dependency-tree-based structure.deep-subordination
Semantic similarity between adjacent sentences (discourse cohesion signal inspired by Coh-Metrix)

New rules (v0.3 candidates)

Deferred from v0.2 because they require corpus work, lexicon builds, or depend on earlier features (F9, F14). Naming uses the provisional category.rule-name prefix pending F29.

ID	Rule	Category	Tags	Grounding	Depends on
F46	`lexicon.homophone-density`	Lexicon	`dyslexia`	BDA (dyslexia)	FR corpus tuning; ships as `info`. Slip-flag (2026-04-24): if FR corpus tuning exceeds ~2 days, slides to 0.3.x. Ships as `Experimental` in v0.2.x via F-experimental-rule-status; flips to `Stable` at v0.3 cut.
F49	`structure.italic-span-long`	Structure	`dyslexia`	BDA	Cohort lead (2026-05-02) — first rule on the F-experimental-rule-status substrate, depends on F143 (inline AST layer). Ships as `Experimental` in v0.2.x; flips to `Stable` at v0.3 cut.
F51	`structure.number-run`	Structure	`dyscalculia`	plainlanguage.gov	Ships as `Experimental` in v0.2.x via F-experimental-rule-status; flips to `Stable` at v0.3 cut.
F53	`readability.large-number-unanchored`	Readability	`dyscalculia`, `general`	CDC CCI, plainlanguage.gov “Use Numbers Effectively”	Ships as `Experimental` in v0.2.x via F-experimental-rule-status; flips to `Stable` at v0.3 cut. Scoped 2026-05-04 (`.personal/feature-torture/reports/F53.md`): MVP fires when a numeric token (≥ 4 digits) or magnitude word (`million` / `milliard` / `billion` / `trillion`) appears with no same-sentence anchor — unit, percentage, ratio (`X out of Y` / `X sur Y`), or curated comparator phrase (≤ 30 entries per language under `src/language/{en,fr}/`). Excludes year-shaped numerals (1700–2100), ordinals, and figure / page / section refs. Profile thresholds: `dev-doc = 100 000`, `public = 10 000`, `falc = 1 000`. Severity at flip TBD (`Warning` cohort default vs `Suggestion`); pre-flip dogfood pass on `examples/public/` gates the call. Boundary with F51 `structure.number-run`: F51 fires on numeric clusters; F53 fires on isolated unanchored numerals.
F57	`syntax.parenthetical-depth`	Syntax	`adhd`, `general`	plainlanguage.gov, Hemingway	Ships as `Experimental` in v0.2.x via F-experimental-rule-status; flips to `Stable` at v0.3 cut. Scoped 2026-05-04 (`.personal/feature-torture/reports/F57.md`): MVP fires when a sentence’s maximum balanced-bracket nesting depth across `()` and `[]` reaches the profile threshold. Profile thresholds: `dev-doc = 4`, `public = 3`, `falc = 2`. Em-dash pairs and comma-flanked appositives deferred (filed as `F-syntax-appositive-depth` if dogfood demands). Code spans / blocks already excluded by parser helper; unbalanced brackets fail open (mirrors `parenthesised_list_comma_count` in `src/rules/enumeration.rs`). Severity at flip TBD (`Warning` cohort default vs `Suggestion`); pre-flip dogfood pass on `examples/public/` gates the threshold tuning. Boundary with F22 `structure.excessive-commas`: F22 already discounts flat `(A, B, C)` enumerations at depth 1 via `parenthesised_list_comma_count`; F57 fires only at depth ≥ 2, so the rules are mechanically orthogonal.
F58	`syntax.front-loaded-subject-delay`	Syntax	`adhd`, `general`	plainlanguage.gov	FR corpus validation (dislocation FP risk)
F-rhythm-pronoun-density	`rhythm.pronoun-density`	Rhythm	`aphasia`, `general`	FALC	—
F-rhythm-topic-shift-cluster	`rhythm.topic-shift-cluster`	Rhythm	`adhd`, `general`	Hemingway	May merge into F-missing-connectors after corpus review
F-lexicon-falc-idiom	`lexicon.falc-idiom`	Lexicon	`aphasia`, `non-native`	IFLA, FALC	Curated bilingual idiom lexicon
F-lexicon-vocabulary-rarity	`lexicon.vocabulary-rarity`	Lexicon	`non-native`, `general`	—	Frequency lexicon per language (Lexique.org for FR, COCA / Google-Books for EN). Tiered weights: `common` / `context-dependent` / `expert`. LLM-built fallback only.
F-rhythm-forward-reference-heavy	`rhythm.forward-reference-heavy`	Rhythm	`adhd`, `general`	Working-memory load	—
F-lexicon-acronym-distance	`lexicon.acronym-distance-from-definition`	Lexicon	`adhd`, `non-native`	Memory decay	F9 (definition-aware abbreviation)
F-syntax-complex-tense	`syntax.complex-tense`	Syntax	`non-native`, `aphasia`	FALC tense restrictions	FR morphology primary; EN lighter
F-syntax-impersonal-voice-heavy	`syntax.impersonal-voice-heavy`	Syntax	`aphasia`	FALC direct-address rule	—
F-syntax-address-inconsistency	`syntax.address-inconsistency`	Syntax	`non-native`, `general`	Register consistency	FR primary (tu / vous); EN weaker (you / one)

Developer experience (v0.3)

ID	Item	Priority	Origin
F-diff-mode	Differential diagnostics — `--compare=<ref>` CLI mode. Runs against two revisions of the same text(s) and reports score-delta + diagnostic-delta. Pitch: CI/PR comment framing (“this PR adds 2 warnings, removes 5, net −3”), inverting alarm fatigue the way coverage tools do. CLI + JSON + SARIF-run-comparison. No dashboard (that is F-score-evolution-dashboard).	🟡 Later	Rule-system-growth brainstorm (2026-04-20). Depends on F14 stabilising.
F-explain-fancy-rendering	Fancy terminal rendering for `lucid-lint explain` — pipe the bundled markdown through `termimad` (or a custom `pulldown-cmark` + `owo-colors` walker) so headings, tables, code fences, bullets, and inline `code` render with proper typography instead of raw markdown. Ship a toned `Skin` that matches the existing warning-yellow / info-cyan palette rather than termimad’s magenta defaults — the brand direction is calm, typographic, not “rich CLI”. Defer past v0.2 so the `check` output polish (F?) lands first.	🟡 Later	TTY-output critique (2026-04-22)

Ecosystem interop

Motivation: lucid-lint and Markdown-syntax linters (markdownlint, Vale, proselint, textlint) can flag the same line from different angles. Cognitive-load rules that happen to share a substrate with a structural check should stay shipped in core — users without markdownlint, users who disabled the matching markdownlint rule, and users feeding non-Markdown input (plain text, .docx via F-docx-support, HTML via F-html-support) all rely on lucid-lint for that coverage. The pain point is editor LSP sessions where two servers report the same span with different severities and different wording, not CLI pipelines where tools run sequentially.

Scope audit at 2026-04-20: after the structure.heading-jump reframing (cognitive “comprehension cliff” at skip ≥ 2 levels, distinct from MD001’s strict +1 rule), structure.deeply-nested-lists is the only lucid-lint rule that remains functionally equivalent to a markdownlint rule (MD007). The mechanism below is designed to scale — Vale, proselint, textlint overlaps are likely as the rule set grows — rather than to solve a single-rule problem.

ID	Item	Priority	Origin
F77	✅ Shipped in v0.2 — `main.rs` now auto-discovers `lucid-lint.toml` walking up from the CWD (stopping at the nearest `.git` boundary) and applies `[default].profile`, `[default].conditions`, `[scoring]` via `ScoringFileConfig::into_scoring_config`, and `[rules.readability-score].formula`. New `--config <path>` flag overrides discovery. Precedence: built-in profile defaults → TOML → CLI flags. Per-rule TOML overrides beyond `readability.score` extend rule-by-rule as each `Config` gains `Deserialize`. See `docs/src/guide/configuration.md`.	—	F11 follow-up (2026-04-21)
F-interop-suppression	Interop suppression mechanism. Rules declare overlapping external linter rules in their metadata (e.g. `Rule::external_overlaps() -> &[(Linter, &'static str)]`, enum `Linter::Markdownlint \| Vale \| Proselint \| Textlint`). Users opt in via `[interop] suppress_when = ["markdownlint"]` in `lucid-lint.toml` (CLI equivalent: `--interop-suppress=markdownlint`); opt-out is default, so coverage never silently drops. When active, affected rules are skipped at emission time with an info-level trace in `--verbose`. Ships CLI + LSP (the LSP path is the real motivator: two servers squiggling the same span with different severities and wording erodes trust in both). Only `structure.deeply-nested-lists` qualifies at time of writing (MD007); framework is designed to scale to future overlaps. Non-goal: detecting whether the external linter is actually installed or configured — the config field is the signal.	🟡 Later	Markdownlint-overlap scan (2026-04-20)

Adoption channels

Filed 2026-04-25 from the adoption-channels brainstorm (.personal/brainstorm/20260425-adoption-channels.md). This section tracks distribution and integration channels — work that lives in this repo (release artifacts, plugins, docs pages, IDE / CI integrations).

Pure promotion / outreach plays (DINUM submission, awesome-list PRs, audit-and-PR on famous OSS docs, W3C COGA submission, conference talks, social-media cadence, Hacker News, etc.) moved to .personal/promotion-channels.md on 2026-05-01. The freed F-IDs (F111, F112, F113, F117, F118, F119) are considered lost — not reused. F110 (Vale style pack — code) was renumbered to F137 (since renamed to F-vale-style-pack under the slug convention) to free F110 for the encoding-hygiene canonical entry already shipped in v0.2.3.

The regulatory tailwind (EAA enforceable since 2025-06-28; RGAA 5 ships end-2026 with DGCCRF / Arcom sanctions up to 50k€ + renewable) shapes the must-list — F-vale-style-pack (Vale pack) leans directly on it. Bilingual EN/FR is the differentiator that makes the FR-government channel viable.

ID	Item	Priority	Origin
F-vale-style-pack	Vale style pack — subset of rules → `vale-cli/packages` topic. Map only the rules that fit Vale’s `existence` / `substitution` / `occurrence` checks (target list: `lexicon.weasel-words`, `lexicon.redundant-intensifier`, `lexicon.jargon-undefined`, `lexicon.unexplained-abbreviation`, `lexicon.all-caps-shouting` — plus a couple thresholded `structure` rules if Vale’s `conditional` extends cleanly). The cognitive-load core (sentence-too-long thresholds, `structure.deep-subordination`, scoring engine, FALC profile) stays standalone-only. Pack is generated from the rule registry (~50 lines of Rust emitting Vale YAML) — zero hand-maintenance, regenerated per release. Each rule’s Vale `link:` field points to `docs/src/rules/<id>.md` so curiosity about gaps surfaces the standalone tool. Pack README opens with: “This is a subset of `lucid-lint` for Vale users. For sentence-shape, paragraph rhythm, scoring and the FALC profile, use `lucid-lint` standalone — see `[link]`.” The Vale pack is intentionally a “trailer.” Risks (discovery dilution, identity blur, maintenance drag) all fall on the README + per-rule link surfaces; not cannibalisation — Vale users are a new audience, not poached existing users.	🔴 Next	Adoption-channels brainstorm 2026-04-25
F-github-action	GitHub Action in Marketplace (promoted to 🔴 Next, targeted at v0.3 from 2026-04-27 Block E recon — early-adoption feedback channel). Verified peer shape: both `astral-sh/ruff-action` and `biomejs/setup-biome` are thin composite actions (yaml-only) that download the prebuilt binary from upstream Releases, add it to `PATH`, optionally run it. Composite > Docker container for sub-second cold start; pure JS action avoided (no Node runtime needed). Proposed contract: `uses: lucid-lint/lucid-lint-action@v1` with `with:` inputs `version` (default `latest`), `paths`, `profile` (`falc` / `dev-doc` / `public`), `format` (`tty` / `json` / `sarif`), `min-score`. v0.3 first cut emits `::warning file=…,line=…::` workflow commands for inline PR annotations; v0.4 swaps to SARIF upload via `github/codeql-action/upload-sarif` once the SARIF output stabilises, feeding GitHub Code Scanning natively. Risk: a composite action coupled to `cargo-dist` release-tarball naming — any rename breaks consumers, so pin the manifest contract. Internal scaffold landed 2026-04-28 — `action.yml` at the repo root implements the locked input contract (`version`, `paths`, `profile`, `format`, `min-score`, plus `working-directory` and passthrough `args`); a smoke workflow (`.github/workflows/action-smoke.yml`) exercises it on Linux / macOS / Windows runners against this repo’s own `docs/src/`. Not yet published, not yet `v1`-tagged, not yet listed in the Marketplace. Bake-in plan: dogfood the contract internally for 2–3 weeks, revise inputs that don’t survive contact with reality, then split out to a dedicated `bastien-gallay/lucid-lint-action` repo (the canonical ruff / biome pattern) and tag `v1` alongside the v0.3 release. v0.3 first cut still emits `::warning::`; SARIF upload deferred to v0.4 behind F32.	🔴 Next	Adoption-channels brainstorm 2026-04-25 + Block E recon 2026-04-27 + scaffold 2026-04-28
F-falc-readiness-guide	FALC-readiness guide page — new docs page `docs/src/guide/falc-readiness.md` (FR mirror at `docs/src/fr/guide/falc-readiness.md`) explaining how `lucid-lint --profile=falc` maps to the Inclusion Europe European Easy-to-Read standards. Cite the European Easy-to-Read logo program (logo use is free if conditions met: document follows the standards + at least one person with intellectual disability validated readability). Do not claim certification — claim readiness. The guide drives qualified traffic from disability-federation networks (UNAPEI, Inclusion Europe, etc.).	🟡 Later	Adoption-channels brainstorm 2026-04-25
F-mdbook-lint-coexistence	mdbook-lint coexistence guide. Short page in our docs (and a one-liner cross-PR to mdbook-lint’s README) explaining “use both”: mdbook-lint = markdown structure, `lucid-lint` = prose / cognitive load. Different niches, complementary. Free, opportunistic.	🟢 Could	Adoption-channels brainstorm 2026-04-25
F-pre-commit-hook-listing	Pre-commit hook listing in `pre-commit/pre-commit` registry. Fires once `--check` mode is stable across our CLI surface (currently most surfaces use `--format=json` and exit codes; hook-friendly summary + fast-fail mode is the prerequisite).	🟢 Could	Adoption-channels brainstorm 2026-04-25
F-wasm-playground	WASM playground for in-browser linting. Peer pattern (ruff `play.ruff.rs`, biome `biomejs.dev/playground`): single-page React/Preact + Vite app driving a Monaco editor, with a dedicated `_wasm` Rust crate built via `wasm-pack` (ruff publishes `ruff_wasm`; biome publishes `@biomejs/wasm-web` from `biome_wasm`). Source layout: a `playground/` workspace at repo root with `wasm/` and `web/` sub-trees. Hosting: Cloudflare Pages or GitHub Pages on a subdomain (e.g. `play.lucid-lint.dev`). Proposed shape for `lucid-lint`: `crates/lucid-lint-wasm` exposing `lint(text, lang, profile) -> Diagnostic[]` via `wasm-bindgen`; tiny Vite+Preact UI; estimated 300–600 kB gzipped given our deterministic core (no network, no LLM). Phase: v0.4+* — the surface needs its own brainstorm before scoping (UX shape, share-link encoding, persistence, mobile experience, contribution channel) and is best framed as a traction / acquisition lever once v0.3 distribution is in place. Risks: (1) bundle-size cliff if `regex` + `unicode-segmentation` push past 1 MB; (2) ongoing maintenance of a JS surface that can drift from CLI behaviour.	🟢 Speculative	2026-04-27 Block E recon
F123	✅ Shipped 2026-04-28 — curl-pipe-sh + PowerShell one-liners are surfaced in `README.md` and `docs/src/guide/installation.md`. The cargo-dist installer flip itself was a no-op — `installers = ["shell", "powershell"]` has been in `Cargo.toml` `[workspace.metadata.dist]` since the initial scaffold (`d153ad8`), so v0.1.1 / v0.2.0 / v0.2.1 / v0.2.2 have all been attaching `lucid-lint-installer.sh` and `lucid-lint-installer.ps1` to their GitHub Releases. Yesterday’s Block E recon mis-filed F123 as a config flip; today’s reconnaissance confirmed the actual gap was discoverability. Documentation now covers both one-liners (Linux / macOS / WSL via `curl … \| sh`; Windows via PowerShell `irm \| iex`), the `--check` / audit-before-running pattern (download to a file, `less`/`notepad`, then execute), version pinning (`releases/download/v<version>/…` instead of `releases/latest/…`), and how each installer drops the binary on `$PATH`. The `cargo install` and source-build routes stay alongside as fallbacks. README’s stale “Once released to crates.io” lead-in dropped. Vanity `sh.lucid-lint.dev` redirect remains a v0.5 concern.	—	2026-04-27 Block E recon
F-npm-wrapper	npm wrapper with platform `optionalDependencies` — promoted to 🔴 Next, targeted at v0.3 (early-adoption feedback channel for the JS-toolchain audience: Prettier / ESLint / Husky / package.json scripts users). Canonical pattern verified on the npm registry: biome (`@biomejs/biome` 2.4.13) and dprint (0.54.0) both publish a thin root package whose `optionalDependencies` lists one sub-package per target; npm resolves only the matching platform; root `bin` shim execs the binary; dprint additionally runs a `postinstall` `install.cjs` as fallback. Proposed shape: root `lucid-lint` (~10 kB) + five platform-specific `@lucid-lint/cli-{aarch64-apple-darwin, x86_64-apple-darwin, x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, x86_64-pc-windows-msvc}`. Version stays in lockstep with the Rust crate; release workflow gains an `npm publish --provenance` step using OIDC (biome already does this). Risks: (1) 5+ packages per release multiply publish-failure surface — release workflow needs all-or-nothing semantics; (2) npm registry outages would block JS users — document fallback to direct binary download (F123).	🔴 Next	2026-04-27 Block E recon
F-repo-discoverability-polish	Repo discoverability polish — README badges + GitHub social preview. Two adjacent first-impression surfaces, bundled because they share the audience (drive-by visitors on crates.io, link unfurls on Mastodon / LinkedIn / HN). Checklist: (1) README badge row at the top of `README.md`: `crates.io` version (`img.shields.io/crates/v/lucid-lint`), docs (mdBook URL), CI (`ci.yml` badge), license. Standard Rust crate signal — first thing crates.io users scan to gauge a project. (2) Social preview image uploaded under Settings → General → Social preview (1280×640, 40pt safe zone per GitHub template). Replaces the auto-generated avatar+name card that GitHub serves on Twitter / Mastodon / LinkedIn / Slack / HN unfurls. Two viable directions — pure brand card (wordmark + tagline), or terminal-screenshot card (real `lucid-lint` diagnostic on a real sentence, more “shows what it does”). Pick at design time; brand assets in `.impeccable.md`. Both items are non-breaking, no code touched — fits any v0.2.x patch slot or rides alongside an unrelated PR.	🔴 Next	Repo-config session 2026-05-03
F-homebrew-tap	Homebrew distribution (own tap → core). macOS-first audiences (writers, designers, docs teams) reach for `brew install` before `cargo`. Path is well-trodden: ship a tap on `<org>/homebrew-tap` immediately, graduate to `homebrew-core` once eligibility is met (current acceptable-formula policy needs a manual cross-check on `homebrew/brew docs/Acceptable-Formulae.md` — sandboxed during Block E recon; the old “75 stars” line was removed but maintainers still gate on adoption signal). Implementation: enable `cargo-dist`’s `homebrew` installer — it generates a Ruby formula referencing the same release tarballs we already build (`aarch64-apple-darwin`, `x86_64-apple-darwin`, plus Linux bottles) and opens a PR against our tap on each tag. Bottle building runs free on `macos-latest` runners. v0.4 launches the tap; `homebrew-core` submission deferred to v0.5+ behind real adoption signal. Risks: (1) tap fragmentation if we never graduate to core; (2) core review can take weeks.	🟡 Later	2026-04-27 Block E recon

Infrastructure & Static Analysis

Hardened quality stack for the project’s internal hygiene, balancing CI speed with high-signal security and prose audits. Routed 2026-05-04.

ID	Item	Priority	Origin
F-python-hygiene	Python scripting hygiene — ruff + mypy. Enforce high quality on the `scripts/` layer (lang-sync, text-conversion). ruff for linting/formatting (replaces black/isort/flake8); mypy for type checking (leverages existing type hints). Pre-commit + CI integration.	🔴 Next	Static-analysis session 2026-05-04
F-prose-audits	Prose & Doc audits — typos + lychee. typos for fast source-code spell checking (aligned with `lucid-lint` mission); lychee for link integrity across `docs/` and `README.md`.	✅ Done	Static-analysis session 2026-05-04
F-cargo-deny	Supply-chain gate — cargo-deny (+ actionlint). Replace the `cargo-audit` step in `.github/workflows/ci.yml` with `cargo-deny` for unified CVE + license + crate-ban checks (e.g. ban `lazy_static` in favour of `OnceLock`). Pairs with actionlint in the same PR — both are CI-workflow edits, shared review surface. Adds `deny.toml` (advisories + MIT/Apache-2.0/BSD-3-Clause/Unicode-DFS-2016 license whitelist + bans). Split out of former F-security-hardening 2026-05-05 (see report).	🔴 Next	Static-analysis session 2026-05-04; split 2026-05-05
F-gitleaks-precommit	Secret-leak guard — gitleaks pre-commit hook. Add gitleaks to `.pre-commit-config.yaml` alongside ruff/mypy/typos (PR #58). Minimal `.gitleaks.toml`. Mirror as a non-blocking CI warning step for one cycle, then promote to required. Open question on first-run scope (history vs staged-only) and blocking-vs-warning posture for the first week. Split out of former F-security-hardening 2026-05-05.	🔴 Next	Static-analysis session 2026-05-04; split 2026-05-05
F-infra-audit	API-stability gate — cargo-semver-checks. Protect the `[lib]` API surface before major releases by failing CI on accidental breaking changes. Narrowed from the original actionlint + cargo-semver-checks pairing 2026-05-05 — actionlint moved into F-cargo-deny (shared `ci.yml` edit).	🔴 Next	Static-analysis session 2026-05-04; narrowed 2026-05-05
F-cargo-udeps	Unused dependency audit — cargo-udeps. Identify and prune unused crates to minimize binary size and compile times. Runs weekly or manually (requires nightly).	🟡 Later	Static-analysis session 2026-05-04

Research track

Bets that don’t commit to a ship date. Tracked to ensure they’re not forgotten.

ID	Item	Priority	Origin
F-paragraph-landmark-density	`structure.paragraph-landmark-density` — reprise-points for attention-fragile readers. Research needed to define “landmark” (bold / italic / headers / list-starts / code spans?).	🟢 Speculative	Rule-system-growth brainstorm (2026-04-20)
F-lede-buried	`structure.lede-buried` — journalistic inverted-pyramid check. Strong candidate for a future `lucid-lint-journalism` plugin rather than core.	🟢 Speculative	Rule-system-growth brainstorm (2026-04-20)
F-rule-discovery-corpus	Rule-discovery corpus project — mine writer-heavy git histories for patterns that authors repeatedly rewrite. Source of evidence-grounded rule proposals. Intern / student project scale.	🟢 Speculative	Rule-system-growth brainstorm (2026-04-20)

Additional research directions captured for posterity but not yet ID’d:

Reader-model scoring — tiny local model predicts processing time and accuracy per paragraph; output is a cognitive-load heatmap. Deterministic at inference, data-hungry at training.
TTS / screen-reader prosody rules — detect prosody breakdown (mid-sentence acronyms, awkward punctuation cadence). Needs a TTS corpus.
Cross-document terminology drift — same concept named three ways across a corpus (“user” / “customer” / “client”). Requires multi-file analysis infrastructure; performance implications.
Eye-tracking corpus collaboration — partnership with a reading lab to ground thresholds in behavioural data.
LSP server — live diagnostics in editors; same core, different frontend.
--fix / quickfix suggestions — safe rules only (e.g. structure.long-enumeration → concrete list skeleton). Controversial for prose; needs guardrails.
lucid-lint baseline — record per-project medians; rules flag regressions rather than absolutes (ESLint-style).
Profile composition (extends = "falc") — reduce duplication across projects.
Community rule-pack registry — cargo-style publication of domain packs (medical, legal, edu, journalism).
lucid-lint-style plugin — adverb overuse, show-don’t-tell, and other aesthetic rules excluded from core by design.
lucid-lint-a11y plugin — alternative home for a11y-markup- tagged rules if the tag proves insufficient to separate them from prose rules.

v0.2 / v0.2.x — Must-ship shipped, patch cycle in progress

Release cadence

The 2026-04-22 reprioritisation favoured a tight 0.2.0 cut over a fat one: anything non-blocking slides to 0.2.x patch releases, which exist precisely to absorb per-rule polish and per-surface slices. v0.2.0, v0.2.1, and v0.2.2 are shipped; v0.2.x remains open as a rolling patch cycle. 0.2.x routing was reviewed on 2026-04-24 in .personal/brainstorm/20260424-next-cycles.md (not tracked; .personal/ is gitignored).

v0.2.0 — Blocking items (all ✅ shipped 2026-04-22)

ID	Summary
F29-slim	Rule IDs moved to `category.rule-name` form (25 rules); `src/rules/<cat>/` subdirectories; `Category::for_rule` derives from prefix. Hard break — suppression directives, `[rules.<id>]` TOML keys, JSON/SARIF `ruleId` all use the new form.
F35a	`theme/index.hbs` forked from upstream mdBook; skip link + EN / FR switch server-rendered. WCAG 2.4.1 Bypass Blocks passes with JS disabled.
F35d	Accessibility statement page (`docs/src/accessibility.md` + FR counterpart).
F-fail-on-warning-bool	`--fail-on-warning` accepts optional boolean; hidden mirror `--no-fail-on-warning`. `--min-score` now testable in isolation on documents with warnings.

v0.2.1 — ✅ Released 2026-04-23

Localhost 404.html rendering fix (F-example-fixtures-part2 part 1), per-rule TOML override for structure.excessive-commas (third rule wired after readability.score.formula and lexicon.unexplained-abbreviation.whitelist), scraped-prose fixtures pipeline (examples/texts.yaml + just texts), TTY-capture GIFs via vhs tapes, v0.1 / v0.2 staleness sweep, idea-highlight motif extended to the structure.sentence-too-long rule page. First crates.io publish since v0.1.1 — packaging switched from exclude to an explicit include list so docs/src/rules/*.md reach the tarball (needed by src/explain.rs’s include_str!).

v0.2.2 — ✅ Released 2026-04-23

F87 — FR syntax.nested-negation pair-based counting over ne / n' clitics and second-position particles (pas, rien, jamais, …).

v0.2.x — MoSCoW routing (patch cycle, post-release)

Routed 2026-04-24 from the active-work view. Each row here has a full entry under a topic section below; priority column reflects the routing decision.

Must — 🔴 Next

ID	Topic	Item
F25	Docs — bilingual	✅ Closed 2026-05-01 — per-rule 25/25 + guides 8/8 + architecture 2/2 + contributing
F-docs-responsive	Docs — reading prefs	Responsive / mobile adaptation
F35b	Docs — reading prefs	Drop `role="radiogroup"` on reading chips (P2 a11y)
F-example-fixtures-part2 part 2	Example-text fixtures	Redistributable replacements for load-bearing slots
F-vale-style-pack	Adoption channels	Vale style pack (subset of rules → `vale-cli/packages` topic)
F-experimental-rule-status	Architecture	Experimental rule status substrate — gates v0.3 cohort, opens dogfood window
F143	Architecture	Inline AST layer over pulldown-cmark — substrate for F49 (cohort lead)
F-weasel-words-severity-tiering	Rules refinement	Severity tiering for `lexicon.weasel-words` (quantifier `info`, hedge `warning`) — unblocks the audit-and-PR play
F-redundant-intensifier-bullet-fix	Rules refinement	Fix `lexicon.redundant-intensifier` parser miss in bullet / `strong` spans — unblocks the audit-and-PR play
F-severity-floor-flag	Suppression / config	`--severity-floor` CLI flag — unblocks the audit-and-PR play narrow-audit shape
F-report-quick-wins	Reporting / DX	TTY quick-wins block under the diagnostic list — acronym whitelist hint + single-rule hot-spot hint

Should — ships as the next patch absorbs it

ID	Topic	Item
F-project-scoring-rollup	Architecture	Project-level scoring roll-up (per-file + summary)
—	Suppression / config	Per-rule TOML plumbing, rule-by-rule as each `Config` gains `Deserialize`
F-suppression-reason-field	Suppression / config	`reason="..."` field on suppression directives
F-rule-mention-linking	Docs — content	Rule-mention linking audit + coverage test (F-rule-mention-coverage-test)
F-github-action	Adoption channels	GitHub Action published to Marketplace (depends on stable SARIF output)
F-falc-readiness-guide	Adoption channels	FALC-readiness guide page citing Inclusion Europe standards
F-roadmap-slug-ids	Architecture	ROADMAP feature IDs adopt `F-<kebab-slug>` for new entries; legacy F1–F146 stay numeric; slug-uniqueness CI test (offline-runnable)

Could — nice-to-have

F-excessive-nominalization-suffix-refine (nominalization suffix refine), F43 (RULES.md drift cleanup), F73 (font-leak CI gate), F-docs-final-polish (final polish pass), F-explain-fancy-rendering (fancy explain rendering), F-suppression-disable-file (disable-file), F-text-source-adapters / F-text-before-after-refine / F-texts-yaml-url-maintenance (fixture hygiene), F-mdbook-lint-coexistence (mdbook-lint coexistence guide), F-pre-commit-hook-listing (pre-commit hook listing once --check mode stabilises).

Won’t (pushed to 0.3)

F-score-letter-grade letter grade, F-score-traffic-light traffic light, F-per-family-subscores per-family sub-scores, F-lucid-stance-unify .lucid-stance unify, F-fix-mode --fix mode (narrow).

v0.3 and later (already scoped)

Detail under “New rules (v0.3 candidates)” and the ## v0.4 — horizon section below.

F22 v0.3 slice — 3–4-word Oxford items, non-Oxford / “plus”-closed lists, interleaved parentheticals (first slice shipped in 0.2.x).
F-readability-formulas-extra remainder — SMOG, Dale-Chall, Scolarius, --readability-verbose.
Five condition-tag rules — F46, F49, F51, F53, F57. F46 carries a slip-flag: if FR corpus tuning for homophone density exceeds ~2 days, it slides to 0.3.x.
Full F29 — demoted to 🟢 Speculative on 2026-04-24. F29-slim already fixed the category-drift problem by construction; numeric codes (STR-001) only earn their cost on a real rename, and there are zero scheduled renames. Revisit when one actually happens.

Architecture

ID	Item	Priority	Origin
F14	✅ Hybrid scoring model shipped in v0.2 (global score + per-category sub-scores + diagnostics). `X/max` arbitrary-max at both levels, 5 fixed categories (Structure · Rhythm · Lexicon · Syntax · Readability), composition = weighted sum × density-normalization × per-category cap, `weight` field added to `Diagnostic`, `--min-score=N` CLI flag. See `docs/src/guide/scoring.md`. Letter-grade / traffic-light / reading-time decorations deferred (F-score-letter-grade–F-reading-time-score).	—	Architecture decision discussion
F-project-scoring-rollup	🚧 Document-level scoring shipped in v0.2 (multi-path runs are aggregated as one document). Project-level roll-up (per-file breakdown + project summary) still open. Section-level deferred → F-section-scoring.	🔴 Next	Linked to F14
F-per-family-subscores	Per-family sub-scores	🟡 Later	Linked to F14
F32	✅ Shipped in v0.2 — `lucid-lint check --format=sarif` emits a SARIF v2.1.0 log compatible with GitHub Code Scanning. One rule descriptor per observed rule id (category, default severity, default weight, `helpUri` to the per-rule mdBook page); per-result properties carry weight + section. Workflow snippet in `docs/src/guide/ci-integration.md`.	—	v0.1 AGENTS.md audit
F37	✅ Rule-message clarity audit completed: all 17 rules reviewed against “what do I change?” bar. 15 rules already actionable; `structure.heading-jump` updated (first-heading-not-H1 and missing-H1 variants now include repair guidance). `readability.score` info variant left observational by design (fires only when `always_report` is set).	—	F14 `brainstorm/20260420-score-semantics.md`
F-section-scoring	Section-level granularity for scoring (deferred from F-project-scoring-rollup) — per-heading sub-scores once document + project are proven in the wild.	🟡 Later	F14 `brainstorm/20260420-score-semantics.md`
F-score-letter-grade	Letter-grade decoration (A–F) on the `X/max` score — promote when user feedback shows the numbers feel noisy or hard to compare across docs.	🟡 Later	F14 `brainstorm/20260420-score-semantics.md`
F-score-traffic-light	Traffic-light (🔴🟡🟢) + pass/fail margin in the TTY output — promote when CI users ask for a stronger glance signal than the number alone.	🟡 Later	F14 `brainstorm/20260420-score-semantics.md`
F-reading-time-score	Reading-time-seconds as an alternative score unit — ties score to concrete user outcome. Requires validated heuristic + companion metrics (comfort, fatigue, understandability) so the time unit doesn’t monopolize the read.	🟢 Speculative	F14 `brainstorm/20260420-score-semantics.md`
F71	✅ Shipped in v0.2 — `ConditionTag` enum (fixed 7-variant ontology: `a11y-markup`, `dyslexia`, `dyscalculia`, `aphasia`, `adhd`, `non-native`, `general`) plus `Rule::condition_tags()` trait method (default `&[General]`). All 17 v0.2 rules are `general`; future tagged rules (F48, F55, F56) opt in by overriding. See `docs/src/guide/conditions.md`.	—	Rule-system-growth brainstorm (2026-04-20)
F-tight-list-paragraphs	Markdown parser — emit paragraphs for tight list items (correctness fix). Discovered 2026-05-01 while verifying the F22 third-tranche dogfood metric: the same bullet content that triggers `excessive-commas` / `dense-punctuation-burst` / `readability.score` in a loose list (multiple items separated by blank lines) is silent in a tight list (single item, or items without separating blank lines). Root cause: `pulldown-cmark` only emits `Tag::Paragraph` events for items in loose lists; tight-list text events fire directly inside `Tag::Item`. The parser at `src/parser/markdown.rs` only buffers text inside heading or paragraph contexts, so tight-list content goes into the void and every paragraph-level rule (all 17 in v0.1) inherits the blindspot. Same pre-existing limitation flagged in F126 for `structure.line-length-wide` — F-tight-list-paragraphs resolves it once for every rule. Fix: synthesize a paragraph for each list-item span when no `Tag::Paragraph` event fires inside it. Expected dogfood impact: many CHANGELOG / release-note / README bullets become newly visible to rules — some genuine new diagnostics, some snapshot updates.	🔴 Next	F22 third-tranche verification (2026-05-01)
F72	✅ Shipped in v0.2 — `[default] conditions = [...]` config field and `--conditions` CLI flag (comma-separated). Filter semantics: rules tagged `general` always run; tagged-only rules run iff their tags intersect the active list. Profiles unchanged; FALC retains its regulatory meaning. See `docs/src/guide/conditions.md`.	—	Rule-system-growth brainstorm (2026-04-20)
F143	Inline AST layer over pulldown-cmark — substrate for inline-positional rules. Routed 2026-05-02 (`.personal/brainstorm/20260502-parser-substrate-choice.md`). The current Markdown parser at `src/parser/markdown.rs` flattens emphasis, strong, and link spans into the `Paragraph.text` string before rules see it — visible text preserved, structure lost. F49 (`structure.italic-span-long`, cohort lead) needs italic-span boundaries; future inline-positional rules (F-paragraph-landmark-density speculative, F-lexicon-acronym-distance conditional on F9) would hit the same wall. Decision: introduce a thin typed inline AST on top of pulldown-cmark, not swap the engine for comrak / markdown-rs. Reasons: pulldown stays the perf-leading parser; the AST is the domain model the rules walk (CUPID-aligned: composable, predictable, domain-based); the engine swap regresses bench by ≈ 2–3× and collides with the lightning-fast positioning pillar. Minimal viable substrate (YAGNI applied inside the layer): `enum Inline { Text(String), Emphasis(Vec<Inline>) }` plus a `Paragraph.inline: Vec<Inline>` field captured during the existing pulldown walk. Not modeled yet: `Strong`, `Link`, `Code`, footnotes, task-list markers, hard breaks inside emphasis. Each gets added when a second rule actually demands it; today only F49 does, and the steel-man check confirmed the cohort is non-uniform (F51 / F53 / F57 don’t need inline spans). Plain-text parser path: empty `inline` vec (no Markdown semantics). Estimated effort: half a day for the substrate, then F49 ships on top in a follow-up PR. Bench gate: PR-1 against the existing bench corpus; > 5 % regression triggers a profile pass before merge. Reversibility: the layer can be deleted and folded back into per-rule fields if a year passes with one consumer; the engine swap (comrak) was rejected partly because that reversal is much harder.	🔴 Next	2026-05-02 parser substrate brainstorm; F49 cohort lead unblocked
F-experimental-rule-status	Experimental rule status — registry substrate for the v0.3 cohort. Routed 2026-05-02 (`.personal/brainstorm/20260502-v03-breaking-change.md`). Soft-breaking changes (new default-active rules) are the SemVer-major signal for linters; lucid-lint has 5 such rules queued for v0.3 (F46 / F49 / F51 / F53 / F57). Rather than smear 5 score regressions across v0.2.x patches or hold all 5 until a single v0.3 cut, this entry adds a rule lifecycle status (`Stable` / `Experimental`) and ships the cohort in v0.2.x as `Experimental` (off by default). Users — including this repo’s own dogfood loop on adjacent projects — opt in via a `[experimental]` config section (`enabled = ["structure.italic-span-long", …]` or `enabled = ""`) or `--experimental <id>` CLI flag. v0.3’s breaking change is then a single-line per rule (`Status::Experimental` → `Status::Stable`) plus a CHANGELOG cohort entry. Why this shape, not per-rule `default = false` knobs:* `Status` is one concept that maps to a known industry pattern (clippy `nursery`, biome `nursery`, ESLint experimental rules, rust `#[unstable]`); per-rule booleans would add five toggles for the same concept and pre-figure no lifecycle. Minimal viable substrate (resist gold-plating): `Status` enum on the `Rule` trait (default `Stable`); `default_rules()` filters `Experimental` unless config opts in; `[experimental]` TOML section parsing; `--experimental` CLI flag (multi-occur + `*`); experimental tagging visible in `--list-rules` output; one snapshot test for the experimental-off vs experimental-on diff. No rule-group / preset / category-toggle machinery yet — the biome-style `recommendedRules` preset is filed as a v0.4 question. Estimated effort: half a day for the substrate, then one line per rule once F49 / F51 / F53 / F57 ship on top of it. F46 keeps its original FR-corpus slip-flag (independent of the experimental status).	🔴 Next	2026-05-02 v0.3 breaking-change brainstorm; user-proposed dogfood window
F-repo-config-hardening	✅ Shipped 2026-05-03 — full pass closed. What landed today: tag ruleset on `v` pattern (block deletion + force-push, Active; matches all 7 release tags v0.1.0 → v0.2.4). Pre-existing (verified via API audit before clicking):* `.github/dependabot.yml` (cargo + github-actions, weekly, grouped); Actions pinned-SHA required (`sha_pinning_required: true`); secret scanning + push protection both enabled; private vulnerability reporting enabled; CodeQL configured via Advanced setup workflow (`.github/workflows/codeql.yml`, weekly Rust scan, last run 0 results / 29 rules); Scorecard workflow also running. Retro on the routed-vs-actual gap: the entry as routed listed 6 items as if all were unconfigured; the API audit revealed 5 of 6 already shipped via earlier hardening passes that didn’t surface to the ROADMAP. Net work today: 1 click (the tag ruleset) + 1 retro audit. Lesson for future “GH Settings hardening” entries — verify state via `gh api` before drafting the checklist; the repo’s actual posture had drifted past the assumed baseline. Heads-up resolved 2026-05-03: the legacy `main-protection` branch ruleset transiently showed `enforcement: disabled` during the hardening session; re-verified via `gh api repos/:owner/:repo/rulesets` later the same day — both `main-protection` (branch, 5 rules → `main`) and `v-tag-protection` (tag, 2 rules → 7 `v` tags) report `enforcement: active`. Branch protection is enforced on `main` + tags `v`. Branch-Protection scorecard plafond (post-hoc note). OSSF Scorecard’s `Branch-Protection` check warns that `main` requires no approvers, no CODEOWNERS review, and no last-push approval. These three sub-controls assume ≥ 2 humans; a solo-maintainer repo cannot satisfy them without admin bypass (theatre) or auto-approve (contournement of the control’s intent). Decision 2026-05-03: accept the 4/10 ceiling as structural while solo. Revisit on co-maintainer onboarding, or pair with F-adversarial-review to add bot-driven review without faking human gating. Original checklist (preserved for retro reference): (1) Tag ruleset on `v*`. (2) Dependabot version updates (cargo + github-actions). (3) Actions: Require pinned-SHA. (4) Secret scanning + push protection. (5) CodeQL default setup. (6) Private vulnerability reporting.	—	Repo-config session 2026-05-03
F-adversarial-review	Adversarial PR review — bot second pair of eyes. While the repo stays solo-maintainer, every PR is self-authored and self-reviewed; OSSF Scorecard `Branch-Protection` plafonds at 4/10 (see F-repo-config-hardening). An adversarial bot reviewer on each PR adds real signal without faking human-review-as-gate. Two complementary tracks: (1) LLM review — Claude Code or Gemini Code Assist via GitHub integration, comment-only mode, included in existing subscriptions; catches semantic and context-aware issues. (2) Rule-engine review — Semgrep, custom CodeQL queries, `cargo-deny`, `cargo-audit`, danger.js, or similar, run as required PR checks; deterministic, complementary to LLM, catches structural issues (unsafe patterns, license drift, CHANGELOG gaps). Hard scope rule: review-only, no auto-approve — auto-approve crosses into Scorecard theatre and is explicitly out. Reassess if a co-maintainer joins or a `release-managers` group is added.	🟡 Later	Repo-config session 2026-05-03
F-roadmap-slug-ids	ROADMAP feature IDs adopt `F-<kebab-slug>` form for all new entries. Routed 2026-05-02 (`.personal/brainstorm/20260502-roadmap-id-attribution.md`). Numeric `F<n>` IDs collided when two branches independently picked the same free number; reservation-on-`main` was ruled out because new features are routinely discovered mid-implementation inside an existing feature branch. Decision: new ROADMAP entries use a slug-as-ID form (`F-inline-ast-substrate`); legacy F1–F146 stay numeric (no migration — Devil’s-Advocate verified no programmatic parser in `src/`, `tests/`, `scripts/`, `.github/`, `justfile` depends on the format; mixed taxonomy is cosmetic). Slugs are coined locally with no coordination. The cross-branch race that survives (two offline branches independently coining the same slug) is detected at PR time and resolved by a one-line slug rename in ROADMAP + CHANGELOG — no branch rename, no rebase, no commit-history rewrite, because the new convention drops the `F-` prefix from branch names and commit subjects. Minimal viable substrate: (a) `tests/roadmap_id_uniqueness.rs` parses `ROADMAP.md` + `CHANGELOG.md` and asserts every `F-<slug>` appears uniquely as a definition site, no slug shadows the legacy `F<number>` namespace, and every referenced `[F-foo](#f-foo)` resolves; runs offline via `cargo test`, re-runs in CI as a backstop. (b) Explicit `<a id="f-..."></a>` anchors on first definition (matches the existing convention for numeric IDs). (c) `F-` prefix becomes optional in branch names and commit subjects — branches use plain feature slugs (`feat/<slug>`), commits use scope syntax (`feat(parser): <subject>`). Surfaces touched: `tests/roadmap_id_uniqueness.rs` (new), `AGENTS.md` Conventions section, `CHANGELOG.md` `[Unreleased]`, this entry itself (first dogfood). Reversibility: if the mixed taxonomy ever bites (it shouldn’t — no parser depends on it), a one-shot rename script could fold slug entries into a numeric scheme at any future v0.x cut. Estimated effort: ~1 h total — uniqueness test (30 min), `AGENTS.md` update (15 min), this ROADMAP wiring (15 min).	🔴 Next	2026-05-02 ID-attribution brainstorm

Encoding / input handling

The linter is a UTF-8 → diagnostics function. Encoding conversion is the user’s responsibility, exactly once, before lint-time (iconv or “save as UTF-8”). Invalid UTF-8 fails at the read boundary (std::fs::read_to_string returns an io::Error). Other encodings (Windows-1252, Latin-1, Shift-JIS, …) are explicit non-goals: any in-process transcoder would violate the deterministic-core prime directive (charset detection is heuristic, “same input, same output” no longer holds). The entries below cover the valid-UTF-8 edge cases the test surface should pin.

ID	Item	Priority	Origin
F110	✅ Shipped 2026-04-28 — leading `\u{FEFF}` stripped once at the engine boundary (`Engine::lint_with_source`, via the `normalize_input` helper). Funnels every input path (string, stdin, file) through the same boundary so rules never see the BOM. Regression test in `src/engine.rs::tests::bom_prefix_does_not_shift_diagnostics` proves identical diagnostics + line/column locations with and without a leading BOM on a sentence-too-long fixture.	—	2026-04-25 encoding survey
F111	✅ Shipped 2026-04-28 — `unicode-normalization = "0.1"` added; `Engine::lint_with_source` NFC-normalizes input at the same boundary as F110, fast-pathing already-NFC text via `is_nfc_quick`. NFC `café` and NFD `cafe + U+0301` now hash identically in every HashMap-using rule. Regression test in `src/engine.rs::tests::nfd_input_yields_same_diagnostics_as_nfc` exercises a 4-sentence FR fixture and asserts diagnostic count + per-diagnostic rule id and line match across NFC and NFD inputs.	—	2026-04-25 encoding survey
F112	✅ Shipped 2026-04-28 — `src/engine.rs::tests::lone_cr_line_endings_are_normalized` pins parity between LF and lone-CR three-paragraph fixtures (word count + diagnostic count). `src/engine.rs::tests::zero_width_chars_inside_words_pin_behaviour` pins observed behaviour for U+200B / 200C / 200D inside words: the engine round-trips without panicking and produces a valid `Report`; exact word count is intentionally not asserted because `nfc()` does not strip them and tokenisation is owned by `unicode-segmentation`.	—	2026-04-25 encoding survey
F-mixed-script-fixtures	Mixed-script test fixtures. Pin behaviour on EN + CJK and LTR + RTL prose mixed within one paragraph. `unicode_words()` should handle the boundaries correctly (UAX-29), but no regression test exists. Filed as Speculative — no known bug, just a coverage gap. Open if a real-world bilingual corpus surfaces edge cases.	🟢 Speculative	2026-04-25 encoding survey
F126	✅ Shipped — Markdown parser maps `<br>` to `\n` in `paragraph.text`. Pulldown-cmark emits `<br>` as `Event::InlineHtml`, not `Event::HardBreak`, so the v0.2.x author-break-aware fix for `structure.line-length-wide` silently dropped `<br>` despite advertising it as a measured hard break. Helper `html_is_br_tag` recognises `<br>`, `<br/>`, `<br />` (any case, optional whitespace); HTML comments (suppression directives) flow through unchanged. Five new tests pin the contract: `br_tag_inside_paragraph_is_a_hard_break` and `html_comment_directives_do_not_inject_newlines` (parser); `markdown_br_tag_is_checked`, `list_item_text_is_out_of_scope`, `table_cell_text_is_out_of_scope` (rule). The two out-of-scope tests pin the parser-construction contract that list-item content and GFM table cells are not emitted as paragraphs today, so the rule is silent on over-length content inside them — a future parser change that starts emitting either as paragraphs would need to revisit this rule.	—	2026-04-30 audit follow-up to the `structure.line-length-wide` author-break-aware fix (`.personal/2026-04-30-today.md:125`)

ID	Item	Priority	Origin
F9	✅ Shipped in v0.2 — definition-aware `lexicon.unexplained-abbreviation` is now two-pass. A pre-scan collects acronyms defined anywhere in the document in either canonical form (`Expansion (ACRONYM)` or `ACRONYM (Expansion)`; expansion side ≥ 2 alphabetic words to reject `(TBD)`-shaped noise), and a single definition silences every occurrence of that token. Silencing precedence: defined-in-doc → user whitelist → baseline. See `docs/src/rules/unexplained-abbreviation.md`.	—	Rule 10 simplified in v0.1
F-readability-formulas-extra	🚧 Must-ship slice shipped in v0.2 — `readability.score` auto-selects the formula by detected language: Flesch-Kincaid for EN (kept), Kandel & Moles (1958) for FR. Kandel-Moles ease scores are converted to a grade-equivalent so per-profile `max_grade_level` stays comparable across languages. Unknown language → Flesch-Kincaid. See `docs/src/rules/readability-score.md`. Still open: Gunning Fog / SMOG / Dale-Chall (EN), Scolarius / Flesch-Kandel (FR), `--readability-verbose` multi-formula reports, per-file override (covered by F11).	🟡 Later	Rule 11 simplified in v0.1; scope expanded in rule-system-growth brainstorm (2026-04-20)
F11	✅ Shipped in v0.2 — `--readability-formula {auto,flesch-kincaid,kandel-moles}` CLI flag + `FormulaChoice` enum on `readability_score::Config` + `Engine::with_readability_formula(choice)`. `auto` (default) keeps F-readability-formulas-extra per-language selection; `flesch-kincaid` / `kandel-moles` pin a formula for cross-document comparison. TOML config wiring is tracked separately as F77.	🟡 Later	Rule 11
F-missing-connectors	`missing-connectors` rule (15b not shipped in v0.1)	🟡 Later	Rule 15 decomposition
F-low-diversity-stoplist	Custom stoplist parameter for `lexicon.low-lexical-diversity`	🟡 Later	Rule 5
F-sentence-diversity-density	Sentence-level low-lexical-diversity density	🟢 Speculative	Rule 5
F-comma-density-relative	Comma density metric (relative) for `structure.excessive-commas`	🟢 Speculative	Rule 3a
F22	🚧 First slice shipped in v0.2.x — `structure.excessive-commas` now discounts commas inside `(A, B, C, …)` parenthesised token lists (3+ short comma-separated segments inside balanced parens, language-agnostic). Sibling helper `parenthesised_list_comma_count` in `src/rules/enumeration.rs`. Dogfood drops from 25 → 15 hits (10 FPs killed, ~40% reduction). Deferred to v0.3: relaxing `MAX_SEGMENT_WORDS = 2` for 3–4-word Oxford items, non-Oxford / “plus”-closed lists, interleaved parentheticals inside Oxford runs. See research note in `.personal/research/[F22](#f22).md`.	🔴 Next	v0.1 dogfood: 5 false-ish positives on technical docs
F23	✅ Shipped in v0.2 — false-positive cleanup complete for v0.2. Hits inside inline code spans, straight `"..."` quotes, paired curly `"..."` quotes, and directional `rather than` / `plutôt que` pairings are now skipped. Single quotes / apostrophes are deliberately not recognised (possessives, contractions, FR elisions). The “concrete noun” semantic check (`"many X"` where X is a concrete noun) stays unshipped — needs POS data and belongs in the `lucid-lint-nlp` plugin (F-nlp-plugin) rather than the deterministic core.	—	v0.1 dogfood: 11 false-ish positives on this repo’s own docs
F-excessive-nominalization-suffix-refine	Refine `lexicon.excessive-nominalization` suffix list (drop or gate `-al`; many adjectives — `crucial`, `horizontal`, `positional`, `attentional` — are flagged despite not being abstract nouns)	🟡 Later	v0.1 dogfood
F87	✅ Shipped in 0.2.x — FR `syntax.nested-negation` now uses pair-based counting over `ne` / `n'` clitics and the second-position particles `pas`, `rien`, `jamais`, `plus`, `personne`, `aucun`, `aucune`, `guère`, `nulle part`. Each clitic contributes one negation and consumes its nearest particle within a 6-token window; unpaired particles in a `ne`-sentence contribute one more — so `Nous ne disons pas que rien n'est jamais possible` now counts as 3 (was 2). Guards: `pas` / `plus` never count when unpaired, `de rien` idiom is skipped, particles in ne-less sentences are skipped. Fixture at `tests/corpus/fr/nested-negation.md` anchors the behaviour.	—	2026-04-23 docs clarity session — FR pedagogical example surfaced the detection gap
F31	✅ Shipped in v0.2 — `dev-doc` baseline narrowed to the infrastructure stack (`URL`, `HTML`, `CSS`, `JSON`, `XML`, `HTTP`, `HTTPS`, `UTF`, `IO`, `API`, `CLI`, `GUI`, `OS`, `CPU`, `RAM`, `SSD`, `USB`, `IDE`, `SDK`, `CI`, `CD`). Accessibility standards, engineering-practice initialisms, and AI/language-tech terms moved to project config via new `[rules.unexplained-abbreviation].whitelist` in `lucid-lint.toml` (additive over baseline). Breaking change for downstream users, flagged in CHANGELOG with the recovery snippet. Dogfooded in this repo’s own `lucid-lint.toml`.	—	v0.1 review feedback
F126	TOML overrides for `lexicon.jargon-undefined`. In v0.2 the active jargon lists are baked into the profile preset and there is no `[rules."lexicon.jargon-undefined"]` deserializer in `src/config.rs` — users can’t add custom domain terms, silence individual entries, or activate a non-default list combination from `lucid-lint.toml`. Wire the same shape `unexplained-abbreviation` already uses (validated `whitelist`, plus `custom_jargon` for additive terms and an explicit `active_lists` enum array). The rule’s underlying `Config` struct already exposes the fields (`active_lists`, `custom`, `whitelist`) — this is a config-layer wiring task, not a rule rewrite. Definition of done: TOML round-trip test, docs page (`docs/src/rules/jargon-undefined.md` + FR mirror) describing the schema, drop the F126 forward-link in those pages.	🟡 Later	2026-04-28 FR-translation review surfaced the gap
F-weasel-words-severity-tiering	Severity tiering for `lexicon.weasel-words`. Routed 2026-05-02 (`.personal/brainstorm/20260502-async-book-pr-timing.md`); blocks the async-book audit-and-PR play (tracked in `.personal/promotion-channels.md`). The current rule fires uniform `warning` on every entry in the EN/FR weasel-list, conflating two distinct linguistic functions: quantifiers (`some`, `many`, `often`, `most`, `several`) which are legitimate technical hedging in reference docs, and hedges (`a bit`, `just`, `quite`, `rather`, `pretty`, `kind of`) which signal under-confident prose. Stripping all of them in a Rust async-reference produces prose reviewers reject as artisan (“over-edited” — surfaced in `.personal/f113-async-book/READABILITY_REVIEW.md`). Fix: split `WEASEL_WORDS_EN` / `_FR` into two sub-lists, emit `Severity::Info` on quantifier hits and `Severity::Warning` on hedge hits. Per-rule TOML override stays available for users who want stricter / looser bands. Surfaces the pattern other lexical rules can adopt later (no architectural lift; a per-match severity decision inside the rule body). Definition of done: split lists in `src/language/{en,fr}/weasel.rs`, severity routing in `src/rules/lexicon/weasel_words.rs`, snapshot regen for both languages, docs page (`docs/src/rules/weasel-words.md` + FR mirror) describing the two bands and the rationale, CHANGELOG `## [Unreleased]` entry. Pairs with F-severity-floor-flag: once `--severity-floor=warning` exists, an external auditor running on a Rust-reference repo gets the “no contested edits” view in one flag.	🔴 Next	F113 audit-and-PR play (2026-05-02)
F-redundant-intensifier-bullet-fix	`lexicon.redundant-intensifier` parser miss inside bullet items / `strong` spans. Routed 2026-05-02 (`.personal/brainstorm/20260502-async-book-pr-timing.md`); blocks the async-book audit-and-PR play (tracked in `.personal/promotion-channels.md`). Surfaced while linting `rust-lang/async-book/src/why_async.md`: `very` inside `- OS threads are very …` (bullet + strong span) does not fire, while `highly` in a flat paragraph does. Same family of misses as F-tight-list-paragraphs — paragraph-level rules go silent when the surrounding event is `Tag::Item` with no enclosing `Tag::Paragraph`. F-tight-list-paragraphs is the right substrate to fix once for every paragraph-level rule; this entry is the verification slice that pins the regression for `redundant-intensifier` so the case cannot regress when F-tight-list-paragraphs lands. Definition of done: corpus fixture `tests/corpus/en/redundant-intensifier-bullet.md` + FR mirror, snapshot covering `very` / `highly` / `really` inside `- strong ...` and `* strong ...` shapes, comment in the test linking to F-tight-list-paragraphs so the slot stays after F129 lands, CHANGELOG entry.	🔴 Next	F113 audit-and-PR play (2026-05-02); same family as F-tight-list-paragraphs

F22 context. The v0.1 rule is a flat comma-per-sentence threshold. In technical docs that routinely enumerate short items, this fires often even when the sentence is perfectly scannable. Candidate relaxations to evaluate (needs corpus research — don’t pick blindly):

Discount commas inside parenthesis-like elements ((...), [...], en/em-dash pairs). A parenthetical enumeration is already visually bracketed; its commas are not adding subordination load.
Discount commas after a colon : when what follows is a list of short items. Colon + short items is idiomatic prose-enumeration and reads well.
Short-item enumeration exemption: if all comma-separated segments are 1–2 words, treat the enumeration as a single “flattened list” token for counting purposes (a max_short_enum_items parameter, or implicit).
Interaction with structure.long-enumeration: the shared enumeration::detect_enumerations helper already discounts Oxford- style enumeration commas from structure.excessive-commas (3+ short items). F22 is specifically about the cases that helper still misses: parentheticals, post-colon lists, and non-Oxford enumerations (“A, B, C and D” without the final comma).

Research inputs to gather before deciding: FR/EN corpus samples of technical docs, a handful of real false positives from dogfooding and downstream projects, how textlint / Vale / write-good handle parentheticals. Decide between relaxation parameters vs. a smarter token-aware counter.

Performance / hygiene (0.2.x)

Findings filed from the 2026-04-24 code-review stream-2 pass on src/. Each has a concrete source reference so it survives past the .personal/<date>-today.md scratchpad.

ID	Item	Priority	Origin
F93	Parser hot-path allocations. `src/parser/mod.rs:43` (`Paragraph::new(trimmed.to_string(), …)`) and `src/parser/tokenizer.rs:~88/109` (`current.trim().to_string()` per sentence) allocate in hot loops. ~~Confirm constructors accept `impl Into<String>`; pass the already-owned buffer where possible.~~ Refuted by samply profile 2026-04-25: `Paragraph::new` does not appear in the profile; `to_string()` in tokenizer = 3 samples / 0.03%. Real hot spots are F102 (`detect_language` 7.5%) and F103 (per-rule `split_sentences`).	✅ Done (refuted)	2026-04-24 code review (stream-2 #3); refuted 2026-04-25
F94	Tokenizer `Vec<char>` per sentence. `src/parser/tokenizer.rs:~60` collects a full `Vec<char>` for lookahead. ~~Swap to `Peekable<CharIndices>`.~~ Refuted by samply profile 2026-04-25: `Vec<char>` drop = 3 samples / 0.03% on the engine path. Yesterday’s “low ceiling” note (~5%) was generous; real ceiling is ~0.1%. Skip.	✅ Done (refuted)	2026-04-24 code review (stream-2 #5); refuted 2026-04-25
F102	`detect_language` cost. Single function showed 7.5% inclusive in samply profile 2026-04-25. Rewrote as single-pass, alloc-light: scalar counters, `to_lowercase()` only for words containing an uppercase character, no intermediate vectors. Bench delta on `engine_lint_str/en_long_devdoc` vs `stream2-noisy`: −0.56 % (p = 0.00, ~20 µs) — smaller than profile suggested because most of the inclusive cost is `unicode_words()` itself, which the rewrite cannot touch.	✅ Done	2026-04-25 samply profile; landed 2026-04-25
F103	Per-rule `split_sentences` re-parse. 8 rules called `split_sentences(&paragraph.text, …)` directly. Moved sentence splitting into `Paragraph::new`; rules now read `&paragraph.sentences`. Bench delta vs `stream2-noisy`: `engine_lint_str/en_long_devdoc` −11.58 % (~394 µs); `parse_markdown/en_long` +17.67 % (~38 µs, intentional — split cost moved into the parser phase, where it pays for itself across the eight consumers). Net user-facing win ~360 µs. New baseline saved as `stream2-after-f103`.	✅ Done	2026-04-25 samply profile; landed 2026-04-25
F95	✅ Shipped 2026-04-24 in commit `925ffb5`. Two non-literal expects fixed: `consecutive_long_sentences.rs` (`streak_start` unwrap when `streak_len > max`) and `all_caps_shouting.rs::flush_run` (`first()`/`last()` on a `Vec` already verified `len >= min_run`). The originally flagged `parser/tokenizer.rs:177` candidate is now an `if let Some(...)` pattern. Remaining `expect("non-zero literal")` sites are all `NonZeroU32::new(LITERAL)` — idiomatic compile-time invariants, explicitly out of audit scope.	✅ Done	2026-04-24 code review (stream-2 #2)
F96	✅ Shipped 2026-04-24 in commit `925ffb5`. `src/scoring.rs:199-209` now carries an explicit safety-contract comment naming the `[0, cap]` clamp dependency, plus a `debug_assert!(normalized.is_finite() && (0.0..=cap).contains(&normalized))` that trips in debug builds if a future edit loosens the clamp. The `#[allow(clippy::cast_possible_truncation, clippy::cast_sign_loss)]` stays — it masks a lint, not a real bug — but the invariant is now load-bearing in tests.	✅ Done	2026-04-24 code review (stream-2 #1)
F-config-whitelist-normalize	Config whitelist normalization at load time. `src/config.rs` — normalize (trim, case-fold per rule needs) on load instead of per invocation; catches user typos early. Small win; fits a v0.3 config-plumbing pass rather than a 0.2.x patch.	🟡 Later	2026-04-24 code review (stream-2 #6)

New rules (v0.2)

New rule candidates raised in the rule-system-growth brainstorm (2026-04-20). Naming uses a provisional category.rule-name prefix pending F29 harmonisation. Grounding column points at the standard or research that justifies the rule.

Must-ship v0.2 (blocking release):

ID	Rule	Category	Tags	Grounding	Priority
F48	✅ `lexicon.all-caps-shouting` shipped in v0.2 — see `docs/src/rules/all-caps-shouting.md`	Lexicon	`a11y-markup`, `dyslexia`, `general`	WCAG 3.1.5, BDA Dyslexia Style Guide	—
F55	✅ `syntax.nested-negation` shipped in v0.2 — see `docs/src/rules/nested-negation.md`	Syntax	`aphasia`, `adhd`, `general`	FALC, CDC Clear Communication Index	—
F56	✅ `syntax.conditional-stacking` shipped in v0.2 — see `docs/src/rules/conditional-stacking.md`	Syntax	`aphasia`, `adhd`, `general`	FALC, plainlanguage.gov	—

Should-ship v0.2 (cuttable under time pressure, in suggested cut order):

ID	Rule	Category	Tags	Grounding	Priority
F62	✅ `lexicon.redundant-intensifier` shipped in v0.2 — see `docs/src/rules/redundant-intensifier.md`	Lexicon	`general`	Plain-language guides	🟡 Later
F52	✅ `structure.mixed-numeric-format` shipped in v0.2 — see `docs/src/rules/mixed-numeric-format.md`	Structure	`dyscalculia`, `general`	CDC Clear Communication Index	🟡 Later
F50	✅ `structure.line-length-wide` shipped in v0.2 — see `docs/src/rules/line-length-wide.md`	Structure	`dyslexia`, `general`	WCAG 1.4.8 (AAA)	🟡 Later
F47	✅ `lexicon.consonant-cluster` shipped in v0.2 — see `docs/src/rules/consonant-cluster.md`	Lexicon	`dyslexia`, `general`	BDA Dyslexia Style Guide	🟡 Later
F54	✅ `syntax.dense-punctuation-burst` shipped in v0.2 — see `docs/src/rules/dense-punctuation-burst.md`	Syntax	`general`	IFLA easy-to-read guidelines	🟡 Later

Cut order if schedule slips: F47 → F54 → F62 → F52 → F50 → F11. F55 and F56 are non-negotiable (trivial implementation cost, strong grounding).

Format support

ID	Item	Priority	Origin
F-asciidoc-support	Native AsciiDoc support	🟡 Later	Format scope v0.1
F-html-support	Native HTML support	🟡 Later	Relevant for EAA compliance
F-docx-support	`.docx` support via Pandoc integration	🟡 Later	FALC institutional target
F-pandoc-companion	Companion script `pandoc → lucid-lint`	🟡 Later	Documented in v0.1 README

Example-text fixtures

Scraper + cleaner + converter triplet under scripts/texts_*.py populates examples/public/ (committable public_ok sources) from examples/texts.yaml. First batch landed 21 fixtures. The follow-ups below close the remaining rough edges.

ID	Item	Priority	Origin
F-text-source-adapters	Per-source adapters for git-cloned upstreams. The generic `clean` / `convert` path doesn’t know how to extract text from shallow-cloned repos (proselint checks, Vale style packs, write-good / alex / retext / textlint-rule fixtures, ASSET / OneStopEnglish / EASSE / CLEAR-corpus datasets). Each needs a small extractor that walks the repo and emits one or more `.md` files per rule / excerpt.	🟡 Later	First scraper batch, 2026-04-22
F-text-before-after-refine	Refine `texts_convert._split_before_after`. The current heuristic looks for literal `## Before` / `## After` (EN/FR) headings; no upstream page in the current batch uses that shape, so every `before_after` source fell back to a single `content.md` with a warning. Replace with a per-source pair-extraction rule (plainlanguage.gov, EC How to write clearly, Canada.ca, OneStopEnglish, ASSET, Inclusion Europe) that emits `before.md` + `after.md`.	🟡 Later	First scraper batch, 2026-04-22
F-texts-yaml-url-maintenance	Maintenance pass on `examples/texts.yaml` URLs. 12 sources failed on the first batch — 404s from moved landing pages (canada.ca × 2, BDA Dyslexia, Center for Plain Language, Newsela, HuggingFace wiki_auto), UA-/bot-blocking (Légifrance 403, Orthodidacte 403, ADHD Foundation 400), and a DNS error for the specific 18F post. Audit and update entries; for sources that genuinely require a browser-flavoured UA, add a per-source override in the fetcher. Fold in the opportunistic hygiene tasks from the 2026-04-23 brainstorm: (a) dedupe overlapping canada.ca / plainlanguage.gov entries, (b) add a licence-drift guard that flags when a source’s `redistribution` changes between fetches.	🟡 Later	First scraper batch, 2026-04-22 + referential brainstorm, 2026-04-23
F-example-fixtures-part2	Desired-fixture-shapes coverage table + replacements for high-value local-only entries. Part 1 — coverage tables: ✅ Shipped (2026-04-23) — `scripts/texts_coverage.py` splits output by audience: the committed `examples/texts.md` shows `public_ok` counts only (no totals, no names that would leak local-only existence), spliced between `<!-- coverage:begin/end -->` markers; the gitignored `examples/local/COVERAGE.md` carries the full matrices plus the load-bearing local-only list. Wired as `just texts-coverage` / `just texts-coverage-check`. Part 2 — replacement hunting: 🟡 In progress. First addition (2026-04-25): a French government FALC source under Etalab Open Licence 2.0 — knock-on lifted `aphasia × FR` and `gov_guide × FR` out of `0 / N ⚠`. Second addition (2026-04-27): three US-federal public-domain ADHD sources — NIMH ADHD topic page (mixed shape, ~780 words), CDC About ADHD (good, ~920 words), CDC Treatment of ADHD (good, ~1040 words). All three covered by the explicit reproduction notices in NIMH and CDC reuse policies (17 USC § 105 + agency policy pages). Knock-on: `adhd × EN` lifted from the load-bearing list; public-coverage `gov_guide × EN` and `condition adhd × EN` rise to non-zero counts. Remaining load-bearing slots: `dyscalculia × EN` (one BDA `link_only`) and `aphasia × EN+FR` (three plain-language standards as `link_only`).	🟡 In progress	Referential brainstorm, 2026-04-23
F-rule-fixture-coverage-map	Bidirectional rule ↔ fixture coverage map. Generate `examples/COVERAGE.md` from each `content.md`’s `rules_relevant` frontmatter, rendered as two views: rule → fixtures that exercise it (surfaces under-fixtured rules) and fixture → rules it covers (surfaces untagged or mis-tagged fixtures). Once stable, embed or link the canonical fixture per rule from `docs/src/rules/<rule-id>.md`. Optional follow-up: calibrated snapshot tests that lock expected lint output per canonical fixture.	🟡 Later	Referential brainstorm, 2026-04-23
F-reference-auto-discovery	Auto-discovery of new references with triage queue. Crawler (sitemaps, RSS, GitHub search, ACL Anthology API) surfaces candidate sources against a relevance filter derived from `rules_relevant` keywords; a lightweight triage file lists candidates with accept / ignore / defer. Mini-product — revisit post-v0.3 once the referential has stabilised.	🟢 Speculative	Referential brainstorm, 2026-04-23

Documentation rules plugin

ID	Item	Priority	Origin
F-code-block-without-lang	`code-block-without-lang` rule	🟡 Later	Rule 8 dropped from v0.1, candidate for `lucid-lint-docs` plugin

Docs.rs / API reference polish

Polish items for the auto-generated rustdoc surface at https://docs.rs/lucid-lint. The crate-level banner pointing readers to the mdBook + repo + RULES.md was added 2026-05-01 (src/lib.rs); module-level //! headers are already in place and #![warn(missing_docs)] is satisfied. Items below are deferred extras.

ID	Item	Priority	Origin
F-docsrs-metadata	`[package.metadata.docs.rs]` block in `Cargo.toml`. Pin the toolchain and feature set docs.rs builds with; add `rustdoc-args = ["--cfg", "docsrs"]` so any future feature-gated items can carry `#[cfg_attr(docsrs, doc(cfg(feature = "x")))]` and render the “available with feature X” badge. Cheap, lands the day a real feature flag is introduced. Renumbered from F-tight-list-paragraphs (collision with the parser tight-list fix that landed in parallel).	🟢 Speculative (0.2.x or 0.3)	2026-05-01 docs.rs polish discussion
F-docsrs-logo	Logo + favicon on docs.rs via `#![doc(html_logo_url = "…")]` and `#![doc(html_favicon_url = "…")]` at crate root. Reuses an asset hosted under the repo’s raw URL. Tiny visual identity win on the docs.rs landing page. Renumbered from F130.	🟢 Speculative (0.2.x)	2026-05-01 docs.rs polish discussion
F-doctest-entrypoints	One runnable doctest per major entry point (`Engine::with_profile`, `Engine::lint_str`, `Report` field access, key `Profile` variants). `///` blocks render as code samples on docs.rs and run under `cargo test --doc`, so they cannot rot. ~5 lines each. Lifts the API page from “list of names” to self-explanatory reference. Renumbered from F131.	🟡 Later (0.3)	2026-05-01 docs.rs polish discussion
F-public-api-audit	Public-API audit with `cargo public-api`: surface candidates that should carry `#[doc(hidden)]` (re-exports for macros, internal helpers leaked via `pub`) so the rustdoc index reflects the intended surface, not the current surface. Pair with a CI gate later if the surface becomes load-bearing for SemVer. Renumbered from F132.	🟡 Later (0.3)	2026-05-01 docs.rs polish discussion

Docs site — bilingual

ID	Item	Priority	Origin
F25	French mirror of the mdBook docs (`/fr/` tree). First slice shipped 2026-04-22: translated `introduction` + `rules-index`, short FR `accessibility` and `roadmap` pages pointing at EN, SUMMARY sidebar entry. Second slice shipped post-0.2.1 (2026-04-23): `fr/rules-index.md` renamed to `fr/rules/index.md` for EN-parity, first FR per-rule page landed (`structure.sentence-too-long`), parallel-version sidebar and EN↔FR deep-link toggle (F-summary-per-locale plan slot A, F92). Third slice shipped 2026-04-24: four more FR per-rule pages landed (`structure.excessive-commas`, `structure.long-enumeration`, `lexicon.weasel-words`, `lexicon.unexplained-abbreviation`), locked template honoured, `SUMMARY.md` + `fr/rules/index.md` rewired to point at the local FR versions. Fourth slice shipped 2026-04-25: six more FR per-rule pages landed (`structure.paragraph-too-long`, `structure.line-length-wide`, `structure.mixed-numeric-format`, `structure.deeply-nested-lists`, `structure.heading-jump`, `structure.deep-subordination`), closing out the `structure` category (9 / 9 rules FR-complete). Fifth slice shipped 2026-04-27: two more FR per-rule pages landed (`rhythm.consecutive-long-sentences`, `rhythm.repetitive-connectors`), closing out the `rhythm` category (2 / 2 rules FR-complete). Both EN pages were brought up to canonical template first (Examples + See also added). Sixth slice shipped 2026-04-28: six more FR per-rule pages landed (`lexicon.low-lexical-diversity`, `lexicon.excessive-nominalization`, `lexicon.jargon-undefined`, `lexicon.all-caps-shouting`, `lexicon.redundant-intensifier`, `lexicon.consonant-cluster`), closing out the `lexicon` category (8 / 8 rules FR-complete). Three of five categories now at 100 % (structure + rhythm + lexicon). Seventh slice shipped 2026-04-30: six more FR per-rule pages landed (`syntax.passive-voice`, `syntax.unclear-antecedent`, `syntax.dense-punctuation-burst`, `syntax.conditional-stacking`, `syntax.nested-negation`, `readability.score`), closing out the `syntax` (5 / 5) and `readability` (1 / 1) categories — all 5 categories now 100 % FR-complete (25 / 25 per-rule pages). `SUMMARY.md` was missing FR Syntaxe + Lisibilité subsections entirely; added in the same commit. Also fixed an EN/FR logic bug in `syntax.nested-negation` example (After clause now `something is possible` / `quelque chose est possible`, matching the predicate-logic-faithful inversion of the Before clause). Eighth slice shipped 2026-05-01 (Block C slice A): first two FR guide pages landed (`fr/guide/installation.md`, `fr/guide/quick-start.md`); new `Premiers pas` draft-chapter group in `SUMMARY.md`; both pages stamped with the F92 sub-task `en-source-sha` HTML comment. Ninth slice shipped 2026-05-01 (Block C slice B): two more FR guide pages landed (`fr/guide/profiles.md`, `fr/guide/suppression.md`) — Block C now half-done (4 / 8). 4 EN-only guide pages remain (`conditions`, `configuration`, `scoring`, `ci-integration`); FR pair-completeness now 35 / 42 (untranslated EN: 7, down from 11 at start of day). Tenth slice shipped 2026-05-01 (Block C slice C — closing slice): four FR guide pages landed (`fr/guide/conditions.md`, `fr/guide/configuration.md`, `fr/guide/scoring.md`, `fr/guide/ci-integration.md`); `SUMMARY.md` `Premiers pas` group now lists all 8 children. Block C complete (8 / 8). All 8 EN guide pages now have FR mirrors; FR pair-completeness 39 / 42 — only the architecture overview, design-decisions, and contributing pages remain untranslated (these are next-tier surfaces, not part of the user-facing guide). Eleventh slice shipped 2026-05-01 (next-tier close): three FR pages landed (`fr/architecture/overview.md`, `fr/architecture/design-decisions.md`, `fr/contributing.md`); `SUMMARY.md` gains an `Architecture` draft-chapter group + `Contribuer` entry under `Version française`. F25 closes — pair-completeness 41 / 41 (only `roadmap.md` remains intentionally asymmetric).	✅ Closed 2026-05-01	v0.1 docs `/shape` session, bilingual-equality prime directive
F-summary-per-locale	Split `SUMMARY.md` per locale (EN + FR) via a small preprocessor. v0.2.1 ships the single-`SUMMARY.md` + CSS `:has()` locale-hiding approach (1.A); both language trees coexist in the built HTML and each viewer only sees theirs. A clean separation would maintain `SUMMARY.en.md` + `SUMMARY.fr.md` and stitch them at build. Benefit: smaller per-page sidebar payload; clearer authoring story; no `:has()` browser-support floor. Cost: build-time stitcher, tooling to keep the two files in pair-sync. File when the FR tree outgrows the hide-via-CSS approach.	🟢 Speculative	2026-04-23 FR per-rule pages session
F-multi-book-mdbook	Multi-book mdBook layout (one book per locale). The truest “parallel version” — `/` redirects to `/en/`, `/fr/` is its own mdBook with its own theme inheritance. Benefit: each locale has its own table of contents, its own search index, its own navigation neighbour hints; no cross-locale bleed in any surface. Cost: biggest surgery — book.toml per locale, build orchestration, shared theme / asset de-duplication, sitemap updates, redirects. Revisit only if F-summary-per-locale isn’t enough.	🟢 Speculative	2026-04-23 FR per-rule pages session
F92	✅ Shipped post-0.2.1 (2026-04-23) — `scripts/sync_lang_counterparts.py` walks `docs/book/*/.html` after `mdbook build` and rewrites both `hreflang="en"` and `hreflang="fr"` anchors so the lang-switch deep-links to the matching page (e.g. `/fr/rules/sentence-too-long.html` ↔ `/rules/sentence-too-long.html`). Wired into `just docs-build`, the Deploy-docs workflow, and a new `just docs-lang-check` CI gate that runs with `--check` and fails on orphaned FR pages (FR without EN counterpart). The invariant is asymmetric by design: EN is canonical, FR is a translation layer — untranslated EN pages are informational and tracked as F25, not gated. No front-matter flag yet; add a `counterpart: none` flag only when a truly asymmetric page appears. Sub-task — FR content-staleness gate (shipped 2026-05-01): filename parity is gated; content drift was not. Every FR page now carries an `en-source-sha` HTML-comment stamp on its first line (`<!-- en-source-sha: 5e24f614… -->`), recording the EN counterpart’s last commit SHA at translation time. mdBook passes HTML comments through unchanged so the stamp is invisible in the rendered page; YAML front-matter was tried first but mdBook renders `---` as `<hr>` and the body as text. `scripts/check_lang_staleness.py` walks every FR page, compares the stored SHA to `git log -n1 --pretty=%H -- <EN counterpart>`, reports drift soft (PR `ci.yml` + main `docs-deploy.yml`) and fails on `main` with `STRICT=1` once the existing stale backlog clears. Wired as `just docs-lang-staleness`. `scripts/backfill_en_source_sha.py` (one-shot) stamped the 29 already-translated FR pages with the EN SHA at their introduction commit. Reconcile shipped 2026-05-01 (commit `438fa48b`, “F92 — reconcile stale FR backlog (13 → 0) + flip gate to strict”): of the 13 pages reported stale, 12 were cosmetic stamp drift only (the F105/F105b references-section sweep, the F35b/F35c a11y fix, and the `line-length-wide` author-break-aware fix all touched FR counterparts in the same commits — only the `en-source-sha` stamps lagged), 1 was substantive (`fr/index.md` had drifted on three sections — `État du projet` v0.2 numbers, `Aperçu` peak-end demo block, `Pour aller plus loin` guide-links update). Same commit flipped `docs-deploy.yml` from soft to `--strict`. PR-side `ci.yml` flipped to `--strict` on 2026-05-02 — both surfaces now strict-gated, sub-task fully closed. Optional further layers: an mdBook preprocessor banner above stale FR pages; a `needs-fr-translation` PR label automation for EN edits without FR counterparts.	— (sub-task: ✅ Closed 2026-05-02)	2026-04-23 FR per-rule pages session, option 2.B; 2026-05-01 Block C planning
F-docs-i18n-substrate	Docs i18n substrate evaluation (Starlight vs Sphinx). mdBook is a twin-tree at the file level with a post-build `hreflang` patcher (F92); it cannot deliver page-keyed translations or identical section numbering across languages by construction. A real i18n model needs either route-keyed translations (Astro Starlight: `defaultLocale` + `locales`, language dropdown built in, Markdown sources kept) or message-catalogue translations (Sphinx + `sphinx-intl` / gettext PO files: FR is the same file with strings substituted, headings and numbering identical by construction; weblate-style flow). Don’t migrate now — F92 + F25 + the F92 staleness sub-task carry through v0.2.x. Migration triggers (any one): (a) a third language is requested (Spanish or German via the EU disability-federation play, F-falc-readiness-guide); (b) docs surface crosses ~50 pages; (c) contributors complain about FR/EN drift after the staleness gate is in place. Default pick on trigger: Starlight (lightest migration, keeps Markdown). Sphinx only if RGAA-mandated structural parity becomes a contractual requirement. Placeholder entry — no work scheduled.	🟡 Later	2026-05-01 Block C planning, F25 follow-up
F107	✅ Shipped 2026-04-27 — Two-part fix without aliasing the rule ID. (1) Page subtitle: every shipped FR rule page opens with a short italic gloss directly under the H1 (e.g. `Phrase trop longue.`); 13 pages received the subtitle, the remaining 12 land alongside their translation. (2) Index gloss: `fr/rules/index.md` “Catégories” block reshaped into 5 per-category sub-tables (Structure / Rythme / Lexique / Syntaxe / Lisibilité), each `Règle \| Libellé` two-column. All 25 rules carry a FR label even when the page still points to the EN version (marked `(en)` inline). One-line note clarifies the `kebab-case` ID is the stable contract; the FR label is a reading aid. Sidebar TOC labels stay in EN — translating them would force a per-locale `SUMMARY.md` (F-summary-per-locale, parked Speculative).	—	2026-04-25 docs UX critique (Block E)

Docs site — content

ID	Item	Priority	Origin
F27	✅ Shipped in v0.2 — `docs/src/roadmap.md` is auto-generated from the root `ROADMAP.md` by `scripts/sync-roadmap.py`. `just docs-build` / `just docs-serve` run the sync first, so the mdBook site always ships the current roadmap. Relative links are rewritten (targets under `docs/src/` become docs-relative; others become absolute GitHub URLs) so the `docs_links_stay_inside_docs` gate still passes.	—	v0.1 docs review
F28	✅ Shipped in v0.2 — one page per rule under `docs/src/rules/`, wired into `docs/src/SUMMARY.md`, enforced by `tests/rule_docs_coverage.rs`. Each page carries category, severity, default weight, parameters per profile, EN/FR examples where applicable, and suppression guidance.	—	v0.1 docs review
F29	Rule ID harmonisation. F29-slim ✅ shipped 2026-04-22 in v0.2.0: the 25 rule IDs now use `category.rule-name` form (`structure.excessive-commas`, `lexicon.weasel-words`, `readability.score`, …) and rule source files moved into category subdirectories under `src/rules/<cat>/`. `Category::for_rule` derives the category from the id prefix rather than a hand-maintained match arm (F43-style drift now impossible by construction). Hard break — suppression directives, `[rules.<id>]` TOML keys, JSON/SARIF `ruleId` fields all use the new form; no alias layer. mdBook filenames and docs URLs still use the flat kebab slug; docs-tree rearchitecture into category subdirs is a separate slice. F29-full (parked 2026-04-24) would add a stable category-numbered code (`STR-001`, `LEX-002`, `SYN-003`) that survives renames — slim already makes drift impossible by construction, and numeric codes only earn their cost on a real rename. Revisit only when a rename actually happens.	— (slim) / 🟢 Speculative (full)	v0.1 docs review; 2026-04-22 reprioritisation; 2026-04-24 brainstorm-next-cycles
F-rule-mention-linking	Audit every rule mention across the docs and link it to its reference page (F28). Requires F28 to land first. References-page surface (rule IDs in `→ Relevant to:` lines + rule → reference summary table) covered by F105b 2026-04-27; remaining surface is rule mentions in `docs/src/guide/*` prose pages, `RULES.md`, and the introduction.	🟡 Later	v0.1 docs review
F42	✅ Shipped in v0.2 — rule documentation coverage gate. `tests/rule_docs_coverage.rs` cross-checks every shipped rule id against its mdBook page, `Category::for_rule`, `scoring::WEIGHTED_RULE_IDS`, and (on CI, gated by `RULE_DOCS_GATE_GIT=1`) the `## [Unreleased]` section of `CHANGELOG.md`. Contract documented in `CONTRIBUTING.md`.	—	v0.2 interlude
F43	✅ Shipped in v0.2 — `RULES.md` category drift fixed. Per-rule `Category` lines and the Categories table now match `Category::for_rule`: `structure.excessive-commas` and `structure.deep-subordination` are `structure`, `rhythm.repetitive-connectors` is `rhythm`, `syntax.unclear-antecedent` is `syntax`. The drift banners on the four per-rule mdBook pages are removed.	🟡 Later	Surfaced by F42 interlude
F-rule-mention-coverage-test	Coverage test for F-rule-mention-linking rule-mention linking — assert each rule id mentioned in `docs/src/*/.md` is linked on first-per-section occurrence. Follow-up from F-rule-mention-linking.	🟡 Later	F-rule-mention-linking follow-up
F104	✅ Shipped 2026-04-27 — `SUMMARY.md` reshaped into 5 collapsible sub-trees (Structure / Rhythm / Lexicon / Syntax / Readability) using mdBook draft chapters (`- [Title]()`) as non-clickable group headers; FR `Version française` block mirrors the same shape (Structure / Rythme / Lexique — Syntaxe and Lisibilité materialise as those FR translations land). `markdownlint` MD042 disabled globally to permit the empty-link draft-chapter syntax (matches the pre-existing MD025 carve-out for SUMMARY-required multiple H1s). Picked over (B) “one sub-page per category” — B doubles the page count without adding clarity the index table doesn’t already provide.	—	2026-04-25 docs UX critique (Block E)
F105	✅ Shipped 2026-04-27 — `docs/src/references.md` (EN, under Project) and `docs/src/fr/references.md` (FR, under Version française) consolidate every cited source into one informative surface, preserving the full taxonomy of `examples/REFERENCES.md` (legend, per-domain sections, rule → reference summary table) and the scholarly-honesty note. `examples/REFERENCES.md` becomes a thin redirect to the docs sources — kept because external citations may already point there. Both rule indexes (EN + FR) cross-link to the new page next to the existing `RULES.md` pointer. Per-citation anchors deferred — readers scan the page or use browser search; if a need surfaces, file a follow-up.	—	2026-04-25 docs UX critique (Block E)
F105b	✅ Shipped 2026-04-27 — Per-citation anchors (`<a id="author-year">`) on every entry of `references.md` + `fr/references.md`, plus a `## References` / `## Références` section on every rule page (25 EN + 13 FR) listing the relevant citations as anchored links. The references page now links rule IDs in `→ Relevant to:` lines and the rule → reference summary table to their per-rule mdBook pages — bidirectional rules ↔ references. Verified canonical URLs (DOI, publisher landing page, official archive — researched in 2026-04-27 lap, 26 of 34 academic citations carry one) added inline as raw HTML anchors with `rel="nofollow noopener noreferrer" target="_blank"`: `nofollow` so the docs site does not vouch for outside content, `noopener noreferrer` for new-tab safety. Sources without a verifiable canonical URL stay text-only — no guessed links. Subsumes the F-rule-mention-linking rule-mention linking pass for the references-page surface; wider F-rule-mention-linking audit (rule mentions in `docs/src/guide/*` prose pages) stays open.	—	F105 follow-up filed 2026-04-27
F-docs-codegen	Code → docs codegen for data-heavy surfaces. Several docs surfaces are hand-maintained today but derivable from the rule registry and config types: per-rule pages (defaults, weight, condition tags, category, severity), `docs/src/rules/index.md` table, `docs/src/guide/profiles.md` threshold tables, `docs/src/guide/conditions.md` tag list, `docs/src/guide/suppression.md` directive list, JSON output schema page. Proposed shape: a `lucid-lint manifest --format=json` subcommand emits one document with everything pulled from `default_rules`, `Category::for_rule`, `scoring::default_weight_for`, the `Condition` enum, profile presets, and `schemars::JsonSchema` derives. A `just docs-gen` script renders marked regions in existing prose pages (`<!-- BEGIN: lucid-gen rule-defaults id=structure.sentence-too-long lang=en -->` … `<!-- END: lucid-gen -->`) so prose around the data stays hand-authored. CI runs `just docs-gen` and fails on non-empty `git diff` (same shape as F27 for the roadmap sync). Translation surface shrinks to prose only — labels (`Default`, `Profile`, `Threshold`) come from a small `i18n.toml` keyed by `(lang, key)`, the data is identical across languages by construction. Block on F25 guide translations landing first so we don’t change the substrate mid-translation; open after Block C closes.	🟡 Later	2026-05-01 Block C planning, F25 / F28 / F42 follow-up
F-landing-page	Landing-page polish. `docs/src/introduction.md` already plays both roles today: lens-motif hero, before/after figure, “what makes it different”, quick-taste terminal capture, “where to next”. A real landing-page push only earns its cost when there’s a first consumer outside the maintainer (project gets adopted, traffic shows up). Until then, polishing is design work without a forcing function. Notes for when triggered: more positioning above the fold, demo grid for the rule families (one canonical example per category), CTA toward profiles + quick-start, lens-motif extension already validated for use across the page.	🟢 Speculative	2026-04-25 docs UX critique (Block E)

Docs site — theming

ID	Item	Priority	Origin
F26	✅ MVP shipped in v0.2 via DOM-level trim in `lucid-navigation.js` — the picker now shows three honest items (`Auto · Lucid light · Lucid dark`); the stock Rust / Navy / Ayu `<li>`s are marked `hidden` so they’re inert for keyboard and screen-reader. CSS class mapping is unchanged (`.light` / `.rust` → lucid-light, `.coal` / `.navy` / `.ayu` → lucid-dark), so pre-existing localStorage selections still render correctly. Follow-up (optional): a full `index.hbs` override to drop the stock markup entirely rather than hide it; preferred once the mdBook upgrade cadence settles.	🟡 Later	v0.1 docs `/colorize` session; mdBook stock limitation
F73	✅ Pre-deploy font-leak gate shipped in v0.2 — `just docs-check-clean` rebuilds the book, runs `scripts/sanitize-stock-css.py`, and greps the output for active `font-family` / `--*-font` / `local()` references to `Open Sans` or `Source Code Pro`. Not wired into `just check` (mdbook build is too slow for the dev loop); wire it into the docs-publish CI workflow before any release-candidate goes live.	🟡 Later	v0.2 `/critique` polish pass follow-up
F-example-fixtures-part2	✅ Shipped in v0.2.1 — fixed localhost 404.html rendering under `mdbook serve`. `book.toml` sets `site-url = "/lucid-lint/"` for GitHub Pages, and mdBook emits `<base href="/lucid-lint/">` into 404.html (only there). On localhost that prefix doesn’t exist, so the browser’s preload scanner fired 18 stylesheet/script requests with the wrong prefix before the page recovered via a second fetch. The previous JS workaround in `docs/theme/head.hbs` rewrote `<base>` at parse time, but ran after the preload scanner. Fix: `just docs-serve` now sets `MDBOOK_OUTPUT__HTML__SITE_URL=/` for the serve process, so 404.html carries `<base href="/">` on localhost and the correct `<base href="/lucid-lint/">` in production builds; the JS workaround is removed.	—	2026-04-23 Block A

Docs site — reading preferences

ID	Item	Priority	Origin
F-reading-prefs-popover	Full reading-preferences popover UI — cog button in the header opens a popover with font radio (Atkinson / Standard / OpenDyslexic), line-spacing slider (1.4–2.0, 0.05 step) and text-size slider (90–130 %, 5 % step). v0.1 ships only the Introduction-page demonstrator; the CSS-variable plumbing (`--reading-scale`, `--reading-line-height`, `[data-font]`) is already in place, so this is UI work only.	🟡 Later	v0.1 docs `/shape` + `/typeset` sessions
F-docs-responsive	Responsive / mobile adaptation — right-rail page TOC and header controls collapse gracefully below 700 px; touch targets verified ≥ 44 × 44 px; sidebar drawer behaviour polished.	🔴 Next	v0.1 docs `/layout` session, deferred to `/adapt`
F-a11y-audit-sweep	Accessibility audit sweep — full AAA pass on both themes (contrast, focus order, `prefers-reduced-motion` coverage, keyboard-only walk-through, skip-link), plus a published accessibility statement page. First audit pass ran 2026-04-22 (17/20, 0 P0, 2 P1, 3 P2); findings filed as F35a–F35d below. F-a11y-audit-sweep stays open until the statement page ships and P1s are cleared.	🟡 In progress	v0.1 docs `/audit` plan
F35a	✅ Shipped 2026-04-22 — `theme/index.hbs` is now forked from mdBook v0.5.2’s upstream template (minimal-diff approach, documented so future mdBook upgrades stay a mechanical re-sync). The skip link and EN / FR language switch are emitted as server-rendered HTML inside `<body>` and inside `.right-buttons`; both language variants are rendered and CSS in `lucid-layout.css` hides the wrong-locale copy based on `html[lang]` (which `head.hbs` sets synchronously before first paint on `/fr/` pages). The previous `skipLink()` and `langSwitch()` IIFEs in `lucid-navigation.js` are gone; the only remaining JS on the skip-link path is a progressive-enhancement smooth-scroll handler. WCAG 2.4.1 Bypass Blocks now passes with JS disabled. Unblocks F26 (stock theme labels can be collapsed at the markup level).	—	F-a11y-audit-sweep audit 2026-04-22
F35b	Drop `role="radiogroup"`/`role="radio"` on reading-demo chips (P2 from F-a11y-audit-sweep audit). Current markup declares radiogroup semantics but the JS only binds `click` — arrow-key traversal is missing, so the ARIA contract is broken. Simpler fix is to switch to plain buttons with `aria-pressed` (the chips are preset toggles, not radios) rather than add a keyboard handler. Promoted to 🔴 Next on 2026-04-24 (brainstorm-next-cycles).	🔴 Next	F-a11y-audit-sweep audit 2026-04-22
F35c	✅ Closed 2026-05-01 as audit false-positive. The 2026-04-22 audit reported that `.lucid-stance__idea` lost its colour tint under `prefers-reduced-motion`. Re-audit on 2026-05-01 against `docs/theme/css/lucid-layout.css:567-622` and `docs/theme/css/lucid-typography.css:424-431`: no `@media (prefers-reduced-motion: reduce)` rule touches `.lucid-stance__idea`; the global reduced-motion reset zeroes `animation-duration` / `transition-duration` only and never overrides `background-color`. The only rule that strips the tint is `@media (forced-colors: active)` (line 620–622), which is intentional (Windows High Contrast users get the OS palette, position-based pairing carries the meaning). The original audit appears to have conflated `forced-colors: active` with `prefers-reduced-motion: reduce`. No code change needed; accessibility.md known-limitation bullet removed in the same commit.	—	F-a11y-audit-sweep audit 2026-04-22
F35d	Publish an accessibility statement page (`docs/src/accessibility.md`, FR counterpart at `docs/src/fr/accessibility.md`). EN page carries the stated bar (WCAG 2.2 AAA), first audit pass result (2026-04-22, 17/20), a “Known limitations” block listing F35a/b/c pending, report route, and audit cadence. FR stub mirrors the limitations block. Shipped 2026-04-22.	🟢 Shipped	F-a11y-audit-sweep audit 2026-04-22
F-docs-final-polish	Final polish pass — optical alignment, spacing rhythm, edge-state copy, favicon PNG fallback, social-card refinement, re-running `/critique` to verify the score moves above 30/40.	🟡 Later	v0.1 docs `/polish` plan
F-terminal-demo-a11y	Terminal-demo accessibility — keep VHS, add motion + transcript fallbacks. Audited VHS (charmbracelet/vhs, active 2026-04-27, headless+CI-reproducible) vs. terminalizer (~16k stars, last commit 2024-08-29, effectively unmaintained). Verdict: keep VHS — `.tape` files are text-diffable, the build is reproducible, and the motion-handling problem is the same on both tools, so it is not a recorder choice but a wrapping problem. AAA gap to close: every embedded GIF on the docs site (today: `docs/src/assets/tty/explain.gif` plus future captures) must (1) honour `prefers-reduced-motion` — browsers do not pause animated GIFs automatically, so a static `<picture>` source-set with a still PNG fallback served when `(prefers-reduced-motion: reduce)` is the right shape; (2) carry the per-step transcript inside the page so non-sighted, screen-reader, and reduced-motion readers reach the same content as motion viewers — a stepwise prose block (e.g. `<details><summary>Transcript</summary>…</details>` with each tape command + its visible output as a list) sitting next to the GIF, plus an `alt=` summary on the image itself. The `.tape` source already encodes the steps deterministically — a small generator can emit the transcript from the same file the GIF is built from, keeping motion view and transcript view pair-locked. Phase: v0.3 marketing.	🟡 Later	2026-04-27 Block E recon

Quality features

ID	Item	Priority	Origin
F-score-evolution-dashboard	Score evolution dashboard across runs	🟢 Speculative	Rule 11, inspired by coverage reports
F98	Mutation testing via `cargo-mutants`. ✅ Baseline shipped 2026-04-25 — dev-tool installed, `just mutants <file>` recipe added (timeout 60 s, no-shuffle for reproducibility), four-file probe run: `sentence_too_long.rs` 6 caught / 0 missed / 4 unviable (100 %), `scoring.rs` 18 / 0 / 2 (100 %), `engine.rs` 5 / 0 / 12 (100 %), `low_lexical_diversity.rs` 29 / 47 / 5 (36 %). Canonical reference rule + cross-cutting layer score perfectly; the lexical-diversity rule has two clear test gaps surfaced as F108 + F109. Triage methodology: cluster missed mutants by site → one ROADMAP entry per root cause, not per mutant.	✅ Done	Stream-2 testing brainstorm, 2026-04-24
F108	`low_lexical_diversity::ratio_at_anchor_min` — assert reported ratio in tests. ✅ Shipped 2026-04-25. Added `reported_ratio()` helper (parses the documented message format) and three new test fixtures: `reported_ratio_is_minimum_observed_in_cluster` (50 W + 100 cache + 50 V → cluster-exit path with min ratio 0.01 deep mid-slide, not at anchor), `flush_path_reports_final_ratio` (cache-only doc → flush path), and `exactly_window_size_tokens_runs_the_check` (boundary on the early-return guard). Ratio assertion uses `(ratio - 0.01).abs() < 1e-9` so floating-point shifts from arithmetic mutations are caught. Bonus refactor (typed-ratio field on `Diagnostic`) deferred — string parsing is fine for the test-only consumer.	✅ Done	F98 baseline 2026-04-25
F109	`low_lexical_diversity::check` — borderline-cluster fixtures. ✅ Shipped 2026-04-25 alongside F108. Added `cluster_starts_at_strict_inequality` and `ratio_exactly_at_threshold_does_not_trigger` — the latter uses 49 W + 51 cache so the only full window has unique=50 → ratio exactly 0.50 = `min_ratio`. With strict `<` the rule must not trigger; a `< → <=` flip would emit a diagnostic and fail the test. Combined effect: the rule’s mutation score moved from 36 % (29 / 47 / 5) at F98 baseline to 89 % (68 / 8 / 5). The remaining 8 missed mutants are equivalent under the current rule logic — defensive guards (`start_index + window > tokens.len()` is unreachable in normal flow because `anchor.index ≤ len − window`), or initial values the slide loop unconditionally overwrites (`let mut best = unique / window` is replaced as soon as a lower ratio appears, which it always does in a real cluster). Closing those would require rule refactoring (e.g. starting `best` at `f64::INFINITY` to prove the initial computation is dead) — diminishing returns; deferred.	✅ Done	F98 baseline 2026-04-25
F-proptest-invariants	Property-based tests via `proptest` (dep already in `[dev-dependencies]`, zero call sites today — paid for, unused). Four invariants in `tests/properties.rs`, deliberately small: (1) `split_sentences` never drops a non-whitespace character on round-trip, (2) re-linting an identical string yields identical diagnostics (engine idempotence), (3) for threshold-driven rules, `public`-profile diagnostics are a superset of `dev-doc`-profile diagnostics on the same input (profile monotonicity), (4) `Engine::lint_str` never panics on arbitrary valid UTF-8 ≤ 10KB. Goal: fortify tokenizer / engine seams, not rewrite the suite.	🟡 Later	Stream-2 testing brainstorm, 2026-04-24
F-llm-fp-miner	LLM false-positive miner via Claude Code. Dev-only audit script (not a test, not a CI gate) that runs lucid-lint across the CC corpus, asks Claude to flag diagnostics that look wrong, writes a triage report to `.personal/audits/`. Reframed from the original “LLM-as-Judge harness” after Devil’s Advocate surfaced three blockers on the gating form: non-determinism across Claude model versions, ambiguity about whether a disagreement indicts the rule or the judge, cost / wall-clock at 600×N scale. The miner form sheds all three — human triages, Claude suggests. Respects prime directive #4 (deterministic core, no LLM) because it lives entirely outside the library crate and never blocks `just check`. Wait until v0.3 `lucid-lint-nlp` plugin work surfaces the need for correctness review at scale.	🟢 Speculative	Stream-2 testing brainstorm, 2026-04-24
F93	Tokenizer `split_sentences` `Vec\<char\>` allocation. The helper collects the full input into a `Vec\<char\>` per call to support lookbehind (`chars[idx-1]`) and arbitrary lookahead (`chars[idx+1..].find(!ws)` for ellipsis-continuation). Nominal waste on real corpus is ~5% of the `split_sentences` budget (bench shows 35µs total, `Vec\<char\>` alloc ~1–2µs). Refactor to a small ring-buffer + `Peekable\<CharIndices\>` is feasible but high-churn for low ceiling. Revisit only if profiling pins the tokenizer as a bottleneck.	🟢 Speculative	Stream-2 code review 2026-04-24 (measured; deferred)
F-lucid-stance-unify	Unify rule-page example figures on the `.lucid-stance` component. Today the intro page uses a custom `.lucid-stance` figure (Before / After side-by-side, colour-matched ideas, diagnostic in the figcaption), while rule pages use plain H3 + blockquote + fenced `text` for the diagnostic (see `docs/src/rules/sentence-too-long.md`). The H3 form works and is cheap to roll out, but wide screens could show stronger Before↔After pairing with the side-by-side figure. Scope: extract `.lucid-stance` into a reusable component (mdBook include or raw HTML pattern), tune the styling for in-content width (rule pages sit inside the narrower content column, not the landing-page hero), one figure per language, drop the H3 subsections in favour of a `data-lang` attribute surfaced as a chip on the figure. Ship only after the H3-based rollout has landed across all example-bearing rule pages and the unified pairing is confirmed as the dominant reader complaint.	🟢 Speculative	2026-04-23 docs clarity session — H3 subsections landed as the lightweight option; F-lucid-stance-unify parks the heavier unify-the-components path
F-fix-mode	`--fix` mode for the mechanical subset of rules — promoted to 🟡 Later on 2026-04-24 (brainstorm-next-cycles, 0.3 Should). Narrow scope locked: `lexicon.all-caps-shouting` (lowercase the run), `lexicon.redundant-intensifier` (drop the intensifier), `structure.mixed-numeric-format` (normalise to the detected majority style), `structure.line-length-wide` (rewrap to `max_chars`). All other rules stay report-only — cognitive-load judgments need the author to choose the rewrite. Borderline `structure.heading-jump` stays out of the initial cut. Design: per-rule `fixable: bool` metadata on the `Rule` trait, `--fix` flag walks diagnostics in document order applying only those with concrete replacements, writes files in place (or emits a unified diff with `--fix=print`), exits with count of fixes applied. Conservative default: `--fix` only touches the explicitly-fixable set, never guesses.	🟡 Later	2026-04-23 docs clarity session — framing “lucid-lint reports, you rewrite” surfaced the question

Scope control

File/directory discovery. Distinct from suppression (below): scope control excludes inputs before they are scanned; suppression hides diagnostics after scanning.

ID	Item	Priority	Origin
F78	✅ Shipped in v0.2 — `exclude = [...]` glob list in `[default]` of `lucid-lint.toml` and `--exclude <GLOB>` CLI flag (comma-delimited, repeatable). Patterns match against paths relative to the walked root; matching directories are pruned, not descended. Explicit file args bypass exclusion. Backed by `globset`. See `docs/src/guide/configuration.md`. `.lucidignore` (gitignore-style file) deferred to F78b if user demand surfaces.	—	Dogfood feedback 2026-04-21
F78b	`.lucidignore` file (gitignore-style, with negations and nested files). Different crate (`ignore`) and a larger test matrix than the glob-list MVP. Ship only if users ask — the `exclude` list in `lucid-lint.toml` covers the dominant use case.	🟢 Speculative	F78 deferral, 2026-04-21

Suppression mechanism

v0.1 ships the minimal inline-disable directive (see brainstorm brainstorm/20260419-inline-disable-feature.md). Extensions deferred:

ID	Item	Priority	Origin
F18	✅ Block form shipped in v0.2: `<!-- lucid-lint-disable <rule-id> -->` … `<!-- lucid-lint-enable -->` silences one rule across every line in the scope. `enable` with no argument closes every open scope; with a rule id, closes only that rule’s scope (so overlapping disables for different rules can nest). Unterminated `disable` extends to end-of-document. See RULES.md → Suppressing diagnostics.	—	v0.1 inline-disable brainstorm
F19	✅ Shipped in v0.2 — top-level `[[ignore]]` array-of-tables in `lucid-lint.toml`, each entry with a required `rule_id` silences every diagnostic for that rule across Markdown, plain text, and stdin. Unknown ids tolerated. Applied post-engine, pre-scoring, so scoring / rendering / exit-code logic all see the filtered view. Scope broadened from the roadmap’s original “`.txt` and stdin” wording because a global filter is simpler and more useful; Markdown users can still prefer inline directives for local silencing. `reason` field tracked as F-suppression-reason-field. See `docs/src/guide/configuration.md`.	—	v0.1 inline-disable brainstorm
F-suppression-reason-field	`reason="..."` field, optional in v0.1, surfaced in reports and optionally required via config	🟡 Later	v0.1 inline-disable brainstorm
F-suppression-disable-file	File-level directive (`disable-file`) and multi-rule lists	🟡 Later	v0.1 inline-disable brainstorm
F-severity-floor-flag	`--severity-floor=warning` CLI flag. Routed 2026-05-02 (`.personal/brainstorm/20260502-async-book-pr-timing.md`); supports the async-book audit-and-PR play (tracked in `.personal/promotion-channels.md`). Need: external audit PRs (async-book and adjacent) want a “narrow audit” mode that drops `info` diagnostics from output and from score impact, so the PR demonstrates value on the unambiguous wins (sentence-too-long, redundant-intensifier, unclear-antecedent, paragraph-too-long) without the contested ones (`info`-tier weasel words after F-weasel-words-severity-tiering lands). Shape: `--severity-floor={info,warning,error}` with default `info` (current behavior). Pairs with F-weasel-words-severity-tiering: once weasel-words emits `info` on quantifiers, an auditor running `--severity-floor=warning` ships a PR where reviewers see only the prose changes the tool is most confident about. Implementation is a post-engine filter (mirrors F19 `[[ignore]]` post-engine pre-scoring shape) so JSON / SARIF / TTY all see the same filtered view; scoring excludes filtered diagnostics so `--min-score` interacts correctly. Definition of done: CLI flag in `src/cli.rs`, filter in `src/engine.rs` post-rule pre-score, two snapshot tests (info-included default vs warning-floor), docs in `docs/src/guide/configuration.md` + FR mirror with a “running a narrow audit on someone else’s repo” worked example, CHANGELOG entry.	🔴 Next	F113 audit-and-PR play (2026-05-02)

Reporting / DX

TTY-output decoration on the lucid-lint check summary surface. Distinct from the rule engine (no diagnostic semantics change), from suppression (which hides diagnostics), and from scoring (which weights them) — this section covers what the user reads after the diagnostic list. JSON / SARIF stay structural until a second consumer asks.

ID	Item	Priority	Origin
F-report-quick-wins	TTY report quick-wins block — actionable hint hooks under the diagnostic list. Routed 2026-05-03 from Block C of `.personal/2026-05-03-today.md` (originally surfaced in the 2026-05-02 deferred buffer). Dogfood loops on this repo and adjacent docs surfaced the gap: high-density runs already produce a complete summary (score line + per-category breakdown + diagnostic list), but the next action a user can take is buried inside the list. This entry adds a small “quick wins” block rendered after the diagnostic list, TTY only in v0.2.x, with two seed shapes both grounded in observed dogfood patterns: (1) Acronym whitelist hint — when ≥ 3 occurrences of `lexicon.unexplained-abbreviation` share one token, surface `→ add "X" to whitelist (N hits suppressed)` (one line per acronym, top 3); (2) Single-rule hot-spot hint — when one rule fires ≥ 10 times in one file, surface `→ <rule-id> dominates this file — see <docs URL>` (one line per rule per file). Why a single ROADMAP entry, not two: one shape (a quick-wins reporter), two seed heuristics that share the threshold-based fire rule, the TTY-only render path, and the test pattern; a third heuristic (e.g. category dominance) earns its own line later without new scaffolding. Threshold heuristics (sketch, finalised in PR): acronym hint requires ≥ 3 hits sharing a token; hot-spot hint requires ≥ 10 hits of the same rule in one file. Each hint is one line; the block caps at ≤ 5 lines so it never crowds the score banner. Output surfaces: TTY only in v0.2.x; JSON / SARIF stay structural (separate ROADMAP entry if a CI consumer asks; no current ask). Definition of done: new `report::quick_wins` module (or extension of the existing TTY renderer) with the two seed hints wired, snapshot tests pinning the fires-vs-silent path for each shape, threshold parameters live next to the hint definition (no central config knob until a second consumer asks), CHANGELOG `[Unreleased]` entry. Non-breaking: purely additive output below the diagnostic list, no JSON / SARIF schema change, no scoring change, no new CLI flag (existing `--quiet` already suppresses TTY decoration if a consumer wants the bare list).	🔴 Next	Block C 2026-05-03 (deferred from 2026-05-02 buffer)

v0.1 — Released 2026-04-20

Shipped in the tag: all 17 rules across 5 phases, the minimal inline-disable directive, and the mdBook documentation site (Lucid light / Lucid dark themes, Atkinson Hyperlegible Next / Literata / Commit Mono / OpenDyslexic typography layer, reading-preferences demonstrator, accessibility page, EN/FR header switch with v0.2 FR-stub). See CHANGELOG.md for the full release notes.

Rules (17 / 17) ✅

Phase 1 — Deterministic structural rules

Status	Rule	Notes
✅	`structure.paragraph-too-long`	Sentence-count + word-count thresholds per profile (`src/rules/paragraph_too_long.rs`)
✅	`structure.deeply-nested-lists`	Flags list items nested beyond profile depth (`src/rules/deeply_nested_lists.rs`)
✅	`structure.heading-jump`	Walks section depths, flags jumps > +1 level (`src/rules/heading_jump.rs`)

Phase 2 — Simple text rules

Status	Rule	Notes
✅	`structure.sentence-too-long`	Reference implementation — template for the 15 others (`src/rules/sentence_too_long.rs`)
✅	`structure.excessive-commas`	Per-profile comma-per-sentence threshold (`src/rules/excessive_commas.rs`)
✅	`rhythm.consecutive-long-sentences`	Intra-paragraph streak of long sentences (`src/rules/consecutive_long_sentences.rs`)

Phase 3 — Lexical rules with word lists

Status	Rule	Notes
✅	`lexicon.weasel-words`	Per-language phrase list, word-boundary match (`src/rules/weasel_words.rs`)
✅	`lexicon.unexplained-abbreviation`	Pattern-based (v0.1); definition-awareness tracked as F9 (`src/rules/unexplained_abbreviation.rs`)
✅	`lexicon.jargon-undefined`	Pattern-based, profile-activated category lists (`src/rules/jargon_undefined.rs`)
✅	`lexicon.excessive-nominalization`	Per-sentence suffix-based density check (`src/rules/excessive_nominalization.rs`)
✅	`rhythm.repetitive-connectors`	Sliding-window connector frequency, one diagnostic per cluster (`src/rules/repetitive_connectors.rs`)

Phase 4 — Global metric

Status	Rule	Notes
✅	`readability.score`	Per-document Flesch-Kincaid grade; info under threshold, warning above (`src/rules/readability_score.rs`)

Phase 5 — Heuristic rules (hardest)

Status	Rule	Notes
✅	`structure.long-enumeration`	Shared enumeration detector with `structure.excessive-commas`; suggests list conversion (`src/rules/long_enumeration.rs`, `src/rules/enumeration.rs`)
✅	`structure.deep-subordination`	Counts subordinators between strong-punct breaks; skips pronoun enumerations (`src/rules/deep_subordination.rs`)
✅	`syntax.passive-voice`	Heuristic `be/être`+past-participle detector; POS-based detection remains a `lucid-lint-nlp` plugin candidate (`src/rules/passive_voice.rs`)
✅	`syntax.unclear-antecedent`	Info-level heuristic: bare demonstrative + verb, or paragraph-start personal pronoun (`src/rules/unclear_antecedent.rs`)
✅	`lexicon.low-lexical-diversity`	Sliding-window TTR over non-stopword content tokens (`src/rules/low_lexical_diversity.rs`)

Cross-cutting features

Status	Feature	Notes
✅	Minimal inline-disable	`<!-- lucid-lint disable-next-line <rule-id> -->` for Markdown inputs, single rule id, optional reason. See RULES.md → Suppressing diagnostics. Block form, config ignores, file-level scope and required `reason=` are tracked as F18–F-suppression-disable-file below.
✅	Accessibility page in the docs	`docs/src/accessibility.md` covers the WCAG 2.2 AAA bar, the reading-preferences control, typography credits (Atkinson Hyperlegible Next — Braille Institute; OpenDyslexic — Abelardo Gonzalez; Literata — TypeTogether), keyboard shortcuts, and how the site dogfoods the project’s mission. Linked from the sidebar and the footer.

Design decisions from v0.1 session

Diagnostic structure

Decided: v0.1 diagnostics carry only what cannot be trivially recomputed.

#![allow(unused)]
fn main() {
pub struct Diagnostic {
    pub rule_id: String,
    pub severity: Severity,
    pub location: Location,
    pub section: Option<String>,  // H2 (or configured level) containing the diagnostic
    pub message: String,
}
}

Kept : section is stored at emission. Recomputing it a posteriori would require re-parsing the Markdown to walk headings and match locations. Expensive. Storing it is cheap.

Omitted : category is a pure function of rule_id. A category_of(rule_id) -> Category utility derives it in O(1). No duplication in diagnostics.

Omitted : weight and suggestion are not used in v0.1 and will be introduced when the hybrid scoring model (F14) lands.

This aligns with the “open to change, not abstracted for change” principle applied earlier to format handling: struct fields can be added later without breaking JSON serialization compatibility.

Points deferred from v0.1 session

A number of configuration and ergonomics questions were raised but postponed. They will be addressed before or during v0.2:

Configuration

Config file format decision: TOML (recommended), YAML, or JSON
Config filename convention
Profile name finalization (dev-doc, public, falc confirmed)
Naming convention for rules (kebab-case confirmed, flat vs. hierarchical namespace)
Rule codes (short codes like LL001 vs. name-only)
Suppression mechanism (# lucid-lint disable-next, block disable/enable, ignore file)

Output

TTY format (colors, snippets, condensed report)
Structured format: JSON schema, SARIF exactness, native format
Exit code granularity (0/1 vs. graduated)

Architecture

Language detection: simple heuristic (stop-words) vs. dedicated crate (whatlang)
Parallelism: rayon for multi-file processing
Glob patterns and .lucidignore (now tracked as F78)
Core library exposed as lucid-lint-core for third-party integration

Project

Repo structure: single crate vs. Cargo workspace
Reference corpus for testing
README v0.1 content and positioning
Tagline and visual identity

Contribution invitation

Future rules and plugins can be proposed by the community. The default jargon and stoplists (lexicon.jargon-undefined, lexicon.weasel-words, lexicon.low-lexical-diversity) are especially welcome targets for community pull requests to expand coverage across domains and languages.

Accessibility

lucid-lint is a cognitive-accessibility tool. The docs site you are reading is its first proof of concept. If the site itself is not comfortable to read for the audiences the project claims to serve, the pitch does not hold.

This page lists the bar, the controls, and the credits.

The bar — WCAG 2.2 Level AAA

WCAG (Web Content Accessibility Guidelines) is the international standard for web accessibility. It defines three conformance levels; AAA (the strictest) is the ceiling, not the floor.

The stated bar for this site is WCAG 2.2 Level AAA. In practice:

Normal body text clears a contrast ratio of 7:1 against its background.
Large text and UI chrome clears 4.5:1.
Interactive targets are at least 44 × 44 px.
A skip-to-content link is the first focusable element on every page.
The focus ring is visible and does not rely on colour alone.
Motion respects prefers-reduced-motion: reduce absolutely — no decorative animation, no parallax, no auto-playing content.
Keyboard navigation reaches every interactive surface in a logical order.

Both themes (Lucid light and Lucid dark) clear AAA for body text (14:1 and above) and inline links (7.4:1 and above).

Where AAA is impractical — for example contrast on a third-party embed — the exception is documented in .impeccable.md.

Known limitations

The first audit pass (2026-04-22) scored 17 / 20 against the AAA bar: 0 blockers, 2 P1 items, 3 P2, 2 P3. Each open item below has a roadmap ticket; fixes land in subsequent v0.2.x slices.

Skip link and language switch are JS-rendered. Both the “Skip to main content” link and the EN / FR header switch are injected by lucid-navigation.js at end-of-body. Users with JS disabled, or readers on the pre-paint frame, do not see them. WCAG 2.4.1 (Bypass Blocks) asks for the skip link without JS. A theme/index.hbs override that server-renders both is tracked as F35a.
The French page renders under <html lang="en"> at build time because mdBook supports a single book-wide language. A small script corrects lang="fr" on load; screen readers that respect dynamic changes pick it up. Proper per-locale builds land with the full French mirror in F25.
mdBook’s built-in theme picker still lists Light / Rust / Coal / Navy / Ayu as menu items. Each resolves to either Lucid light or Lucid dark via the theme CSS, but the picker labels themselves are a mdBook concern. Brand-owned labels are tracked as F26 in the roadmap.

Reading preferences

A small set of controls tunes the site to your own reading profile. Selections persist across visits via localStorage.

Font

Three choices, picked from the Introduction page demonstrator or from the reading-preferences popover (on the way — see the roadmap).

Option	When it helps
Atkinson Hyperlegible Next (default)	A humanist sans built by the Braille Institute for maximum character differentiation. Reads well for most readers and especially for readers with low vision or reading-speed fatigue. Every surface on the site uses it by default.
Standard	The same Atkinson for body prose, paired with Literata serif for headings — a traditional bookish pairing for readers who prefer serif display contrast.
OpenDyslexic	A typeface whose letters are weighted at the bottom to reduce swapping and rotating. Preferred by some dyslexic readers; not universally helpful.

Line spacing

Adjustable from 1.4 to 2.0 in 0.05 steps. The default is 1.7 — the research range for low-fatigue reading sits between 1.6 and 1.8.

Keyboard shortcuts

The site inherits mdBook’s keyboard map:

Key	Action
`/` or `s`	Focus the search box
`←`	Previous chapter
`→`	Next chapter
`Escape`	Close the search or theme popover
`Tab`	Follow the focus order. The first focusable element is always the Skip to main content link.

Typography credits

Every font on the site is self-hosted under docs/src/_fonts/.

All four ship under the SIL Open Font License 1.1, issued by the Summer Institute of Linguistics.

Atkinson Hyperlegible Next — Braille Institute of America. Commissioned for low-vision readers; designed to maximise the differentiation between characters that commonly get confused (rn vs m, I vs l vs 1).
Literata — TypeTogether, commissioned by Google for Google Play Books. A contemporary serif with generous x-height tuned for long-form reading.
Commit Mono — Eigil Nikolajsen. A monospaced face designed for code reading, with distinctive digits and unambiguous punctuation.
OpenDyslexic — Abelardo Gonzalez. A public-domain typeface for readers who find weighted-bottom letterforms easier to track.

Dogfooding

The prose on this site is linted by lucid-lint itself at the public profile, via just dogfood. A page cannot regress below the bar the tool sets for its users without the build failing.

Reporting an accessibility issue

If something on this site is harder to use than it should be, open an issue on GitHub with the accessibility label. Reports are triaged against the v0.2 milestone unless they block a release. If an email route suits you better, write to the maintainer listed in CONTRIBUTING.md.

Audit cadence: a full AAA sweep runs at least once per minor release (v0.1, v0.2, …). The last pass was 2026-04-22. Findings and their status live in the roadmap under the F35 family.

References

Academic, normative, and practical sources that inform the design of lucid-lint.

This page lists the references that shaped lucid-lint’s rules, profiles, and design decisions. Each entry states where the reference matters in the project. The French mirror lives at fr/references.md.

External links open in a new tab; we mark them rel="nofollow noopener noreferrer" so the new-tab is safe and the docs site does not vouch for outside content.

Legend

Status	Meaning
✅	Verified — canonical reference
⚠️	To verify — likely correct, confirm citation details
🔍	Opportunistic — sound rationale, citation may be looser
📖	Book / secondary source
🌐	Normative standard
🧪	Practical source (style guide, tool)

Cognitive Load Theory — the backbone

The theoretical core of lucid-lint: prose imposes a mental cost on the reader, and this cost can be measured and reduced.

✅ Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. ↗

Foundational paper. Distinguishes intrinsic, extraneous, and germane load. Justifies the core premise that poor prose imposes extraneous load that can be reduced through better structure.

→ Relevant to: most rules, especially structure.*, rhythm.*, syntax.nested-negation, syntax.conditional-stacking.

📖 Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive Load Theory. Springer. ↗

Modern synthesis of 30 years of research.

Text cohesion and discourse processing

✅ Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers, 36(2), 193–202. ↗

The reference paper for automated cohesion analysis. Over 200 linguistic indices measuring local and global cohesion. Our rules are simplified, deterministic versions of several Coh-Metrix metrics.

→ Relevant to: rhythm.repetitive-connectors, syntax.unclear-antecedent, lexicon.low-lexical-diversity.

📖 McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. ↗

Syntactic complexity

✅ Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68(1), 1–76. ↗

Foundational paper on Dependency Locality Theory. Formalizes the cost of holding distant grammatical referents in working memory.

→ Relevant to: structure.deep-subordination, syntax.unclear-antecedent, syntax.conditional-stacking.

Discourse connectors

✅ Sanders, T. J. M., & Noordman, L. G. M. (2000). The role of coherence relations and their linguistic markers in text processing. Discourse Processes, 29(1), 37–60. ↗

Central reference on how logical connectors guide or confuse readers.

→ Relevant to: rhythm.repetitive-connectors.

Readability formulas

✅ Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233. ↗

Original paper for the Flesch Reading Ease formula.

✅ Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas for Navy enlisted personnel. Technical Report, Naval Technical Training Command. ↗

Origin of the Flesch-Kincaid Grade Level formula used in v0.1.

→ Relevant to: readability.score.

📖 McLaughlin, G. H. (1969). SMOG grading: A new readability formula. Journal of Reading, 12(8), 639–646. ↗

Alternative readability formula. Candidate for v0.2.

Lexical diversity

📖 Herdan, G. (1960). Type-Token Mathematics: A Textbook of Mathematical Linguistics.

Origin of the Type-Token Ratio used in lexical diversity analysis.

→ Relevant to: lexicon.low-lexical-diversity.

✅ McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. ↗

Negation processing

✅ Clark, H. H., & Chase, W. G. (1972). On the process of comparing sentences against pictures. Cognitive Psychology, 3(3), 472–517. ↗

Classic experimental work showing that negative sentences take longer to verify than affirmative ones. Foundational evidence that negation carries a comprehension cost.

→ Relevant to: syntax.nested-negation.

✅ Carpenter, P. A., & Just, M. A. (1975). Sentence comprehension: A psycholinguistic processing model of verification. Psychological Review, 82(1), 45–73. ↗

Extends Clark & Chase with a formal model of sentence processing. Stacked negations compound the verification cost.

→ Relevant to: syntax.nested-negation.

🔍 Kaup, B., Lüdtke, J., & Zwaan, R. A. (2006). Processing negated sentences with contradictory predicates: Is a door that is not open mentally closed? Journal of Pragmatics, 38(7), 1033–1050. ↗

Modern reference on negation processing. Useful if you want to go deeper.

Conditional reasoning

🔍 Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Psychology Press. ↗

Mental models theory of conditional reasoning. Stacked conditionals multiply the number of mental models the reader must maintain.

→ Relevant to: syntax.conditional-stacking.

🔍 Evans, J. St. B. T., & Over, D. E. (2004). If. Oxford University Press. ↗

Comprehensive review of the psychology of conditionals. More accessible than Johnson-Laird for non-specialists.

🔍 Caveat: the link between chained conditionals and reader cognitive load is intuitive and well-supported by the broader reasoning literature, but the specific rule “more than N conditionals per sentence is harmful” is a practitioner heuristic, not a directly tested threshold. Treat the threshold as configurable and empirically calibrated.

Typography and visual processing

🔍 Arditi, A., & Cho, J. (2007). Letter case and text legibility in normal and low vision. Vision Research, 47(19), 2499–2505. ↗

Empirical evidence on the reading-speed cost of all-caps text: readers lose the word-shape cues that mixed-case ascenders and descenders provide.

→ Relevant to: lexicon.all-caps-shouting.

🧪 Nielsen, J. (Nielsen Norman Group). Multiple articles on all-caps readability in user interfaces.

Industry-standard reference on why ALL-CAPS text reduces reading speed.

→ Relevant to: lexicon.all-caps-shouting.

📖 Bringhurst, R. (2013). The Elements of Typographic Style (4th ed.). Hartley & Marks.

Canonical reference on typography. Supports the principle that uniform-height text (all-caps) slows reading compared to mixed-case.

✅ Legge, G. E., & Bigelow, C. A. (2011). Does print size matter for reading? A review of findings from vision science and typography. Journal of Vision, 11(5). ↗

Review of vision-science evidence on reading. Covers line-length effects among other factors.

→ Relevant to: structure.line-length-wide.

Phonological complexity and reading

🔍 Seidenberg, M. S., Waters, G. S., Barnes, M. A., & Tanenhaus, M. K. (1984). When does irregular spelling or pronunciation influence word recognition? Journal of Verbal Learning and Verbal Behavior, 23(3), 383–404. ↗

Classic work showing that unusual letter patterns slow word recognition.

🔍 Treiman, R., Kessler, B., Zevin, J. D., Bick, S., & Davis, M. (2006). Influence of consonantal context on the reading of vowels: Evidence from children. Journal of Experimental Child Psychology, 93(1), 1–24. ↗

Work showing that consonant clusters and their context affect reading accuracy and speed.

🔍 Caveat: the lexicon.consonant-cluster rule is grounded in the broader literature on word-form complexity, but a specific validated threshold like “4+ consonants in a row is harmful” does not come from a single canonical paper. The rule is a practitioner heuristic informed by the literature, not a direct transposition of a published metric.

Intensifiers and hedges

🔍 Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Longman.

Classical grammar reference classifying intensifiers as “amplifiers” whose semantic contribution is often marginal. Justifies flagging them as low-value words.

→ Relevant to: lexicon.redundant-intensifier.

🧪 Zinsser, W. (2006). On Writing Well (30th anniversary ed.). HarperCollins.

Practical guide that famously argues against adverb intensifiers (“very”, “really”, “quite”) as clutter. Not academic, but widely cited in writing pedagogy.

Style guides and plain language

📖🧪 Strunk, W., & White, E. B. (1999). The Elements of Style (4th ed.). Longman.

The canonical English writing guide. Codifies active voice, concision, clear pronouns, and warns against qualifiers like weasel words and intensifiers.

→ Relevant to: syntax.passive-voice, lexicon.weasel-words, lexicon.redundant-intensifier, syntax.unclear-antecedent.

🧪 US Plain Language Action and Information Network (2011). Federal Plain Language Guidelines. ↗

Grounds short sentences, active voice, no nominalization, no jargon.

→ Relevant to: structure.sentence-too-long, structure.paragraph-too-long, lexicon.excessive-nominalization, lexicon.jargon-undefined, syntax.passive-voice.

🧪 European Commission (2011). How to write clearly. Publications Office of the European Union. ↗

European plain-language equivalent in all EU languages.

Numeric formatting conventions

🌐 International Organization for Standardization (2022). ISO 80000-1:2022 — Quantities and units — Part 1: General. ↗

International standard on numeric formatting, including digit grouping and decimal separators. Grounds the idea that mixing formats within a single text impairs scanning.

→ Relevant to: structure.mixed-numeric-format.

🧪 The Chicago Manual of Style (17th ed., 2017). University of Chicago Press. ↗

Canonical style guide covering when to spell numbers out vs. use digits, and why consistency matters.

→ Relevant to: structure.mixed-numeric-format.

Working memory and attention

⚠️ Martinussen, R., Hayden, J., Hogg-Johnson, S., & Tannock, R. (2005). A meta-analysis of working memory impairments in children with attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 44(4), 377–384. ↗

⚠️ Caveat: direct research on “text readability for ADHD readers” is dispersed and of variable quality. The cognitive accessibility angle is sound, but treat specific ADHD claims carefully.

📖 Barkley, R. A. (2012). Executive Functions: What They Are, How They Work, and Why They Evolved. The Guilford Press. ↗

Dyslexia and visual accessibility

✅ Rello, L., & Baeza-Yates, R. (2013). Good fonts for dyslexia. Proceedings of ASSETS ’13. ↗

Empirical research on font choice impact for dyslexic readers.

Concreteness norms

✅ Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904–911. ↗

→ Relevant to: possible future rule “abstractness density” (not in v0.1).

Normative standards

🌐 W3C (2018). Web Content Accessibility Guidelines (WCAG) 2.1. ↗

Key criteria invoked:

1.3.1 (Info and Relationships) → structure.heading-jump
1.4.8 (Visual Presentation) — line width ≤ 80 characters → structure.line-length-wide
2.4.6 (Headings and Labels) → structure.heading-jump
3.1.3 (Unusual Words) → lexicon.jargon-undefined
3.1.4 (Abbreviations) → lexicon.unexplained-abbreviation
3.1.5 (Reading Level) → readability.score

⚠️ Verify exact criterion numbers against the WCAG version you want to cite (2.1 or 2.2).

🌐 Accessibility Standards Canada (2025). CAN-ASC-3.1:2025 — Plain Language (first edition). ↗

First-edition Canadian national standard on plain language, published bilingually by Accessibility Standards Canada under the Accessible Canada Act. Prescriptive (shall / should / may) requirements over five areas: audience identification, evaluation methods, structure, wording, design. Grounds many of our lexicon.*, structure.*, and readability.score defaults independently of the US / EU plain-language canons.

→ Relevant to: lexicon.jargon-undefined, lexicon.unexplained-abbreviation, lexicon.weasel-words, structure.sentence-too-long, structure.paragraph-too-long, syntax.passive-voice, readability.score.

European legal context

🌐 Directive (EU) 2019/882 of the European Parliament and of the Council of 17 April 2019 — European Accessibility Act (EAA). ↗

Legal framework extending accessibility requirements to private-sector services from 28 June 2025.

Practical tools that shaped our design

🧪 Coh-Metrix (Graesser & McNamara) — ↗
🧪 Vale (Chris Ward) — ↗
🧪 textlint — ↗
🧪 Hemingway Editor — ↗
🧪 Proselint — ↗

Rule → reference summary

Lexicon

Rule	Primary references
`lexicon.all-caps-shouting`	Arditi & Cho (2007); Nielsen Norman Group; Bringhurst (2013)
`lexicon.consonant-cluster`	Seidenberg et al. (1984); Treiman et al. (2006) — 🔍 practitioner heuristic
`lexicon.excessive-nominalization`	Plain Language US; FALC; CAN-ASC-3.1:2025
`lexicon.jargon-undefined`	WCAG 3.1.3; Plain Language US; FALC; CAN-ASC-3.1:2025
`lexicon.low-lexical-diversity`	Herdan (1960); McCarthy & Jarvis (2010); Graesser et al. (2004)
`lexicon.redundant-intensifier`	Strunk & White; Quirk et al. (1985); Zinsser (2006)
`lexicon.unexplained-abbreviation`	WCAG 3.1.4; RGAA 9.4; CAN-ASC-3.1:2025
`lexicon.weasel-words`	Strunk & White; Wikipedia style guide; CAN-ASC-3.1:2025

Readability

Rule	Primary references
`readability.score`	Flesch (1948); Kincaid et al. (1975); Henry (1975); Kandel & Moles (1958); CAN-ASC-3.1:2025

Rhythm

Rule	Primary references
`rhythm.consecutive-long-sentences`	Sweller (1988); Sweller et al. (2011)
`rhythm.repetitive-connectors`	Sanders & Noordman (2000); Graesser et al. (2004)

Structure

Rule	Primary references
`structure.deep-subordination`	Gibson (1998); FALC
`structure.deeply-nested-lists`	WCAG 2.1; cognitive load heuristics
`structure.excessive-commas`	Gibson (1998) — 🔍 practitioner heuristic
`structure.heading-jump`	WCAG 1.3.1 & 2.4.6; RGAA 9.1
`structure.line-length-wide`	WCAG 1.4.8 (AAA); Legge & Bigelow (2011)
`structure.long-enumeration`	FALC; Plain Language US
`structure.mixed-numeric-format`	ISO 80000-1; Chicago Manual of Style
`structure.paragraph-too-long`	Sweller (1988); Graesser et al. (2004); CAN-ASC-3.1:2025
`structure.sentence-too-long`	Sweller (1988); Plain Language US; FALC; CAN-ASC-3.1:2025

Syntax

Rule	Primary references
`syntax.conditional-stacking`	Johnson-Laird & Byrne (1991); Evans & Over (2004); Gibson (1998) — 🔍 threshold is practitioner heuristic
`syntax.dense-punctuation-burst`	Sweller (1988); Gibson (1998) — 🔍 purely heuristic
`syntax.nested-negation`	Clark & Chase (1972); Carpenter & Just (1975); Kaup et al. (2006)
`syntax.passive-voice`	Strunk & White; Plain Language US; FALC; CAN-ASC-3.1:2025
`syntax.unclear-antecedent`	Strunk & White; Gibson (1998); Graesser et al. (2004)

On scholarly honesty

lucid-lint is an engineering project informed by research, not a research project itself. The references above ground our design choices but we do not claim to validate new findings. Several rules (lexicon.consonant-cluster, syntax.conditional-stacking, syntax.dense-punctuation-burst, structure.excessive-commas) are practitioner heuristics informed by the literature rather than direct transpositions of published metrics — we mark these with 🔍 in the summary table.

Where we simplify an academic metric (e.g., syntax.unclear-antecedent as a pattern heuristic vs. full anaphora resolution), we document the simplification in RULES.md and plan richer versions in the roadmap.

If you are a researcher and spot an error, an outdated citation, or a misattribution, please open an issue — we will correct it promptly and credit you.

Contributing

See CONTRIBUTING.md for the full contribution guide.

TL;DR

Open an issue before large changes.
Run just check locally.
Add tests for new behavior.
Follow Conventional Commits.
Be kind. See Code of Conduct.

Especially welcome

Rule proposals with a clear cognitive load rationale
Language word lists for weasel words, connectors, jargon, abbreviations
Corpus contributions (real-world prose samples)
Documentation improvements

Introduction

Conçu pour les lecteurs dont l'attention est sollicitée — TDAH, dyslexie, fatigue, langue seconde, ou contexte d'accessibilité.

lucid-lint lit votre Markdown ou texte brut et repère les passages qui alourdissent la lecture. Il ne réécrit pas votre voix. Il vous tend une liste courte, puis s’efface.

Avant

Le sous-système de cache, introduit lors d'un jalon antérieur, s'est révélé mal interagir avec la nouvelle chaîne de traitement des requêtes sous charge soutenue, et l'enquête qui a suivi a exigé plusieurs rondes de profilage.

Après

Le sous-système de cache a été introduit plus tôt. Il interagit mal avec la nouvelle chaîne de traitement des requêtes sous charge soutenue. L'enquête a exigé plusieurs rondes de profilage.

Trois idées, teintées de la même couleur à gauche et à droite — la réécriture raccourcit les phrases sans en perdre une seule. lucid-lint a signalé sentence-too-long (43 mots) et consecutive-long-sentences. Il n'a pas proposé la réécriture — elle est de vous.

Ce qui le distingue

La plupart des outils mesurent le style (write-good), la grammaire (Antidote) ou un score de lisibilité de surface (Flesch). lucid-lint mesure la charge cognitive — l’effort mental qu’un lecteur dépense pour comprendre une phrase. Il repère les motifs que la recherche de Sweller, Gibson, Graesser et Coh-Metrix ont isolés.

Bilingue EN/FR dès le premier jour, à qualité égale.
Déterministe par défaut. Une même entrée produit une même sortie. Les règles fondées sur un LLM vivent dans des extensions optionnelles.
Pensé pour l’intégration continue. Sorties texte et JSON ; codes de retour que pre-commit et GitHub Actions comprennent sans adaptateur.
Par profil. Choisissez dev-doc, public ou falc (Facile À Lire et à Comprendre), puis ajustez chaque règle si besoin.

État du projet

lucid-lint est en v0.2 (publiée le 2026-04-22). Les 25 règles listées dans RULES.md sont livrées (17 en v0.1, 8 ajoutées pendant le cycle v0.2), accompagnées du modèle de score hybride — un score global X / max et cinq sous-scores par catégorie, calculés au-dessus des diagnostics. Pré-1.0 : des changements de rupture restent possibles entre versions mineures. La feuille de route indique la suite.

Aperçu

Un fichier sans diagnostic obtient le score complet 100/100 et la bannière du logo — le moment fort d’une analyse réussie :

Capture terminal : une analyse lucid-lint sans diagnostic, avec la bannière du logo en trois parties, le message « No issues found. », et un bloc de score 100/100 — chaque barre de catégorie pleine

~~~~~ ⟨ • ⟩ ─────  lucid-lint  v0.2.0
                   cognitive accessibility linter · prose · EN / FR
                   ────────────────────────────────────────────────

No issues found.

────────────────────────────────────────────────────────────
score: 100/100
       structure    █████  20/20
       rhythm       █████  20/20
       lexicon      █████  20/20
       syntax       █████  20/20
       readability  █████  20/20

cargo install lucid-lint

# Analyser un fichier
lucid-lint check README.md

# Profil le plus strict (FALC)
lucid-lint check --profile=falc docs/

# Entrée standard
echo "Ceci est une phrase de test." | lucid-lint check -

# JSON pour la CI
lucid-lint check --format=json docs/

# Échouer la build si le score global passe sous 85/100 (v0.2+)
lucid-lint check --min-score=85 docs/

Pour aller plus loin

Installation — comment l’installer.
Démarrage rapide — visite guidée en cinq minutes.
Profils — choisir celui qui convient.
Référence des règles — les 25 règles expliquées.
Accessibilité — l’exigence WCAG AAA et comment le site lui-même met en pratique ce qu’il prêche.

Préférences de lecture

Tout le site est conçu comme un compagnon de lecture. Choisissez la police qui vous convient le mieux — elle sera mémorisée entre les pages.

Atkinson Hyperlegible Next

Un paragraphe dense peut beaucoup demander à un esprit sollicité. Chaque virgule, chaque proposition, chaque parenthèse ajoute son coût. Une bonne prose maintient ce coût bas.

L'interligne et la taille du texte arriveront bientôt sous forme de curseurs. En attendant, choisissez une police et le zoom du navigateur est respecté.

Version anglaise

Return to the English version

Licence

Double licence MIT ou Apache-2.0, à votre choix.

Installation

lucid-lint propose quatre voies d’installation. Choisissez celle qui correspond à votre environnement.

Installeur en une ligne (Linux, macOS, WSL)

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh | sh

Le script est généré par cargo-dist à chaque version publiée. Il détecte votre plate-forme. Il télécharge le binaire pré-compilé correspondant depuis la version GitHub. Il le place sur $PATH (par défaut : $CARGO_HOME/bin si défini, sinon ~/.cargo/bin).

Auditer avant d’exécuter

curl … | sh est rapide mais opaque. Pour lire le script avant de l’exécuter :

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.sh -o install.sh
less install.sh
sh install.sh

Le script est court — moins de 200 lignes de shell POSIX. Une lecture rapide reste réaliste. Il fixe la version pour laquelle il a été généré. Il vérifie la taille attendue de l’archive téléchargée. Il sort en erreur si une valeur diffère.

Fixer une version précise

latest pointe vers la version la plus récente. Pour fixer une version connue et stable :

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/bastien-gallay/lucid-lint/releases/download/v0.2.2/lucid-lint-installer.sh | sh

Installeur en une ligne (Windows PowerShell)

powershell -ExecutionPolicy Bypass -c "irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 | iex"

Même mécanique cargo-dist, version PowerShell. Le binaire atterrit dans %CARGO_HOME%\bin si CARGO_HOME est défini, sinon dans %USERPROFILE%\.cargo\bin.

Pour auditer avant d’exécuter, sauvegardez le script et inspectez-le :

irm https://github.com/bastien-gallay/lucid-lint/releases/latest/download/lucid-lint-installer.ps1 -OutFile install.ps1
notepad install.ps1
.\install.ps1

Via Cargo

cargo install lucid-lint

Cette voie compile depuis les sources publiées sur crates.io. Elle place le binaire dans votre dossier bin de Cargo (par défaut ~/.cargo/bin/). Plus lent que l’installeur pré-compilé. Utile quand les cibles pré-compilées ne couvrent pas votre plate-forme.

Depuis les sources

git clone https://github.com/bastien-gallay/lucid-lint
cd lucid-lint
cargo install --path .

Binaires pré-compilés

Chaque version publie des binaires pré-compilés pour :

Linux (x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl)
macOS (aarch64-apple-darwin, x86_64-apple-darwin)
Windows (x86_64-pc-windows-msvc)

Les installeurs shell et PowerShell ci-dessus choisissent l’archive correcte automatiquement. Pour installer à la main, téléchargez depuis la page des versions GitHub et placez le binaire extrait sur $PATH.

Vérifier l’installation

lucid-lint --version

Pré-requis système

Rust 1.75 ou plus récent (utile uniquement pour la compilation depuis les sources ou via cargo install).
Aucune dépendance d’exécution.

Démarrage rapide

Cette page suit l’analyse de votre premier document.

Analyser un seul fichier

lucid-lint check README.md

Sortie :

warning <path>/README.md:14:1 Sentence is 27 words long (maximum 22). Consider splitting it into shorter sentences. [structure.sentence-too-long]

summary: 1 warnings.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long
────────────────────────────────────────────────────────────
score: 88/100
       structure    ██▏░░  8/20
       rhythm       █████  20/20
       lexicon      █████  20/20
       syntax       █████  20/20
       readability  █████  20/20

Le bloc final est le résumé du score. Il affiche un score global X / 100 puis le détail par catégorie.

Analyser plusieurs fichiers

lucid-lint check docs/*.md CHANGELOG.md

Analyser un dossier

lucid-lint check docs/

Tous les fichiers avec une extension .md, .markdown ou .txt seront traités.

Utiliser l’entrée standard

echo "This is a test sentence." | lucid-lint check -

Recevoir depuis Pandoc

Pour les formats que lucid-lint ne sait pas encore lire nativement :

pandoc report.docx -t markdown | lucid-lint check -

Choisir un profil

# Le plus strict : Facile À Lire et à Comprendre
lucid-lint check --profile=falc docs/

# Le plus permissif : documentation pour développeurs
lucid-lint check --profile=dev-doc docs/

Voir Profils pour le détail.

Changer le format de sortie

# JSON pour l'intégration continue
lucid-lint check --format=json docs/

Voir Intégration continue pour les recettes CI.

Codes de sortie

Code	Signification
0	Aucun problème (ou seulement des `info`) et score au-dessus de `--min-score` (si défini)
1	Avertissements détectés ou score sous `--min-score`
2	Erreur d’exécution (arguments invalides, fichier illisible)

Les deux portes se combinent. Voir Intégration continue pour les recettes combinées.

Profils

Un profil est un ensemble pré-configuré de seuils de règles, ajusté pour un public précis.

Profils disponibles

`dev-doc`

Pour la documentation technique, les références d’API, les ADR et le contenu destiné aux développeurs.

Les seuils sont permissifs. Les lecteurs techniques tolèrent mieux les phrases longues, les nominalisations et le jargon de domaine.

`public` (par défaut)

Pour le contenu grand public : pages marketing, descriptions produit, articles de blog.

Les seuils sont modérés. Les principes du langage clair s’appliquent.

`falc`

Pour le contenu qui suit le standard Facile À Lire et à Comprendre / Easy-to-Read européen.

Les seuils sont stricts : phrases courtes, vocabulaire simple, pas de voix passive, pas d’acronyme non défini.

Choisir un profil

Commencez par le profil qui correspond à l’intention du contenu. Surchargez les règles individuelles si besoin via lucid-lint.toml.

Comparaison des seuils

Voir la référence des règles pour les seuils exacts par règle et par profil.

Le schéma général :

dev-doc : 30 mots par phrase, 4 virgules, 7 phrases par paragraphe
public : 22 mots par phrase, 3 virgules, 5 phrases par paragraphe
falc : 15 mots par phrase, 2 virgules, 3 phrases par paragraphe

Le même fichier analysé trois fois sous dev-doc, public puis falc — le score baisse à mesure que le profil se resserre :

Capture terminal : trois exécutions successives de lucid-lint sur examples/sample.md sous les profils dev-doc, public et falc. Le passage dev-doc remonte une poignée de diagnostics et un score moyen ; public se resserre et plus de problèmes apparaissent ; falc en signale le plus et le score chute davantage

Surcharger un profil

Tout seuil défini par règle dans lucid-lint.toml prend le pas sur le préréglage du profil.

[default]
profile = "public"

[rules.sentence-too-long]
max_words = 18   # plus strict que les 22 de public

Conditions

Une étiquette de condition décrit la condition cognitive qu’une règle vise en priorité. Les conditions sont orthogonales aux profils : un profil (dev-doc, public, falc) règle la sévérité des règles toujours actives ; les conditions ajoutent des règles ciblées pour un public précis.

L’ontologie figée

Étiquette	Cible
`general`	Règles toujours actives. La base de v0.2.
`a11y-markup`	Signaux de balisage proches de la prose (par exemple le cri en majuscules).
`dyslexia`	Signaux ciblant la dyslexie. Source : BDA Dyslexia Style Guide.
`dyscalculia`	Format des nombres et points d’ancrage. Source : CDC Clear Communication Index.
`aphasia`	Signaux ciblant l’aphasie. Source : FALC, guides en langage clair.
`adhd`	Signaux liés à la fragilité de l’attention.
`non-native`	Signaux pour lectrices et lecteurs non natifs (mots rares, expressions imagées).

L’ensemble est figé. Ajouter une étiquette est un choix réfléchi et versionné.

Comment le filtrage fonctionne

Pour chaque règle, le moteur évalue :

Une règle marquée general est toujours active.
Une règle sans general ne tourne que si une de ses étiquettes apparaît dans la liste de conditions actives de la personne qui lance l’outil.

Les 17 règles de v0.2 portent toutes general, donc le comportement par défaut ne change pas. Les futures règles étiquetées (par exemple lexicon.all-caps-shouting pour a11y-markup, syntax.nested-negation pour aphasia + adhd) s’activent par cette liste.

Configurer les conditions

Dans lucid-lint.toml :

[default]
profile = "falc"
conditions = ["dyslexia", "aphasia"]

En ligne de commande (séparées par des virgules, répétables) :

lucid-lint check --profile falc --conditions dyslexia,aphasia docs/

FALC garde son sens réglementaire. Ajouter dyslexia ne le relâche pas et ne le renomme pas — la condition pose des signaux dyslexie par-dessus.

Pourquoi des étiquettes, pas des profils parallèles

Trois niveaux de sévérité × N conditions explose en combinaisons. Garder les deux axes orthogonaux préserve le sens réglementaire de falc tout en laissant composer des couches dédiées à un public. Voir les entrées F71 et F72 sur la feuille de route.

Configuration

lucid-lint se configure par un fichier lucid-lint.toml à la racine du projet (facultatif) et par des options en ligne de commande (qui priment sur le fichier).

Forme du fichier

# lucid-lint.toml

[default]
profile = "public"

[rules.sentence-too-long]
max_words = 22

[rules.passive-voice]
max_per_paragraph = 2

Sections

`[default]`

Réglages par défaut appliqués à toute l’exécution.

Champ	Type	Défaut	Description
`profile`	chaîne	`"public"`	Une valeur parmi `dev-doc`, `public`, `falc`
`conditions`	tableau de chaînes	`[]`	Étiquettes de condition actives. Voir Conditions.
`exclude`	tableau de motifs glob	`[]`	Chemins ignorés pendant la descente récursive. Voir Exclure des chemins.

`[rules.<rule-id>]`

Configuration par règle. Les champs disponibles dépendent de la règle. Voir les pages de règles dans la référence des règles.

`[scoring]`

Paramètres ajustables du modèle hybride de score. Tous les champs sont facultatifs ; un champ absent retombe sur le défaut livré (category_max = 20, category_cap = 15).

[scoring]
category_max = 20
category_cap = 15

[scoring.weights]
sentence-too-long = 3
weasel-words      = 2

La sous-table [scoring.weights] est indexée par identifiant de règle. Les identifiants inconnus sont ignorés ; retirer une règle dans une version future ne casse donc pas les anciens fichiers.

Ordre de priorité

Du plus faible au plus fort :

Préréglage du profil (par exemple public)
Surcharges de lucid-lint.toml
Options en ligne de commande

Une option non passée en ligne de commande retombe sur la valeur TOML ; un champ TOML absent retombe sur le préréglage du profil.

Découverte

lucid-lint remonte depuis le dossier courant jusqu’au premier lucid-lint.toml trouvé, et s’arrête à la frontière du dépôt .git le plus proche. L’option --config <chemin> saute la découverte et charge le fichier indiqué ; un chemin explicite manquant est une erreur, mais un fichier auto-découvert manquant ne l’est pas.

Exclure des chemins

Les gros dépôts de documentation contiennent souvent des sorties générées, des textes vendus avec le projet et des instantanés qui noieraient le linter sous le bruit. Utilisez le champ exclude dans [default] — ou l’option --exclude <GLOB> en ligne de commande — pour les écarter à la découverte, avant l’analyse.

[default]
exclude = [
    "vendor/**",
    "**/fixtures/**",
    "CHANGELOG.md",
]

L’équivalent en ligne de commande :

lucid-lint check --exclude 'vendor/**,**/fixtures/**,CHANGELOG.md' docs

Notes :

Mise en correspondance. Les motifs glob s’appliquent au chemin relatif à la racine parcourue. Lancer lucid-lint check docs avec exclude = ["drafts/**"] ignore docs/drafts/....
Élaguer, ne pas visiter. Un dossier qui correspond n’est pas parcouru — les gros arbres exclus ne coûtent rien à traverser.
Les fichiers nommés explicitement passent quand même. Si vous passez docs/CHANGELOG.md directement en ligne de commande, il est analysé même quand CHANGELOG.md est dans la liste d’exclusion. Si vous le nommez, c’est que vous le voulez.
Additif. L’option --exclude et le champ TOML exclude se cumulent ; ils ne se remplacent pas. Séparez plusieurs motifs par des virgules dans une option, ou répétez --exclude.

Faire taire des règles globalement

Les documents Markdown acceptent des directives de désactivation en ligne pour faire taire localement, mais le texte brut et l’entrée standard n’ont pas cette porte de sortie. [[ignore]] comble le manque — et fonctionne pareil sur tous les formats d’entrée.

[[ignore]]
rule_id = "unexplained-abbreviation"

[[ignore]]
rule_id = "weasel-words"

Chaque entrée [[ignore]] retire tous les diagnostics dont le rule_id correspond, dans les fichiers Markdown, le texte brut et l’entrée standard. Le filtre s’applique après l’exécution de toutes les règles, mais avant le score, donc le score reflète la vue post-filtre.

Notes :

Portée globale. Le filtre n’est pas par fichier. Les directives en ligne restent la porte de sortie recommandée pour faire taire ponctuellement en Markdown — utilisez [[ignore]] seulement quand une règle est vraiment bruyante sur tout le projet.
Identifiants inconnus tolérés. Les entrées qui visent des règles disparues sont retirées sans rien dire, donc retirer une règle dans une version future ne casse pas les anciens fichiers.
Champs futurs. Un champ reason = "..." sur chaque entrée est suivi par F-suppression-reason-field — quand il arrivera, il sera affiché dans les rapports et exigible par configuration.

Surcharges par règle

La configuration TOML est branchée règle par règle, à mesure que chaque Config reçoit son accesseur dédié. Deux règles l’honorent aujourd’hui :

`[rules.readability-score]`

[rules.readability-score]
formula = "kandel-moles"  # ou "flesch-kincaid", "auto"

Fixe la formule de lisibilité, quelle que soit la langue détectée. auto (défaut) garde la sélection par langue de F-readability-formulas-extra.

`[rules.unexplained-abbreviation]`

[rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]

Les entrées sont additives par rapport à la base du profil (F31). Utilisez ce champ pour réintroduire des sigles propres au projet — normes d’accessibilité, sigles métier, termes de pratique d’ingénierie — que la base de v0.2 ne livre plus. Chaque entrée fait taire le sigle dans tout le document, comme si vous l’aviez défini en ligne par Expansion (ACRONYME).

`[rules."structure.excessive-commas"]`

[rules."structure.excessive-commas"]
max_commas = 2

Surcharge le plafond de virgules par phrase (défaut : 4 / 3 / 2 pour dev-doc / public / falc). La valeur doit être un entier positif — 0 ou une valeur négative est refusée au chargement. La surcharge remplace le préréglage du profil ; elle n’est pas additive.

Les tables pour les autres règles se lisent sans erreur, mais n’ont pas d’effet à l’exécution. Étendre cette liste est un changement mécanique par règle, qui se poursuivra pendant le cycle v0.2.x.

Score

v0.2 ajoute un modèle hybride de score par-dessus les diagnostics existants. Chaque exécution répond désormais à deux questions à la fois :

Qu’est-ce qui ne va pas, précisément ? — la liste des diagnostics, inchangée depuis v0.1.
À quel point ce document est mauvais dans l’ensemble ? — un nouveau score global, plus cinq sous-scores par catégorie.

Les deux surfaces sont complémentaires. Les scores sont des résumés ; les diagnostics restent le signal sur lequel agir.

Ce que le score signifie

Le score prend la forme X / max — un maximum arbitraire, pas un nombre normalisé sur 0–100. v0.2 livre max = 100 (cinq catégories × vingt points), mais ce nombre est traité comme un calibrage à tester et apprendre : l’échelle peut bouger dans une future version mineure, à mesure que les poids des règles sont ajustés sur de vrais corpus.

Les règles d’usage pour le calibrage du jour :

Plage	Lecture
80 – 100	Le score s’affiche en vert dans le terminal. Rien de bloquant.
60 – 79	Le score s’affiche en jaune. Quelques signalements à passer en revue.
0 – 59	Le score s’affiche en rouge. Problèmes denses ou règle qui s’emballe.

Les bandes de couleur aident la lecture ; elles ne sont pas un contrat de réussite ou d’échec. Pour bloquer la CI, utilisez --min-score avec un nombre concret que vous avez choisi.

Les cinq catégories

Chaque règle appartient à exactement une catégorie. v0.2 fige la taxonomie en cinq cases :

Catégorie	Couvre
`structure`	Longueur, imbrication, ponctuation, squelette du document
`rhythm`	Cadence et répétition entre phrases voisines
`lexicon`	Vocabulaire, terminologie, sigles, diversité lexicale
`syntax`	Style et clarté au niveau de la phrase
`readability`	Métriques de lisibilité au niveau du document

Voir la référence des règles pour la correspondance règle → catégorie.

Comment un score est calculé

Pour un seul document :

coût_par_règle      = Σ (poids × multiplicateur_de_sévérité)   sur les hits
coût_par_catégorie  = min(Σ coût_par_règle / (mots / 1000),    ← densité
                          category_cap)                         ← plafond
score_de_catégorie  = category_max − coût_par_catégorie         (borné ≥ 0)
score_global        = Σ score_de_catégorie

Trois mécaniques s’empilent :

Somme pondérée — chaque hit coûte poids × multiplicateur_de_sévérité. La table de poids par défaut vit dans scoring::default_weight_for ; elle insiste sur les règles dont les hits portent la plus grosse charge cognitive (readability-score = 5, longueur / subordination / passive / unclear-antecedent = 2, le reste = 1).
Normalisation par densité — les coûts sont divisés par mots / 1000, pour qu’un manuel de 10 000 mots ne soit pas puni d’avoir plus de hits qu’un README de 400 mots. Les documents de moins de 200 mots sont traités comme des documents de 200 mots ; les petites fixtures ne sont donc pas pénalisées artificiellement.
Plafond par catégorie — aucune catégorie ne peut perdre plus de category_cap sur category_max. Une règle bruyante mange au plus 75 % de sa propre catégorie (15 / 20 par défaut), et ne déborde pas sur les autres.

Le multiplicateur de sévérité est info = 1, warning = 3, error = 5.

Lire la sortie TTY

Le formateur de terminal imprime chaque diagnostic, une courte ligne de résumé, puis un bloc de score : le nombre global, suivi de chaque score de catégorie avec une barre sparkline en huit pas.

lucid-lint exécuté sur examples/sample.md — cinq diagnostics, un résumé qui compte 3 warnings et 2 info, une invite à utiliser explain, et un bloc de score qui affiche 45/100 avec des barres par catégorie pour structure, rhythm, lexicon, syntax et readability

La même exécution rendue en texte brut, pour les lecteurs d’écran et le copier-coller :

warning examples/sample.md:7:1 Sentence is 35 words long (maximum 30). Consider splitting it into shorter sentences. [section: A paragraph with a long sentence] [structure.sentence-too-long]
warning examples/sample.md:7:11 Weasel phrase "rather" weakens the statement. Replace with concrete language or remove it. [section: A paragraph with a long sentence] [lexicon.weasel-words]
info    examples/sample.md:1:1 Flesch-Kincaid grade 6.8 (target ≤ 14.0). [readability.score]
info    examples/sample.md:7:1 Sentence starts with a bare demonstrative "this". Name the referent to avoid forcing the reader to guess. [section: A paragraph with a long sentence] [syntax.unclear-antecedent]
warning examples/sample.md:7:1 Line is 210 characters wide (maximum 120). [section: A paragraph with a long sentence] [structure.line-length-wide]

summary: 3 warnings, 2 info.
→ run 'lucid-lint explain <rule-id>' — seen here: structure.sentence-too-long, lexicon.weasel-words, readability.score + 2 more
────────────────────────────────────────────────────────────
score: 45/100
       structure    █▎░░░  5/20
       rhythm       █████  20/20
       lexicon      █▎░░░  5/20
       syntax       ██▌░░  10/20
       readability  █▎░░░  5/20

Les cinq catégories sont toujours affichées, pour que le découpage reste structurellement stable d’une exécution à l’autre. Un document parfait affiche score: 100/100 avec toutes les barres pleines (█████). Quand la même règle se déclenche deux fois ou plus dans un fichier, les hits se groupent sous un en-tête compact, et le message ou la section partagés sont remontés pour n’apparaître qu’une fois.

Lire la sortie JSON

Le schéma JSON est en version = 2 dans v0.2. Nouveaux champs :

{
  "version": 2,
  "diagnostics": [
    {
      "rule_id": "structure.sentence-too-long",
      "severity": "warning",
      "location": { "file": { "kind": "path", "path": "draft.md" }, "line": 12, "column": 1, "length": 42 },
      "section": "Introduction",
      "message": "Sentence is 27 words long (maximum 22).",
      "weight": 2
    }
  ],
  "summary": { "info": 0, "warning": 1, "error": 0, "total": 1 },
  "score": { "value": 88, "max": 100 },
  "category_scores": [
    { "category": "structure",   "value": 8,  "max": 20 },
    { "category": "rhythm",      "value": 20, "max": 20 },
    { "category": "lexicon",     "value": 20, "max": 20 },
    { "category": "syntax",      "value": 20, "max": 20 },
    { "category": "readability", "value": 20, "max": 20 }
  ]
}

Les valeurs de catégorie sont des chaînes minuscules, dans l’ordre fixe listé plus haut. Les outils qui lisaient le schéma v0.1 doivent :

passer leur version attendue de 1 à 2 ;
remplacer les anciens noms de catégorie (length → structure, lexical → lexicon, style → syntax, global → readability) ;
ignorer les champs inconnus, pour qu’un futur ajout au schéma ne les casse pas.

Bloquer la CI avec `--min-score`

La sous-commande check accepte une option facultative --min-score=N. L’exécution sort 1 si le score global agrégé est sous N, indépendamment du blocage par sévérité.

# Échoue le build si la qualité globale tombe sous 85/100
lucid-lint check --min-score=85 docs/

Les deux gardes s’empilent : l’exécution échoue si l’une ou l’autre se déclenche. Choisissez l’une, l’autre ou les deux selon votre flux :

Garde par sévérité seule (comportement v0.1) : attrape les warnings nouvellement introduits, ne réagit pas à une dérive lente.
Garde par score seule (--fail-on-warning=false --min-score=85) : tolère des warnings isolés, mais échoue quand la densité dépasse votre seuil.
Les deux (défaut + --min-score=85) : pics et dérives échouent tous les deux le build.

Ajuster les poids dans `lucid-lint.toml`

Les projets peuvent surcharger le calibrage dans leur lucid-lint.toml :

[scoring]
category_max = 20
category_cap = 15

[scoring.weights]
sentence-too-long = 3
weasel-words      = 2

Les champs absents retombent sur les défauts livrés. La sous-table [scoring.weights] est indexée par identifiant de règle ; les identifiants inconnus sont ignorés, donc retirer une règle plus tard ne casse pas les anciens fichiers.

Ce qui est différé

Le brainstorm qui a façonné F14 (voir brainstorm/20260420-score-semantics.md) a gardé le modèle minimal. Les décorations ne seront promues que si les retours utilisateurs l’exigent :

Notes en lettres (A–F) — suivi par F-score-letter-grade. Promu si les nombres semblent bruyants ou difficiles à comparer entre documents.
Affichage feu tricolore + marge réussite/échec — suivi par F-score-traffic-light. Promu si les utilisateurs CI demandent un signal d’un coup d’œil plus fort.
Secondes de lecture comme unité alternative — suivi par F-reading-time-score. Demande une heuristique validée et des métriques compagnes (confort, fatigue), pour ne pas monopoliser la lecture.
Sous-scores par section — suivi par F-section-scoring. Une fois les agrégats document + projet éprouvés sur le terrain.
Agrégat multi-fichiers au niveau projet — suivi par F-project-scoring-rollup. En v0.2, la CLI traite tous les chemins passés comme un seul document pour le score.

Neutralisation des diagnostics

lucid-lint propose deux directives en ligne pour faire taire des diagnostics dans les entrées Markdown. Elles servent aux cas rares où une règle se déclenche sur de la prose intentionnelle (un terme vague cité, un exemple didactique de nominalisation lourde, une voix passive légitime). Préférez réécrire la prose d’abord. Sortez une directive quand la détection est un faux positif connu, ou quand l’autrice a vu l’avertissement et choisi de garder le texte.

Forme « ligne »

<!-- lucid-lint disable-next-line structure.sentence-too-long -->

Une phrase longue qui est intentionnelle et ne doit pas être signalée.

Syntaxe. Commentaire HTML, un identifiant de règle par directive. Plusieurs directives ligne peuvent précéder la même ligne cible.
Portée. La prochaine ligne non vide dans la source.
Raison facultative.  — surfacée dans la sortie JSON ; sera exigée via configuration dans une version future (suivi par F-suppression-reason-field sur la feuille de route).

Forme « bloc » (v0.2, F18)

<!-- lucid-lint-disable structure.sentence-too-long -->

Une phrase longue.

Une autre phrase longue dans la même portée.

<!-- lucid-lint-enable -->

Ouverture.  ouvre une portée pour une règle.
Fermeture.  ferme toutes les portées en cours. Passer un identifiant de règle () ne ferme que la portée de cette règle, ce qui permet d’imbriquer proprement des désactivations chevauchantes pour des règles différentes.
Portée. Toutes les lignes entre les deux commentaires (incluses).
Désactivation non fermée. S’étend jusqu’à la fin du document — utile pour un opt-out sur un fichier entier, mais préférez la directive disable-file planifiée (F-suppression-disable-file) dès qu’elle arrive.
Une règle par commentaire. Les listes multi-règles sont suivies par F-suppression-disable-file.

Propriétés communes

S’applique au Markdown uniquement. Le texte brut et l’entrée standard ne peuvent pas porter de commentaires HTML. Les ignorés par configuration ([[ignore]] dans lucid-lint.toml) couvrant .txt et l’entrée standard sont suivis par F19.
Les identifiants de règle inconnus sont silencieusement ignorés. Cela rend les directives compatibles d’une version de lint à une autre.
Les diagnostics supprimés ne coûtent rien au score. Les modèles de suppression et de score sont cohérents — faire taire un diagnostic le retire de la somme pondérée. Aucune double pénalité cachée.

Différé

Les extensions suivantes sont suivies sur la feuille de route :

ID	Élément
F19	Ignorés par configuration (`[[ignore]]` dans `lucid-lint.toml`) pour les entrées `.txt` et l’entrée standard
F-suppression-reason-field	Champ `reason="..."` facultatif puis exigé, surfacé dans les rapports
F-suppression-disable-file	Directive niveau fichier (`disable-file`) et listes multi-règles séparées par virgule

Voir aussi

Configuration — seuils TOML et surcharges de profil.
Score — comment les diagnostics supprimés influent sur les scores global et par catégorie.
Notes spécifiques par règle sur les cas où une suppression est idiomatique — voir la section ## Suppression sur chaque page de règle dans la référence des règles.

Intégration CI

lucid-lint est conçu pour la CI. Il renvoie :

0 quand aucun problème (ou seulement info) n’est trouvé
1 quand des warnings sont trouvés
2 sur erreur d’exécution (arguments invalides, fichier illisible)

GitHub Actions

name: Docs lint

on:
  pull_request:
    paths:
      - '**/*.md'
  push:
    branches: [main]
    paths:
      - '**/*.md'

jobs:
  lucid-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install lucid-lint
        run: cargo install lucid-lint
      - name: Lint docs
        run: lucid-lint check --profile=public docs/ README.md

Pre-commit

À ajouter dans votre .pre-commit-config.yaml :

repos:
  - repo: local
    hooks:
      - id: lucid-lint
        name: lucid-lint
        entry: lucid-lint check --profile=public
        language: system
        types: [markdown]

Reviewdog

Pour faire remonter les diagnostics en commentaires de revue de pull request :

lucid-lint check --format=json docs/ | reviewdog -f=rdjson -reporter=github-pr-review

Note : l’adaptateur RDJSON n’est pas livré. Pour une remontée native dans la revue de code, préférez le flux GitHub Code Scanning ci-dessous.

GitHub Code Scanning (SARIF)

--format=sarif émet un journal SARIF v2.1.0 que GitHub Code Scanning lit directement : chaque diagnostic devient une alerte de code-scanning, annotée sur le diff de la pull request.

name: Lucid lint (code scanning)

on:
  pull_request:
    paths: ['**/*.md']
  push:
    branches: [main]
    paths: ['**/*.md']

permissions:
  security-events: write
  contents: read

jobs:
  lucid-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: cargo install lucid-lint
      - name: Run lucid-lint and emit SARIF
        run: |
          lucid-lint check \
            --profile=public \
            --format=sarif \
            --fail-on-warning=false \
            docs/ README.md > lucid-lint.sarif
      - name: Upload SARIF to Code Scanning
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: lucid-lint.sarif
          category: lucid-lint

Notes :

--fail-on-warning=false laisse l’étape d’upload toujours s’exécuter ; reposez-vous sur les gardes de Code Scanning dans l’UI de la pull request, plutôt que sur le code de sortie du linter.
Chaque règle apparaît une fois dans runs[0].tool.driver.rules, avec sa catégorie, sa sévérité par défaut, son poids de score par défaut, et un helpUri qui pointe vers la page mdBook de la règle.
Sur chaque résultat, properties.weight et properties.section portent le poids de score et le titre de section sous lequel le diagnostic a été trouvé.

Contrôle du code de sortie

Pour ne pas faire échouer la CI sur des warnings (par exemple pendant une phase d’adoption progressive), vous pouvez inverser le défaut :

lucid-lint check --fail-on-warning=false docs/

L’exécution renvoie alors toujours 0, sauf en cas d’erreur d’exécution.

Bloquer sur le score

Vous pouvez aussi bloquer le build sur le modèle de score agrégé. L’exécution sort 1 si le score global est sous le seuil, indépendamment de la garde par sévérité.

jobs:
  lucid-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: cargo install lucid-lint
      - name: Lint and gate on score
        run: lucid-lint check --min-score=85 docs/ README.md

Les deux gardes s’empilent — l’exécution échoue si l’une ou l’autre se déclenche. Choisissez la combinaison adaptée à votre courbe d’adoption :

Objectif	Options
Attraper les warnings nouvellement introduits (comportement par défaut)	par défaut
Tolérer des warnings isolés mais échouer sur la dérive	`--fail-on-warning=false --min-score=85`
Échouer sur les pics et la dérive	par défaut + `--min-score=85`

Une exécution bloquée qui échoue — lucid-lint imprime son résumé habituel, puis le shell expose le code de sortie non nul :

Capture terminal : une exécution lucid-lint sur examples/sample.md avec –min-score=85, qui produit trois warnings, deux diagnostics info, un score de 45/100, et une ligne « exit: 1 » écrite par la commande echo qui suit

$ lucid-lint check --min-score=85 examples/sample.md
…
score: 45/100
       structure    █▎░░░  5/20
       rhythm       █████  20/20
       lexicon      █▎░░░  5/20
       syntax       ██▌░░  10/20
       readability  █▎░░░  5/20
$ echo "exit: $?"
exit: 1

Référence des règles

lucid-lint livre 25 règles en v0.2 (17 reprises de v0.1, 8 ajouts v0.2). Chaque règle dispose d’une page dédiée avec sa catégorie, sa sévérité, son poids par défaut, ses seuils par profil, des exemples, et les consignes de neutralisation.

La référence compacte RULES.md reste la vue d’ensemble en un seul fichier, conservée à la racine du dépôt. Les sources académiques et normatives derrière chaque règle sont consolidées sur la page Références.

Traduction FR — complète. Les 25 règles ont chacune leur page dédiée en français (jalon F25 sur la feuille de route).

Catégories

Chaque règle appartient à exactement une des cinq catégories fixes. La taxonomie fait autorité — le modèle de score compose les sous-scores par catégorie dans le score global X / max.

L’identifiant en kebab-case (par ex. structure.sentence-too-long) est le contrat stable utilisé partout : option CLI, sortie JSON, clé de configuration, citation dans les docs. Le libellé FR ci-dessous est un repère humain ; il n’aliase jamais l’identifiant.

Structure

Règle	Libellé
`structure.sentence-too-long`	Phrase trop longue
`structure.paragraph-too-long`	Paragraphe trop long
`structure.heading-jump`	Saut de niveau de titre
`structure.deeply-nested-lists`	Listes trop imbriquées
`structure.excessive-commas`	Virgules en excès
`structure.long-enumeration`	Énumération trop longue
`structure.deep-subordination`	Subordination profonde
`structure.line-length-wide`	Lignes trop larges
`structure.mixed-numeric-format`	Formats numériques mixtes

Rythme

Règle	Libellé
`rhythm.consecutive-long-sentences`	Phrases longues consécutives
`rhythm.repetitive-connectors`	Répétition de connecteurs

Lexique

Règle	Libellé
`lexicon.low-lexical-diversity`	Diversité lexicale faible
`lexicon.excessive-nominalization`	Nominalisations en excès
`lexicon.unexplained-abbreviation`	Abréviations non explicitées
`lexicon.weasel-words`	Mots évasifs
`lexicon.jargon-undefined`	Jargon non défini
`lexicon.all-caps-shouting`	Majuscules criardes
`lexicon.redundant-intensifier`	Intensificateurs redondants
`lexicon.consonant-cluster`	Amas consonantiques

Syntaxe

Règle	Libellé
`syntax.passive-voice`	Voix passive
`syntax.unclear-antecedent`	Antécédent flou
`syntax.nested-negation`	Négations imbriquées
`syntax.conditional-stacking`	Empilement de conditions
`syntax.dense-punctuation-burst`	Rafale de ponctuation

Lisibilité

Règle	Libellé
`readability.score`	Score de lisibilité

Source d’autorité. La catégorie de chaque règle est déterminée par Category::for_rule dans src/types.rs. Les tableaux ci-dessus reflètent cette fonction. Un test de couverture (tests/rule_docs_coverage.rs) tient les pages par règle, le helper de catégorie et les poids du score synchronisés.

Niveaux de sévérité

Niveau	Sens	Effet
`info`	Signal à connaître, pas un défaut	Remonté ; ne fait pas échouer la CI
`warning`	Problème de qualité à corriger	Remonté ; peut faire échouer la CI selon `--min-score`
`error`	Réservé pour v0.3+	Non émis en v0.2

Proposer une règle

Voir Contributing pour la checklist d’ajout de règle — toute nouvelle règle doit être livrée avec une page dans cette section.

`structure.sentence-too-long`

Phrase trop longue.

Ce que cette règle signale

Les phrases dont la longueur dépasse un plafond par profil. La charge cognitive intrinsèque d’une phrase croît de façon non linéaire avec son nombre de mots (Graesser et al. 2004, Coh-Metrix) ; le FALC plafonne à 15 mots, le Plain English à 20. Les phrases longues augmentent la probabilité qu’un lecteur à l’attention fragilisée perde le fil en cours de lecture.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`2`
Langues	EN · FR (détection identique)
Source	`src/rules/sentence_too_long.rs`

Détection

Le texte est découpé en phrases via la ponctuation forte (., !, ?, …, sauts de paragraphe). Les tokens mots Unicode sont comptés en excluant la ponctuation. Les contractions (don't) et élisions (l'accessibilité) comptent pour un seul mot quand l’apostrophe est entourée de deux lettres. Les blocs de code sont ignorés.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_words`	`int`	30	22	15
`exclude_code_blocks`	`bool`	`true`	`true`	`true`

Exemples

Trois idées, teintes assorties d’un bout à l’autre de la réécriture — la position les appariait déjà, la couleur confirme que la réécriture n’en perd aucune.

Avant (FR, signalée) :

Le sous-système de cache introduit lors d’un jalon précédent interagit mal avec le nouveau pipeline de requêtes sous charge soutenue, et l’enquête a nécessité plusieurs rondes de profilage.

Après :

Le cache a été introduit lors d’un jalon précédent. Il interagit mal avec le nouveau pipeline sous charge soutenue. L’enquête a nécessité plusieurs rondes de profilage.

Avant (EN, signalée) :

The caching subsystem, which was introduced in an earlier milestone, turned out to interact poorly with the new request pipeline under sustained load, and the investigation that followed required multiple rounds of profiling.

Après :

The caching subsystem was introduced earlier. It interacts poorly with the new request pipeline under sustained load. The investigation required several rounds of profiling.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant) pour les formes en ligne et par bloc.

Voir aussi

rhythm.consecutive-long-sentences — capture le rythme ; son seuil doit rester inférieur au max_words d’ici.
Modèle de score — structure.sentence-too-long porte un poids de 2 parce que le coût cognitif se compose avec la longueur.

Références

Sweller (1988)
Plain Language US (2011)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`structure.paragraph-too-long`

Paragraphe trop long.

Ce que cette règle signale

Les paragraphes qui dépassent un seuil en nombre de phrases ou en nombre de mots. Le paragraphe est l’unité visuelle de reprise : un paragraphe trop long dilue ce point de reprise pour les lecteurs qui s’interrompent souvent. Les deux mesures sont vérifiées afin qu’un paragraphe court mais dense (une seule phrase de 80 mots) soit aussi attrapé — structure.sentence-too-long couvre le cas complémentaire.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`2`
Langues	EN · FR (détection identique)
Source	`src/rules/paragraph_too_long.rs`

Détection

Découpage sur les lignes vides (convention Markdown du paragraphe). Comptage des phrases et des mots par paragraphe. Signalement des paragraphes dépassant l’un ou l’autre des seuils.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_sentences`	`int`	7	5	3
`max_words`	`int`	150	100	60

Exemples

Un paragraphe de huit phrases moyennes sous le profil public se déclenchera sur max_sentences. Un paragraphe contenant une seule phrase de 120 mots se déclenchera sur max_words (et également sur structure.sentence-too-long).

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

structure.sentence-too-long
rhythm.consecutive-long-sentences

Références

Sweller (1988)
Graesser et al. (2004)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`structure.heading-jump`

Saut de niveau de titre.

Ce que cette règle signale

Les sauts de niveau de titre qui cassent la carte mentale du document (par exemple H2 → H4). Chaque niveau doit suivre le précédent d’au plus +1. Les lecteurs avec des difficultés attentionnelles s’appuient fortement sur la hiérarchie des titres pour se repositionner après une interruption ; une hiérarchie cassée détruit cet indice. Signale aussi le tout premier titre s’il est plus profond que H2 quand allow_first_heading_any_level vaut false, ainsi que l’absence de H1 quand require_h1 vaut true.

Références. WCAG 2.1 SC 1.3.1 (Information et relations) et 2.4.6 (En-têtes et étiquettes) ; RGAA 9.1.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	indépendant de la langue
Source	`src/rules/heading_jump.rs`

Détection

Analyse des titres Markdown (#, ##, …). Parcours dans l’ordre source ; signalement de chaque titre dont le niveau dépasse le précédent de plus d’un. Déterministe, pas de faux positifs.

Paramètres

Clé	Type	Défaut
`allow_first_heading_any_level`	`bool`	`true`
`require_h1`	`bool`	`false`

Règle binaire — pas de seuils par profil.

Exemples

Signalé :

# Vue d'ensemble
#### Détails    ← saut de H1 à H4

Propre :

# Vue d'ensemble
## Section
### Sous-section

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

structure.deeply-nested-lists — le signal équivalent au niveau des listes.

Références

WCAG 2.1 — 1.3.1 & 2.4.6

Voir Références pour la bibliographie complète.

`structure.deeply-nested-lists`

Listes trop imbriquées.

Ce que cette règle signale

Les items de liste à puces imbriqués au-delà d’une profondeur raisonnable. Une liste profondément imbriquée force le lecteur à reconstruire une hiérarchie mentale complexe — l’indentation horizontale cesse d’être un indice positionnel et devient du bruit. Quatre niveaux d’indentation, c’est trop pour des lecteurs avec des difficultés attentionnelles.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	indépendant de la langue
Source	`src/rules/deeply_nested_lists.rs`

Détection

Analyse Markdown via pulldown-cmark ; extraction des items de liste avec leur niveau d’indentation ; signalement des items au-delà de max_depth. Déterministe, pas de faux positifs.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_depth`	`int`	4	3	2

Exemple

Sous le profil public (profondeur max 3) :

- Niveau 1
  - Niveau 2
    - Niveau 3
      - Niveau 4    ← signalé

Message de diagnostic

Inclut un guide de réparation : aplatir la structure, scinder en plusieurs listes, ou promouvoir les sous-items en sous-sections avec des titres.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

WCAG 2.1

Voir Références pour la bibliographie complète.

`structure.line-length-wide`

Lignes trop larges.

Ce que cette règle signale

Les lignes choisies par l’auteur plus larges que le plafond du profil. WCAG 1.4.8 (AAA) plafonne le texte rendu à environ 80 caractères par ligne, car des lignes plus longues forcent l’œil à parcourir plus de distance entre saccades et augmentent la relecture au retour à la ligne — une difficulté connue chez les lecteurs dyslexiques (BDA Dyslexia Style Guide).

« Choisies par l’auteur » est important. En Markdown, les sauts mous sont remplacés par des espaces lors de l’analyse, parce que le rendu réorganise le texte selon la largeur de l’écran. La largeur de la ligne source ne dit donc rien de ce que voit le lecteur. Cette règle ne mesure que les sauts gardés volontairement : sauts durs Markdown (<br> ou deux espaces en fin de ligne) et retours à la ligne explicites en texte brut. Un paragraphe Markdown soft-wrappé est exempté, peu importe la longueur de son texte joint. Pour borner la densité d’un paragraphe, voir structure.paragraph-too-long.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Mots-clés de condition	`dyslexia`, `general`
Langues	EN · FR (indépendant de l’écriture)
Source	`src/rules/line_length_wide.rs`

Détection

Pour chaque paragraphe qui contient un saut de ligne voulu par l’auteur, mesure de la largeur de chaque ligne en clusters de graphèmes ; signalement des lignes au-delà de max_line_length.

Un paragraphe Markdown sans saut dur (le cas courant en prose) est exempté. Les sauts mous sont remplacés par des espaces lors de l’analyse : ce qui reste est une ligne logique dont la longueur source suit la largeur de l’éditeur, pas le rendu visé par WCAG 1.4.8. Le texte brut suit la même logique : un paragraphe sans \n interne est exempté ; un paragraphe avec retours à la ligne internes est mesuré ligne par ligne.

Les blocs de code (clôturés ou indentés) sont exclus en amont par le parseur Markdown. Les titres, items de liste et cellules de tableau sont hors scope par construction — paragraph-too-long, sentence-too-long et les règles de titres couvrent les charges cognitives qui s’appliquent à ces blocs.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_line_length`	`int`	120	100	80

Le profil FALC s’aligne sur la recommandation AAA WCAG 1.4.8 de 80 caractères.

Mises en garde connues

Les paragraphes de prose en une seule ligne source sont exemptés volontairement. La règle se déclenchait dessus auparavant et générait beaucoup de bruit sur de la prose réelle ; v0.2.x la restreint aux sauts choisis par l’auteur. À combiner avec structure.paragraph-too-long si tu veux aussi un plafond sur la longueur jointe du paragraphe.

Les titres et items de liste ne sont pas mesurés par cette règle. Leur largeur de retour dépend du rendu (corps des titres, indentation des listes), et les charges cognitives sous-jacentes sont déjà couvertes par d’autres règles.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

structure.paragraph-too-long
structure.sentence-too-long
Conditions (page EN pour l’instant)

Références

WCAG 2.1 — 1.4.8 (AAA)
Legge & Bigelow (2011)

Voir Références pour la bibliographie complète.

`structure.mixed-numeric-format`

Formats numériques mixtes.

Ce que cette règle signale

Les phrases qui mêlent des numéraux en chiffres (42, 3.14, 1,000, 1 000) avec des numéraux écrits en toutes lettres (two, trois, twenty, cent) au sein de la même phrase. Présenter les nombres de manière incohérente force le lecteur à changer de forme visuelle en cours de proposition et à ré-ancrer le référent — une charge connue pour les lecteurs dyscalculiques et un anti-patron du langage clair.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Mots-clés de condition	`dyscalculia`, `general`
Langues	EN · FR
Source	`src/rules/mixed_numeric_format.rs`

Détection

Pour chaque phrase produite par le tokenizer, balayage des tokens chiffrés et des entrées de la liste des numéraux en lettres pour la langue. Si au moins un de chaque type co-existe, un seul diagnostic est émis pour la phrase, citant un token représentatif de chaque type.

Les tokens chiffrés acceptent les chiffres ASCII plus un séparateur décimal facultatif (.) ou de milliers (,, espace fine U+0020) quand il est encadré de chiffres des deux côtés. Les correspondances en toutes lettres sont des comparaisons ASCII insensibles à la casse contre en::SPELLED_NUMERALS et fr::SPELLED_NUMERALS.

Les formes ambiguës one (EN) et un / une (FR) sont exclues de la liste des numéraux en lettres parce qu’elles servent aussi de pronoms indéfinis et d’articles. Cela maintient un taux de faux positifs gérable, au prix de manquer les vrais cas de format mixte dont le seul numéral en lettres est one / un / une. Les formes régionales (Suisse / Belgique : septante, huitante, octante, nonante) ainsi que les formes métropolitaines sont incluses.

Les phrases sont produites par le tokenizer partagé (voir src/parser/tokenizer.rs), afin que les abréviations, décimales et points de suspension ne fragmentent pas indûment les phrases. Les blocs de code (clôturés ou indentés) sont exclus en amont par le parseur Markdown.

Paramètres

Aucun. La règle n’a pas de seuil configurable — une seule co-occurrence des deux formes suffit.

Mises en garde connues

Les phrases dont le seul numéral en lettres est one / un / une ne sont pas signalées, par construction (voir Détection).
Les ordinaux (first, premier, 2nd, 3e) sont hors périmètre. 2nd se lit actuellement comme un token chiffré (2) suivi d’un mot (nd), ce qui ne correspond pas à la liste des numéraux en lettres — pas de faux positif.
Les chiffres romains (IV, XIV) ne sont ni des chiffres ni des numéraux en lettres pour cette règle.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

readability.score
Conditions (page EN pour l’instant)

Références

ISO 80000-1:2022
Chicago Manual of Style (17th ed.)

Voir Références pour la bibliographie complète.

`structure.excessive-commas`

Virgules en excès.

Ce que cette règle signale

Les phrases dont le nombre de virgules dépasse un plafond par profil. La virgule est le marqueur le plus fréquent de complexité syntaxique ; plutôt que de démêler la cause (subordination, apposition, énumération, incise), la règle se sert de la densité comme indicateur avancé de surcharge.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (détection identique)
Source	`src/rules/excessive_commas.rs`

Détection

Compter les virgules par phrase, signaler celles qui dépassent max_commas.

Interaction. Quand structure.long-enumeration se déclenche sur la même phrase, cette règle est neutralisée pour cette phrase afin d’éviter un double signalement. Le détecteur d’énumération partagé décompte les virgules Oxford (3 items courts ou plus, plus une passe rythmique relâchée pour les items de 1 à 4 mots, plus les listes fermées par plus au même titre que et / ou — voir « Faux positifs connus » ci-dessous) et les virgules à l’intérieur des listes de tokens parenthésées (A, B, C, …) (3 segments courts ou plus séparés par des virgules entre parenthèses équilibrées) — tous les décomptes sont agnostiques à la langue.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_commas`	`int`	4	3	2

Faux positifs connus

Les faux positifs restants viennent surtout des listes sans connecteur terminal (par exemple Rules touched: A, B, C) et des énumérations Oxford interrompues par une parenthèse interleavée ; ils sont suivis sous F22 dans la feuille de route pour les prochaines sous-tranches v0.3.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

structure.long-enumeration
structure.deep-subordination

Références

Gibson (1998)

Voir Références pour la bibliographie complète.

`structure.long-enumeration`

Énumération trop longue.

Ce que cette règle signale

Les énumérations en prose inline qui seraient plus claires sous forme de liste à puces — 5 items ou plus séparés par des virgules et fermés par un coordinateur (et, ou, and, or).

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (détection identique)
Source	`src/rules/long_enumeration.rs`, helper partagé `src/rules/enumeration.rs`

Détection

Séquence de min_items segments courts ou plus, séparés par des virgules, terminée par , and / , or / , plus / , et / , ou (virgule Oxford facultative). Le détecteur partagé alimente également structure.excessive-commas.

Paramètres

Clé	Type	Défaut
`min_items`	`int`	`5`

Message de diagnostic

Suggère de convertir l’énumération en liste à puces.

Exemples

Six items, teintes assorties d’un bout à l’autre de la réécriture — chaque terme inline s’aligne avec sa puce.

Avant (FR, signalée) :

Le plat contient tomate, oignon, ail, basilic, persil et thym.

Après :

Le plat contient :

tomate

oignon

ail

basilic

persil

thym

Avant (EN, signalée) :

The dish contains tomato, onion, garlic, basil, parsley, and thyme.

Après :

The dish contains:

tomato

onion

garlic

basil

parsley

thyme

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Plain Language US (2011)

Voir Références pour la bibliographie complète.

`structure.deep-subordination`

Subordination profonde.

Ce que cette règle signale

Les cascades de subordonnées : plusieurs pronoms relatifs ou conjonctions de subordination enchaînés sans rupture forte de ponctuation. Chaque référent ouvert doit rester en mémoire de travail jusqu’à sa clôture — la Dependency Locality Theory de Gibson (1998) relie le coût de traitement directement à cette distance.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`2`
Langues	EN · FR (listes distinctes)
Source	`src/rules/deep_subordination.rs`

Détection

Parcours de la phrase entre ruptures fortes de ponctuation ; décompte des subordonnants consécutifs. Signalement quand le décompte dépasse max_consecutive_subordinators. Les énumérations de pronoms (qui, que, dont, où) sont ignorées — le détecteur reconnaît la forme listée et ne la traite pas comme une cascade.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_consecutive_subordinators`	`int`	3	2	2

Listes par langue

🇫🇷 Pronoms relatifs : qui, que, dont, où, lequel, laquelle, lesquels, lesquelles
🇫🇷 Subordonnants : parce que, afin que, bien que, quoique, puisque, pour que, tandis que
🇬🇧 Pronoms relatifs : which, that, who, whom, whose
🇬🇧 Subordonnants : because, although, while, since, whereas, unless, until

Exemples

Chaque token surligné est un subordonnant compté par la règle. Quatre consécutifs déclenchent le seuil dev-doc (3) ; deux consécutifs déclenchent public et falc.

Signalé (FR) :

Le document qui a été rédigé par l’équipe que nous avons constituée et qui couvre les points que nous avions discutés…

Signalé (EN) :

The report that was drafted by the team which we formed last month and which covers the topics that we had discussed…

Non signalé (forme énumération, reconnue par le détecteur) :

Les pronoms relatifs en français sont : qui, que, dont, où.

Et la forme équivalente en anglais :

The English relative pronouns are: which, that, who, whom, whose.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Gibson (1998)

Voir Références pour la bibliographie complète.

`structure.italic-span-long`

Phrase en italique trop longue.

Expérimentale en v0.2.x. Désactivée par défaut ; activez-la via --experimental structure.italic-span-long ou [experimental] enabled = ["structure.italic-span-long"] dans lucid-lint.toml. Passe à Stable au moment du tag v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour le tag dyslexia qui gouverne cette règle selon les conditions actives.

Ce que la règle détecte

Les spans italiques (*…* / _…_) qui dépassent un seuil de mots configurable. Les glyphes inclinés gênent la reconnaissance des formes de lettres pour les personnes dyslexiques — un constat solide qui motive la recommandation de la British Dyslexia Association : garder l’italique pour de courtes phrases plutôt que pour des passages entiers. Les longs passages en italique nuisent aussi au repérage visuel pour tout lecteur dont l’attention est déjà sollicitée (fatigue, lecture en seconde langue, basse vision).

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Statut	`experimental` (v0.2.x) → `stable` au tag v0.3
Tag de condition	`dyslexia` (gouverné ; ne s’exécute qu’avec `--conditions` correspondant)
Langues	EN · FR (détection identique — le substrat est agnostique)
Source	`src/rules/structure/italic_span_long.rs`

Détection

Parcourt l’arbre inline typé attaché à chaque Paragraph (substrat F143) et signale tout span Inline::Emphasis dont le nombre de mots visibles dépasse le seuil du profil. Les blocs de code et le code inline sont exclus par le parseur ; un italique dans un bloc de code ne déclenche jamais la règle. Le gras (**bold**) ne déclenche pas non plus cette règle — seul l’italique (*italique* / _italique_).

La position du diagnostic pointe sur le délimiteur d’ouverture : le surlignage dans votre éditeur se place sur le * ou _ visible, pas sur une colonne arbitraire dans le paragraphe.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_words`	`int`	12	8	5

Pour ajuster via lucid-lint.toml :

[rules."structure.italic-span-long"]
max_words = 6

Exemples

Anglais

Avant (signalé) :

The team eventually concluded that the proposed migration plan would require careful coordination across three regional offices and an extended freeze window before any deployment could begin.

Ce que lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia rapporte :

warning input.md:1:36 Italic span is 17 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]

Après (réécriture proposée) :

The team eventually concluded that the proposed migration plan would require careful coordination. Three regional offices and an extended freeze window are prerequisites before any deployment.

L’italique marque maintenant un seul mot porteur — l’usage que le guide BDA recommande.

Français

Avant (signalé) :

L’équipe a fini par conclure que le plan de migration proposé nécessiterait une coordination soignée entre trois bureaux régionaux et une fenêtre de gel prolongée avant tout déploiement.

Ce que lucid-lint check --profile public --experimental structure.italic-span-long --conditions dyslexia rapporte :

warning input.md:1:35 Italic span is 18 words long (maximum 8). Long italic runs strain dyslexic readers; consider shortening the emphasized phrase or removing the italics. [structure.italic-span-long]

Après (réécriture proposée) :

L’équipe a fini par conclure que le plan de migration nécessiterait une coordination soignée. Trois bureaux régionaux et une fenêtre de gel prolongée sont indispensables avant tout déploiement.

Suppression

Voir Supprimer un diagnostic pour les formes inline et bloc. La directive inline fonctionne sur cette règle :

<!-- lucid-lint disable-next-line structure.italic-span-long -->
Une *phrase volontairement longue en italique que la règle signalerait normalement* est ici.

Voir aussi

Conditions — le tag dyslexia qui gouverne cette règle.
F-experimental-rule-status — statut expérimental des règles — substrat qui permet à cette règle d’arriver en v0.2.x sans affecter les scores par défaut.
F143 — couche AST inline — substrat qui expose les bornes des spans d’emphase à cette règle.

Références

British Dyslexia Association — Dyslexia Style Guide (2018). Recommande de garder l’italique pour de courtes phrases afin de préserver la reconnaissance des formes de lettres.
Rello & Baeza-Yates (2013) — contexte académique plus large sur la typographie favorable à la dyslexie.

Voir Références pour la bibliographie complète.

`structure.number-run`

Trop de nombres dans une seule phrase.

Expérimentale en v0.2.x. Désactivée par défaut ; activez-la via --experimental structure.number-run ou [experimental] enabled = ["structure.number-run"] dans lucid-lint.toml. Passe à Stable au moment du tag v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour le tag dyscalculia qui gouverne cette règle selon les conditions actives.

Ce que la règle détecte

Les phrases qui empilent plus d’un seuil configurable de jetons numériques. plainlanguage.gov est explicite sur le cadrage — « Don’t put a lot of numbers together in one sentence » et « Avoid placing too many statistics close together » — et les personnes dyscalculiques en paient le coût en premier : chaque jeton numérique force un nouvel ancrage quantité-vers-symbole qui ne profite pas du contexte de la prose comme un mot ordinaire. Les enfilades de citations ((Smith 2020, Jones 2021, Wei 2022, Park 2023)), les tableaux de mesures aplatis dans la prose et les paragraphes saturés de statistiques sont les cas typiques.

En bref


Catégorie	`structure`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Statut	`experimental` (v0.2.x) → `stable` au tag v0.3
Tag de condition	`dyscalculia` (gouverné ; ne s’exécute qu’avec `--conditions` correspondant)
Langues	EN · FR (détection identique — les chiffres sont agnostiques)
Source	`src/rules/structure/number_run.rs`

Détection

Parcourt le flux de phrases de chaque paragraphe (après aplatissement, les blocs de code clos sont déjà exclus par le parseur) et compte les jetons numériques par phrase. Un jeton numérique est une suite contiguë de chiffres ASCII, contenant éventuellement un séparateur décimal (. ou ,) suivi de chiffres. Le tiret, le deux-points, la barre oblique et les espaces séparent les jetons.

Entrée	Jetons comptés	Remarque
`42`	1	Entier nu
`3.14`	1	Séparateur décimal conservé
`1,000`	1	Virgule conservée
`2026-05-04`	3	Les tirets séparent — une date vaut trois nombres en charge cognitive
`$3.50`	1	Préfixe monétaire non-chiffre, ignoré
`1st`	1	Lettres finales séparées ; les chiffres comptent

La position du diagnostic pointe sur le premier jeton numérique de la phrase fautive : le surlignage de l’éditeur tombe sur le bloc visible plutôt qu’au début de la phrase.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_numbers`	`int`	6	4	3

Pour ajuster via lucid-lint.toml :

[rules."structure.number-run"]
max_numbers = 5

Exemples

Anglais

Avant (signalé) :

The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points over the prior year.

Ce que lucid-lint check --profile public --experimental structure.number-run --conditions dyscalculia rapporte :

warning input.md:1:5 Sentence packs 8 numeric tokens (maximum 4). plain-language guidance recommends not placing many numbers or statistics together in one sentence; split the sentence or move some figures to a list or table. [structure.number-run]

Après (votre réécriture) :

The 2024 cohort sat 1,200 students across 4 campuses. They posted a 92.5% pass rate on the reviewed papers and improved 18 points over the prior year.

Les chiffres voyagent toujours ensemble, mais chaque phrase porte une charge qu’une lectrice dyscalculique peut ré-ancrer sans perdre le référent.

Français

Avant (signalé) :

La promotion 2024 a réuni 1 200 étudiants sur 4 campus, affiché un taux de réussite de 92,5 % sur les 3 copies revues, et progressé de 18 points par rapport à l’année précédente.

Après (votre réécriture) :

La promotion 2024 a réuni 1 200 étudiants sur 4 campus. Le taux de réussite atteint 92,5 % sur les copies revues et progresse de 18 points par rapport à l’année précédente.

Suppression

Voir Supprimer les diagnostics pour les formes inline et bloc. La désactivation inline fonctionne aussi sur cette règle :

<!-- lucid-lint disable-next-line structure.number-run -->
The 2024 cohort sat 1,200 students across 4 campuses, posted a 92.5% pass rate on the 3 reviewed papers, and improved 18 points.

Voir aussi

Conditions — le tag dyscalculia qui gouverne cette règle.
structure.mixed-numeric-format — règle sœur sur la cohérence de la forme numérique. Découpe atomique : mixed-numeric-format regarde si chiffres et numéraux écrits cohabitent ; number-run regarde combien de jetons numériques s’agglutinent, peu importe la forme.
F-experimental-rule-status — statut expérimental — substrat qui permet à cette règle d’arriver en v0.2.x sans affecter les scores par défaut.

Références

plainlanguage.gov — Use short, simple sentences. « Don’t put a lot of numbers together in one sentence. »
plainlanguage.gov — Use numerals. Conseil compagnon sur la cohérence de la forme numérique (qui motive mixed-numeric-format).

Voir Références pour la bibliographie complète.

`rhythm.consecutive-long-sentences`

Phrases longues consécutives.

Ce que cette règle signale

Des séries de phrases longues à l’intérieur d’un même paragraphe. Une phrase longue isolée reste gérable ; plusieurs d’affilée fatiguent l’attention même si chaque phrase reste sous le plafond de structure.sentence-too-long. Cette règle capte le rythme.

En bref


Catégorie	`rhythm`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (détection identique)
Source	`src/rules/consecutive_long_sentences.rs`

Détection

Parcourir les phrases dans l’ordre à l’intérieur de chaque paragraphe. Maintenir un compteur de phrases consécutives au-dessus de word_threshold. Émettre un seul diagnostic par série atteignant max_consecutive.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`word_threshold`	`int`	20	15	10
`max_consecutive`	`int`	3	2	2

Relation à `structure.sentence-too-long`

Les deux règles regardent la longueur des phrases mais signalent des problèmes différents :

Règle	Seuil (`dev-doc` / `public` / `falc`)	Se déclenche sur
`structure.sentence-too-long`	`max_words` 30 / 22 / 15	une phrase isolée au-delà du plafond
`rhythm.consecutive-long-sentences`	`word_threshold` 20 / 15 / 10	une série de `max_consecutive` phrases chacune au-dessus du seuil inférieur

Comme word_threshold reste sous max_words, cette règle capte le rythme même quand aucune phrase isolée ne franchit sentence-too-long. L’invariant word_threshold < max_words (par profil) empêche les deux règles de se déclencher ensemble sur la même phrase.

Exemples

Cinq idées, teintes assorties d’un bout à l’autre de la réécriture — seul le rythme change. lucid-lint signale ; la réécriture vous appartient.

Français

Avant (signalée) :

La migration a introduit une couche de cache qui se place devant chaque lecture de la base de données primaire. L’équipe a observé des pics de latence inattendus chaque fois que le cache s’invalidait sous une charge d’écriture soutenue. Une enquête ultérieure a relié la régression à un effet thundering-herd qui se déclenchait sur chaque clé froide. Le tableau de bord des métriques signalait à tort un délai d’attente générique parce que la propagation de la trace était incomplète. Le correctif a fusionné les remplissages concurrents, randomisé les TTL, et instrumenté la couche de cache avec un émetteur de span dédié.

Cinq phrases, chacune au-delà de 20 mots — la série fatigue l’attention.

Ce que lucid-lint check --profile dev-doc rapporte :

warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]

Après (votre réécriture) :

La migration a introduit une couche de cache devant la base de données primaire. La latence montait dès que le cache s’invalidait sous écritures soutenues. Le coupable : un thundering-herd sur les clés froides. Les métriques signalaient un délai générique — la trace était cassée. Le correctif fusionne les remplissages, randomise les TTL et émet un span dédié.

Anglais

Avant (signalée) :

The migration introduced a caching layer that sits in front of every read from the primary database. The team observed unexpected latency spikes whenever the cache invalidated under sustained write load. A subsequent investigation traced the regression to a thundering-herd pattern that fired on every cold key. The metrics dashboard misreported the issue as a generic timeout because the trace propagation was incomplete. The fix coalesced concurrent fills, added jittered TTLs, and instrumented the cache layer with a dedicated span emitter.

Ce que lucid-lint check --profile dev-doc rapporte :

warning input.md:1:1 5 consecutive sentences exceed 20 words (max 3). Vary sentence length or split the streak. [rhythm.consecutive-long-sentences]

Après (votre réécriture) :

The migration introduced a caching layer in front of the primary database. Latency spiked whenever the cache invalidated under heavy writes. The cause was a thundering-herd pattern on cold keys. Metrics misreported it as a generic timeout — trace propagation was broken. The fix coalesced concurrent fills, added jittered TTLs, and emitted a dedicated span.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant) pour les formes en ligne et par bloc.

Voir aussi

structure.sentence-too-long — capte les phrases longues isolées ; cette règle capte la série même quand chaque phrase reste sous ce plafond.
Modèle de score — rhythm.consecutive-long-sentences porte le poids par défaut 1 ; le coût cognitif est cumulatif, pas par phrase.

Références

Sweller (1988)
Sweller, Ayres & Kalyuga (2011)

Voir Références pour la bibliographie complète.

`rhythm.repetitive-connectors`

Répétition de connecteurs.

Ce que cette règle signale

Surutilisation d’un même connecteur logique dans une fenêtre courte de phrases. Les connecteurs (opposition, cause, conséquence, séquence, illustration, addition) sont des points d’attention ; répétés, ils aplatissent le sentiment de progression. Sanders & Noordman (2000), Connectives as processing signals ; Graesser et al. (2004), cohésion locale.

En bref


Catégorie	`rhythm`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (listes séparées)
Source	`src/rules/repetitive_connectors.rs`

Détection

Fenêtre glissante de window_size phrases. Par connecteur, compter les occurrences dans la fenêtre. Émettre un diagnostic par grappe qui dépasse max_per_window.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_per_window`	`int`	4	3	2
`window_size`	`int`	5	5	5
`custom_connectors`	`list`	`[]`	`[]`	`[]`

Listes de connecteurs par défaut

🇫🇷 Opposition : cependant, toutefois, en revanche, néanmoins, pourtant, mais
🇫🇷 Cause : parce que, car, puisque, en effet
🇫🇷 Conséquence : donc, ainsi, par conséquent, c’est pourquoi
🇫🇷 Séquence : d’abord, ensuite, puis, enfin, premièrement
🇫🇷 Illustration : par exemple, notamment, en particulier
🇫🇷 Addition : de plus, en outre, également, par ailleurs
🇬🇧 Opposition : however, nevertheless, yet, although, but
🇬🇧 Cause : because, since, as, for
🇬🇧 Conséquence : therefore, thus, consequently, hence, so
🇬🇧 Séquence : first, then, next, finally
🇬🇧 Illustration : for example, notably, in particular, such as
🇬🇧 Addition : moreover, furthermore, also, additionally

Exemples

lucid-lint signale ; la réécriture vous appartient.

Français

Cinq actions, teintes assorties d’un bout à l’autre de la réécriture — seuls les connecteurs changent.

Avant (signalée) :

Nous avons analysé les données. Ensuite nous avons construit le modèle. Ensuite nous avons validé les résultats. Ensuite nous avons publié le rapport. Ensuite nous avons archivé les données brutes.

Quatre ensuite en cinq phrases — aucune progression ressentie.

Ce que lucid-lint check --profile public rapporte :

warning input.md:1:1 Connector "ensuite" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]

Après (votre réécriture) :

Nous avons analysé les données. À partir de là nous avons construit le modèle. La validation a suivi, et dès que les résultats ont tenu nous avons publié le rapport. Les données brutes ont été archivées en dernier.

Anglais

Cinq actions, teintes assorties d’un bout à l’autre de la réécriture — seuls les connecteurs changent.

Avant (signalée) :

We analysed the data. Then we built the model. Then we validated the results. Then we published the report. Then we archived the raw data.

Quatre then en cinq phrases — aucune progression ressentie.

Ce que lucid-lint check --profile public rapporte :

warning input.md:1:1 Connector "then" appears 4 times within 5 consecutive sentences (max 3). Vary the connector or restructure the passage. [rhythm.repetitive-connectors]

Après (votre réécriture) :

We analysed the data. From it we built the model. Validation followed, and once the results held up we published the report. The raw data was archived last.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant) pour les formes en ligne et par bloc.

Voir aussi

structure.sentence-too-long — phrases longues et abus de connecteurs co-occurrent souvent ; signaler les deux fait apparaître un signal de rythme plus riche.
Modèle de score — rhythm.repetitive-connectors porte le poids par défaut 1 ; le coût est local, pas cumulatif.

Références

Sanders & Noordman (2000)
Graesser et al. (2004)

Voir Références pour la bibliographie complète.

`lexicon.weasel-words`

Mots évasifs.

Ce que cette règle signale

Les qualificatifs vagues qui affaiblissent une affirmation. Un mot fuyant ajoute une charge cognitive invisible : le lecteur doit décider si l’assertion compte, est vraie, ou mesurable. Références : guide de style Wikipédia (Avoid weasel words), Strunk & White, FALC.

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (listes distinctes)
Source	`src/rules/weasel_words.rs`

Détection

Correspondance sur frontière de mot contre une liste par langue. Insensible à la casse. Un diagnostic par occurrence.

Spans de code inline. Une occurrence à l’intérieur de `…` est ignorée. Entourer un terme fuyant de backticks quand on parle du mot lui-même.
Appariements directionnels. plutôt que (FR) et rather than (EN) sont des conjonctions qui signifient « au lieu de » — ce ne sont pas des formules d’atténuation — et sont ignorés.

Paramètres

Clé	Type	Défaut
`custom_weasels_fr`	`list`	`[]`
`custom_weasels_en`	`list`	`[]`
`disable_weasels`	`list`	`[]`

Listes par défaut (v0.1)

🇫🇷 quelques, certains, parfois, plutôt, assez, globalement, généralement, souvent, en général, la plupart, il semble que, il semblerait que, on pourrait dire que, on dit souvent, beaucoup de, peu de, presque, quasiment, environ, à peu près
🇬🇧 some, many, often, just, simply, clearly, obviously, seemingly, arguably, basically, essentially, virtually, various, numerous, sort of, kind of, a bit, rather, quite, fairly, relatively, mostly, generally

Faux positifs connus

Deux motifs se déclenchent encore en v0.2 : les termes entre guillemets droits ("many X" sans backticks) et "many X" où X est un nom concret. Les deux sont suivis sous F23 dans la feuille de route. Entourer le terme cité de backticks, ou utiliser un commentaire de neutralisation inline, pour opter hors de la règle.

Neutralisation

Utiliser  quand le mot fuyant est intentionnel (citation, référence légitime à un sous-ensemble, méta-discussion). Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Strunk & White (1999)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`lexicon.unexplained-abbreviation`

Abréviations non explicitées.

Ce que cette règle signale

Les acronymes employés sans définition proche. Chaque interruption forcée pour deviner ou chercher un acronyme casse le fil et augmente le risque de perdre l’attention.

Références. WCAG 2.1 SC 3.1.4 (Abréviations) ; RGAA 9.4.

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (listes blanches distinctes)
Source	`src/rules/unexplained_abbreviation.rs`

Détection (v0.2, deux passes — F9)

Pré-scan du document entier pour repérer les acronymes définis sous l’une ou l’autre forme canonique :
- Expansion complète (ACRONYME) — exemple : World Wide Web (WWW)
- ACRONYME (Expansion complète) — exemple : WWW (World Wide Web)
Le côté « expansion » doit contenir au moins deux mots alphabétiques, pour que des notes courtes entre parenthèses comme (TBD) ou (à vérifier) ne soient pas comptées comme définitions.
Appariement des séquences de 2 lettres capitales consécutives ou plus (optionnellement avec des chiffres) dans le texte principal.
Filtrage de chaque candidat par trois couches, dans l’ordre :
1. Défini dans le document (issu du pré-scan) — le plus fort.
2. Liste blanche utilisateur [rules.unexplained-abbreviation].whitelist.
3. Liste blanche de base (pilotée par le profil).
Signalement de chaque occurrence restante.

Une seule définition n’importe où dans le document fait taire chaque occurrence du même acronyme — ce qui correspond à la manière dont les lecteurs utilisent réellement la documentation (remonter une fois pour trouver l’expansion, la retenir ensuite).

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`min_length`	`int`	3	2	2
`whitelist`	`list`	étendue	minimale	vide

Liste blanche par défaut (v0.2, resserrée par F31) : la pile d’infrastructure — URL, HTML, CSS, JSON, XML, HTTP, HTTPS, UTF, IO, API, CLI, GUI, OS, CPU, RAM, SSD, USB, IDE, SDK, CI, CD — plus les acronymes FR/EN courants et les mots-clés d’emphase RFC 2119 (PDF, SMS, GPS, ID, OK, FAQ, MUST, SHALL, SHOULD, …).

[rules.unexplained-abbreviation]
whitelist = ["WCAG", "ARIA", "ADHD", "LLM"]

Les entrées de la liste blanche utilisateur sont additives par rapport à la liste de base — elles l’étendent, jamais ne la remplacent.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

lexicon.jargon-undefined — l’équivalent pour les mots de contenu.

Références

WCAG 2.1 — 3.1.4
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`lexicon.low-lexical-diversity`

Diversité lexicale faible.

Ce que cette règle signale

Les passages qui répètent excessivement leurs mots de contenu. Un texte monotone perd l’attention du lecteur et trahit souvent une pensée mal structurée. La règle n’est pas un anti-jargon : les termes techniques (API, requête, cache) sont attendus comme récurrents — le signal vise les mots de contenu non techniques.

En bref


Catégorie	`lexicon`
Sévérité par défaut	`info`
Poids par défaut	`1`
Langues	EN · FR (listes de mots-outils distinctes)
Source	`src/rules/low_lexical_diversity.rs`

Détection

Fenêtre glissante de window_size mots. Dans la fenêtre, on calcule mots_uniques / mots_totaux sur les jetons hors mots-outils et hors blocs de code. Le diagnostic se déclenche quand le ratio passe sous min_ratio.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`window_size`	`int`	100	100	80
`min_ratio`	`float`	0.40	0.50	0.55
`use_stoplist`	`bool`	`true`	`true`	`true`

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Herdan (1960)
McCarthy & Jarvis (2010)
Graesser et al. (2004)

Voir Références pour la bibliographie complète.

`lexicon.excessive-nominalization`

Nominalisations en excès.

Ce que cette règle signale

Les phrases densément peuplées de nominalisations — verbes transformés en noms abstraits. Deux problèmes se cumulent : le texte nominalisé est plus abstrait (plus coûteux à traiter) et il masque l’agent (« qui fait quoi » disparaît). Le FALC et le Plain Writing Act américain recommandent les verbes forts plutôt que les nominalisations.

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (listes de suffixes qui se recoupent)
Source	`src/rules/excessive_nominalization.rs`

Détection

Parcours de la phrase. On signale les mots dont le suffixe figure dans la liste de la langue. Le diagnostic se déclenche quand le nombre par phrase franchit max_per_sentence.

🇫🇷 Suffixes : -tion, -sion, -ment, -ance, -ence, -age, -ité, -isme, -ure
🇬🇧 Suffixes : -tion, -sion, -ment, -ance, -ence, -ity, -ism, -ness, -al

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_per_sentence`	`int`	4	3	2
`suffixes`	`list`	défauts par langue	défauts par langue	défauts par langue

Faux positifs connus

Le vocabulaire technique (function, implementation, configuration) contient beaucoup de nominalisations légitimes, ce qui justifie le seuil relâché de dev-doc. Le suffixe -al en anglais est trop large (il signale crucial, horizontal, positional alors qu’il ne s’agit pas de noms abstraits) et reste suivi sous F-excessive-nominalization-suffix-refine dans la feuille de route.

Exemple

Nominalisations mises en couleur, appariées aux verbes actifs correspondants dans la version réécrite.

Avant (lourd) :

La réalisation de l’analyse de la conformité permettra l’identification des axes d’amélioration.

Après (allégé) :

Nous analyserons la conformité. Cela permettra d’identifier les axes à améliorer.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Plain Language US (2011)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`lexicon.jargon-undefined`

Jargon non défini.

Ce que cette règle signale

Les termes spécialisés employés sans définition. Le jargon est contextuel : acceptable entre spécialistes, exclusif autrement. Comme les acronymes, le jargon impose des interruptions de lecture au non-spécialiste ; à la différence des acronymes, ce sont des mots de contenu, pas des séquences en majuscules.

Références. Plain Language (US), FALC, WCAG 2.1 SC 3.1.3 (Mots inhabituels).

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Langues	EN · FR (listes distinctes par langue et par domaine)
Source	`src/rules/jargon_undefined.rs`

Détection

Plusieurs listes de jargon par domaine sont maintenues (tech, legal, medical, admin).
L’utilisateur active les listes pertinentes via le profil.
Chaque occurrence d’un terme listé est signalée.

Activation par profil

Profil	Listes actives
`dev-doc`	aucune (les développeurs maîtrisent leur propre jargon)
`public`	`tech`, `legal`, `medical`, `admin`
`falc`	`tech`, `legal`, `medical`, `admin`, mode strict

Configuration

En v0.2, les listes actives sont fixées par le profil et ne sont pas encore surchargées depuis lucid-lint.toml. Les surcharges TOML par règle — ajouter des termes de domaine, neutraliser des entrées précises, ou activer une combinaison de listes différente du profil — sont suivies sous F126 dans la feuille de route.

Listes de départ par défaut (contributions bienvenues)

Tech : idempotent, orthogonal, deterministic, polymorphic, serialization, deserialization, synchronous, asynchronous, concurrency, thread-safe, side-effect, referential transparency, memoization, currying, hoisting, closure, monad, immutable, stateless, refactoring
Juridique (surtout FR) : apériteur, clause résolutoire, force majeure, cessation de paiement, préjudice subi, onéreux, nonobstant, préalablement, susmentionné, infra, supra, ad hoc, de facto, in fine, subséquemment
Médical : anamnèse, étiologie, pathognomonique, iatrogène, nosocomial, décompensation, récidive, rémission, syndromique
Administratif (surtout FR) : attributaire, solliciter, diligenter, instruction du dossier, pièces justificatives, circulaire, délibération, arrêté préfectoral, transmission des pièces, ayant droit

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

lexicon.unexplained-abbreviation

Références

WCAG 2.1 — 3.1.3
Plain Language US (2011)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`lexicon.all-caps-shouting`

Majuscules criardes.

Ce que cette règle signale

Les suites de mots consécutifs en MAJUSCULES.

Le texte tout en majuscules supprime les indices de forme sur lesquels les lecteurs dyslexiques s’appuient pour distinguer les mots :

Ascendantes — les hampes qui montent au-dessus du corps des lettres comme b, d, h, k, l.
Descendantes — les hampes qui descendent sous la ligne de base dans g, p, q, y.
Contraste de hauteur d’x — l’écart entre les lettres courtes comme a, e, o et les hautes comme h, l.

En tout-majuscules, chaque lettre repose sur la même ligne de base à la même hauteur. Le lecteur perd la silhouette du mot et doit décoder lettre à lettre. Le tout-majuscules déclenche aussi de nombreux lecteurs d’écran à épeler la suite lettre à lettre, sauf indication contraire dans le balisage.

WCAG 3.1.5 et le BDA Dyslexia Style Guide recommandent la minuscule ou la casse de phrase pour l’emphase.

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Étiquettes de condition	`a11y-markup`, `dyslexia`, `general`
Langues	EN · FR (détection sur le script — agnostique de la langue)
Source	`src/rules/all_caps_shouting.rs`

Détection

Par paragraphe, on cherche les suites de mots consécutifs en MAJUSCULES. Les connecteurs mineurs (,, ;, :, -, espaces) gardent la suite vivante ; un mot en minuscule, un point ou un saut de paragraphe la termine.

Un mot est en MAJUSCULES quand il fait au moins 2 lettres et ne contient aucune minuscule. Les jetons en MAJUSCULES isolés sont traités comme des abréviations et relèvent de lexicon.unexplained-abbreviation.

Les blocs de code sont exclus par le parseur Markdown avant que la règle ne s’exécute.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`min_run_length`	`int`	3	2	2

dev-doc tolère une emphase à 2 mots (DO NOT) courante en documentation technique.

Exemple

lucid-lint signale ; la réécriture vous appartient toujours.

Une seule formule d’emphase, mise en couleur dans la version réécrite — le cri devient une emphase typographique sans perdre l’insistance.

Avant (signalé) :

Please DO NOT touch this.

DO NOT se lit comme un cri.

Ce que lucid-lint check --profile public rapporte :

warning input.md:1:8 2 consecutive ALL-CAPS words read as shouting and degrade legibility for dyslexic readers. Use sentence case and rely on emphasis (italics, bold) or a callout instead. [lexicon.all-caps-shouting]

Après (votre réécriture) :

Please do not touch this.

Faux positifs connus

Une chaîne de trois acronymes ou plus en prose (API HTTP TLS) est structurellement indiscernable d’un cri et déclenchera la règle. Neutraliser sur la ligne si la chaîne est intentionnelle, ou restructurer la phrase.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

lexicon.unexplained-abbreviation
Conditions (page EN pour l’instant)

Références

Arditi & Cho (2007)
Nielsen Norman Group
Bringhurst (2013)

Voir Références pour la bibliographie complète.

`lexicon.redundant-intensifier`

Intensificateurs redondants.

Ce que cette règle signale

Les intensificateurs — adverbes qui tentent de renforcer la confiance d’une affirmation sans rien y ajouter en information. très important se réduit à important, ou mieux, à une assertion chiffrée. plainlanguage.gov (chapitre 4) et le CDC Clear Communication Index signalent les intensificateurs comme un anti-motif de langue claire.

La règle est le pendant délibéré de lexicon.weasel-words : les mots évasifs affaiblissent la confiance (atténuations, qualifications) ; les intensificateurs redondants la renforcent. Les deux listes sont disjointes par construction.

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Étiquettes de condition	`general`
Langues	EN · FR
Source	`src/rules/redundant_intensifier.rs`

Détection

Par paragraphe, le texte est mis en minuscules puis chaque intensificateur de la liste par langue (en::INTENSIFIERS, fr::INTENSIFIERS) est cherché via la recherche partagée à frontières de mot. Les hits à l’intérieur d’un span de code (clôturé ou inline) sont ignorés. Les documents dont la langue est Unknown sont ignorés plutôt que devinés, par parallèle avec lexicon.weasel-words.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`custom_intensifiers_en`	`list<string>`	`[]`	`[]`	`[]`
`custom_intensifiers_fr`	`list<string>`	`[]`	`[]`	`[]`
`disable`	`list<string>`	`[]`	`[]`	`[]`

custom_intensifiers_en / _fr ajoutent des locutions aux défauts. disable retire des locutions de ces défauts (correspondance exacte en minuscules).

Cas connus

très dans la formule figée très bien (comme acquiescement) déclenche tout de même — les guides de langue claire le signalent quand même, et la règle ne taille pas d’exception. Neutraliser via une directive inline si le contexte l’impose vraiment.
Les références métalinguistiques (« le mot ‘très’ est un intensificateur ») déclenchent sauf si le mot cible est entre backticks. Utiliser un span de code inline pour ce genre de référence.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

lexicon.weasel-words
lexicon.jargon-undefined
Conditions (page EN pour l’instant)

Références

Strunk & White (1999)
Quirk et al. (1985)
Zinsser (2006)

Voir Références pour la bibliographie complète.

`lexicon.consonant-cluster`

Amas consonantiques.

Ce que cette règle signale

Les mots dont la plus longue suite de consonnes consécutives atteint ou dépasse un seuil par profil. Les amas consonantiques denses sont une barrière de décodage connue pour les lecteurs dyslexiques (BDA Dyslexia Style Guide) : le lecteur doit retenir plus de phonèmes en mémoire de travail avant que la voyelle suivante « libère » la syllabe.

Exemples typiques en anglais au seuil public de 5 : strengths (n-g-t-h-s), twelfths (l-f-t-h-s), sixths (x-t-h-s sur 4 + contexte). Exemples typiques en français au seuil falc de 4 : constructions (n-s-t-r).

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Étiquettes de condition	`dyslexia`, `general`
Langues	EN · FR
Source	`src/rules/consonant_cluster.rs`

Détection

Par ligne source, on parcourt le flux de graphèmes une seule fois. Un mot est une suite maximale de caractères alphabétiques ; les traits d’union, apostrophes et espaces ferment le mot (ainsi dys-lexique compte pour deux mots, pas un amas de dix lettres). À l’intérieur d’un mot, on suit la plus longue suite de consonnes consécutives. Un diagnostic est émis par mot dont la plus longue suite atteint min_run_length.

Les voyelles sont sensibles à la langue — les formes accentuées françaises (é, è, ê, à, â, î, ï, ô, ö, ù, û, ü, ÿ, œ, æ) comptent comme des voyelles. Le repli anglais accepte les voyelles latin-1 accentuées courantes pour que les emprunts (café, naïve) soient décodés correctement. Le y est traité comme une voyelle dans toutes les langues (clémence), ce qui évite des faux positifs gênants sur des mots comme fly, rhythm.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`min_run_length`	`int`	6	5	4

dev-doc est tolérant — la prose technique nomme régulièrement des choses comme strengths ou benchmarks. falc (audience grand public) attrape toute suite de 4 consonnes.

Cas connus

La règle est aveugle à la structure syllabique : elle compte les graphèmes consonantiques bruts, pas les phonèmes. Un mot comme hatching (5 lettres : t-c-h-n-g — suite de 5) se lit fluidement pour la plupart des lecteurs parce que tch est un seul digramme anglais. Neutraliser via directive inline quand un hit est inévitable.
Agnostique pour tout script alphabétique, mais les listes de voyelles ne sont calibrées que pour les scripts latins. Les mots en cyrillique, grec, arabe, etc., déclencheront probablement dès que le drapeau de langue est en ou fr — en pratique ce contenu sort du périmètre d’un linter bilingue EN/FR.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

lexicon.all-caps-shouting
readability.score (page EN pour l’instant)
Conditions (page EN pour l’instant)

Références

Seidenberg et al. (1984)
Treiman et al. (2006)

Voir Références pour la bibliographie complète.

`lexicon.homophone-density`

Densité d’homophones trop élevée.

Expérimentale en v0.2.x. Désactivée par défaut ; activez-la via --experimental lexicon.homophone-density ou [experimental] enabled = ["lexicon.homophone-density"] dans lucid-lint.toml. Passe à Stable au moment du tag v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour les tags dyslexia et aphasia qui gouvernent cette règle selon les conditions actives.

Ce que la règle détecte

Les paragraphes dont la part d’homophones — des mots qui se prononcent pareil mais s’écrivent différemment (their / there / they're, to / too / two, cours / court, amande / amende) — dépasse un pourcentage configurable. Les homophones imposent une double passe : l’oreille reconnaît le mot, l’œil doit ensuite choisir la bonne orthographe via le contexte. Ce détour est anodin isolément, coûteux en grappe. Le guide de la British Dyslexia Association cite les homophones comme un point de friction connu pour la lecture dyslexique, et les recommandations FALC d’orthographe claire conseillent de reformuler les passages denses pour les lecteurs aphasiques et les publics « facile à lire ».

En bref


Catégorie	`lexicon`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Statut	`experimental` (v0.2.x) → `stable` au tag v0.3
Tags de condition	`dyslexia`, `aphasia` (gouvernés ; ne s’exécute qu’avec `--conditions` correspondant)
Langues	EN · FR (listes d’homophones spécifiques à chaque langue)
Source	`src/rules/lexicon/homophone_density.rs`

Détection

Pour chaque paragraphe, parcourt le flux de mots une fois, compte les mots alphabétiques au dénominateur, et compte comme « occurrences » les mots qui apparaissent dans la table d’homophones de la langue. Si occurrences / total dépasse strictement le seuil du profil, émet un diagnostic ancré sur la première ligne du paragraphe. Les paragraphes de moins de 20 mots de contenu sont ignorés — sous ce plancher, un seul homophone produit un pourcentage à deux chiffres trompeur. Le message du diagnostic cite jusqu’à deux exemples d’homophones effectivement rencontrés, pour que la localisation reste le paragraphe mais que les pistes de réécriture soient concrètes.

Les tables d’homophones (HOMOPHONE_GROUPS_EN, HOMOPHONE_GROUPS_FR dans src/language/) privilégient des paires de mots-contenu dont la confusion orthographique altère vraiment le sens. Les homophones-outils français très fréquents (et / est, a / à, ou / où) sont volontairement exclus : ils apparaissent dans presque toutes les phrases et feraient grimper la densité de référence au-dessus de tous les seuils, noyant le signal que la règle veut capter.

Quand la langue détectée est Unknown, la règle n’a pas de table à appliquer et s’abstient silencieusement plutôt que de deviner.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_density_percent`	`float`	8.0	5.0	3.0

Pour ajuster via lucid-lint.toml :

[rules."lexicon.homophone-density"]
max_density_percent = 4.0

Exemples

Anglais

Avant (signalé) :

Their report shows there were too many decisions to make and two teams could not affect the launch nor lose the schedule despite careful planning across each region and product line every quarter.

Ce que lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia rapporte :

warning input.md:1:1 Paragraph density of homophones is 21.2% (7 of 33 content words (e.g. their, there)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]

Après (réécriture proposée) :

The report shows that the team made many decisions and that the two squads kept the launch on schedule despite careful planning across each region and product line every quarter.

La réécriture remplace their / there / to / too / two par des tournures ancrées dans le contexte (the report, that, the team, kept, the two squads), faisant tomber la densité bien sous le seuil.

Français

Avant (signalé) :

Pendant le cours du matin la cuisinière prépare le foie de veau avant la pause de midi puis revient à sa tâche après avoir rangé les ustensiles sur la grande table en bois clair.

Ce que lucid-lint check --profile public --experimental lexicon.homophone-density --conditions dyslexia rapporte :

warning input.md:1:1 Paragraph density of homophones is 11.8% (4 of 34 content words (e.g. cours, foie)); maximum 5.0%. Dense homophone runs raise the phonological-decoding load for dyslexic and aphasic readers; rephrase to disambiguate. [lexicon.homophone-density]

Après (réécriture proposée) :

Pendant la séance du matin la cuisinière prépare le foie de veau avant la coupure de midi puis reprend son travail après avoir rangé les ustensiles sur la grande table en bois clair.

cours devient séance, pause devient coupure, tâche devient travail — trois des quatre occurrences disparaissent sans perte de sens.

Suppression

Voir Supprimer un diagnostic pour les formes inline et bloc. La directive inline fonctionne sur cette règle :

<!-- lucid-lint disable-next-line lexicon.homophone-density -->
Their report shows there were too many decisions to make and two teams could not lose the launch.

Voir aussi

Conditions — les tags dyslexia et aphasia qui gouvernent cette règle.
F-experimental-rule-status — statut expérimental des règles — substrat qui permet à cette règle d’arriver en v0.2.x sans affecter les scores par défaut.

Références

British Dyslexia Association — Dyslexia Style Guide (2018). Cite les homophones comme point de friction pour la lecture dyslexique.
FALC — Information pour tous (2009). Recommandations d’orthographe claire pour les publics aphasiques et « facile à lire ».

Voir Références pour la bibliographie complète.

`syntax.passive-voice`

Voix passive.

Ce que cette règle signale

Les constructions à la voix passive. La passive masque l’agent et allonge la phrase sans ajouter d’information. Des exceptions légitimes existent (agent inconnu, style scientifique, mise en relief volontaire de l’action) — la règle signale, l’auteur décide.

Références. US Plain Language ; Strunk & White ; FALC.

En bref


Catégorie	`syntax`
Sévérité par défaut	`warning`
Poids par défaut	`2`
Langues	EN · FR (heuristiques distinctes)
Source	`src/rules/passive_voice.rs`

Détection (heuristique v0.1)

🇬🇧 be (conjugué) + participe passé [+ by …]. Gère le -ed régulier et la table des participes irréguliers.
🇫🇷 être (conjugué) + participe passé [+ par …], plus se faire + infinitif. Plus difficile qu’en anglais à cause de l’accord du participe (genre/nombre) et de la confusion avec (a) l’attribut du sujet (il est content vs il est vu) et (b) l’auxiliaire être des temps composés (elle est partie — passé composé, actif).

Précision attendue ~70–80 %. Un remplaçant à base d’analyseur morphosyntaxique est prévu pour un futur greffon lucid-lint-nlp.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_per_paragraph`	`int`	3	1	0
`ignore_scientific_style`	`bool`	`false`	`false`	`false`

Neutralisation

Pour les passives volontaires, utiliser une directive inline. Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Strunk & White (1999)
Plain Language US (2011)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`syntax.unclear-antecedent`

Antécédent flou.

Ce que cette règle signale

Les pronoms dont l’antécédent n’est pas évident dans le contexte immédiat. La référence pronominale ambiguë est l’une des ruptures de compréhension les plus coûteuses pour les lecteurs souffrant de troubles attentionnels : chaque ambiguïté force un retour conscient pour chercher l’antécédent.

Références. Strunk & White ; FALC (« préférer la répétition du nom au pronom ») ; Graesser et al. Coh-Metrix (cohésion référentielle).

En bref


Catégorie	`syntax`
Sévérité par défaut	`info`
Poids par défaut	`2`
Langues	EN · FR (listes de pronoms distinctes)
Source	`src/rules/unclear_antecedent.rs`

Détection (heuristique v0.1)

La détection exacte demande une résolution d’anaphore (problème avancé de traitement automatique du langage). v0.1 attrape les deux motifs les plus fréquents :

Pronoms démonstratifs en début de phrase (This/That/These/ Those, Ceci/Cela/Ce) non suivis d’un nom.
Pronoms personnels en début de paragraphe (aucun antécédent dans le contexte précédent).

La sévérité est info parce que l’heuristique est approximative — le niveau de bruit justifie une sévérité douce.

Paramètres

Clé	Type	Défaut
`check_demonstratives`	`bool`	`true`
`check_paragraph_start_pronouns`	`bool`	`true`

Listes de pronoms

🇫🇷 ce, cela, ceci, ça, celui-ci, celle-ci, il, elle, ils, elles
🇬🇧 this, that, these, those, it, they, them

Exemple

Les performances étaient médiocres avec le cache LRU. Cela a motivé le changement.

À quoi renvoie cela ? Aux performances ? Au cache ? Ambigu.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Références

Strunk & White (1999)
Gibson (1998)
Graesser et al. (2004)

Voir Références pour la bibliographie complète.

`syntax.nested-negation`

Négations imbriquées.

Ce que cette règle signale

Les phrases qui empilent plusieurs négations. Deux négations ou plus dans une même phrase forcent le lecteur à basculer mentalement les valeurs de vérité. La charge est connue pour les lecteurs aphasiques et ceux qui souffrent d’un trouble du déficit de l’attention (TDAH). Le coût se multiplie sous pression cognitive. Les guides de langage clair (FALC, CDC Clear Communication Index, plainlanguage.gov) recommandent de réécrire les doubles négatives au positif.

En bref


Catégorie	`syntax`
Sévérité par défaut	`warning`
Poids par défaut	`2`
Étiquettes de condition	`aphasia`, `adhd`, `general`
Langues	EN · FR (comptage spécifique par langue)
Source	`src/rules/nested_negation.rs`

Détection

On compte les négations par phrase ; on signale les phrases dont le compte dépasse max_negations.

Anglais — somme des correspondances délimitées par mot contre la liste de négations de la langue (not, no, never, none, nothing, nobody, nowhere, neither, nor, cannot, without) plus les occurrences du suffixe contracté n't (don't, won't, isn't, doesn't, …).
Français — comptage bipartite par paires. Chaque clitique ne / n' contribue pour une négation et s’apparie à la particule de seconde position la plus proche (pas, rien, jamais, plus, personne, aucun, aucune, guère, nulle part) dans une fenêtre courte ; l’appariement consomme simplement la particule pour éviter le double comptage. Les particules non appariées dans une phrase avec ne contribuent pour une de plus — ce qui attrape les formes comme rien employé en sujet nominal négatif. Garde-fous : pas / plus ne comptent jamais sans appariement (trop ambigus en dehors de ne …) ; rien précédé de de est traité comme l’idiome de rien et ignoré ; les particules d’une phrase sans clitique ne sont ignorées également (plus de courage, personne d'autre). Les autonomes sans / non comptent toujours.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_negations`	`int`	3	2	1

Exemples

lucid-lint signale ; la réécriture reste à l’auteur.

Français

Passe sous public :

Nous ne sommes pas prêts.

Le bipartite ne … pas compte pour une négation.

Avant (signalée) :

Nous ne disons pas que rien n’est jamais possible.

Trois négations : ne…pas (un bipartite), rien (non apparié), n'…jamais (un bipartite).

Ce que rapporte lucid-lint check --profile public :

warning input.md:1:1 Sentence stacks 3 negations (maximum 2). Rewrite as a positive statement or split the negations across separate sentences. [syntax.nested-negation]

Après (votre réécriture) :

Nous disons que quelque chose est possible.

Anglais

Trois négations → trois affirmations, teintes assorties d’un bout à l’autre de la réécriture. Le not disparaît simplement — la simplification se voit.

Avant (signalée) :

We do not say nothing is never possible.

Trois négations (not, nothing, never).

Après :

We say something is possible.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

syntax.passive-voice
structure.deep-subordination
Conditions (page EN pour l’instant)

Références

Clark & Chase (1972)
Carpenter & Just (1975)
Kaup et al. (2006)

Voir Références pour la bibliographie complète.

`syntax.conditional-stacking`

Empilement de conditions.

Ce que cette règle signale

Les phrases qui enchaînent plusieurs propositions conditionnelles. Chaque if / when / unless / quand / si ouvre une branche que le lecteur doit garder en pile mentale jusqu’à la résolution de la proposition englobante. Deux ou trois empilées dans une même phrase forment un multiplicateur de charge connu. L’effet touche les lecteurs avec aphasie, trouble du déficit de l’attention (TDAH) et toute personne sous pression cognitive. Les guides de langage clair (FALC, plainlanguage.gov) recommandent de scinder les chaînes conditionnelles en phrases distinctes ou en liste à puces.

En bref


Catégorie	`syntax`
Sévérité par défaut	`warning`
Poids par défaut	`2`
Étiquettes de condition	`aphasia`, `adhd`, `general`
Langues	EN · FR (listes spécifiques par langue)
Source	`src/rules/conditional_stacking.rs`

Détection

Par phrase, on compte les connecteurs conditionnels et on signale les comptes au-dessus de max_conditionals.

Anglais — somme des correspondances délimitées par mot contre la liste de langue (if, unless, when, whenever, while, until, provided, assuming, in case, as long as, as soon as, even if, only if).
Français — somme des correspondances délimitées par mot contre la liste de langue (si, sauf si, à moins que, à moins de, quand, lorsque, lorsqu', dès que, tant que, pourvu que, à condition que, à condition de, au cas où, même si, en cas de) plus les clitiques élidés s'il / s'ils.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_conditionals`	`int`	3	2	1

Exemples

Trois conditions, teintes assorties d’un bout à l’autre de la réécriture — la position les appariait déjà, la couleur confirme que la réécriture conserve chaque branche. lucid-lint signale ; la réécriture reste à l’auteur.

Français

Avant (signalée) :

Si nous expédions, quand le test passe, à moins que la barrière échoue, nous déployons.

Trois connecteurs conditionnels (si, quand, à moins que).

Ce que rapporte lucid-lint check --profile public :

warning input.md:1:1 Sentence stacks 3 conditional clauses (maximum 2). Split the conditions across separate sentences or convert them to a bullet list. [syntax.conditional-stacking]

Après (votre réécriture) :

Nous déployons quand les trois conditions tiennent :

la commande d’expédition a tourné,

le test passe,

la barrière n’échoue pas.

Anglais

Avant (signalée) :

If we ship, when the build passes, unless the gate fails, we deploy.

Après :

We deploy when all three checks hold:

the ship command ran,

the build passes,

the gate does not fail.

Faux positifs connus

La liste anglaise mêle des conditionnels purs avec des conjonctions temporelles (when, while) qui peuvent introduire des sous-propositions à valeur conditionnelle. Un usage purement temporel peut produire un faux positif sur des phrases longues. Utiliser disable-next-line (page EN pour l’instant) quand la lecture temporelle est sans ambiguïté.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

syntax.nested-negation
structure.deep-subordination
Conditions (page EN pour l’instant)

Références

Johnson-Laird & Byrne (1991)
Evans & Over (2004)
Gibson (1998)

Voir Références pour la bibliographie complète.

`syntax.dense-punctuation-burst`

Rafale de ponctuation.

Ce que cette règle signale

Des rafales locales de ponctuation : une fenêtre glissante de graphèmes qui contient trop de signes qualifiants (,, ;, :, —, –). Les amas serrés de signes indiquent une subordination empilée, des incises parenthétiques ou des listes dans des listes. Ce sont des constructions difficiles à analyser pour les lecteurs souffrant de troubles cognitifs ou attentionnels (lignes directrices IFLA pour les textes faciles à lire).

À distinguer de structure.excessive-commas, qui compte les virgules sur une phrase entière. Une phrase avec 8 virgules réparties sur 200 caractères ne déclenche pas ici, alors qu’une phrase avec 3 virgules dans 30 caractères déclenche.

En bref


Catégorie	`syntax`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Étiquettes de condition	`general`
Langues	EN · FR (agnostique au script)
Source	`src/rules/dense_punctuation_burst.rs`

Détection

Par ligne source, on parcourt le flux de graphèmes une fois et on recense la colonne de chaque signe qualifiant. Quand une fenêtre de window_graphemes graphèmes contient min_marks signes ou plus, on émet une rafale qui couvre du premier au dernier signe de la fenêtre. Puis on avance au-delà de ce dernier signe pour éviter que les fenêtres recouvrantes ne tirent deux fois sur le même amas.

Les blocs de code (fencés et indentés) sont exclus en amont par l’analyseur Markdown. Les terminateurs de phrase (., !, ?) et les parenthèses ne comptent pas dans la rafale.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`min_marks`	`int`	4	3	3
`window_graphemes`	`int`	30	30	40

dev-doc tolère un amas de 3 signes — typique des listes techniques au contact de la prose. FALC garde le même seuil de densité que public mais élargit la fenêtre pour attraper des rafales un peu plus lâches.

Cas connus

La règle opère par ligne source. Une rafale qui chevauche un saut de ligne dur en source n’est pas détectée ; en pratique c’est rare, car la ponctuation dense est aussi dense en octets source.
Le tiret cadratin (—, U+2014) et le tiret demi-cadratin (–, U+2013) qualifient ; le succédané ASCII à double trait (--) non, sous l’hypothèse que les auteurs soucieux de lisibilité utilisent les bonnes formes Unicode.

Neutralisation

Voir Neutralisation des diagnostics (page EN pour l’instant).

Voir aussi

structure.excessive-commas
structure.sentence-too-long
Conditions (page EN pour l’instant)

Références

Sweller (1988)
Gibson (1998)

Voir Références pour la bibliographie complète.

`syntax.parenthetical-depth`

Expérimentale en v0.2.x. Désactivée par défaut ; activée via --experimental syntax.parenthetical-depth ou [experimental] enabled = ["syntax.parenthetical-depth"] dans lucid-lint.toml. Passe à Stable à la coupe v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour les étiquettes adhd et general.

Ce que la règle signale

Une phrase dont la profondeur d’imbrication maximale entre crochets équilibrés () et [] atteint le seuil du profil. Les parenthèses empilées obligent la lectrice à garder en mémoire plusieurs idées suspendues à la fois — un signal reconnu de « phrase difficile » dans la tradition plainlanguage.gov et Hemingway, et un coût particulier pour les lectrices avec TDAH, qui portent en premier la charge en mémoire de travail.

La règle complète structure.excessive-commas, qui ignore déjà les énumérations plates (A, B, C) à profondeur 1. Cette règle-ci ne se déclenche qu’à partir de la profondeur 2 ; les deux règles sont mécaniquement orthogonales.

En un coup d’œil


Catégorie	`syntax`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Statut	`experimental` (v0.2.x) → `stable` à la coupe v0.3
Étiquettes de condition	`adhd`, `general` (filtrées : exécutée seulement si `--conditions` correspond)
Langues	EN · FR (indépendant de la langue — les familles de crochets sont identiques)
Source	`src/rules/syntax/parenthetical_depth.rs`

Détection

Pour chaque phrase, la règle parcourt le texte du paragraphe une fois aplati par le parseur (les blocs de code sont donc déjà exclus en amont) et tient un seul compteur de profondeur courante.

Algorithm

Parcourir la phrase un caractère à la fois.
Incrémenter la profondeur sur ( ou [ ; décrémenter sur ) ou ].
Une fermeture qui ferait passer la profondeur sous zéro la remet à zéro — la règle reste tolérante face à un balisage déséquilibré, comme le fait l’aide parenthesised_list_comma_count utilisée par structure.excessive-commas.
Suivre la profondeur maximale atteinte et la position du crochet ouvrant qui l’a atteinte.
Émettre un diagnostic par phrase quand max_depth ≥ le seuil du profil, ancré sur le crochet ouvrant le plus profond.

Exclusions (garde-fous contre les faux positifs)

Spans / blocs de code : déjà exclus en amont par le parseur Markdown.
Crochets déséquilibrés : la remise à zéro empêche les fermetures isolées de gonfler une profondeur ultérieure.

Reporté (hors MVP)

Les paires de tirets longs (— … —), les accolades ({}) et les appositions encadrées par des virgules sont volontairement hors scope en v0.2.x. Détecter une paire de tirets longs est fragile (confusion entre tirets demi-cadratin / cadratin, ambiguïté avec le trait d’union) et ramènerait du périmètre par la fenêtre.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_depth`	`int`	4	3	2

max_depth est la profondeur d’imbrication inclusive à laquelle la règle se déclenche. Une phrase dont le crochet le plus profond reste un cran en-dessous reste silencieuse.

Réglage via lucid-lint.toml :

[rules."syntax.parenthetical-depth"]
max_depth = 3

Exemples

Anglais

Avant (signalé) :

The migration tool (which now supports rollbacks (see --reverse, added in 0.4.2 [tracked in #312])) is opt-in.

Ce que lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd rapporte :

warning input.md:1:21 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]

Après (réécriture proposée) :

The migration tool is opt-in. It now supports rollbacks via --reverse, added in 0.4.2 (tracked in #312).

Les deux parenthétiques de premier niveau ont disparu ; il ne reste qu’une parenthèse plate à profondeur 1. La lectrice n’a plus à empiler trois pensées suspendues pour arriver au point.

Français

Avant (signalé) :

Le module (qui dépend du noyau (chargé au démarrage [voir le manuel])) est facultatif.

Ce que lucid-lint check --profile public --experimental syntax.parenthetical-depth --conditions adhd rapporte :

warning input.md:1:23 Nested parentheticals reach depth 3; readers must hold 3 suspended thoughts to reach the close. Split the sentence or unnest the inner bracket (plainlanguage.gov, Hemingway). [syntax.parenthetical-depth]

Après (réécriture proposée) :

Le module est facultatif. Il dépend du noyau, chargé au démarrage. Voir le manuel pour les détails.

Trois phrases, aucun crochet imbriqué. La chaîne de dépendances est désormais linéaire et la lectrice récupère chaque fait dans l’ordre où il apparaît.

Neutralisation

Voir Neutralisation des diagnostics pour les formes inline et bloc. La désactivation inline fonctionne aussi sur cette règle :

<!-- lucid-lint disable-next-line syntax.parenthetical-depth -->
The migration tool (which now supports rollbacks (see `--reverse`, added in 0.4.2 [tracked in #312])) is opt-in.

Voir aussi

Conditions — les étiquettes adhd et general qui filtrent cette règle.
structure.excessive-commas — règle sœur sur les énumérations plates entre parenthèses. Découpage atomique : excessive-commas ignore les listes (A, B, C) à profondeur 1 ; cette règle se déclenche seulement à partir de la profondeur 2.
syntax.dense-punctuation-burst — règle sœur sur la densité locale de ponctuation. Les deux règles signalent des phrases difficiles à analyser, sous deux angles différents.
F-experimental-rule-status — statut expérimental d’une règle — substrat qui permet à cette règle de paraître en v0.2.x sans affecter les scores par défaut.

Références

plainlanguage.gov — Write short sentences. Les recommandations plain-language traitent les qualificatifs empilés et les parenthèses imbriquées comme le symptôme canonique de la « phrase trop longue ».
Tradition d’édition Hemingway — fait remonter les phrases « difficiles à lire » quand elles superposent plusieurs idées suspendues ; les parenthèses imbriquées en sont la lecture mécanique la plus propre.

Voir Références pour la bibliographie complète.

`readability.score`

Score de lisibilité.

Ce que cette règle signale

Un indice de lisibilité au niveau du document. Les formules de lisibilité sont le signal synthétique historique de la complexité textuelle — simples, reproductibles, reconnues par les guides gouvernementaux US/UK et par WCAG. À traiter comme la complexité cyclomatique : d’abord une métrique, ensuite un avertissement.

En bref


Catégorie	`readability`
Sévérité par défaut	`info` (toujours signalée) · `warning` quand au-dessus de `max_grade_level`
Poids par défaut	`5`
Langues	EN — Flesch-Kincaid · FR — Kandel-Moles (auto-sélection selon la langue détectée ; v0.2+)
Source	`src/rules/readability_score.rs`

Détection (v0.2 — formule par langue)

La formule est sélectionnée selon la langue détectée du document :

Anglais — Flesch-Kincaid Grade Level :

0.39 × (mots / phrases) + 11.8 × (syllabes / mots) − 15.59

Le résultat est un niveau scolaire américain. Comparé directement à max_grade_level.

Français — Kandel & Moles (1958) :

207 − 1.015 × (mots / phrases) − 73.6 × (syllabes / mots)

Le résultat est un score d’aisance, typiquement dans 0..100 (plus haut = plus facile), à la Flesch. Pour rester comparable d’une langue à l’autre, la règle le convertit en équivalent niveau scolaire avec l’approximation linéaire standard (100 − score) / 10, et compare ce niveau à max_grade_level. Le message de diagnostic remonte à la fois le score d’aisance natif et l’équivalent niveau scolaire.

Langue inconnue : repli sur Flesch-Kincaid.

Niveau	Équivalent scolaire (FR)
< 6	Primaire
6–9	Collège
9–12	Lycée
12–16	Études supérieures
> 16	Expert

D’autres formules (Gunning Fog, SMOG, Dale-Chall, Scolarius) et un rapport multi-formules --readability-verbose restent sur la feuille de route.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`max_grade_level`	`float`	14	9	6
`always_report`	`bool`	`true`	`true`	`true`
`formula`	`auto` \| `flesch-kincaid` \| `kandel-moles`	`auto`	`auto`	`auto`

formula peut être surchargée via --readability-formula en CLI ; auto suit la langue détectée, les autres valeurs figent la formule.

Modes de sortie

Toujours signalé en info (pour l’observabilité, même sous le seuil).
Signalé en warning quand le niveau dépasse max_grade_level.

Neutralisation

Neutraliser une métrique au niveau du document est rarement la bonne réponse ; ajuster max_grade_level dans lucid-lint.toml à la place. Voir Configuration (page EN pour l’instant).

Références

Flesch (1948)
Kincaid et al. (1975)
CAN-ASC-3.1:2025

Voir Références pour la bibliographie complète.

`readability.large-number-unanchored`

Expérimentale en v0.2.x. Désactivée par défaut ; activée via --experimental readability.large-number-unanchored ou [experimental] enabled = ["readability.large-number-unanchored"] dans lucid-lint.toml. Passe à Stable à la coupe v0.3 dans le cadre de la cohorte F-experimental-rule-status. Voir Conditions pour les étiquettes dyscalculia et general.

Ce que la règle signale

Un grand nombre ou un mot d’ordre de grandeur qui apparaît dans une phrase sans aucun ancrage proche — pas d’unité, pas de pourcentage, pas de symbole monétaire, pas de ratio, pas de phrase de comparaison. Le CDC Clear Communication Index demande si les nombres sont clairs et utiles pour le public visé ; plainlanguage.gov est plus direct sur le mécanisme — « Use Numbers Effectively » recommande d’accompagner chaque grand nombre d’une comparaison ou d’un dénominateur que la lectrice peut situer. Les lectrices avec dyscalculie portent ce coût en premier : un « 4,8 milliards » hors contexte impose une estimation de l’ordre de grandeur à l’aveugle, là où la prose ordinaire fournit habituellement des appuis.

La règle complète structure.number-run, qui se déclenche sur des grappes numériques (≥ N tokens dans une même phrase). Cette règle-ci se déclenche sur un seul grand nombre ou mot d’ordre de grandeur sans ancrage.

En un coup d’œil


Catégorie	`readability`
Sévérité par défaut	`warning`
Poids par défaut	`1`
Statut	`experimental` (v0.2.x) → `stable` à la coupe v0.3
Étiquettes de condition	`dyscalculia`, `general` (filtrée ; ne s’active qu’avec `--conditions` correspondants)
Langues	EN · FR (lexiques de comparateurs et de références figure/page par langue)
Source	`src/rules/readability/large_number_unanchored.rs`

Détection

Pour chaque phrase, la règle parcourt le texte de paragraphe (post-aplatissement, donc les blocs de code clôturés sont déjà exclus par le parseur) et cherche les candidats sans ancrage.

Définition d’un candidat

Un candidat de niveau phrase est l’un de :

Un token numérique dont le nombre de chiffres est ≥ 4 et dont la valeur entière est ≥ le seuil du profil. Le scanner replie les séparateurs de milliers courants (,, ., espace ASCII, NBSP, espace fine, NBSP étroite) entre les groupes de chiffres, donc 1 000 (FR) et 1,000 (EN) comptent tous les deux comme un seul token de 4 chiffres et de valeur 1000.
Un mot d’ordre de grandeur — million(s), milliard(s), billion(s), trillion(s) en FR ; million(s), billion(s), trillion(s) en EN. Mot entier, insensible à la casse.

Filtres (garde-fous contre les faux positifs)

Forme année : exactement 4 chiffres contigus, sans séparateur de milliers ni de décimale, et de valeur dans 1000..=2999. 2024 et 1789 sont des années, pas des ordres de grandeur.
Ordinal : suite de chiffres immédiatement suivie d’une lettre (1st, 12th).
Référence figure / page / section : candidat précédé (dans les 16 octets, même phrase) par figure, page, section, tableau, chapitre, annexe, §, p., pp., n°, #, ou les équivalents EN.

Types d’ancrage (au niveau de la phrase)

L’un quelconque des éléments ci-dessous, n’importe où dans la phrase, ancre tous les candidats de la phrase :

Signe pourcent (%).
Symbole monétaire (€, $, £, ¥).
Token d’unité issu d’une petite liste curated (km, kg, m², °C, L, Hz, Mo, …).
Motif de ratio : X sur Y, X out of Y, ou X / Y entre chiffres.
Phrase de comparateur du lexique par langue (FR : soit environ, équivalent à, environ, plus de, par rapport à, … ; EN : roughly, approximately, more than, the size of, …).

La position du diagnostic pointe sur le premier candidat survivant dans la phrase fautive, pour que le surlignage tombe sur le nombre visible plutôt que sur le début de la phrase.

Paramètres

Clé	Type	`dev-doc`	`public`	`falc`
`min_value`	`int`	100000	10000	1000

min_value est la borne inférieure inclusive sur la valeur entière d’un candidat numérique. Les tokens qui passent le filtre du nombre de chiffres mais dont la valeur est en-dessous de min_value sont ignorés — les quantités de type numéro de page passent déjà par le filtre référence figure/page ; ce paramètre est un second filet.

À régler via lucid-lint.toml :

[rules."readability.large-number-unanchored"]
min_value = 50000

Exemples

Français

Avant (signalé) :

Le budget atteint 4 800 000 000 selon le rapport final.

Ce que lucid-lint check --profile public --experimental readability.large-number-unanchored --conditions dyscalculia rapporte :

warning input.md:1:19 Large numeral (10-digit, value ≈ 4800000000) appears with no anchor in this sentence (no unit, percentage, ratio, or comparison phrase). plain-language guidance recommends giving large numbers a comparison or denominator the reader can ground. [readability.large-number-unanchored]

Après (votre réécriture) :

Le budget atteint 4,8 milliards d’euros, soit environ 6 % du PIB selon le rapport final.

Le nombre est désormais accompagné d’une unité (euros), d’un pourcentage (6 %) et d’une phrase de comparateur (soit environ). Une lectrice qui ne peut pas estimer « 4,8 milliards » à brut dispose maintenant de trois ancres indépendantes.

Anglais

Avant (signalé) :

The proposal mentions several billion in vague spending across regions.

Après (votre réécriture) :

The proposal mentions several billion dollars in vague spending across regions, roughly the annual budget of a mid-sized state agency.

L’ordre de grandeur est désormais accompagné d’une unité (dollars) et d’une phrase de comparateur (roughly the annual budget).

Suppression

Voir Suppression des diagnostics pour les formes en ligne et en bloc. La désactivation en ligne fonctionne aussi sur cette règle :

<!-- lucid-lint disable-next-line readability.large-number-unanchored -->
Le budget atteint 4 800 000 000 selon le rapport final.

Voir aussi

Conditions — les étiquettes dyscalculia et general qui filtrent cette règle.
structure.number-run — règle sœur sur les grappes numériques. Découpe atomique : number-run se déclenche sur des grappes de tokens numériques ; cette règle-ci se déclenche sur un seul grand nombre sans ancrage.
structure.mixed-numeric-format — autre règle sœur, sur la cohérence de forme numérique (chiffres vs lettres).
F-experimental-rule-status — statut expérimental des règles — substrat qui permet à cette règle d’être livrée en v0.2.x sans affecter les scores par défaut.

Références

plainlanguage.gov — Use numbers effectively. « Help your reader visualize numbers… Compare numbers to something the reader is familiar with. »
CDC Clear Communication Index — Numbers. L’item 6 demande si les nombres sont clairs et utiles pour le public visé.

Voir Références pour la bibliographie complète.

Vue d’ensemble de l’architecture

lucid-lint est une petite caisse Rust avec un pipeline volontairement simple.

Pipeline

 texte d'entrée
     │
     ▼
┌──────────────────────────┐
│ Détection de la langue   │   heuristique du ratio de mots vides
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Parseur                  │   pulldown-cmark ou texte brut
│ (Markdown | brut)        │
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Modèle de document       │   Section > Paragraphe > Phrase
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Règles                   │   Chaque règle reçoit le document + la langue
│ (sentence-too-long, ...) │
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Diagnostics              │   rule_id, severity, location, section,
│                          │   message, weight
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐     v0.2+
│ Score                    │   normalisé par densité, plafonné par catégorie
│ (Scorecard)              │   5 catégories figées
└─────────────┬────────────┘
              │
              ▼
┌──────────────────────────┐
│ Formateur de sortie      │   TTY (défaut) ou JSON
│                          │   — porte les diagnostics + le scorecard
└──────────────────────────┘

Types clés

Diagnostic — l’unité de sortie. Porte weight (initialisé depuis scoring::default_weight_for) depuis v0.2.
Rule (trait) — fn check(document, language) -> Vec<Diagnostic>.
Document — la sortie du parseur. Consciente des sections.
Scorecard — global: Score, plus [CategoryScore; 5] dans l’ordre figé Structure · Rhythm · Lexicon · Syntax · Readability.
Report — diagnostics + scorecard + word_count, renvoyé par Engine::lint_* depuis v0.2.
Engine — regroupe un profil, un jeu de règles et une ScoringConfig facultative ; expose lint_str, lint_file, lint_stdin.

Principes de conception

Ces principes sont appliqués en revue de code. Voir Décisions de conception pour le contexte.

Rendre les états impossibles impossibles — types neufs, énumérations avec données, NonZeroU32.
Style fonctionnel où il aide — chaînes d’itérateurs, fonctions de règle pures.
Règles atomiques — une règle, un signal.
Cœur déterministe — ni réseau, ni LLM, ni comportement dépendant de l’environnement.
YAGNI — pas d’abstractions spéculatives.

Disposition des modules

src/
├── lib.rs             — racine de la bibliothèque
├── main.rs            — point d'entrée du binaire
├── cli.rs             — CLI clap
├── config.rs          — préréglages de profil, lecture du fichier de configuration
├── engine.rs          — orchestration
├── language/          — détection + données par langue
├── parser/            — Markdown + texte brut + tokeniseur + modèle de document
├── rules/             — un fichier par règle
├── scoring.rs         — modèle hybride de score (v0.2+)
├── output/            — formateurs TTY + JSON
└── types.rs           — types métier (Diagnostic, Severity, Location, ...)

Décisions de conception

Cette page consigne les décisions de conception prises pendant v0.1 qui méritent d’être revues avant tout changement.

Modèle linter contre modèle de score

Décision : v0.1 a livré la forme classique de linter, avec les sévérités info / warning. v0.2 a ajouté un modèle hybride de score (score global + sous-scores par catégorie + diagnostics) par-dessus, sans retirer la forme linter.

Raison : livrer la forme linter d’abord nous a permis de valider la qualité de détection sur de vrais corpus avant d’ajouter la couche d’agrégation. La couche de score est additive — les outils qui ne s’intéressent qu’aux diagnostics ignorent le scorecard.

Modèle hybride de score (v0.2)

Décision : un score global + 5 sous-scores par catégorie, tous sous la forme X / max. La composition empile une somme pondérée, une normalisation par densité (par 1 000 mots, plancher à 200) et un plafond par catégorie. 5 catégories figées : Structure · Rhythm · Lexicon · Syntax · Readability. Nouveau champ Diagnostic.weight, nouvelle option --min-score=N en ligne de commande.

Raison (brainstorm complet dans brainstorm/20260420-score-semantics.md) :

X / max plutôt que 0–100 : un maximum arbitraire nous laisse réajuster sans prétendre que le 80 d’aujourd’hui est le 80 de la prochaine version. La compétence /impeccable utilise déjà cette convention.
5 catégories figées : ne couplent rien à un renommage de règle ; utilisent l’aide category_of(rule_id) déjà décidée en v0.1. Dériver depuis le préfixe (plan B) a été rejeté : il aurait fallu renommer 17 règles rien que pour F14.
Trois mécaniques de composition empilées : aucune seule ne couvre tous les modes de défaillance. La densité seule punit les courts documents ; les poids seuls perdent face à une règle qui s’emballe ; les plafonds seuls ne reflètent pas l’ampleur du coût.
Notes en lettres, feux tricolores, marge réussite/échec et secondes de lecture ont été coupés du design v0.2 après une analyse à partir des principes de base (F-score-letter-grade–F-reading-time-score dans ROADMAP). Ils dupliquent la fonction-1 (vue d’un coup d’œil) que le nombre remplit déjà.
L’actionnabilité (fonction-2) est portée par la liste des diagnostics, pas par le score. Les sous-scores peuvent donc se permettre d’être minimaux — F37 veille à ce que les messages de diagnostic tiennent le côté actionnable du contrat.

Structure `Diagnostic`

Décision : un Diagnostic porte rule_id, severity, location, section, message et (depuis v0.2) weight.

Ce qui n’est PAS stocké, et pourquoi :

category — dérivable depuis rule_id via Category::for_rule. La stocker dupliquerait l’information et créerait un risque de dérive.
suggestion — toujours différée ; les messages actuels sont actionnables par eux-mêmes.

Ce qui EST stocké, et pourquoi :

section — la recalculer après coup demanderait de reparser le document pour parcourir les titres et faire correspondre les positions. Le coût de stockage est une Option<String> par diagnostic ; le coût de recalcul est un second parsing complet.
weight (v0.2) — initialisé à l’émission depuis scoring::default_weight_for, pour que les surcharges utilisatrices (par configuration) et les surcharges au niveau règle (par with_weight) traversent l’agrégation sans seconde recherche.

Cœur déterministe, extensions pour le reste

Décision : le cœur ne livre que des règles déterministes. Les règles à base de LLM, les règles qui s’appuient sur le réseau ou les règles à base de modèle d’apprentissage vivent dans des caisses d’extension facultatives (prévues pour v0.3).

Raison : un hook pre-commit qui prend 5 secondes et varie d’une exécution à l’autre est pire que pas de hook du tout. Le déterminisme n’est pas négociable sur le chemin nominal.

Bilingue EN/FR dès le premier jour

Décision : chaque règle qui dépend de la langue gère l’anglais et le français depuis v0.1.

Raison : la plupart des développeurs francophones de l’open source écrivent leur documentation en anglais. Viser le français seul passerait à côté de la majorité. Gérer les deux dès le premier jour coûte peu et signale l’ambition.

Une seule formule de lisibilité en v0.1

Décision : v0.1 utilise le grade Flesch-Kincaid pour toutes les langues. Les formules par langue (Kandel-Moles pour le français, SMOG, Coleman-Liau) sont différées à v0.2.

Raison : Flesch-Kincaid est connue, reproductible et bien comprise. Ajouter trois formules avant de valider les bases serait une optimisation prématurée.

Markdown + texte brut + entrée standard, Pandoc pour le reste

Décision : prise en charge native de .md, .markdown, .txt et de l’entrée standard en v0.1. Les autres formats (AsciiDoc, HTML, docx, PDF) passent par Pandoc en pré-traitement.

Raison : Markdown couvre la grande majorité de l’écriture open-source et technique. Pandoc est libre, omniprésent, et lève la charge de maintenir plusieurs parseurs.

Un fichier par règle

Décision : chaque règle vit dans son propre fichier sous src/rules/, avec une structure cohérente (struct, config, impl Rule, tests).

Raison : ajouter une règle devient une opération bien définie (un nouveau fichier depuis un gabarit), et la revue est facile (une règle, une PR, un fichier à lire).

Heuristique des mots vides pour la détection de langue

Décision : v0.1 détecte la langue par le ratio de mots vides. Aucune dépendance externe.

Raison : court, déterministe, sans coût à l’exécution. Pour les cas où elle échoue (textes très courts, documents pleins de code), la valeur de repli unknown est sûre.

Préréglages de profil comme variantes d’énumération

Décision : les profils sont Profile::DevDoc | Public | Falc. Ils ne peuvent pas être définis dans la configuration de l’utilisateur en v0.1.

Raison : ajouter des profils personnalisés est une abstraction spéculative tant que personne ne le demande. Les surcharges par règle suffisent à couvrir 95 % des cas « je veux un préréglage légèrement différent ».

Pipeline de source de vérité du ROADMAP (v0.2.x+)

Décision : ROADMAP.md est rétrogradé de source éditée à artefact généré. La source de vérité devient un ensemble structuré de fichiers sous .roadmap/ (ignoré par git), un fichier markdown par fonctionnalité avec front-matter TOML, plus des fragments narratifs. Un petit membre de workspace Rust (crates/roadmap-cli) fournit les sous-commands add / generate / validate / rename. Le générateur est invoqué localement pendant la préparation de release ; le ROADMAP.md régénéré est committé sur la PR de préparation. La CI ne régénère pas. Cadré sous F-roadmap-toml-source.

Raison :

La protection de branche sur main (en place depuis le 2026-05-03 via F-repo-config-hardening) force chaque modification de ROADMAP.md à passer par le cycle worktree → branche → PR → CI → merge → nettoyage. Le débit prévu en régime stable était de 10 à 30 modifications ROADMAP-seules par semaine. La valeur de revue PR sur ces modifications est nulle (auteur unique), donc la cérémonie n’était que pur surcoût.
Une dérogation de ruleset par chemin sur ROADMAP.md affaiblirait les signaux de protection de branche suivis par les badges OpenSSF Scorecard / Best Practices. Rétrograder le fichier hors de main préserve ces signaux intacts.
Les fichiers par fonctionnalité donnent des diffs git par fonctionnalité, suppriment le verrouillage de schéma (le front-matter est optionnel par fichier) et laissent les sections de narration vivre en markdown brut plutôt qu’en chaînes TOML.
Rust plutôt que Python pour le générateur : réutilise pulldown-cmark déjà dans les dépendances, intègre les tests dans cargo test, maintenance avec une seule chaîne d’outils, et reste extractible en caisse autonome si l’outil mûrit.
Le générateur local (pas la CI) évite d’accorder à la CI un quelconque accès à .roadmap/ (ignoré par git et local à la machine). La cadence de release — pas le temps réel — était un compromis accepté ; l’artefact public ROADMAP.md se met à jour à chaque tag v*.
Bloqueurs jour 1 à la livraison : émission déterministe des ancres <a id="…"> (pour que les liens croisés existants de la forme [F46](#f46) dans les PR et commits contingent de résoudre), une sous-commande add qui sert de gabarit (pour que créer une fonctionnalité soit une seule frappe, pas une régression), et un test de déterminisme aller-retour (régénérer l’artefact, le comparer à la version committée, échouer en cas de dérive).

Solution de repli d’urgence : si le travail sur crates/roadmap-cli dépasse le budget, le fichier migre plutôt vers une branche orpheline roadmap avec push direct et la même forme .md — préserve les signaux Scorecard via un autre mécanisme, au prix d’une disposition de branches non standard. Documenté comme issue de secours mais pas comme chemin retenu.

Références à consulter avant de changer

RULES.md — la référence des règles qui fait foi
ROADMAP.md — les travaux à venir
CODING_STANDARDS.md — les conventions du quotidien

Feuille de route

En cours de traduction. La feuille de route complète est pour l’instant disponible en anglais. Sa traduction FR est suivie dans F25 — la tâche même qui pilote la mise en place de cette version française.

Repères

v0.1 livrée — 17 règles déterministes, bilingues EN/FR.
v0.2 en cours — 25 règles au total, modèle de score hybride (score global + cinq sous-scores par catégorie), polissage docs, miroir FR (en cours).
v0.3 planifiée — extensions LLM/NLP optionnelles, SARIF pour GitHub Code Scanning.

La version anglaise reste la référence en attendant la traduction complète.

Accessibilité

Traduction en cours. La page d’accessibilité détaillée est pour l’instant disponible en anglais. Sa traduction FR est suivie dans F25 sur la feuille de route.

En résumé : le site vise WCAG 2.2 niveau AAA. Il dogfoode lucid-lint sur sa propre prose. Les contrastes, tailles, raccourcis clavier et la compatibilité avec les lecteurs d’écran sont testés à chaque livraison.

Écarts connus

Premier audit complet le 2026-04-22 : 17 / 20, 0 bloquant.

Le lien « Aller au contenu principal » et le sélecteur EN / FR sont ajoutés par JavaScript en fin de page. Un rendu côté serveur via theme/index.hbs est prévu (F35a).

Signaler un défaut d’accessibilité

Ouvrez une issue sur GitHub avec le label accessibility. Les signalements sont traités sur le jalon v0.2, sauf s’ils bloquent une publication.

Références

Sources académiques, normatives et pratiques qui fondent la conception de lucid-lint.

Cette page liste les références qui ont façonné les règles, profils et décisions de conception de lucid-lint. Chaque entrée précise où la référence intervient dans le projet. Le miroir anglais est à references.md.

Les liens externes ouvrent un nouvel onglet ; ils portent rel="nofollow noopener noreferrer" pour que le nouvel onglet reste sûr et que le site documentaire ne cautionne pas les contenus tiers.

Légende

Statut	Signification
✅	Vérifiée — référence canonique
⚠️	À vérifier — probablement correcte, détails à confirmer
🔍	Opportuniste — raisonnement solide, citation plus lâche
📖	Livre / source secondaire
🌐	Standard normatif
🧪	Source pratique (guide de style, outil)

Théorie de la charge cognitive — la colonne vertébrale

Le socle théorique de lucid-lint : un texte impose un coût mental au lecteur, et ce coût peut être mesuré et réduit.

✅ Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. ↗

Papier fondateur. Distingue la charge intrinsèque, extrinsèque et germane.

→ Concerne : la plupart des règles, notamment structure.*, rhythm.*, syntax.nested-negation, syntax.conditional-stacking.

📖 Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive Load Theory. Springer. ↗

Cohésion textuelle et traitement du discours

Papier de référence pour l’analyse automatisée de la cohésion.

→ Concerne : rhythm.repetitive-connectors, syntax.unclear-antecedent, lexicon.low-lexical-diversity.

📖 McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. ↗

Complexité syntaxique

✅ Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68(1), 1–76. ↗

Papier fondateur de la Dependency Locality Theory.

→ Concerne : structure.deep-subordination, syntax.unclear-antecedent, syntax.conditional-stacking.

Connecteurs du discours

✅ Sanders, T. J. M., & Noordman, L. G. M. (2000). The role of coherence relations and their linguistic markers in text processing. Discourse Processes, 29(1), 37–60. ↗

→ Concerne : rhythm.repetitive-connectors.

Formules de lisibilité

✅ Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233. ↗

→ Concerne : readability.score.

Formules francophones

⚠️ Kandel, L., & Moles, A. (1958). Application de l’indice de Flesch à la langue française. Cahiers Études de Radio-Télévision, 19, 253–274.

⚠️ À vérifier : pagination et intitulé exact du périodique. À contrôler sur Cairn ou en bibliothèque universitaire.

✅ Henry, G. (1975). Comment mesurer la lisibilité. Labor, Bruxelles. ↗ (compte rendu)

Ouvrage de référence francophone proposant la formule de Henry. Le lien Persée pointe vers le compte rendu de De Landsheere (1976), faute de page éditeur en ligne pour l’ouvrage.

→ Concerne : candidat pour v0.2 de readability.score.

✅ François, T., & Fairon, C. (2012). An “AI readability” formula for French as a foreign language. EMNLP-CoNLL 2012. ↗

⚠️ Rectification : « Scolarius », évoqué en session de conception, est un outil commercial québécois et non une formule académique publiée. À ne pas citer comme référence scientifique.

Diversité lexicale

📖 Herdan, G. (1960). Type-Token Mathematics: A Textbook of Mathematical Linguistics.

→ Concerne : lexicon.low-lexical-diversity.

Traitement de la négation

✅ Clark, H. H., & Chase, W. G. (1972). On the process of comparing sentences against pictures. Cognitive Psychology, 3(3), 472–517. ↗

Travaux expérimentaux classiques démontrant que les phrases négatives prennent plus de temps à traiter que les affirmatives. Preuve fondamentale que la négation porte un coût de compréhension.

→ Concerne : syntax.nested-negation.

✅ Carpenter, P. A., & Just, M. A. (1975). Sentence comprehension: A psycholinguistic processing model of verification. Psychological Review, 82(1), 45–73. ↗

Prolonge Clark & Chase avec un modèle formel du traitement des phrases. Les négations empilées composent le coût de vérification.

Raisonnement conditionnel

🔍 Johnson-Laird, P. N., & Byrne, R. M. J. (1991). Deduction. Psychology Press. ↗

Théorie des modèles mentaux du raisonnement conditionnel. Les conditionnelles empilées multiplient le nombre de modèles que le lecteur doit maintenir.

→ Concerne : syntax.conditional-stacking.

🔍 Evans, J. St. B. T., & Over, D. E. (2004). If. Oxford University Press. ↗

🔍 Précaution : le lien entre conditionnelles enchaînées et charge cognitive du lecteur est intuitif et bien étayé par la littérature globale sur le raisonnement, mais la règle spécifique « plus de N conditionnelles par phrase est néfaste » relève d’une heuristique de praticien, non d’un seuil directement testé. Traiter le seuil comme configurable et calibré empiriquement.

Typographie et traitement visuel

🔍 Arditi, A., & Cho, J. (2007). Letter case and text legibility in normal and low vision. Vision Research, 47(19), 2499–2505. ↗

Preuves empiriques du coût de lecture du texte en majuscules : le lecteur perd les indices de forme des mots que fournissent les jambages et hampes du mixed-case.

→ Concerne : lexicon.all-caps-shouting.

🧪 Nielsen, J. (Nielsen Norman Group). Articles multiples sur la lisibilité du texte en majuscules dans les interfaces.

→ Concerne : lexicon.all-caps-shouting.

📖 Bringhurst, R. (2013). The Elements of Typographic Style (4ᵉ éd.). Hartley & Marks.

Référence canonique en typographie.

✅ Legge, G. E., & Bigelow, C. A. (2011). Does print size matter for reading? A review of findings from vision science and typography. Journal of Vision, 11(5). ↗

Revue des preuves issues des sciences de la vision sur la lecture. Couvre les effets de longueur de ligne.

→ Concerne : structure.line-length-wide.

Complexité phonologique et lecture

Travail classique montrant que les patterns de lettres inhabituels ralentissent la reconnaissance des mots.

Travaux montrant que les clusters consonantiques et leur contexte affectent précision et vitesse de lecture.

🔍 Précaution : la règle lexicon.consonant-cluster est fondée sur la littérature globale sur la complexité des formes de mots, mais un seuil spécifique validé du type « 4+ consonnes d’affilée est néfaste » ne provient pas d’un papier canonique unique. C’est une heuristique de praticien informée par la littérature, non la transposition directe d’une métrique publiée.

Intensificateurs et atténuateurs

🔍 Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A Comprehensive Grammar of the English Language. Longman.

Grammaire classique classant les intensificateurs comme « amplificateurs » dont la contribution sémantique est souvent marginale.

→ Concerne : lexicon.redundant-intensifier.

🧪 Zinsser, W. (2006). On Writing Well (30ᵉ éd. anniversaire). HarperCollins.

Guide pratique qui plaide contre les adverbes intensificateurs comme encombrement.

Guides de style et langage clair

📖🧪 Strunk, W., & White, E. B. (1999). The Elements of Style (4ᵉ éd.). Longman.

→ Concerne : syntax.passive-voice, lexicon.weasel-words, lexicon.redundant-intensifier, syntax.unclear-antecedent.

🧪 US Plain Language Action and Information Network (2011). Federal Plain Language Guidelines. ↗

→ Concerne : structure.sentence-too-long, structure.paragraph-too-long, lexicon.excessive-nominalization, lexicon.jargon-undefined, syntax.passive-voice.

🧪 European Commission (2011). Rédiger clairement. Office des publications de l’Union européenne. ↗

Conventions de formatage numérique

🌐 International Organization for Standardization (2022). ISO 80000-1:2022 — Quantities and units — Part 1: General. ↗

Standard international sur le formatage des nombres, y compris groupement des chiffres et séparateurs décimaux.

→ Concerne : structure.mixed-numeric-format.

🧪 The Chicago Manual of Style (17ᵉ éd., 2017). University of Chicago Press. ↗

Guide de style canonique couvrant quand écrire les nombres en lettres ou en chiffres, et pourquoi la cohérence importe.

Mémoire de travail et attention

⚠️ Précaution : la recherche spécifique sur « lisibilité textuelle pour lecteurs TDAH » est dispersée et de qualité variable. L’angle « accessibilité cognitive » est sain, mais traiter les affirmations spécifiques au TDAH avec prudence.

📖 Barkley, R. A. (2012). Executive Functions: What They Are, How They Work, and Why They Evolved. The Guilford Press. ↗

Dyslexie et accessibilité visuelle

✅ Rello, L., & Baeza-Yates, R. (2013). Good fonts for dyslexia. Proceedings of ASSETS ’13. ↗

Standards normatifs internationaux

🌐 W3C (2018). Web Content Accessibility Guidelines (WCAG) 2.1. ↗

Critères clés invoqués :

1.3.1 (Information et relations) → structure.heading-jump
1.4.8 (Présentation visuelle) — largeur de ligne ≤ 80 caractères → structure.line-length-wide
2.4.6 (En-têtes et étiquettes) → structure.heading-jump
3.1.3 (Mots inhabituels) → lexicon.jargon-undefined
3.1.4 (Abréviations) → lexicon.unexplained-abbreviation
3.1.5 (Niveau de lecture) → readability.score

⚠️ Vérifie les numéros de critères sur la version WCAG que tu veux citer (2.1 ou 2.2).

Standards normatifs francophones

🌐 DINUM (2023). Référentiel Général d’Amélioration de l’Accessibilité (RGAA) version 4.1. ↗

Critère 9.1 — structure de l’information → structure.heading-jump
Critère 9.4 — expansion des abréviations → lexicon.unexplained-abbreviation

🌐 Inclusion Europe (2009, mise à jour 2014). Information pour tous : Règles européennes pour une information facile à lire et à comprendre.

Référentiel FALC (Facile À Lire et à Comprendre).

→ Concerne : le profil falc est directement inspiré de ces règles.

🌐 Normes d’accessibilité Canada (2025). CAN-ASC-3.1:2025 — Langage clair (première édition). ↗

Première norme nationale canadienne sur le langage clair, publiée en version bilingue par Normes d’accessibilité Canada dans le cadre de la Loi canadienne sur l’accessibilité. Exigences prescriptives (doit / devrait / peut) sur cinq axes : identification du public, méthodes d’évaluation, structure, formulation, conception. Fonde indépendamment plusieurs de nos seuils par défaut côté lexicon.*, structure.* et readability.score.

→ Concerne : lexicon.jargon-undefined, lexicon.unexplained-abbreviation, lexicon.weasel-words, structure.sentence-too-long, structure.paragraph-too-long, syntax.passive-voice, readability.score.

Contexte légal européen

🌐 Directive (UE) 2019/882 du Parlement européen et du Conseil du 17 avril 2019 — European Accessibility Act (EAA). ↗

Cadre légal étendant les exigences d’accessibilité aux services du secteur privé à partir du 28 juin 2025.

Outils pratiques qui ont façonné notre design

🧪 Coh-Metrix (Graesser & McNamara) — ↗
🧪 Vale (Chris Ward) — ↗
🧪 textlint — ↗
🧪 Hemingway Editor — ↗
🧪 Proselint — ↗

Tableau récapitulatif règle → référence

Lexique

Règle	Références principales
`lexicon.all-caps-shouting`	Arditi & Cho (2007); Nielsen Norman Group; Bringhurst (2013)
`lexicon.consonant-cluster`	Seidenberg et al. (1984); Treiman et al. (2006) — 🔍 heuristique praticien
`lexicon.excessive-nominalization`	Plain Language US; FALC; CAN-ASC-3.1:2025
`lexicon.jargon-undefined`	WCAG 3.1.3; Plain Language US; FALC; CAN-ASC-3.1:2025
`lexicon.low-lexical-diversity`	Herdan (1960); McCarthy & Jarvis (2010); Graesser et al. (2004)
`lexicon.redundant-intensifier`	Strunk & White; Quirk et al. (1985); Zinsser (2006)
`lexicon.unexplained-abbreviation`	WCAG 3.1.4; RGAA 9.4; CAN-ASC-3.1:2025
`lexicon.weasel-words`	Strunk & White; Wikipedia style guide; CAN-ASC-3.1:2025

Lisibilité

Règle	Références principales
`readability.score`	Flesch (1948); Kincaid et al. (1975); Henry (1975); Kandel & Moles (1958); CAN-ASC-3.1:2025

Rythme

Règle	Références principales
`rhythm.consecutive-long-sentences`	Sweller (1988); Sweller et al. (2011)
`rhythm.repetitive-connectors`	Sanders & Noordman (2000); Graesser et al. (2004)

Structure

Règle	Références principales
`structure.deep-subordination`	Gibson (1998); FALC
`structure.deeply-nested-lists`	WCAG 2.1; heuristiques de charge cognitive
`structure.excessive-commas`	Gibson (1998) — 🔍 heuristique praticien
`structure.heading-jump`	WCAG 1.3.1 & 2.4.6; RGAA 9.1
`structure.line-length-wide`	WCAG 1.4.8 (AAA); Legge & Bigelow (2011)
`structure.long-enumeration`	FALC; Plain Language US
`structure.mixed-numeric-format`	ISO 80000-1; Chicago Manual of Style
`structure.paragraph-too-long`	Sweller (1988); Graesser et al. (2004); CAN-ASC-3.1:2025
`structure.sentence-too-long`	Sweller (1988); Plain Language US; FALC; CAN-ASC-3.1:2025

Syntaxe

Règle	Références principales
`syntax.conditional-stacking`	Johnson-Laird & Byrne (1991); Evans & Over (2004); Gibson (1998) — 🔍 seuil heuristique de praticien
`syntax.dense-punctuation-burst`	Sweller (1988); Gibson (1998) — 🔍 purement heuristique
`syntax.nested-negation`	Clark & Chase (1972); Carpenter & Just (1975); Kaup et al. (2006)
`syntax.passive-voice`	Strunk & White; Plain Language US; FALC; CAN-ASC-3.1:2025
`syntax.unclear-antecedent`	Strunk & White; Gibson (1998); Graesser et al. (2004)

Sur l’honnêteté académique

lucid-lint est un projet d’ingénierie informé par la recherche, pas un projet de recherche en soi. Les références ci-dessus fondent nos choix de conception mais nous ne prétendons pas valider de nouveaux résultats. Plusieurs règles (lexicon.consonant-cluster, syntax.conditional-stacking, syntax.dense-punctuation-burst, structure.excessive-commas) sont des heuristiques de praticien informées par la littérature, et non des transpositions directes de métriques publiées — nous les marquons 🔍 dans le tableau récapitulatif.

Lorsque nous simplifions une métrique académique (par exemple syntax.unclear-antecedent comme heuristique de pattern au lieu d’une résolution complète des anaphores), nous documentons la simplification dans RULES.md et planifions des versions plus riches dans la feuille de route.

Si vous êtes chercheur et repérez une erreur, une citation obsolète ou une mauvaise attribution, ouvrez une issue — nous corrigerons rapidement et vous créditerons.

Contribuer

Voir CONTRIBUTING.md pour le guide de contribution complet.

En bref

Ouvrez une issue avant tout gros changement.
Lancez just check en local.
Ajoutez des tests pour tout nouveau comportement.
Suivez Conventional Commits.
Soyez bienveillant. Voir Code de conduite.

Particulièrement bienvenus

Propositions de règles, avec une raison claire côté charge cognitive
Listes de mots par langue : mots fuyants, connecteurs, jargon, sigles
Apports de corpus (échantillons de prose réelle)
Améliorations de la documentation