odysseus/static/js/markdown.js at bc00a9fc7f90614336ffedd192c6490dff68e92e

Files

SurprisedDuck b70ae56ffa Sanitize preserved markdown HTML

`mdToHtml` deliberately stashes literal <details> blocks and <a> tags from
the source text *before* the global HTML-escape pass and restores them
verbatim into the string callers assign to `innerHTML` (e.g. chatRenderer's
`b.innerHTML = ...processWithThinking(text)`). Nothing scrubbed those
fragments, so message/agent content containing
`<details><img src=x onerror=...></details>` or
`<a href="javascript:..." onmouseover=...>` executed arbitrary script in
the authenticated page.

Route both stashed fragments through `sanitizeAllowedHtml()`, which parses
them in an inert <template> (no resource loads, no script execution),
removes script-capable elements, and strips event-handler attributes plus
javascript:/vbscript:/data: URL schemes. Hardening details:

- Compare tag names case-insensitively and drop the SVG/MathML foreign-
  content roots. An SVG-namespaced <script> has the lower-case tagName
  'script', so an HTML-only upper-case check would miss it — a real bypass.
- Sanitize to a fixpoint (re-parse + re-clean until stable) to blunt
  mutation-XSS, where re-serializing/re-parsing reshapes the tree.

Benign anchors and <details> blocks are preserved unchanged.

Verified under jsdom against the obvious vectors plus mutation-XSS probes
(svg/math-namespaced <script>, foreignObject, ns-confusion, comment
breakout, template smuggling): no script/iframe element, event handler, or
javascript:/data: URL survives, and benign markup is kept.

Co-authored-by: Claude <noreply@anthropic.com>

2026-06-02 05:58:38 +09:00

32 KiB

Raw Blame History

View Raw

32 KiB Raw Blame History

32 KiB

Raw Blame History