fix(documents): use strip_pdf_content_marker instead of lstrip for PDF auto-open (#1727)

lstrip("\n[PDF content]:") treats the argument as a character set,
not a prefix, so it chews into the following [Page N text]: marker —
e.g. turning [Page 1 text]: into "age 1 text]:". The correct helper
strip_pdf_content_marker (which uses removeprefix) already exists in
the same file and is used by other call sites.

Fixes #1663

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Wes Huber
2026-06-02 21:30:04 -07:00
committed by GitHub
parent 4907b16d9b
commit 49885ff9e7

View File

@@ -394,9 +394,7 @@ def build_user_content(
# Pull the PDF prose once — used as either intro_text
# (form path) or the doc body (plain path).
try:
pdf_body_text = _process_pdf(path).lstrip(
"\n[PDF content]:"
).strip()
pdf_body_text = strip_pdf_content_marker(_process_pdf(path))
except Exception:
pdf_body_text = None