Email: recognize forwarded message dividers
`_ORIG_RE` (and its JS mirror `_TALON_ORIG_RE`) already recognised the
Japanese forward marker `転送` alongside the "Original Message" delimiters,
but not the English "Forwarded message" one. So Gmail-style forwards —
including the ones Odysseus itself emits (`---------- Forwarded message
----------`, static/js/emailInbox.js) — were not treated as a quote
boundary:
- with a following Outlook From:/Date: header block, the divider line
leaked into the level-0 reply bubble as noise;
- with only the divider marking the forward (no header block), the body
was not split into turns at all.
Add `Forwarded\s+message` to the same `[-_=]{3,}`-delimited alternation in
both the server-side parser and the JS mirror, so forward dividers are
consumed as an attribution boundary like "----- Original Message -----".
Locale variants of "Forwarded message" can follow the existing pattern.
Tests cover both manifestations plus a negative control (the bare words
"forwarded message" without `[-_=]{3,}` delimiters must not split).
Checks: python -m pytest tests/test_forwarded_message_divider.py (3 passed),
python -m py_compile src/email_thread_parser.py, node --check
static/js/emailLibrary/utils.js, git diff --check.
This commit is contained in:
@@ -15,7 +15,7 @@ export const _TALON_FROM = '(?:From|Från|Von|De|Da|От|Od|Van|差出人|发件
|
||||
export const _TALON_SENT = '(?:Sent|Skickat|Gesendet|Envoy[ée]|Inviato|Enviado|Verzonden|Отправлено|Wysłane|Date|送信日時|发送时间|寄件日期|Sendt|Lähetetty|Tarih|Datum|Data|Datum)';
|
||||
export const _TALON_SUBJ = '(?:Subject|Ämne|Betreff|Objet|Oggetto|Asunto|Onderwerp|Тема|Temat|件名|主题|主旨|Emne|Aihe|Onderwerp|Konu)';
|
||||
export const _TALON_TO = '(?:To|Till|An|À|A|Voor|Para|Naar|Кому|Do|宛先|收件人|Emri|Komu)';
|
||||
export const _TALON_ORIG_RE = /(?:^|\n)[\s>]*[-_=]{3,}\s*(?:Original\s+Message|Ursprüngliche\s+Nachricht|Mensaje\s+original|Messaggio\s+originale|Message\s+d['’]origine|Oorspronkelijk\s+bericht|Original\s+meddelande|Vor[ ]asal[a]\s+meddelande|原文|原始邮件|転送)\s*[-_=]{3,}/i;
|
||||
export const _TALON_ORIG_RE = /(?:^|\n)[\s>]*[-_=]{3,}\s*(?:Original\s+Message|Forwarded\s+message|Ursprüngliche\s+Nachricht|Mensaje\s+original|Messaggio\s+originale|Message\s+d['’]origine|Oorspronkelijk\s+bericht|Original\s+meddelande|Vor[ ]asal[a]\s+meddelande|原文|原始邮件|転送)\s*[-_=]{3,}/i;
|
||||
|
||||
// Minimum plain-text length of a "signature" before we bother folding it.
|
||||
// Short closings ("Cheers, John") stay inline — folding them would add
|
||||
|
||||
Reference in New Issue
Block a user