Fix HTML entity decoding and broaden OSINT dedup window

- Replace single ' handler with generic numeric/hex entity decoder
  so ' and other unpadded entities are properly converted
- Dedup urgent OSINT posts against all hot memory runs (last 3 sweeps)
  instead of only the previous sweep, preventing posts that drop out
  of one sweep from reappearing as "new" in the next

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Greg Scher
2026-03-23 13:01:32 -04:00
parent 31c305cbbb
commit b7322f1c7e
3 changed files with 12 additions and 4 deletions

View File

@@ -169,7 +169,10 @@ function parseWebPreview(html, channelId) {
.replace(/&lt;/g, '<')
.replace(/&gt;/g, '>')
.replace(/&quot;/g, '"')
.replace(/&#039;/g, "'")
.replace(/&#0*39;/g, "'")
.replace(/&#x0*27;/gi, "'")
.replace(/&#(\d+);/g, (_, n) => String.fromCharCode(Number(n)))
.replace(/&#x([0-9a-f]+);/gi, (_, h) => String.fromCharCode(parseInt(h, 16)))
.replace(/&nbsp;/g, ' ')
.trim();
}