Fix HTML entity decoding and broaden OSINT dedup window
- Replace single ' handler with generic numeric/hex entity decoder so ' and other unpadded entities are properly converted - Dedup urgent OSINT posts against all hot memory runs (last 3 sweeps) instead of only the previous sweep, preventing posts that drop out of one sweep from reappearing as "new" in the next Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -169,7 +169,10 @@ function parseWebPreview(html, channelId) {
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"')
|
||||
.replace(/'/g, "'")
|
||||
.replace(/�*39;/g, "'")
|
||||
.replace(/�*27;/gi, "'")
|
||||
.replace(/&#(\d+);/g, (_, n) => String.fromCharCode(Number(n)))
|
||||
.replace(/&#x([0-9a-f]+);/gi, (_, h) => String.fromCharCode(parseInt(h, 16)))
|
||||
.replace(/ /g, ' ')
|
||||
.trim();
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user