Replace regex RSS parsing with robust feed parsing and entity decoding #18

Open
opened 2026-05-17 12:06:02 +00:00 by MrSphay · 0 comments
Owner

Created from local project scan after reviewing existing issues #1-#13.

Current status: etchRSS() parses feeds with regexes for ,

Created from local project scan after reviewing existing issues #1-#13. Current status: etchRSS() parses feeds with regexes for <item>, <title>, <link>, and <pubDate>. This misses Atom feeds, namespace variants, escaped entities, nested markup, CDATA edge cases, and feeds where metadata is not laid out exactly as expected. It also leaves HTML entities visible in dashboard headlines. Code references: - dashboard/inject.mjs: etchRSS() and itemRegex around lines 152-168. - dashboard/inject.mjs: uildNewsFeed() consumes those titles directly. Acceptance criteria: - Parse RSS and Atom feeds with a structured XML/feed parser or a small well-tested local parser. - Decode common HTML/XML entities before displaying headlines. - Preserve sanitized HTTP/HTTPS links only. - Mark malformed feeds degraded without breaking the whole news aggregation. - Add fixture tests for RSS, Atom, CDATA, entity-escaped titles, and malformed feed responses.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: MrSphay/intelligence-terminal#18