* fix: support large proxy model endpoint refresh
Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline.
Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails.
Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage.
* fix: preserve endpoint ping status semantics
KNOWN_CONTEXT_WINDOWS lists 'o1' (200k) before 'o1-mini' (128k), and
_lookup_known returned on the first substring hit — so "o1-mini" matched
'o1' and reported 200000 instead of 128000. Track the longest matching
key instead, so the most specific entry wins regardless of table order.