Summarize with:

Share
A newly disclosed flaw in SGLang means a malicious GGUF model file can become an execution path, not just a poisoned model artifact. CERT/CC says CVE-2026-5760 lets attackers achieve remote code execution when a crafted model is loaded and the /v1/rerank endpoint renders an unsafe Jinja2 chat_template. For defenders, the bigger lesson is that model ingestion now belongs in the same risk bucket as untrusted plugins, packages, and templates. If your AI serving stack can pull or load outside models, this is both a patching and incident response problem.
According to CERT/CC, the vulnerable path sits in SGLang's reranking feature. An attacker can prepare a malicious GGUF model file with a crafted tokenizer.chat_template field. When that model is loaded and a request hits /v1/rerank, SGLang renders the template with an unsandboxed jinja2.Environment(), allowing attacker-controlled Python code to execute in the context of the SGLang service.
This matters because the exploit chain does not begin with a classic web payload alone. It begins with a model artifact that looks operationally normal in many AI workflows. In other words, a model download can become a code execution event if teams treat model provenance as an MLOps hygiene issue instead of a security boundary.
The immediate impact is remote code execution on infrastructure that may already have access to sensitive prompts, application secrets, internal datasets, and adjacent GPU workloads. CERT/CC explicitly warns that successful exploitation could lead to host compromise, lateral movement, command-and-control, data theft, or denial of service.
The more strategic problem is trust collapse around model ingestion. Many teams now pull community models, test quantized variants quickly, and move promising artifacts into internal serving environments. If those environments assume a model file is "just data," then controls around review, access control, and isolation will lag behind the real risk.
Public details currently center on:
/v1/reranktokenizer.chat_template metadataDeployments are at highest risk when they:
At the time of disclosure, CERT/CC said no response or patch had been obtained from the project maintainers during coordination. That raises the priority of compensating controls right now.
Do not treat GGUF or similar model formats as passive content. For this issue, model metadata is part of the attack surface. Any workflow that pulls models from public repositories or third parties should be reviewed immediately.
If SGLang is deployed in production, identify whether /v1/rerank is enabled and whether cross-encoder reranking workflows depend on externally sourced models. If the feature is not essential, disable or isolate it until safe rendering behavior is confirmed.
Move model acquisition behind an approval flow. Require vetted sources, immutable hashes, and a manual review step before new model files reach shared inference infrastructure. This is the same logic defenders already apply to containers, packages, and CI/CD dependencies.
Review SGLang service logs, process creation telemetry, shell history, outbound connections, and artifact download traces after any recent model onboarding. Pay special attention to systems that pulled new GGUF models shortly before suspicious execution or network activity.
/v1/rerank from unusual sources/v1/rerank is exposed or usedThis is the kind of flaw that forces defenders to update their mental model of AI infrastructure. The interesting part is not only that SGLang has an RCE. It is that a model file can carry the trigger for server-side code execution in a production inference path. That makes model onboarding a software supply trust problem, not just a performance or quality problem.
Security teams should use this disclosure to ask a blunt question: who is allowed to introduce new model artifacts into environments that can reach sensitive data or production services? If the answer is unclear, CVE-2026-5760 is a warning shot.
It is an SGLang vulnerability that can lead to remote code execution when a malicious GGUF model file is loaded and the vulnerable reranking path renders attacker-controlled template content.
No. The disclosed path depends on both a crafted model artifact and use of the vulnerable reranking behavior, which is why model provenance matters so much here.
CERT/CC said no patch or maintainer response had been obtained during coordination at disclosure time, so compensating controls should come first.
Audit SGLang usage, restrict /v1/rerank, stop loading untrusted models into affected environments, and review recent model onboarding for signs of compromise.
Written by
Research
A DevOps engineer and cybersecurity enthusiast with a passion for uncovering the latest in zero-day exploits, automation, and emerging tech. I write to share real-world insights from the trenches of IT and security, aiming to make complex topics more accessible and actionable. Whether I’m building tools, tracking threat actors, or experimenting with AI workflows, I’m always exploring new ways to stay one step ahead in today’s fast-moving digital landscape.
Get the latest cybersecurity insights in your inbox.
vulnerabilityCVE-2026-20182 makes Cisco SD-WAN controllers an urgent KEV priority CVE-2026-20182 is not landing as a routine patch bulletin. Cisco says the flaw is already b...
vulnerabilityExim BDAT flaw makes mail servers urgent RCE patch targets CVE-2026-45185 is the kind of bug that forces defenders to remember an old lesson: email infrastructu...
vulnerabilityDirty Frag Linux kernel zero-day gives local users a fast path to root Dirty Frag is the kind of Linux bug defenders worry about because it turns a limited foot...