Policy-as-Code for LLM Inference: Cost & Security Guardrails - Sakalya Deshpande, SAP Labs

Name: Policy-as-Code for LLM Inference: Cost & Security Guardrails - Sakalya Deshpande, SAP Labs
Uploaded: 2026-05-28T16:36:19.000Z
Channel: CNCF

Kyverno Policy As Code Llm inference Kubernetes Gpu management Cloud Native Cost optimization Mlops Admission control Kuberentes security

CNCF May 28, 2026

AI summary

This talk demonstrates how Kyverno validating admission policies can enforce cost and security guardrails for LLM inference on Kubernetes before workloads reach the scheduler. Sakalya Deshpande shows practical policies that reject inference requests exceeding token budgets, enforce GPU limits on model serving deployments, and require cost-attribution labels for chargeback. Platform engineers building multi-tenant AI infrastructure will learn declarative policy patterns that require no sidecars or custom controllers.

Poison Control: Unifying Software and Content... Anmol Krishan Sachdeva & Ankit Kotnala, Google

From YAML to CEL: Understanding Kyverno’s New Policy Model - Kirti Goyal, DevRel @Keploy

Beyond VLLM: Distributed LLM Inferencing With Llm-d on Kubernetes - Ravindra Patil, Red Hat

Panel: Telemetry That Matters - Diana Todea, Antonio Jimenez Martinez & Laura Luttmer

Webhook Topology and Admission Latency: Lessons from Migration - Tanat Lokejaroenlarb, Adevinta

From Policy to Production: Implementing ISO27001/BSI IT-Gr... Marcus Ross, Hamburg Port Authority

Welcome & Opening Remarks - Cortney Nickerson, CNCF Ambassador

CNCF On-Demand: Autonomous Agents on K8s – Durable Execution for AI

Evolving KServe: The Unified Model Inference Platform for Both Predictive and... F. Spolti & J. Lee

Keynote: Rules of the Road for Shared GPUs: AI Inference Scheduling at Wa... M. Muralikrishnan (ASL)

Merge Forward Meeting - July 2026

ChatLoopBackOff Episode 79: Capsule with Thomas and Oliver