LiteLLM Kubernetes Security Hardening and Troubleshoot

2026.03.05
Technology
689 Words
LiteLLM Kubernetes Security Hardening and Troubleshoot

Part 4 of 4. In Part 3 we connected AI tools to LiteLLM. This final part covers production hardening, troubleshooting, and next steps.

LiteLLM is an open-source AI gateway that unifies 100+ provider APIs behind a single OpenAI-compatible endpoint. If you’re following from earlier in the series, revisit Part 1: Architecture.

Production Considerations

My current rig is a homelab deployment. If you’re taking this to production, these upgrades are non-negotiable.

Use Kubernetes Secrets

The Part 2 deployment already references secrets via secretKeyRef. Audit your ConfigMaps and logs for any leaked credentials.

env:
- name: KIMI_API_KEY
valueFrom:
secretKeyRef:
name: litellm-provider-keys
key: KIMI_API_KEY

Add Resource Limits

resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"

Enable TLS

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: litellm-ingress
namespace: litellm
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- litellm.yourdomain.com
secretName: litellm-tls
rules:
- host: litellm.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: litellm-service
port:
number: 4000

Set Up Monitoring

Monitor request volume, error rates, and latency.

LiteLLM exposes Prometheus metrics at /metrics:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: litellm-metrics
namespace: litellm
spec:
selector:
matchLabels:
app: litellm
endpoints:
- port: "4000"
path: /metrics
interval: 30s

Implement Rate Limiting

Rate limiting prevents quota exhaustion.

general_settings:
master_key: "os.environ/LITELLM_MASTER_KEY"
router_settings:
routing_strategy: "simple-shuffle"
litellm_settings:
success_callback: ["prometheus"]
failure_callback: ["prometheus"]

Troubleshooting

ErrorCauseFix
Connection refusedPod not readykubectl get pods -n litellm
Authentication ErrorWrong API keyVerify LITELLM_MASTER_KEY
Model not foundTypo in model nameCheck proxy_config.yaml entries
Kimi: 401 UnauthorizedMissing headersAdd User-Agent + X-Kimi-Client
OpenRouter: 400Missing HTTP-RefererAdd the header with your domain
DB connection failedWrong Postgres URLVerify DATABASE_URL format
Pod stuck PendingHostPath missingCreate /k3s_storage/litellm first

Debug Mode

Add --detailed_debug to args, then kubectl logs -n litellm -l app=litellm --tail=100 -f.

Conclusion

LiteLLM on Kubernetes eliminated my API key chaos. Provider changes pricing? Update one config. Want a new model? Toss three lines into proxy_config.yaml. Spinning up a new agent? One URL, one key. No more hunting through five different tools to rotate credentials.

Next steps: Add fallback routing for resilience, spend tracking to monitor costs across providers, and caching to reduce latency and save on repeated queries. Each feature takes minutes to configure and compounds the value of your gateway.

Frequently Asked Questions

What is LiteLLM?

Open-source LLM gateway providing a unified OpenAI-compatible API for 100+ providers.

Why deploy on Kubernetes instead of locally?

Centralized management, persistent config, shared access, and stateless scaling.

Can I add more providers?

Yes. Over 100 providers are supported. Add an entry to model_list in proxy_config.yaml.

Is my API data secure?

Keys are in Kubernetes Secrets. For production, enable TLS, network policies, and namespace isolation.

How do I update config without restarting?

Update proxy_config.yaml and run kubectl rollout restart deployment/litellm-deployment -n litellm.

Proxy vs SDK?

Proxy is a standalone server (this series). SDK is a Python library. Use Proxy for multiple clients.

Sources and References

Also available in Spanish. También disponible en español.

# litellm # Kubernetes # AI # Llm # proxy # ai-gateway