LiteLLM Kubernetes Security Hardening and Troubleshoot
Table of Contents
Part 4 of 4. In Part 3 we connected AI tools to LiteLLM. This final part covers production hardening, troubleshooting, and next steps.
LiteLLM is an open-source AI gateway that unifies 100+ provider APIs behind a single OpenAI-compatible endpoint. If you’re following from earlier in the series, revisit Part 1: Architecture.
Production Considerations
My current rig is a homelab deployment. If you’re taking this to production, these upgrades are non-negotiable.
Use Kubernetes Secrets
The Part 2 deployment already references secrets via secretKeyRef. Audit your ConfigMaps and logs for any leaked credentials.
env: - name: KIMI_API_KEY valueFrom: secretKeyRef: name: litellm-provider-keys key: KIMI_API_KEYAdd Resource Limits
resources: requests: memory: "512Mi" cpu: "500m" limits: memory: "2Gi" cpu: "2000m"Enable TLS
apiVersion: networking.k8s.io/v1kind: Ingressmetadata: name: litellm-ingress namespace: litellm annotations: cert-manager.io/cluster-issuer: "letsencrypt-prod"spec: tls: - hosts: - litellm.yourdomain.com secretName: litellm-tls rules: - host: litellm.yourdomain.com http: paths: - path: / pathType: Prefix backend: service: name: litellm-service port: number: 4000Set Up Monitoring
Monitor request volume, error rates, and latency.
LiteLLM exposes Prometheus metrics at /metrics:
apiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata: name: litellm-metrics namespace: litellmspec: selector: matchLabels: app: litellm endpoints: - port: "4000" path: /metrics interval: 30sImplement Rate Limiting
Rate limiting prevents quota exhaustion.
general_settings: master_key: "os.environ/LITELLM_MASTER_KEY"
router_settings: routing_strategy: "simple-shuffle"
litellm_settings: success_callback: ["prometheus"] failure_callback: ["prometheus"]Troubleshooting
| Error | Cause | Fix |
|---|---|---|
Connection refused | Pod not ready | kubectl get pods -n litellm |
Authentication Error | Wrong API key | Verify LITELLM_MASTER_KEY |
Model not found | Typo in model name | Check proxy_config.yaml entries |
Kimi: 401 Unauthorized | Missing headers | Add User-Agent + X-Kimi-Client |
OpenRouter: 400 | Missing HTTP-Referer | Add the header with your domain |
DB connection failed | Wrong Postgres URL | Verify DATABASE_URL format |
Pod stuck Pending | HostPath missing | Create /k3s_storage/litellm first |
Debug Mode
Add --detailed_debug to args, then kubectl logs -n litellm -l app=litellm --tail=100 -f.
Conclusion
LiteLLM on Kubernetes eliminated my API key chaos. Provider changes pricing? Update one config. Want a new model? Toss three lines into proxy_config.yaml. Spinning up a new agent? One URL, one key. No more hunting through five different tools to rotate credentials.
Next steps: Add fallback routing for resilience, spend tracking to monitor costs across providers, and caching to reduce latency and save on repeated queries. Each feature takes minutes to configure and compounds the value of your gateway.
Frequently Asked Questions
What is LiteLLM?
Open-source LLM gateway providing a unified OpenAI-compatible API for 100+ providers.
Why deploy on Kubernetes instead of locally?
Centralized management, persistent config, shared access, and stateless scaling.
Can I add more providers?
Yes. Over 100 providers are supported. Add an entry to model_list in proxy_config.yaml.
Is my API data secure?
Keys are in Kubernetes Secrets. For production, enable TLS, network policies, and namespace isolation.
How do I update config without restarting?
Update proxy_config.yaml and run kubectl rollout restart deployment/litellm-deployment -n litellm.
Proxy vs SDK?
Proxy is a standalone server (this series). SDK is a Python library. Use Proxy for multiple clients.
Sources and References
- LiteLLM Official Documentation
- LiteLLM Proxy Setup Guide
- Kimi API Documentation
- OpenRouter Documentation
- NVIDIA NIM Documentation
- Kubernetes Documentation
- Tailscale Documentation
Also available in Spanish. También disponible en español.