BentoML is a framework for packaging machine learning models into production-grade services. It automates Docker image creation, dependency locking, API generation (REST/gRPC), and deployment orchestration. BentoML supports PyTorch, TensorFlow, Scikit-learn, HuggingFace transformers, and ONNX models, making it framework-agnostic and ideal for teams shipping multiple model types. - Framework-Agnostic: Works with PyTorch, TensorFlow, Scikit-learn, LLMs, and custom models