TorchServe is Facebook's framework for deploying PyTorch models as production APIs. You package your model (trained weights), write a handler (preprocessing and postprocessing code), and TorchServe exposes it via REST/gRPC endpoints. TorchServe handles operational concerns: batching (combine 32 requests into one forward pass), GPU management, model versioning, A/B testing, and metrics. This lets ML engineers focus on model quality, not infrastructure.