Building ML Service Platform From The Ground Up (PyData 2022)

Dec 8, 2022 • Written By Sean Sheng

Summary:#

The value of an ML model is not realized until it is deployed and served in production. Building an ML application is more challenging compared to a traditional application due to the added complexities from models and data in addition to the application code. Using web serving frameworks (e.g. FastAPI) can work for the simple cases but falls short for performance and efficiency. Alternatively, using pre-packaged models servers (e.g. Triton Inference Server) can be ideal for low-latency serving and resource utilization but lacks flexibility in defining custom logic and dependency. BentoML abstracts the complexities by creating separate runtimes for IO-intensive preprocessing logic and compute-intensive model inference logic. Simultaneously, BentoML offers an intuitive and flexible Python-first SDK for defining custom preprocessing logic, orchestrating multi-model inference, and integrating with other frameworks in the MLOps ecosystem.

Join our global Community

Over 1 million new deployments a month 5000+ community members 200+ open-source contributors

Start a free trial

Schedule a demo

Building ML Service Platform From The Ground Up (PyData 2022)

Summary:#

Freedom To Build

Products

Resources

Company

Join our community