Welcome to the documentation for NEANIAS Serving trained ML models with BentoML


BentoML is an open-source framework for high-performance machine learning model serving. It makes it easy to build production API endpoints for trained ML models and supports all major machine learning frameworks, including Tensorflow, Keras, PyTorch, XGBoost, scikit-learn, fastai, etc.

BentoML comes with a high-performance API model server with adaptive micro-batching support, bringing the advantage of batch processing to online serving workloads. It also provides batch serving, model registry and cloud deployment functionality, which gives ML teams an end-to-end model serving solution with baked-in DevOps best practices.


BentoML stack is available as VM at


On Jupyter endpoint you can find a notebook called bentoml-quick-start-guide, this notebook provide all blocks to create a model and store it on BentoML.

Yatai service is the core of BentoML that handles model storage and deployment. Yatai endpoint can be use from your local enviroment.

BentoML UI provide a user interface that list all model stored on yatai.

MinIO is the storage where models are kept.


Set local enviroment

Install bentoml dependencies

pip3 install bentoml pandas sklearn

After installed bentoml cli set yatai service endpoint in order to use remote service:

bentoml config set yatai_service.url=

Create a classifier:

%%writefile iris_classifier.py
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.artifact import SklearnModelArtifact

class IrisClassifier(BentoService):

   def predict(self, df):
      # Optional pre-processing, post-processing code goes here
      return self.artifacts.model.predict(df)

After that you can training the classifier

from sklearn import svm
from sklearn import datasets

# import the custom BentoService defined above
from iris_classifier import IrisClassifier

# Load training data
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)

# Create a iris classifier service instance
iris_classifier_service = IrisClassifier()

# Pack the newly trained model artifact
iris_classifier_service.pack('model', clf)

# Save the prediction service to disk for model serving
saved_path = iris_classifier_service.save()

A new model will be create and store on Yatai, another user can retrieve the model using bentoml cli

bentoml serve IrisClassifier:latest

The model is now served at localhost:5000. Use curl command to send a prediction request:

curl -i \
 --header "Content-Type: application/json" \
 --request POST \
 --data '[[5.1, 3.5, 1.4, 0.2]]' \

More example on documentation


Please, contact Francesco Caronte (francesco.caronte@altecspace.it) for any assistance.