Setup MLflow on Kubernetes

Updated on Sep 25, 2024

MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. Integrating it with Kubernetes allows scalable deployment, artifact storage, and tracking capabilities, essential for managing production-level ML models. This guide walks through the setup of MLflow on Kubernetes, configuring PostgreSQL as the backend, MinIO as the artifact store.

Dependencies

Before diving into the setup, ensure you have the following dependencies ready:

PostgreSQL as a Backend Store

- Ensure namespace of PersistentVolumeClaim is set to mlops
- Ensure namespace of Deployment is set to mlops
- Ensure namespace of PersistentVolumeClaim is set to mlops
- Ensure namespace of Service is set to mlops & Service Port set to 5432
MinIO for storing Artifacts
A Private Container Registry to host your custom MLflow Docker image

These components will enable smooth data storage and access for MLflow tracking.

Install Mlflow Tracking Server

We will install MLflow on Kubernetes using a custom Docker image. Begin by creating a self-signed certificate for secure communication.

Create Self-Signed Certificate

Pre-requisite: openssl must be installed in MacOS

openssl req -x509 -nodes -days 365 \
    -subj "/C=DE/ST=Berlin/L=Berlin/O=appdev24/OU=dev/CN=mlflow.local" \
    -newkey rsa:4096 -keyout selfsigned.key \
    -out selfsigned.crt

-subj switch option:

Country Name (2 letter code): DE
State or Province Name (full name): Berlin
Locality Name (city): Berlin
Organization Name (company): appdev24
Organizational Unit Name (department): dev
Common Name (server FQDN): mlflow.local

Update hosts file

To ensure your system recognizes the mlflow tracking server, update the /etc/hosts file by adding the domain name and pointing it to your localhost or cluster IP:

sudo vi /etc/hosts

# Append the server entry
127.0.0.1	mlflow-tracking.local

Create Namespace

Next, create a dedicated namespace for your registry in your Kubernetes cluster:

kubectl create namespace mlops

Create Kubernetes Secret

Then, create Kubernetes secrets to store the TLS certificate and key within the cluster:

kubectl create secret tls mlflow-tracking-tls --namespace mlops --cert=selfsigned.crt --key=selfsigned.key

Create Custom Image

You'll need a Dockerfile to define your custom MLflow image, which includes essential dependencies such as psycopg2 for PostgreSQL integration:

FROM ghcr.io/mlflow/mlflow:v2.16.2

RUN apt-get -y update && \
    apt-get -y install python3-dev build-essential pkg-config && \
    pip install --upgrade pip && \
    pip install psycopg2-binary boto3

CMD ["bash"]

After building the image, push it to your private container registry:

docker build -t registry.local/mlflow:v1 -f Dockerfile .
docker push registry.local/mlflow:v1

Create Deployment

With the image ready, define the Kubernetes deployment manifest to deploy MLflow Tracking Server in your cluster. Make sure to include environment variables for PostgreSQL and MinIO integration:

A Kubernetes cluster uses the Secret of kubernetes.io/dockerconfigjson type to authenticate with a container registry to pull a private image.

kubectl create secret docker-registry regcred --namespace mlops \
  --docker-server=registry.local \
  --docker-username=admin \
  --docker-password=Passw0rd1234

mlflow-tracking-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow-tracking-deployment
  namespace: mlops
  labels:
    app: mlflow-tracking-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mlflow
  template:
    metadata:
      labels:
        app: mlflow
    spec:
      imagePullSecrets:
        - name: regcred
      containers:
      - name: mlflow-tracking
        image: registry.local/mlflow:v1
        ports:
        - containerPort: 5000
        env:
          - name: MLFLOW_S3_ENDPOINT_URL
            value: "http://minio-service:9000"
          - name: MLFLOW_S3_IGNORE_TLS
            value: "true"
          - name: AWS_ACCESS_KEY_ID
            value: admin
          - name: AWS_SECRET_ACCESS_KEY
            value: Password1234
        command: ["mlflow", "server", "--host", "0.0.0.0", "--port", "5000", "--backend-store-uri", "postgresql+psycopg2://postgres:Password1234@postgres-service:5432/postgres", "--default-artifact-root", "s3://mlflow-artifacts"]

kubectl apply -f mlflow-tracking-deploy.yaml

This deployment will set up your mlflow tracking server, running on port 5000.

Create Service

To access your mlflow tracking server, you’ll need to expose it as a service in Kubernetes. Use a ClusterIP or NodePort service type, depending on your environment:

mlflow-tracking-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: mlflow-tracking-service
  namespace: mlops
  labels:
    app: mlflow-tracking-service
spec:
  type: ClusterIP
  selector:
    app: mlflow
  ports:
  - name: tcp
    protocol: TCP
    port: 5000
    targetPort: 5000

kubectl apply -f mlflow-tracking-svc.yaml

Create Ingress

To make the mlflow accessible through an easy-to-remember URL, set up Nginx ingress:

mlflow-tracking-ing.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mlflow-tracking-ingress
  namespace: mlops
  labels:
    app: mlflow-tracking-ingress
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  rules:
  - host: mlflow-tracking.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: mlflow-tracking-service
            port:
              number: 5000
  tls:
  - hosts:
    - mlflow-tracking.local
    secretName: mlflow-tracking-tls

kubectl apply -f mlflow-tracking-ing.yaml

Test Remote Tracking Server

Finally, test your MLflow setup by configuring environment variables on your local machine and running a sample MLflow experiment.

Local Machine Setup

To test the remote tracking server, follow these steps:

Open a terminal on your local machine

Set the following environment variables:

# Remote Tracking Server Details
export MLFLOW_TRACKING_URI=https://mlflow-tracking.local
export MLFLOW_TRACKING_INSECURE_TLS=true
export MLFLOW_S3_ENDPOINT_URL=https://minio.local
export MLFLOW_S3_IGNORE_TLS=true
export MLFLOW_ARTIFACTS_DESTINATION=s3://mlflow-artifacts
export AWS_ACCESS_KEY_ID=LmxP6ZEYgJRxRE9tzbUM
export AWS_SECRET_ACCESS_KEY=bwygxJDYc9PmYqthNPhdNFVIeaBrpk4OsPMMVuSb

Create a new directory for testing MLflow:

# Local Environment Setup
mkdir mlflow-testing
cd mlflow-testing

Activate a virtual environment

python3 -m venv env
source env/bin/activate

Install MLflow and other dependencies:

pip3 install --upgrade pip
pip3 install mlflow
pip3 install boto3
pip3 install scikit-learn

Run the sample model training script:

# Sample Model Training
git clone https://github.com/mlflow/mlflow
python3 mlflow/examples/sklearn_elasticnet_wine/train.py

This should train a machine learning model using MLflow and store the artifacts in your MinIO bucket.

Test MLflow running at https://mlflow-tracking.local

MLflow Training Runs<br> — MLflow Training Runs

Install Mlflow Inference Server

Create Self-Signed Certificate

openssl req -x509 -nodes -days 365 \
    -subj "/C=DE/ST=Berlin/L=Berlin/O=appdev24/OU=dev/CN=mlflow-inference.local" \
    -newkey rsa:4096 -keyout selfsigned-inference.key \
    -out selfsigned-inference.crt

Update hosts file

To ensure your system recognizes the mlflow inference server, update the /etc/hosts file by adding the domain name and pointing it to your localhost or cluster IP:

sudo vi /etc/hosts

# Append the server entry
127.0.0.1	mlflow-inference.local

Create Kubernetes Secret

Then, create Kubernetes secrets to store the TLS certificate and key within the cluster:

kubectl create secret tls mlflow-inference-tls --namespace mlops --cert=selfsigned-inference.crt --key=selfsigned-inference.key

Build Docker Image

Build a ready-to-deploy Docker image with the mlflow models:

export EXPERIMENT_ID=0
export RUN_ID=8d07ecfade2e400cb6ecd3710e5b601a
mlflow models build-docker \
  -m $MLFLOW_ARTIFACTS_DESTINATION/$EXPERIMENT_ID/$RUN_ID/artifacts/model \
  -n registry.local/mlflow-wine-classifier:v1

After building the image, push it to your private container registry:

docker push registry.local/mlflow-wine-classifier:v1

Create Deployment

With the image ready, define the Kubernetes deployment manifest to deploy MLflow Inference Server in your cluster.

mlflow-inference-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow-inference-deployment
  namespace: mlops
  labels:
    app: mlflow-inference-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mlflow-inference
  template:
    metadata:
      labels:
        app: mlflow-inference
    spec:
      imagePullSecrets:
        - name: regcred
      containers:
      - name: mlflow-inference
        image: registry.local/mlflow-wine-classifier:v1
        ports:
        - containerPort: 8000
        env:
          - name: GUNICORN_CMD_ARGS
            value: "--bind=0.0.0.0"

kubectl apply -f mlflow-inference-deploy.yaml

This deployment will set up your mlflow inference server, running on port 8080.

Create Service

To access your mlflow inference server, you’ll need to expose it as a service in Kubernetes. Use a ClusterIP or NodePort service type, depending on your environment:

mlflow-inference-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: mlflow-inference-service
  namespace: mlops
  labels:
    app: mlflow-inference-service
spec:
  type: ClusterIP
  selector:
    app: mlflow-inference
  ports:
  - name: tcp
    protocol: TCP
    port: 8000
    targetPort: 8000

kubectl apply -f mlflow-inference-svc.yaml

Create Ingress

To make the mlflow accessible through an easy-to-remember URL, set up Nginx ingress:

mlflow-inference-ing.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mlflow-inference-ingress
  namespace: mlops
  labels:
    app: mlflow-inference-ingress
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  rules:
  - host: mlflow-inference.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: mlflow-inference-service
            port:
              number: 8000
  tls:
  - hosts:
    - mlflow-inference.local
    secretName: mlflow-inference-tls

kubectl apply -f mlflow-inference-ing.yaml

Test Mlflow Inference Server

Once the service is ready, you can send a test request to the inference server.

vi test-input.json

{
  "inputs": [
    {
      "fixed acidity": 7.4,
      "volatile acidity": 0.7,
      "citric acid": 0.0,
      "residual sugar": 1.9,
      "chlorides": 0.076,
      "free sulfur dioxide": 11,
      "total sulfur dioxide": 34,
      "density": 0.9978,
      "pH": 3.51,
      "sulphates": 0.56,
      "alcohol": 9.4
    }
  ]
}

curl -k -H "Content-Type: application/json" \
  -d @./test-input.json \
  https://mlflow-inference.local/invocations

By following these steps, you've successfully set up MLflow in a Kubernetes environment, ensuring secure, scalable machine learning model management. This architecture allows you to handle model tracking, storage, and deployment efficiently.

Setup MLflow on Kubernetes

Dependencies

Install Mlflow Tracking Server

Create Self-Signed Certificate

Update hosts file

Create Namespace

Create Kubernetes Secret

Create Custom Image

Create Deployment

mlflow-tracking-deploy.yaml

Create Service

mlflow-tracking-svc.yaml

Create Ingress

mlflow-tracking-ing.yaml

Test Remote Tracking Server

Local Machine Setup

Install Mlflow Inference Server

Create Self-Signed Certificate

Update hosts file

Create Kubernetes Secret

Build Docker Image

Create Deployment

mlflow-inference-deploy.yaml

Create Service

mlflow-inference-svc.yaml

Create Ingress

mlflow-inference-ing.yaml

Test Mlflow Inference Server

PrimeChess.org

Top 10 Articles