Setup MLflow on Kubernetes
MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. Integrating it with Kubernetes allows scalable deployment, artifact storage, and tracking capabilities, essential for managing production-level ML models. This guide walks through the setup of MLflow on Kubernetes, configuring PostgreSQL as the backend, MinIO as the artifact store.
Dependencies
Before diving into the setup, ensure you have the following dependencies ready:
- PostgreSQL as a Backend Store
- Ensure namespace of PersistentVolumeClaim is set to mlops
- Ensure namespace of Deployment is set to mlops
- Ensure namespace of PersistentVolumeClaim is set to mlops
- Ensure namespace of Service is set to mlops & Service Port set to 5432 - MinIO for storing Artifacts
- A Private Container Registry to host your custom MLflow Docker image
These components will enable smooth data storage and access for MLflow tracking.
Install Mlflow Tracking Server
We will install MLflow on Kubernetes using a custom Docker image. Begin by creating a self-signed certificate for secure communication.
Create Self-Signed Certificate
Pre-requisite: openssl must be installed in MacOS
openssl req -x509 -nodes -days 365 \
-subj "/C=DE/ST=Berlin/L=Berlin/O=appdev24/OU=dev/CN=mlflow.local" \
-newkey rsa:4096 -keyout selfsigned.key \
-out selfsigned.crt
-subj switch option:
- Country Name (2 letter code): DE
- State or Province Name (full name): Berlin
- Locality Name (city): Berlin
- Organization Name (company): appdev24
- Organizational Unit Name (department): dev
- Common Name (server FQDN): mlflow.local
Update hosts file
To ensure your system recognizes the mlflow tracking server, update the /etc/hosts file by adding the domain name and pointing it to your localhost or cluster IP:
sudo vi /etc/hosts
# Append the server entry
127.0.0.1 mlflow-tracking.local
Create Namespace
Next, create a dedicated namespace for your registry in your Kubernetes cluster:
kubectl create namespace mlops
Create Kubernetes Secret
Then, create Kubernetes secrets to store the TLS certificate and key within the cluster:
kubectl create secret tls mlflow-tracking-tls --namespace mlops --cert=selfsigned.crt --key=selfsigned.key
Create Custom Image
You'll need a Dockerfile to define your custom MLflow image, which includes essential dependencies such as psycopg2 for PostgreSQL integration:
FROM ghcr.io/mlflow/mlflow:v2.16.2
RUN apt-get -y update && \
apt-get -y install python3-dev build-essential pkg-config && \
pip install --upgrade pip && \
pip install psycopg2-binary boto3
CMD ["bash"]
After building the image, push it to your private container registry:
docker build -t registry.local/mlflow:v1 -f Dockerfile .
docker push registry.local/mlflow:v1
Create Deployment
With the image ready, define the Kubernetes deployment manifest to deploy MLflow Tracking Server in your cluster. Make sure to include environment variables for PostgreSQL and MinIO integration:
A Kubernetes cluster uses the Secret of kubernetes.io/dockerconfigjson type to authenticate with a container registry to pull a private image.
kubectl create secret docker-registry regcred --namespace mlops \
--docker-server=registry.local \
--docker-username=admin \
--docker-password=Passw0rd1234
mlflow-tracking-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mlflow-tracking-deployment
namespace: mlops
labels:
app: mlflow-tracking-deployment
spec:
replicas: 1
selector:
matchLabels:
app: mlflow
template:
metadata:
labels:
app: mlflow
spec:
imagePullSecrets:
- name: regcred
containers:
- name: mlflow-tracking
image: registry.local/mlflow:v1
ports:
- containerPort: 5000
env:
- name: MLFLOW_S3_ENDPOINT_URL
value: "http://minio-service:9000"
- name: MLFLOW_S3_IGNORE_TLS
value: "true"
- name: AWS_ACCESS_KEY_ID
value: admin
- name: AWS_SECRET_ACCESS_KEY
value: Password1234
command: ["mlflow", "server", "--host", "0.0.0.0", "--port", "5000", "--backend-store-uri", "postgresql+psycopg2://postgres:Password1234@postgres-service:5432/postgres", "--default-artifact-root", "s3://mlflow-artifacts"]
kubectl apply -f mlflow-tracking-deploy.yaml
This deployment will set up your mlflow tracking server, running on port 5000.
Create Service
To access your mlflow tracking server, you’ll need to expose it as a service in Kubernetes. Use a ClusterIP or NodePort service type, depending on your environment:
mlflow-tracking-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: mlflow-tracking-service
namespace: mlops
labels:
app: mlflow-tracking-service
spec:
type: ClusterIP
selector:
app: mlflow
ports:
- name: tcp
protocol: TCP
port: 5000
targetPort: 5000
kubectl apply -f mlflow-tracking-svc.yaml
Create Ingress
To make the mlflow accessible through an easy-to-remember URL, set up Nginx ingress:
mlflow-tracking-ing.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mlflow-tracking-ingress
namespace: mlops
labels:
app: mlflow-tracking-ingress
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
rules:
- host: mlflow-tracking.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mlflow-tracking-service
port:
number: 5000
tls:
- hosts:
- mlflow-tracking.local
secretName: mlflow-tracking-tls
kubectl apply -f mlflow-tracking-ing.yaml
Test Remote Tracking Server
Finally, test your MLflow setup by configuring environment variables on your local machine and running a sample MLflow experiment.
Local Machine Setup
To test the remote tracking server, follow these steps:
- Open a terminal on your local machine
- Set the following environment variables:
# Remote Tracking Server Details
export MLFLOW_TRACKING_URI=https://mlflow-tracking.local
export MLFLOW_TRACKING_INSECURE_TLS=true
export MLFLOW_S3_ENDPOINT_URL=https://minio.local
export MLFLOW_S3_IGNORE_TLS=true
export MLFLOW_ARTIFACTS_DESTINATION=s3://mlflow-artifacts
export AWS_ACCESS_KEY_ID=LmxP6ZEYgJRxRE9tzbUM
export AWS_SECRET_ACCESS_KEY=bwygxJDYc9PmYqthNPhdNFVIeaBrpk4OsPMMVuSb
- Create a new directory for testing MLflow:
# Local Environment Setup
mkdir mlflow-testing
cd mlflow-testing
- Activate a virtual environment
python3 -m venv env
source env/bin/activate
- Install MLflow and other dependencies:
pip3 install --upgrade pip
pip3 install mlflow
pip3 install boto3
pip3 install scikit-learn
- Run the sample model training script:
# Sample Model Training
git clone https://github.com/mlflow/mlflow
python3 mlflow/examples/sklearn_elasticnet_wine/train.py
This should train a machine learning model using MLflow and store the artifacts in your MinIO bucket.
Test MLflow running at https://mlflow-tracking.local
Install Mlflow Inference Server
Create Self-Signed Certificate
openssl req -x509 -nodes -days 365 \
-subj "/C=DE/ST=Berlin/L=Berlin/O=appdev24/OU=dev/CN=mlflow-inference.local" \
-newkey rsa:4096 -keyout selfsigned-inference.key \
-out selfsigned-inference.crt
Update hosts file
To ensure your system recognizes the mlflow inference server, update the /etc/hosts file by adding the domain name and pointing it to your localhost or cluster IP:
sudo vi /etc/hosts
# Append the server entry
127.0.0.1 mlflow-inference.local
Create Kubernetes Secret
Then, create Kubernetes secrets to store the TLS certificate and key within the cluster:
kubectl create secret tls mlflow-inference-tls --namespace mlops --cert=selfsigned-inference.crt --key=selfsigned-inference.key
Build Docker Image
Build a ready-to-deploy Docker image with the mlflow models:
export EXPERIMENT_ID=0
export RUN_ID=8d07ecfade2e400cb6ecd3710e5b601a
mlflow models build-docker \
-m $MLFLOW_ARTIFACTS_DESTINATION/$EXPERIMENT_ID/$RUN_ID/artifacts/model \
-n registry.local/mlflow-wine-classifier:v1
After building the image, push it to your private container registry:
docker push registry.local/mlflow-wine-classifier:v1
Create Deployment
With the image ready, define the Kubernetes deployment manifest to deploy MLflow Inference Server in your cluster.
mlflow-inference-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mlflow-inference-deployment
namespace: mlops
labels:
app: mlflow-inference-deployment
spec:
replicas: 1
selector:
matchLabels:
app: mlflow-inference
template:
metadata:
labels:
app: mlflow-inference
spec:
imagePullSecrets:
- name: regcred
containers:
- name: mlflow-inference
image: registry.local/mlflow-wine-classifier:v1
ports:
- containerPort: 8000
env:
- name: GUNICORN_CMD_ARGS
value: "--bind=0.0.0.0"
kubectl apply -f mlflow-inference-deploy.yaml
This deployment will set up your mlflow inference server, running on port 8080.
Create Service
To access your mlflow inference server, you’ll need to expose it as a service in Kubernetes. Use a ClusterIP or NodePort service type, depending on your environment:
mlflow-inference-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: mlflow-inference-service
namespace: mlops
labels:
app: mlflow-inference-service
spec:
type: ClusterIP
selector:
app: mlflow-inference
ports:
- name: tcp
protocol: TCP
port: 8000
targetPort: 8000
kubectl apply -f mlflow-inference-svc.yaml
Create Ingress
To make the mlflow accessible through an easy-to-remember URL, set up Nginx ingress:
mlflow-inference-ing.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mlflow-inference-ingress
namespace: mlops
labels:
app: mlflow-inference-ingress
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
rules:
- host: mlflow-inference.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mlflow-inference-service
port:
number: 8000
tls:
- hosts:
- mlflow-inference.local
secretName: mlflow-inference-tls
kubectl apply -f mlflow-inference-ing.yaml
Test Mlflow Inference Server
Once the service is ready, you can send a test request to the inference server.
vi test-input.json
{
"inputs": [
{
"fixed acidity": 7.4,
"volatile acidity": 0.7,
"citric acid": 0.0,
"residual sugar": 1.9,
"chlorides": 0.076,
"free sulfur dioxide": 11,
"total sulfur dioxide": 34,
"density": 0.9978,
"pH": 3.51,
"sulphates": 0.56,
"alcohol": 9.4
}
]
}
curl -k -H "Content-Type: application/json" \
-d @./test-input.json \
https://mlflow-inference.local/invocations
By following these steps, you've successfully set up MLflow in a Kubernetes environment, ensuring secure, scalable machine learning model management. This architecture allows you to handle model tracking, storage, and deployment efficiently.