DocumentDB Kubernetes Operator¶
The DocumentDB Kubernetes Operator is an open-source project to run and manage DocumentDB on Kubernetes. DocumentDB
is the engine powering vCore-based Azure Cosmos DB for MongoDB. It is built on top of PostgreSQL and offers a native implementation of document-oriented NoSQL database, enabling CRUD operations on BSON data types.
As part of a DocumentDB cluster installation, the operator deploys and manages a set of PostgreSQL instance(s), the DocumentDB Gateway, as well as other Kubernetes resources. While PostgreSQL is used as the underlying storage engine, the gateway ensures that you can connect to the DocumentDB cluster using MongoDB-compatible drivers, APIs, and tools.
Note: This project is under active development but not yet recommended for production use. We welcome your feedback and contributions!
Quickstart¶
This quickstart guide will walk you through the steps to install the operator, deploy a DocumentDB cluster, access it using mongosh
, and perform basic operations.
Prerequisites¶
- Helm installed.
- kubectl installed.
- A local Kubernetes cluster such as minikube, or kind installed. You are free to use any other Kubernetes cluster, but that's not a requirement for this quickstart.
- Install mongosh to connect to the DocumentDB cluster.
Start a local Kubernetes cluster¶
If you are using minikube
, use the following command:
minikube start
If you are using kind
, use the following command:
kind create cluster
Install cert-manager
¶
cert-manager is used to manage TLS certificates for the DocumentDB cluster.
If you already have
cert-manager
installed, you can skip this step.
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true
Verify that cert-manager
is installed correctly:
kubectl get pods -n cert-manager
Output:
NAMESPACE NAME READY STATUS RESTARTS
cert-manager cert-manager-6794b8d569-d7lwd 1/1 Running 0
cert-manager cert-manager-cainjector-7f69cd69f7-pd9bc 1/1 Running 0
cert-manager cert-manager-webhook-6cc5dccc4b-7jmrh 1/1 Running 0
Install documentdb-operator
using the Helm chart¶
The DocumentDB operator utilizes the CloudNativePG operator behind the scenes, and installs it in the
cnpg-system
namespace. At this point, it is assumed that the CloudNativePG operator is not pre-installed in your cluster.
Use the following command to install the DocumentDB operator:
helm install documentdb-operator oci://ghcr.io/microsoft/documentdb-kubernetes-operator/documentdb-operator --version 0.0.1 --namespace documentdb-operator --create-namespace
This will install the operator in the documentdb-operator
namespace. Verify that it is running:
kubectl get deployment -n documentdb-operator
Output:
NAME READY UP-TO-DATE AVAILABLE AGE
documentdb-operator 1/1 1 1 113s
You should also see the DocumentDB operator CRDs installed in the cluster:
kubectl get crd | grep documentdb
Output:
documentdbs.db.microsoft.com
Store DocumentDB credentials in K8s Secret¶
Before deploying the DocumentDB cluster, create a Kubernetes secret to store the DocumentDB credentials. The sidecar injector plugin will automatically inject these credentials as environment variables into the DocumentDB gateway container.
Create the secret with your desired username and password:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: documentdb-preview-ns
---
# DocumentDB Credentials Secret
#
# Login credentials:
# Username: k8s_secret_user
# Password: K8sSecret100
#
# Connect using mongosh:
# mongosh 127.0.0.1:10260 -u k8s_secret_user -p K8sSecret100 --authenticationMechanism SCRAM-SHA-256 --tls --tlsAllowInvalidCertificates
#
apiVersion: v1
kind: Secret
metadata:
name: documentdb-credentials
namespace: documentdb-preview-ns
type: Opaque
stringData:
username: k8s_secret_user
password: K8sSecret100
EOF
Verify the secret is created:
kubectl get secret documentdb-credentials -n documentdb-preview-ns
Output:
NAME TYPE DATA AGE
documentdb-credentials Opaque 2 10s
Note: The sidecar injector plugin requires the secret to be named
documentdb-credentials
and must containusername
andpassword
keys. The plugin will automatically inject these asUSERNAME
andPASSWORD
environment variables into the DocumentDB gateway container.
Deploy a DocumentDB cluster¶
Create a single-node DocumentDB cluster:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: documentdb-preview-ns
---
apiVersion: db.microsoft.com/preview
kind: DocumentDB
metadata:
name: documentdb-preview
namespace: documentdb-preview-ns
spec:
nodeCount: 1
instancesPerNode: 1
documentDBImage: ghcr.io/microsoft/documentdb/documentdb-local:16
resource:
pvcSize: 10Gi
exposeViaService:
serviceType: ClusterIP
EOF
Wait for the DocumentDB cluster to be fully initialized. Verify that it is running:
kubectl get pods -n documentdb-preview-ns
Output:
NAME READY STATUS RESTARTS AGE
documentdb-preview-1 2/2 Running 0 26m
You can also check the DocumentDB CRD instance:
kubectl get DocumentDB -n documentdb-preview-ns
Output:
NAME AGE
documentdb-preview 28m
Connect to the DocumentDB cluster¶
The DocumentDB Pod
has the Gateway container running as a sidecar. To keep things simple, the quickstart does not use a public load balancer. So you can connect to the DocumentDB instance directly through the Gateway port 10260
. For both minikube
and kind
, this can be easily done using port forwarding:
kubectl port-forward pod/documentdb-preview-1 10260:10260 -n documentdb-preview-ns
Connect using mongosh:
mongosh 127.0.0.1:10260 -u default_user -p Admin100 --authenticationMechanism SCRAM-SHA-256 --tls --tlsAllowInvalidCertificates
Execute the following commands to create a database and a collection, and insert some documents:
use testdb
db.createCollection("test_collection")
db.test_collection.insertMany([
{ name: "Alice", age: 30 },
{ name: "Bob", age: 25 },
{ name: "Charlie", age: 35 }
])
db.test_collection.find()
Output:
[direct: mongos] test> use testdb
switched to db testdb
[direct: mongos] testdb> db.createCollection("test_collection")
{ ok: 1 }
[direct: mongos] testdb> db.test_collection.insertMany([
... { name: "Alice", age: 30 },
... { name: "Bob", age: 25 },
... { name: "Charlie", age: 35 }
... ])
{
acknowledged: true,
insertedIds: {
'0': ObjectId('682c3b06491dc99ae02b3fed'),
'1': ObjectId('682c3b06491dc99ae02b3fee'),
'2': ObjectId('682c3b06491dc99ae02b3fef')
}
}
[direct: mongos] testdb> db.test_collection.find()
[
{ _id: ObjectId('682c3b06491dc99ae02b3fed'), name: 'Alice', age: 30 },
{ _id: ObjectId('682c3b06491dc99ae02b3fee'), name: 'Bob', age: 25 },
{
_id: ObjectId('682c3b06491dc99ae02b3fef'),
name: 'Charlie',
age: 35
}
]
Other options: Try the sample Python app and LoadBalancer
service¶
Connect to DocumentDB using a Python app¶
In addition to mongosh
, you can also use the sample Python program (that uses the PyMongo client) in the GitHub repository to execute operations on the DocumentDB instance. It inserts a sample document to a movies
collection inside the sample_mflix
database.
git clone https://github.com/microsoft/documentdb-kubernetes-operator
cd documentdb-kubernetes-operator/scripts/test-scripts
pip3 install pymongo
python3 mongo-python-data-pusher.py
Output:
Inserted document ID: 682c54f9505b85fba77ed154
{'_id': ObjectId('682c54f9505b85fba77ed154'),
'cast': ['Olivia Colman', 'Emma Stone', 'Rachel Weisz'],
'directors': ['Yorgos Lanthimos'],
'genres': ['Drama', 'History'],
'rated': 'R',
'runtime': 121,
'title': 'The Favourite MongoDB Movie',
'type': 'movie',
'year': 2018}
You can verify this using the mongosh
shell:
use sample_mflix
db.movies.find()
Output:
[direct: mongos] testdb> use sample_mflix
switched to db sample_mflix
[direct: mongos] sample_mflix>
[direct: mongos] sample_mflix> db.movies.find()
[
{
_id: ObjectId('682c54f9505b85fba77ed154'),
title: 'The Favourite MongoDB Movie',
genres: [ 'Drama', 'History' ],
runtime: 121,
rated: 'R',
year: 2018,
directors: [ 'Yorgos Lanthimos' ],
cast: [ 'Olivia Colman', 'Emma Stone', 'Rachel Weisz' ],
type: 'movie'
}
]
Use a LoadBalancer
service¶
For the quickstart, you connected to DocumentDB using port forwarding. If you are using a Kubernetes cluster in the cloud (for example, Azure Kubernetes Service), and want to use a LoadBalancer
service instead, enable it in the DocumentDB
spec as follows:
exposeViaService:
serviceType: LoadBalancer
LoadBalancer
service is also supported in minikube and kind.
List the Service
s and verify:
kubectl get services -n documentdb-preview-ns
This will create a LoadBalancer
service named documentdb-service-documentdb-preview
for the DocumentDB cluster. You can then access the DocumentDB instance using the external IP of the service.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
documentdb-preview-r ClusterIP 10.0.216.38 <none> 5432/TCP 26m
documentdb-preview-ro ClusterIP 10.0.31.103 <none> 5432/TCP 26m
documentdb-preview-rw ClusterIP 10.0.118.26 <none> 5432/TCP 26m
documentdb-service-documentdb-preview LoadBalancer 10.0.228.243 52.149.56.216 10260:30312/TCP 27m
If you are using the Python program to connect to DocumentDB, make sure to update the script's
host
variable with the external IP of yourdocumentdb-service-documentdb-preview
LoadBalancer service. Additionally, ensure that you update the defaultpassword
in the script or, preferably, use environment variables to securely manage sensitive information like passwords.
Delete the DocumentDB cluster and other resources¶
kubectl delete DocumentDB documentdb-preview -n documentdb-preview-ns
The Pod
should now be terminated:
kubectl get pods -n documentdb-preview-ns
Uninstall the DocumentDB operator:
helm uninstall documentdb-operator --namespace documentdb-operator
Output:
These resources were kept due to the resource policy:
[CustomResourceDefinition] poolers.postgresql.cnpg.io
[CustomResourceDefinition] publications.postgresql.cnpg.io
[CustomResourceDefinition] scheduledbackups.postgresql.cnpg.io
[CustomResourceDefinition] subscriptions.postgresql.cnpg.io
[CustomResourceDefinition] backups.postgresql.cnpg.io
[CustomResourceDefinition] clusterimagecatalogs.postgresql.cnpg.io
[CustomResourceDefinition] clusters.postgresql.cnpg.io
[CustomResourceDefinition] databases.postgresql.cnpg.io
[CustomResourceDefinition] imagecatalogs.postgresql.cnpg.io
release "documentdb-operator" uninstalled
Verify that the Pod
is removed:
kubectl get pods -n documentdb-preview-ns
Delete namespace, and CRDs:
kubectl delete namespace documentdb-operator
kubectl delete crd backups.postgresql.cnpg.io \
clusterimagecatalogs.postgresql.cnpg.io \
clusters.postgresql.cnpg.io \
databases.postgresql.cnpg.io \
imagecatalogs.postgresql.cnpg.io \
poolers.postgresql.cnpg.io \
publications.postgresql.cnpg.io \
scheduledbackups.postgresql.cnpg.io \
subscriptions.postgresql.cnpg.io \
documentdbs.db.microsoft.com