From 69697620e921827d3e2131b18539195313b94b82 Mon Sep 17 00:00:00 2001 From: Fred Boniface Date: Fri, 12 Dec 2025 12:04:33 +0000 Subject: [PATCH] Add RADOS Gateway Documentation --- docs/Admin/Virtualisation/rados-gw-in-k8s.md | 284 +++++++++++++++++++ docs/Virtualisation/pve-k8s-ceph-config.md | 129 --------- 2 files changed, 284 insertions(+), 129 deletions(-) create mode 100644 docs/Admin/Virtualisation/rados-gw-in-k8s.md delete mode 100644 docs/Virtualisation/pve-k8s-ceph-config.md diff --git a/docs/Admin/Virtualisation/rados-gw-in-k8s.md b/docs/Admin/Virtualisation/rados-gw-in-k8s.md new file mode 100644 index 0000000..c463787 --- /dev/null +++ b/docs/Admin/Virtualisation/rados-gw-in-k8s.md @@ -0,0 +1,284 @@ +# Deploy Ceph RADOS Gateway in Kubernetes +## Pointing to a separate PVE Ceph Cluster + +This guide outlines how to deploy a RADOS Gateway to enable an S3 API for a Ceph pool. I use this to provide S3 storage to my Kubernetes Cluster with the Ceph cluster hosted by Proxmox VE. Many conecpts are similar to the previous guide - Enable Ceph CSI for PVE Ceph, some steps will refer to that guide. + +This guide makes the following assumptions: +* You are already runnung Ceph via PVE. +* You are using the PVE UI for Ceph actions where possible. +* You are deploying the RADOS Gateway to the `object-store` namespace in K8s. +* Flux is used to deploy to K8s using SOPS for secret encryption. + +### 1. Ceph Pool & User Creation + +These steps ensure that a Ceph Pool is created with appropriate Replication. + +* Create the RGW Realm on a PVE Host from the Shell + * Create Realm: `radosgw-admin realm create --rgw-realm=default --default` + * Create Zonegroup: `radosgw-admin zonegroup create --rgw-zonegroup=default --master --default --endpoints=http://ceph-rgw.object-store.svc.cluster.local:8080` + * Create Zone: `radosgw-admin zone create --rgw-zone=default --master --default` + * Ensure Zone is included in Zonegroup: `radosgw-admin zonegroup add --rgw-zonegroup=default --rgw-zone=default` + * Update & Commit Period: `radosgw-admin period update --commit` + * Set the default realm: `radosgw-admin realm default --rgw-realm=default` + +* The above commands will have created the following new pools + **You do not need to manually create these** + + |Pool Name|Purpose| + | :----- | :----- | + |.rgw.root| + |default.rgw.log| + |default.rgw.control| + |default.rgw.meta| + +* Create the two required Pools for index and data in the PVE UI: + + |Pool Name | PG Autoscaler | Size | Min Size | Crush Rule | + | :------- | :------------ | :--- | :------- | :--------- | + | default.rgw.buckets.index | On | 3 | 2 | replicated_rule | + | default.rgw.buckets.data | On | 3 | 2 | replicated_rule | + +* Enable RGW Application: + When the pool is created via PVE, it is registered by default as an RBD Pool, + run these commands to change it to an RGW pool. + * Disable RBD: `ceph osd pool application disable default.rgw.buckets.data rbd --yes-i-really-mean-it` + * Enable RGW: `ceph osd pool application enable default.rgw.buckets.data rgw` + * Check with: `ceph osd pool application get default.rgw.buckets.data` + * Repeat for index pool: `ceph osd pool application disable default.rgw.buckets.index rbd --yes-i-really-mean-it` + * Enable RGW: `ceph osd pool application enable default.rgw.buckets.index rgw` + * Check with: `ceph osd pool application get default.rgw.buckets.index` + +* Create a user for the RADOS Gateway: + ``` + ceph auth get-or-create client.rgw.k8s.svc \ + mon 'allow r' \ + osd 'allow rwx pool=default.rgw.buckets.data, allow rwx pool=default.rgw.buckets.index, allow rwx pool=.rgw.root, allow rwx pool=default.rgw.meta, allow rwx pool=default.rgw.log, allow rwx pool=default.rgw.control' \ + -o /etc/ceph/ceph.client.rgw.k8s.svc.keyring + ``` + +### 2. Register Kubernetes Secrets + +* Retreive the files, from the Ceph host, required for Kubernetes Secrets: + Retreive these files and store them **temporarily** on your workstation. + + | File | Path | Purpose | + | :------- | :--- | :------ | + |ceph.conf|/etc/ceph/ceph.conf|Location of Ceph Monitors| + |Keyring | /etc/ceph/ceph.client.rgw.k8s.svc.keyring | Auth token | + +* Create Secret manifests for deployment to K8s: + + ``` + kubectl create secret generic ceph-config \ + --namespace=object-store \ + --from-file=ceph.conf=./conf \ + --dry-run=client -o yaml > ceph-config-secret.yaml + ``` + ``` + kubectl create secret generic ceph-keyring \ + --namespace=object-store \ + --from-file=keyring=./keyring \ + --dry-run=client -o yaml > ceph-keyring-secret.yaml + ``` + +* Encrypt the secret manifests using sops: + * `sops encrypt --in-place ./ceph-config-secret.yaml` + * `sops encrypt --in-place ./ceph-keyring-secret.yaml` + +### 3. Kubernetes Manifests + +**These should be treated as examples, read through them and ensure they match your environment** + +#### Namespace + +``` +apiVersion: v1 +kind: Namespace +metadata: + name: object-store +``` + +#### Service +``` +apiVersion: v1 +kind: Service +metadata: + name: ceph-rgw-svc + namespace: object-store + labels: + app.kubernetes.io/name: ceph-rgw + app.kubernetes.io/component: gateway +spec: + # The ClusterIP DNS name used for the RGW initialization: + # http://ceph-rgw-svc.object-store.svc.cluster.local:8080 + ports: + - port: 8080 + targetPort: 8080 + protocol: TCP + name: http-api + selector: + app: ceph-rgw + type: ClusterIP +``` + +#### Deployment +``` +apiVersion: apps/v1 +kind: Deployment +metadata: + name: ceph-rgw + namespace: object-store + labels: + app.kubernetes.io/name: ceph-rgw + app.kubernetes.io/component: gateway +spec: + replicas: 2 + selector: + matchLabels: + app: ceph-rgw + strategy: + type: RollingUpdate + template: + metadata: + labels: + app: ceph-rgw + spec: + # CRUCIAL: Enforce Pods to be on separate nodes for HA + affinity: + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchLabels: + app: ceph-rgw + topologyKey: "kubernetes.io/hostname" + containers: + - name: rgw + # Use the same Major:Minor as your PVE Hosts + image: quay.io/ceph/ceph:v18.2 + # Arguments to start the RGW process on port 8080 + args: [ + "radosgw", + "-f", # Run in foreground + "--conf=/etc/ceph/ceph.conf", # Explicitly use the mounted config + "--name=client.rgw.k8s.svc", # The exact CephX user name we created + "--rgw-frontends=beast port=8080" # REQUIRED: Beast frontend for Ceph 18+ + ] + resources: + requests: + cpu: 500m + memory: 2Gi + limits: + cpu: 2000m + memory: 2Gi + ports: + - containerPort: 8080 + name: rgw-http + # Ensure the Pod does not run as root unnecessarily + securityContext: + runAsUser: 167 # A common non-root user ID for Ceph containers + runAsGroup: 167 + allowPrivilegeEscalation: false + volumeMounts: + - name: ceph-config-vol + mountPath: /etc/ceph/ceph.conf + subPath: ceph.conf + - name: ceph-keyring-vol + mountPath: /etc/ceph/ceph.client.rgw.k8s.svc.keyring + subPath: keyring + volumes: + - name: ceph-config-vol + secret: + secretName: ceph-config + items: + - key: ceph.conf + path: ceph.conf + - name: ceph-keyring-vol + secret: + secretName: ceph-keyring + items: + - key: keyring + path: ceph.client.rgw.k8s.svc.keyring +``` + +**Deploy these manifests to Flux** + +### 4. RGW Admin Utility + +**Do not commit this to Flux, run as and when required to manage RGW users and buckets** + +#### Pod Manifest +``` +apiVersion: v1 +kind: Pod +metadata: + name: rgw-admin-utility + namespace: object-store +spec: + restartPolicy: Never + containers: + - name: rgw-admin-cli + # Use the same image as your RGW deployment for consistency + image: quay.io/ceph/ceph:v18.2 + # Use the /bin/bash entrypoint to allow manual command execution + command: ["/bin/bash", "-c", "sleep 3600"] + # Environment variable to explicitly define the CephX user for CLI tools + env: + - name: CEPH_ARGS + value: "--name client.rgw.k8s.svc --keyring /etc/ceph/ceph.client.rgw.k8s.svc.keyring" + + volumeMounts: + # Mount the ceph.conf Secret + - name: ceph-config-vol + mountPath: /etc/ceph/ceph.conf + subPath: ceph.conf + # Mount the keyring Secret to the file name radosgw-admin expects + - name: ceph-keyring-vol + mountPath: /etc/ceph/ceph.client.rgw.k8s.svc.keyring + subPath: keyring + + volumes: + - name: ceph-config-vol + secret: + secretName: ceph-config + items: + - key: ceph.conf + path: ceph.conf + - name: ceph-keyring-vol + secret: + secretName: ceph-keyring + items: + - key: keyring + path: ceph.client.rgw.k8s.svc.keyring # Use the explicit filename +``` + +#### Managing RGW + +* Deploy the Pod using `kubectl apply -f {filepath}` +* Exec into the pod `kubectl exec -it rgw-admin-utility -n object-store -- bash` + +##### Create User + +* `radosgw-admin user create --uid={uid} --display-name={display-name} --gen-key --gen-secret` + +**CRITICAL:** *Copy the JSON output, save the access_key and secret_key* + +##### Create Bucket + +* `radosgw-admin bucket create --bucket={buket-name} --uid={owner-uid}` + +##### Exit & Cleanup + +* `exit` +* `kubectl delete pod rgw-admin-utility -n object-store` + +### 5. Generate Secret for Client Access + +Deploy this in the namespace of the appliation requiring the S3 API Access + +``` +kubectl create secret generic s3-credentials \ + --namespace={application-namespace} \ + --from-literal=S3_ACCESS_KEY={access-key-from-user-creation} \ + --from-literal=S3_SECRET_KEY={secret-key-from-user-creation} \ + --dry-run=client -o yaml > s3-secret.yaml +``` \ No newline at end of file diff --git a/docs/Virtualisation/pve-k8s-ceph-config.md b/docs/Virtualisation/pve-k8s-ceph-config.md deleted file mode 100644 index f0dcd12..0000000 --- a/docs/Virtualisation/pve-k8s-ceph-config.md +++ /dev/null @@ -1,129 +0,0 @@ -# 📚 Complete Ceph CSI Deployment & Troubleshooting Guide - -This guide details the preparation and configuration necessary for successful dynamic provisioning of Ceph RBD (RWO) and CephFS (RWX) volumes in a Kubernetes cluster running on **MicroK8s**, backed by a Proxmox VE (PVE) Ceph cluster. - ---- - -## 1. ⚙️ Ceph Cluster Preparation (Proxmox VE) - -These steps ensure the Ceph backend has the necessary pools and structure. - -* **Create Dedicated Pools:** Create OSD Pools for data, e.g., **`k8s_rbd`** (for RWO), and **`k8s_data`** and **`k8s_metadata`** (for CephFS). -* **Create CephFS Metadata Servers (MDS):** Deploy **at least two** Metadata Server (MDS) instances. -* **Create CephFS Filesystem:** Create the Ceph Filesystem (e.g., named **`k8s`**), linking the metadata and data pools. -* **Create Subvolume Group (Mandatory Fix):** Create the dedicated Subvolume Group **`csi`** inside your CephFS. This is required by the CSI driver's default configuration and fixes the "No such file or directory" error during provisioning. - * **CLI Command:** `ceph fs subvolumegroup create k8s csi` - ---- - -## 2. 🔑 Ceph User and Authorization (The Permission Fix) - -This addresses the persistent "Permission denied" errors during provisioning. - -* **Create and Configure Ceph User:** Create the user (`client.kubernetes`) and set permissions for all services. The **wildcard MGR cap** (`mgr "allow *"`) is critical for volume creation. - * **Final Correct Caps Command:** - ```bash - sudo ceph auth caps client.kubernetes \ - mon 'allow r' \ - mgr "allow *" \ - mds 'allow rw' \ - osd 'allow class-read object_prefix rbd_children, allow pool k8s_rbd rwx, allow pool k8s_metadata rwx, allow pool k8s_data rwx' - ``` -* **Export Key to Kubernetes Secrets:** Create and place two Secrets with the user key in the correct CSI provisioner namespaces: - * **RBD Secret:** `csi-rbd-secret` (in the RBD Provisioner namespace). - * **CephFS Secret:** `csi-cephfs-secret` (in the CephFS Provisioner namespace). - -**The secrets should contain the keys: userID & userKey. userID should omit the 'client.' from the ceph output.** - ---- - -## 3. 🌐 Network Configuration and Bi-Directional Routing - -These steps ensure stable, bidirectional communication for volume staging and mounting. - -### A. PVE Host Firewall Configuration - -The PVE firewall must explicitly **allow inbound traffic** from the entire Kubernetes Pod Network to the Ceph service ports. - -| Protocol | Port(s) | Source | Purpose | -| :--- | :--- | :--- | :--- | -| **TCP** | **6789** | K8s Pod Network CIDR (e.g., `10.1.0.0/16`) | Monitor connection. | -| **TCP** | **6800-7300** | K8s Pod Network CIDR | OSD/MDS/MGR data transfer. | - -Alternatively, you may find a 'ceph' macro you can use on the PVE Firewall, if so use the Macro instead of additional rules. - -### B. PVE Host Static Routing (Ceph $\rightarrow$ K8s) - -Add **persistent static routes** on **all PVE Ceph hosts** to allow Ceph to send responses back to the Pod Network. - -* **Action:** Edit `/etc/network/interfaces` on each PVE host: - ```ini - # Example: - post-up ip route add via dev - # e.g., post-up ip route add 10.1.0.0/16 via 172.35.100.40 dev vmbr0 - ``` - -### C. K8s Node IP Forwarding (Gateway Function) - -Enable IP forwarding on **all Kubernetes nodes** so they can route incoming Ceph traffic to the correct Pods. - -* **Action:** Run on all K8s nodes: - ```bash - sudo sysctl net.ipv4.ip_forward=1 - sudo sh -c 'echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.d/99-sysctl.conf' - ``` - -### D. K8s Static Routing (K8s $\rightarrow$ Ceph) - Conditional/Advanced ⚠️ - -This routing is **only required** if the **Ceph Public Network** (the network Ceph Monitors/OSDs listen on) is **not reachable** by your Kubernetes Node's **default gateway**. - -* **Action:** This is implemented via a **Netplan configuration** on the Kubernetes nodes, using multiple routes with different metrics to provide load balancing and automatic failover. -* **Example Netplan Configuration (`/etc/netplan/99-ceph-routes.yaml`):** - ```yaml - network: - version: 2 - renderer: networkd - ethernets: - eth0: # Replace with your primary K8s network interface - routes: - # Route 1: Directs traffic destined for the first Ceph Monitor IP (10.11.12.1) - # through three different PVE hosts (172.35.100.x) as gateways. - # The lowest metric (10) is preferred. - - to: 10.11.12.1/32 - via: 172.35.100.10 - metric: 10 - - to: 10.11.12.1/32 - via: 172.35.100.20 - metric: 100 - - to: 10.11.12.1/32 - via: 172.35.100.30 - metric: 100 - - # Route 2: Directs traffic destined for the second Ceph Monitor IP (10.11.12.2) - # with a similar failover strategy. - - to: 10.11.12.2/32 - via: 172.35.100.20 - metric: 10 - - to: 10.11.12.2/32 - via: 172.35.100.10 - metric: 100 - - to: 10.11.12.2/32 - via: 172.35.100.30 - metric: 100 - ``` - -Use route priorities (Lower is Higher) to prefer the most direct path, while still offering alternative gateways into the Ceph network where needed. - ---- - -## 4. 🧩 MicroK8s CSI Driver Configuration (The Path Fix) - Conditional/Advanced ⚠️ - -This adjustment is **only required** if **MicroK8s** is running your Kuberneted deployment. **Alternative changes** may be needed for other Kubetnetes distributions. - -This resolves the **`staging path does not exist on node`** error for the Node Plugin. - -* **Update `kubeletDir`:** When deploying the CSI driver (via Helm or YAML), the `kubeletDir` parameter must be set to the MicroK8s-specific path. - ```yaml - # Correct path for MicroK8s Kubelet root directory - kubeletDir: /var/snap/microk8s/common/var/lib/kubelet - ```