Add RADOS Gateway Documentation
This commit is contained in:
284
docs/Admin/Virtualisation/rados-gw-in-k8s.md
Normal file
284
docs/Admin/Virtualisation/rados-gw-in-k8s.md
Normal file
@@ -0,0 +1,284 @@
|
|||||||
|
# Deploy Ceph RADOS Gateway in Kubernetes
|
||||||
|
## Pointing to a separate PVE Ceph Cluster
|
||||||
|
|
||||||
|
This guide outlines how to deploy a RADOS Gateway to enable an S3 API for a Ceph pool. I use this to provide S3 storage to my Kubernetes Cluster with the Ceph cluster hosted by Proxmox VE. Many conecpts are similar to the previous guide - Enable Ceph CSI for PVE Ceph, some steps will refer to that guide.
|
||||||
|
|
||||||
|
This guide makes the following assumptions:
|
||||||
|
* You are already runnung Ceph via PVE.
|
||||||
|
* You are using the PVE UI for Ceph actions where possible.
|
||||||
|
* You are deploying the RADOS Gateway to the `object-store` namespace in K8s.
|
||||||
|
* Flux is used to deploy to K8s using SOPS for secret encryption.
|
||||||
|
|
||||||
|
### 1. Ceph Pool & User Creation
|
||||||
|
|
||||||
|
These steps ensure that a Ceph Pool is created with appropriate Replication.
|
||||||
|
|
||||||
|
* Create the RGW Realm on a PVE Host from the Shell
|
||||||
|
* Create Realm: `radosgw-admin realm create --rgw-realm=default --default`
|
||||||
|
* Create Zonegroup: `radosgw-admin zonegroup create --rgw-zonegroup=default --master --default --endpoints=http://ceph-rgw.object-store.svc.cluster.local:8080`
|
||||||
|
* Create Zone: `radosgw-admin zone create --rgw-zone=default --master --default`
|
||||||
|
* Ensure Zone is included in Zonegroup: `radosgw-admin zonegroup add --rgw-zonegroup=default --rgw-zone=default`
|
||||||
|
* Update & Commit Period: `radosgw-admin period update --commit`
|
||||||
|
* Set the default realm: `radosgw-admin realm default --rgw-realm=default`
|
||||||
|
|
||||||
|
* The above commands will have created the following new pools
|
||||||
|
**You do not need to manually create these**
|
||||||
|
|
||||||
|
|Pool Name|Purpose|
|
||||||
|
| :----- | :----- |
|
||||||
|
|.rgw.root|
|
||||||
|
|default.rgw.log|
|
||||||
|
|default.rgw.control|
|
||||||
|
|default.rgw.meta|
|
||||||
|
|
||||||
|
* Create the two required Pools for index and data in the PVE UI:
|
||||||
|
|
||||||
|
|Pool Name | PG Autoscaler | Size | Min Size | Crush Rule |
|
||||||
|
| :------- | :------------ | :--- | :------- | :--------- |
|
||||||
|
| default.rgw.buckets.index | On | 3 | 2 | replicated_rule |
|
||||||
|
| default.rgw.buckets.data | On | 3 | 2 | replicated_rule |
|
||||||
|
|
||||||
|
* Enable RGW Application:
|
||||||
|
When the pool is created via PVE, it is registered by default as an RBD Pool,
|
||||||
|
run these commands to change it to an RGW pool.
|
||||||
|
* Disable RBD: `ceph osd pool application disable default.rgw.buckets.data rbd --yes-i-really-mean-it`
|
||||||
|
* Enable RGW: `ceph osd pool application enable default.rgw.buckets.data rgw`
|
||||||
|
* Check with: `ceph osd pool application get default.rgw.buckets.data`
|
||||||
|
* Repeat for index pool: `ceph osd pool application disable default.rgw.buckets.index rbd --yes-i-really-mean-it`
|
||||||
|
* Enable RGW: `ceph osd pool application enable default.rgw.buckets.index rgw`
|
||||||
|
* Check with: `ceph osd pool application get default.rgw.buckets.index`
|
||||||
|
|
||||||
|
* Create a user for the RADOS Gateway:
|
||||||
|
```
|
||||||
|
ceph auth get-or-create client.rgw.k8s.svc \
|
||||||
|
mon 'allow r' \
|
||||||
|
osd 'allow rwx pool=default.rgw.buckets.data, allow rwx pool=default.rgw.buckets.index, allow rwx pool=.rgw.root, allow rwx pool=default.rgw.meta, allow rwx pool=default.rgw.log, allow rwx pool=default.rgw.control' \
|
||||||
|
-o /etc/ceph/ceph.client.rgw.k8s.svc.keyring
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Register Kubernetes Secrets
|
||||||
|
|
||||||
|
* Retreive the files, from the Ceph host, required for Kubernetes Secrets:
|
||||||
|
Retreive these files and store them **temporarily** on your workstation.
|
||||||
|
|
||||||
|
| File | Path | Purpose |
|
||||||
|
| :------- | :--- | :------ |
|
||||||
|
|ceph.conf|/etc/ceph/ceph.conf|Location of Ceph Monitors|
|
||||||
|
|Keyring | /etc/ceph/ceph.client.rgw.k8s.svc.keyring | Auth token |
|
||||||
|
|
||||||
|
* Create Secret manifests for deployment to K8s:
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl create secret generic ceph-config \
|
||||||
|
--namespace=object-store \
|
||||||
|
--from-file=ceph.conf=./conf \
|
||||||
|
--dry-run=client -o yaml > ceph-config-secret.yaml
|
||||||
|
```
|
||||||
|
```
|
||||||
|
kubectl create secret generic ceph-keyring \
|
||||||
|
--namespace=object-store \
|
||||||
|
--from-file=keyring=./keyring \
|
||||||
|
--dry-run=client -o yaml > ceph-keyring-secret.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
* Encrypt the secret manifests using sops:
|
||||||
|
* `sops encrypt --in-place ./ceph-config-secret.yaml`
|
||||||
|
* `sops encrypt --in-place ./ceph-keyring-secret.yaml`
|
||||||
|
|
||||||
|
### 3. Kubernetes Manifests
|
||||||
|
|
||||||
|
**These should be treated as examples, read through them and ensure they match your environment**
|
||||||
|
|
||||||
|
#### Namespace
|
||||||
|
|
||||||
|
```
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Namespace
|
||||||
|
metadata:
|
||||||
|
name: object-store
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Service
|
||||||
|
```
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Service
|
||||||
|
metadata:
|
||||||
|
name: ceph-rgw-svc
|
||||||
|
namespace: object-store
|
||||||
|
labels:
|
||||||
|
app.kubernetes.io/name: ceph-rgw
|
||||||
|
app.kubernetes.io/component: gateway
|
||||||
|
spec:
|
||||||
|
# The ClusterIP DNS name used for the RGW initialization:
|
||||||
|
# http://ceph-rgw-svc.object-store.svc.cluster.local:8080
|
||||||
|
ports:
|
||||||
|
- port: 8080
|
||||||
|
targetPort: 8080
|
||||||
|
protocol: TCP
|
||||||
|
name: http-api
|
||||||
|
selector:
|
||||||
|
app: ceph-rgw
|
||||||
|
type: ClusterIP
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Deployment
|
||||||
|
```
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: ceph-rgw
|
||||||
|
namespace: object-store
|
||||||
|
labels:
|
||||||
|
app.kubernetes.io/name: ceph-rgw
|
||||||
|
app.kubernetes.io/component: gateway
|
||||||
|
spec:
|
||||||
|
replicas: 2
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: ceph-rgw
|
||||||
|
strategy:
|
||||||
|
type: RollingUpdate
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app: ceph-rgw
|
||||||
|
spec:
|
||||||
|
# CRUCIAL: Enforce Pods to be on separate nodes for HA
|
||||||
|
affinity:
|
||||||
|
podAntiAffinity:
|
||||||
|
requiredDuringSchedulingIgnoredDuringExecution:
|
||||||
|
- labelSelector:
|
||||||
|
matchLabels:
|
||||||
|
app: ceph-rgw
|
||||||
|
topologyKey: "kubernetes.io/hostname"
|
||||||
|
containers:
|
||||||
|
- name: rgw
|
||||||
|
# Use the same Major:Minor as your PVE Hosts
|
||||||
|
image: quay.io/ceph/ceph:v18.2
|
||||||
|
# Arguments to start the RGW process on port 8080
|
||||||
|
args: [
|
||||||
|
"radosgw",
|
||||||
|
"-f", # Run in foreground
|
||||||
|
"--conf=/etc/ceph/ceph.conf", # Explicitly use the mounted config
|
||||||
|
"--name=client.rgw.k8s.svc", # The exact CephX user name we created
|
||||||
|
"--rgw-frontends=beast port=8080" # REQUIRED: Beast frontend for Ceph 18+
|
||||||
|
]
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
cpu: 500m
|
||||||
|
memory: 2Gi
|
||||||
|
limits:
|
||||||
|
cpu: 2000m
|
||||||
|
memory: 2Gi
|
||||||
|
ports:
|
||||||
|
- containerPort: 8080
|
||||||
|
name: rgw-http
|
||||||
|
# Ensure the Pod does not run as root unnecessarily
|
||||||
|
securityContext:
|
||||||
|
runAsUser: 167 # A common non-root user ID for Ceph containers
|
||||||
|
runAsGroup: 167
|
||||||
|
allowPrivilegeEscalation: false
|
||||||
|
volumeMounts:
|
||||||
|
- name: ceph-config-vol
|
||||||
|
mountPath: /etc/ceph/ceph.conf
|
||||||
|
subPath: ceph.conf
|
||||||
|
- name: ceph-keyring-vol
|
||||||
|
mountPath: /etc/ceph/ceph.client.rgw.k8s.svc.keyring
|
||||||
|
subPath: keyring
|
||||||
|
volumes:
|
||||||
|
- name: ceph-config-vol
|
||||||
|
secret:
|
||||||
|
secretName: ceph-config
|
||||||
|
items:
|
||||||
|
- key: ceph.conf
|
||||||
|
path: ceph.conf
|
||||||
|
- name: ceph-keyring-vol
|
||||||
|
secret:
|
||||||
|
secretName: ceph-keyring
|
||||||
|
items:
|
||||||
|
- key: keyring
|
||||||
|
path: ceph.client.rgw.k8s.svc.keyring
|
||||||
|
```
|
||||||
|
|
||||||
|
**Deploy these manifests to Flux**
|
||||||
|
|
||||||
|
### 4. RGW Admin Utility
|
||||||
|
|
||||||
|
**Do not commit this to Flux, run as and when required to manage RGW users and buckets**
|
||||||
|
|
||||||
|
#### Pod Manifest
|
||||||
|
```
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Pod
|
||||||
|
metadata:
|
||||||
|
name: rgw-admin-utility
|
||||||
|
namespace: object-store
|
||||||
|
spec:
|
||||||
|
restartPolicy: Never
|
||||||
|
containers:
|
||||||
|
- name: rgw-admin-cli
|
||||||
|
# Use the same image as your RGW deployment for consistency
|
||||||
|
image: quay.io/ceph/ceph:v18.2
|
||||||
|
# Use the /bin/bash entrypoint to allow manual command execution
|
||||||
|
command: ["/bin/bash", "-c", "sleep 3600"]
|
||||||
|
# Environment variable to explicitly define the CephX user for CLI tools
|
||||||
|
env:
|
||||||
|
- name: CEPH_ARGS
|
||||||
|
value: "--name client.rgw.k8s.svc --keyring /etc/ceph/ceph.client.rgw.k8s.svc.keyring"
|
||||||
|
|
||||||
|
volumeMounts:
|
||||||
|
# Mount the ceph.conf Secret
|
||||||
|
- name: ceph-config-vol
|
||||||
|
mountPath: /etc/ceph/ceph.conf
|
||||||
|
subPath: ceph.conf
|
||||||
|
# Mount the keyring Secret to the file name radosgw-admin expects
|
||||||
|
- name: ceph-keyring-vol
|
||||||
|
mountPath: /etc/ceph/ceph.client.rgw.k8s.svc.keyring
|
||||||
|
subPath: keyring
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
- name: ceph-config-vol
|
||||||
|
secret:
|
||||||
|
secretName: ceph-config
|
||||||
|
items:
|
||||||
|
- key: ceph.conf
|
||||||
|
path: ceph.conf
|
||||||
|
- name: ceph-keyring-vol
|
||||||
|
secret:
|
||||||
|
secretName: ceph-keyring
|
||||||
|
items:
|
||||||
|
- key: keyring
|
||||||
|
path: ceph.client.rgw.k8s.svc.keyring # Use the explicit filename
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Managing RGW
|
||||||
|
|
||||||
|
* Deploy the Pod using `kubectl apply -f {filepath}`
|
||||||
|
* Exec into the pod `kubectl exec -it rgw-admin-utility -n object-store -- bash`
|
||||||
|
|
||||||
|
##### Create User
|
||||||
|
|
||||||
|
* `radosgw-admin user create --uid={uid} --display-name={display-name} --gen-key --gen-secret`
|
||||||
|
|
||||||
|
**CRITICAL:** *Copy the JSON output, save the access_key and secret_key*
|
||||||
|
|
||||||
|
##### Create Bucket
|
||||||
|
|
||||||
|
* `radosgw-admin bucket create --bucket={buket-name} --uid={owner-uid}`
|
||||||
|
|
||||||
|
##### Exit & Cleanup
|
||||||
|
|
||||||
|
* `exit`
|
||||||
|
* `kubectl delete pod rgw-admin-utility -n object-store`
|
||||||
|
|
||||||
|
### 5. Generate Secret for Client Access
|
||||||
|
|
||||||
|
Deploy this in the namespace of the appliation requiring the S3 API Access
|
||||||
|
|
||||||
|
```
|
||||||
|
kubectl create secret generic s3-credentials \
|
||||||
|
--namespace={application-namespace} \
|
||||||
|
--from-literal=S3_ACCESS_KEY={access-key-from-user-creation} \
|
||||||
|
--from-literal=S3_SECRET_KEY={secret-key-from-user-creation} \
|
||||||
|
--dry-run=client -o yaml > s3-secret.yaml
|
||||||
|
```
|
||||||
@@ -1,129 +0,0 @@
|
|||||||
# 📚 Complete Ceph CSI Deployment & Troubleshooting Guide
|
|
||||||
|
|
||||||
This guide details the preparation and configuration necessary for successful dynamic provisioning of Ceph RBD (RWO) and CephFS (RWX) volumes in a Kubernetes cluster running on **MicroK8s**, backed by a Proxmox VE (PVE) Ceph cluster.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 1. ⚙️ Ceph Cluster Preparation (Proxmox VE)
|
|
||||||
|
|
||||||
These steps ensure the Ceph backend has the necessary pools and structure.
|
|
||||||
|
|
||||||
* **Create Dedicated Pools:** Create OSD Pools for data, e.g., **`k8s_rbd`** (for RWO), and **`k8s_data`** and **`k8s_metadata`** (for CephFS).
|
|
||||||
* **Create CephFS Metadata Servers (MDS):** Deploy **at least two** Metadata Server (MDS) instances.
|
|
||||||
* **Create CephFS Filesystem:** Create the Ceph Filesystem (e.g., named **`k8s`**), linking the metadata and data pools.
|
|
||||||
* **Create Subvolume Group (Mandatory Fix):** Create the dedicated Subvolume Group **`csi`** inside your CephFS. This is required by the CSI driver's default configuration and fixes the "No such file or directory" error during provisioning.
|
|
||||||
* **CLI Command:** `ceph fs subvolumegroup create k8s csi`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 2. 🔑 Ceph User and Authorization (The Permission Fix)
|
|
||||||
|
|
||||||
This addresses the persistent "Permission denied" errors during provisioning.
|
|
||||||
|
|
||||||
* **Create and Configure Ceph User:** Create the user (`client.kubernetes`) and set permissions for all services. The **wildcard MGR cap** (`mgr "allow *"`) is critical for volume creation.
|
|
||||||
* **Final Correct Caps Command:**
|
|
||||||
```bash
|
|
||||||
sudo ceph auth caps client.kubernetes \
|
|
||||||
mon 'allow r' \
|
|
||||||
mgr "allow *" \
|
|
||||||
mds 'allow rw' \
|
|
||||||
osd 'allow class-read object_prefix rbd_children, allow pool k8s_rbd rwx, allow pool k8s_metadata rwx, allow pool k8s_data rwx'
|
|
||||||
```
|
|
||||||
* **Export Key to Kubernetes Secrets:** Create and place two Secrets with the user key in the correct CSI provisioner namespaces:
|
|
||||||
* **RBD Secret:** `csi-rbd-secret` (in the RBD Provisioner namespace).
|
|
||||||
* **CephFS Secret:** `csi-cephfs-secret` (in the CephFS Provisioner namespace).
|
|
||||||
|
|
||||||
**The secrets should contain the keys: userID & userKey. userID should omit the 'client.' from the ceph output.**
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 3. 🌐 Network Configuration and Bi-Directional Routing
|
|
||||||
|
|
||||||
These steps ensure stable, bidirectional communication for volume staging and mounting.
|
|
||||||
|
|
||||||
### A. PVE Host Firewall Configuration
|
|
||||||
|
|
||||||
The PVE firewall must explicitly **allow inbound traffic** from the entire Kubernetes Pod Network to the Ceph service ports.
|
|
||||||
|
|
||||||
| Protocol | Port(s) | Source | Purpose |
|
|
||||||
| :--- | :--- | :--- | :--- |
|
|
||||||
| **TCP** | **6789** | K8s Pod Network CIDR (e.g., `10.1.0.0/16`) | Monitor connection. |
|
|
||||||
| **TCP** | **6800-7300** | K8s Pod Network CIDR | OSD/MDS/MGR data transfer. |
|
|
||||||
|
|
||||||
Alternatively, you may find a 'ceph' macro you can use on the PVE Firewall, if so use the Macro instead of additional rules.
|
|
||||||
|
|
||||||
### B. PVE Host Static Routing (Ceph $\rightarrow$ K8s)
|
|
||||||
|
|
||||||
Add **persistent static routes** on **all PVE Ceph hosts** to allow Ceph to send responses back to the Pod Network.
|
|
||||||
|
|
||||||
* **Action:** Edit `/etc/network/interfaces` on each PVE host:
|
|
||||||
```ini
|
|
||||||
# Example:
|
|
||||||
post-up ip route add <POD_NETWORK_CIDR> via <K8S_NODE_IP> dev <PVE_INTERFACE>
|
|
||||||
# e.g., post-up ip route add 10.1.0.0/16 via 172.35.100.40 dev vmbr0
|
|
||||||
```
|
|
||||||
|
|
||||||
### C. K8s Node IP Forwarding (Gateway Function)
|
|
||||||
|
|
||||||
Enable IP forwarding on **all Kubernetes nodes** so they can route incoming Ceph traffic to the correct Pods.
|
|
||||||
|
|
||||||
* **Action:** Run on all K8s nodes:
|
|
||||||
```bash
|
|
||||||
sudo sysctl net.ipv4.ip_forward=1
|
|
||||||
sudo sh -c 'echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.d/99-sysctl.conf'
|
|
||||||
```
|
|
||||||
|
|
||||||
### D. K8s Static Routing (K8s $\rightarrow$ Ceph) - Conditional/Advanced ⚠️
|
|
||||||
|
|
||||||
This routing is **only required** if the **Ceph Public Network** (the network Ceph Monitors/OSDs listen on) is **not reachable** by your Kubernetes Node's **default gateway**.
|
|
||||||
|
|
||||||
* **Action:** This is implemented via a **Netplan configuration** on the Kubernetes nodes, using multiple routes with different metrics to provide load balancing and automatic failover.
|
|
||||||
* **Example Netplan Configuration (`/etc/netplan/99-ceph-routes.yaml`):**
|
|
||||||
```yaml
|
|
||||||
network:
|
|
||||||
version: 2
|
|
||||||
renderer: networkd
|
|
||||||
ethernets:
|
|
||||||
eth0: # Replace with your primary K8s network interface
|
|
||||||
routes:
|
|
||||||
# Route 1: Directs traffic destined for the first Ceph Monitor IP (10.11.12.1)
|
|
||||||
# through three different PVE hosts (172.35.100.x) as gateways.
|
|
||||||
# The lowest metric (10) is preferred.
|
|
||||||
- to: 10.11.12.1/32
|
|
||||||
via: 172.35.100.10
|
|
||||||
metric: 10
|
|
||||||
- to: 10.11.12.1/32
|
|
||||||
via: 172.35.100.20
|
|
||||||
metric: 100
|
|
||||||
- to: 10.11.12.1/32
|
|
||||||
via: 172.35.100.30
|
|
||||||
metric: 100
|
|
||||||
|
|
||||||
# Route 2: Directs traffic destined for the second Ceph Monitor IP (10.11.12.2)
|
|
||||||
# with a similar failover strategy.
|
|
||||||
- to: 10.11.12.2/32
|
|
||||||
via: 172.35.100.20
|
|
||||||
metric: 10
|
|
||||||
- to: 10.11.12.2/32
|
|
||||||
via: 172.35.100.10
|
|
||||||
metric: 100
|
|
||||||
- to: 10.11.12.2/32
|
|
||||||
via: 172.35.100.30
|
|
||||||
metric: 100
|
|
||||||
```
|
|
||||||
|
|
||||||
Use route priorities (Lower is Higher) to prefer the most direct path, while still offering alternative gateways into the Ceph network where needed.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 4. 🧩 MicroK8s CSI Driver Configuration (The Path Fix) - Conditional/Advanced ⚠️
|
|
||||||
|
|
||||||
This adjustment is **only required** if **MicroK8s** is running your Kuberneted deployment. **Alternative changes** may be needed for other Kubetnetes distributions.
|
|
||||||
|
|
||||||
This resolves the **`staging path does not exist on node`** error for the Node Plugin.
|
|
||||||
|
|
||||||
* **Update `kubeletDir`:** When deploying the CSI driver (via Helm or YAML), the `kubeletDir` parameter must be set to the MicroK8s-specific path.
|
|
||||||
```yaml
|
|
||||||
# Correct path for MicroK8s Kubelet root directory
|
|
||||||
kubeletDir: /var/snap/microk8s/common/var/lib/kubelet
|
|
||||||
```
|
|
||||||
Reference in New Issue
Block a user