Rook - Ceph storage


5 min read

Ceph is a powerful distributed storage system, while Rook is a cloud-native storage orchestrator for Kubernetes. Rook Ceph combines the capabilities of both projects, enabling users to deploy and manage Ceph storage as Kubernetes applications, simplifying storage management in Kubernetes environments.

In a sense to understand , we can draw an analogy between Ceph and Docker containers, and Rook and Kubernetes, but it's important to understand the differences:

  • Ceph is a distributed storage system designed to provide scalable and reliable storage for cloud computing environments. It is analogous to Docker containers in the sense that both are technologies used to manage and deploy infrastructure components (storage in the case of Ceph, and applications in the case of Docker).

  • Like Docker containers, Ceph can be deployed across multiple nodes to provide a distributed and scalable storage solution. Ceph stores data across multiple storage nodes (OSDs), similar to how Docker containers can be distributed across multiple hosts.

  • Rook is a cloud-native storage orchestrator designed specifically for Kubernetes. It automates the deployment, management, and scaling of storage systems as native Kubernetes applications. Rook abstracts the complexity of deploying storage systems and provides Kubernetes-native APIs for managing storage resources.

  • Rook is analogous to Kubernetes in the sense that both are orchestration platforms used to manage infrastructure components (storage in the case of Rook, and applications in the case of Kubernetes). Rook extends Kubernetes to provide storage orchestration capabilities, similar to how Kubernetes manages containers.

  • Installation:-

  • Both Rook and Ceph are designed to be versatile and can be deployed in various types of environments, including on-premises data centers or bare-metal servers.

You can use tools like Helm or kubectl to deploy Rook components onto your Kubernetes cluster.

After installing Rook, you can configure it to deploy Ceph clusters on your Kubernetes cluster. Rook provides Custom Resource Definitions (CRDs) for defining Ceph clusters, which you can customize according to your requirements. You'll specify parameters such as the number of storage nodes, disk configurations, and replication settings.

Once Rook is configured, you can deploy Ceph clusters using Kubernetes resources such as Custom Resource Definitions (CRDs). Rook will handle tasks such as provisioning storage nodes, deploying Ceph OSDs (Object Storage Daemons), and configuring the Ceph cluster.

Storage types offered by Ceph

Ceph offers support for various types of storage, including block storage, object storage, and file system storage. Each type of storage has its own characteristics and use cases:

Block storage is a type of storage where data is stored in fixed-size blocks (chunks) and accessed through block-level protocols such as iSCSI

In block storage, data is stored on physical storage devices such as hard disk drives (HDDs), solid-state drives (SSDs), or storage arrays.

The storage capacity of each storage device is divided into logical blocks, which are fixed-size units of storage. Common block sizes range from a few kilobytes to several megabytes. Each logical block is assigned a unique address, allowing it to be accessed independently.

  • Block storage is commonly used for applications that require direct access to raw storage devices, such as databases, virtual machines (VMs), and high-performance computing (HPC) environments.

  • Block storage provides direct access to storage blocks, allowing applications or operating systems to read from and write to specific blocks of data on storage devices. This direct access enables high-performance storage operations and low-latency data access.

  • Ceph provides block storage capabilities through its RBD (RADOS Block Device) feature, allowing you to create virtual block devices that can be attached to VMs or used as raw storage devices by applications.

Object Storage:

  • Object storage is a type of storage where data is stored as objects, each with a unique identifier (object key), metadata, and data payload. Objects are stored in a flat namespace and accessed via APIs such as RESTful HTTP.

  • Object storage is highly scalable and suitable for storing large volumes of unstructured data, such as media files, backups, logs, and archives.

  • Ceph provides object storage capabilities through its RADOS Gateway (RGW) feature, which implements the S3-compatible and Swift-compatible object storage APIs. RGW allows you to create buckets and store objects in Ceph clusters, providing scalable and durable storage for object data.

File System:

  • File system storage is a type of storage where data is organized and accessed in a hierarchical structure of directories and files. File systems provide a familiar interface for accessing and managing data, similar to how data is organized on local disk drives.

  • File system storage is commonly used for shared file storage, home directories, application data, and other file-based workloads.

  • Ceph provides file system storage capabilities through its CephFS feature, which implements a distributed file system on top of the Ceph storage cluster. CephFS allows you to create file systems and mount them on multiple clients, providing shared access to files and directories stored in the Ceph cluster.

kind: CephBlockPool
  name: rbd-pool
  namespace: rook-ceph   # Replace with your namespace if different
    size: 3   # Replication factor: Number of copies of data
kind: StorageClass
  name: rook-ceph-block
  blockPool: rbd-pool   # Name of the Ceph block pool to use
  clusterNamespace: rook-ceph   # Replace with your namespace if different
  fstype: ext4   # File system type for the RBD volume
apiVersion: v1
kind: PersistentVolumeClaim
  name: my-rbd-pvc
    - ReadWriteOnce
  storageClassName: rook-ceph-block   # Use the defined StorageClass
      storage: 1Gi   # Requested storage size for the RBD volume