This document is for a development version of Ceph.
Datacenter-Data-Delivery Network (D3N) uses high-speed storage such as NVMe flash or DRAM to cache datasets on the access side. Such caching allows big data jobs to use the compute and fast storage resources available on each Rados Gateway node at the edge.
Many datacenters include low-cost, centralized storage repositories, called data lakes, to store and share terabyte and petabyte-scale datasets. By necessity most distributed big-data analytic clusters such as Hadoop and Spark must depend on accessing a centrally located data lake that is relatively far away. Even with a well-designed datacenter network, cluster-to-data lake bandwidth is typically much less than the bandwidth of a solid-state storage located at an edge node.
D3N improves the performance of big-data jobs by speeding up repeatedly accessed dataset reads from the data lake. Cache servers are located in the datacenter on the access side of potential network and storage bottlenecks. D3Ns two-layer logical cache forms a traditional caching hierarchy * where caches nearer the client have the lowest access latency and overhead, while caches in higher levels in the hierarchy are slower (requiring multiple hops to access), The layer 1 cache server nearest to the client handles object requests by breaking them into blocks, returning any blocks which are cached locally, and forwarding missed requests to the block home location (as determined by consistent hashing) in the next layer. Cache misses are forwarded to successive logical caching layers until a miss at the top layer is resolved by a request to the data lake (Rados)
* currently only layer 1 cache has been upstreamed.
The D3N cache supports both the S3 and Swift object storage interfaces.
D3N currently caches only tail objects, because they are immutable (by default it is parts of objects that are larger than 4MB). (the NGINX RGW Data cache and CDN supports caching of all object sizes)
An SSD (/dev/nvme,/dev/pmem,/dev/shm) or similar block storage device, formatted (filesystems other than XFS were not tested) and mounted. It will be used as the cache backing store. (depending on device performance, multiple RGWs may share a single device but each requires a discrete directory on the device filesystem)
To enable D3N on an existing RGWs the following configuration entries are required
in each Rados Gateways ceph.conf client section, for example for
[client.rgw.8000] rgw_d3n_l1_local_datacache_enabled = true rgw_d3n_l1_datacache_persistent_path = "/mnt/nvme0/rgw_datacache/client.rgw.8000/" rgw_d3n_l1_datacache_size = 10737418240
The above example assumes that the cache backing-store solid state device is mounted at /mnt/nvme0 and has 10 GB of free space available for the cache.
The persistent path directory has to be created before starting the Gateway.
mkdir -p /mnt/nvme0/rgw_datacache/client.rgw.8000/)
If another Gateway is co-located on the same machine, configure it’s persistent path to a discrete directory,
for example in the case of [client.rgw.8001] configure
rgw_d3n_l1_datacache_persistent_path = "/mnt/nvme0/rgw_datacache/client.rgw.8001/"
[client.rgw.8001] ceph.conf client section.
In a multiple co-located Gateways configuration consider assigning clients with different workloads to each Gateway without a balancer in order to avoid cached data duplication.
NOTE: each time the Rados Gateway is restarted the content of the cache directory is purged.
D3N related log lines in radosgw.*.log contain the string
low level D3N logs can be enabled by the
debug_rgw_datacachesubsystem (up to
The following D3N related settings can be added to the Ceph configuration file
(i.e., usually ceph.conf) under the
Enable datacenter-scale dataset delivery local cache
path for the directory for storing the local cache objects data
datacache maximum size on disk in bytes
select the d3n cache eviction policy
- valid choices