Notice

This document is for a development version of Ceph.

Management Gateway

Deploying mgmt-gateway

In Ceph releases beginning with Tentacle, the mgmt-gateway service introduces a new design for Ceph applications based on a modular, service-based architecture. This service, managed by cephadm and built on top of nginx (an open-source, high-performance web server), acts as the new front-end and single entry point to the Ceph cluster. The mgmt-gateway provides unified access to all Ceph applications, including the Ceph dashboard and monitoring stack. Employing nginx enhances security and simplifies access management due to its robust community support and high-security standards. The mgmt-gateway service acts as a reverse proxy that routes requests to the appropriate Ceph application instances.

In order to deploy the mgmt-gateway service, use the following command:

ceph orch apply mgmt-gateway [--placement ...] ...

Once applied cephadm will reconfigure specific running daemons (such as monitoring) to run behind the newly created service. External access to those services will not be possible anymore. Access will be consolidated behind the new service endpoint: https://<node-ip>:<port>.

Benefits of the mgmt-gateway service

Unified Access: Consolidated access through nginx improves security and provides a single entry point to services.
Improved user experience: Users no longer need to know where each application is running (IP/host).
High Availability for dashboard: nginx HA mechanisms are used to provide high availability for the Ceph dashboard.
High Availability for monitoring: nginx HA mechanisms are used to provide high availability for monitoring.

Security enhancements

Once the mgmt-gateway service is deployed, users cannot access monitoring services without authentication through the Ceph dashboard.

High availability enhancements

nginx HA mechanisms are used to provide high availability for all the Ceph management applications including the Ceph dashboard and monitoring stack. In case of the Ceph dashboard, users no longer need to know where the active manager is running. mgmt-gateway handles manager failover transparently and redirects the user to the active manager. In case of the monitoring mgmt-gateway takes care of handling HA when several instances of Prometheus, Alertmanager or Grafana are available. The reverse proxy will automatically detect healthy instances and use them to process user requests.

High Availability for mgmt-gateway service

In addition to providing high availability for the underlying backend services, the mgmt-gateway service itself can be configured for high availability, ensuring that the system remains resilient even if certain core components for the service fail, including the mgmt-gateway itself.

Multiple mgmt-gateway instances can be deployed in an active/standby configuration using keepalived for seamless failover. The oauth2-proxy service can be deployed as multiple stateless instances, with nginx acting as a load balancer across them using a round-robin strategy. This setup removes single points of failure and enhances the resilience of the entire system.

In this setup, the underlying internal services follow the same high availability mechanism. Instead of directly accessing the mgmt-gateway internal endpoint, services use the virtual IP specified in the spec. This ensures that the high availability mechanism for mgmt-gateway is transparent to other services.

The simplest and recommended way to deploy the mgmt-gateway in high availability mode is by using labels. To run the mgmt-gateway in HA mode, users can either use the cephadm command line as follows:

ceph orch apply mgmt-gateway --virtual_ip 192.168.100.220 --enable-auth=true --placement="label:mgmt"

Or provide specification files as follows:

mgmt-gateway configuration:

service_type: mgmt-gateway
placement:
  label: mgmt
spec:
  enable_auth: true
  virtual_ip: 192.168.100.220

In addition, the admin must configure an ingress service to provide virtual IP functionality for the mgmt-gateway. For example:

service_type: ingress
service_id: ingress-mgmt-gw
placement:
  label: mgmt
virtual_ip: 192.168.100.220
backend_service: mgmt-gateway
keepalive_only: true

The number of deployed instances is determined by the number of hosts with the mgmt label. The ingress is configured in keepalive_only mode, with labels ensuring that any changes to the mgmt-gateway daemons are replicated to the corresponding keepalived instances.

Note

The virtual_ip parameter must be identical in both the ingress and mgmt-gateway specifications.

Accessing services with mgmt-gateway

Once the mgmt-gateway service is deployed, direct access to the monitoring services will not be allowed anymore. Applications including: Prometheus, Grafana and Alertmanager are now accessible through links from Administration > Services.

Service Specification

A mgmt-gateway service can be applied using a specification. An example in YAML follows:

service_type: mgmt-gateway
service_id: gateway
placement:
  hosts:
    - ceph0
spec:
 port: 5000
 ssl: True
 ssl_protocols:
   - TLSv1.2
   - TLSv1.3
   - ...
 ssl_ciphers:
   - AES128-SHA
   - AES256-SHA
   - ...
 ssl_cert: |
   -----BEGIN CERTIFICATE-----
   MIIDtTCCAp2gAwIBAgIYMC4xNzc1NDQxNjEzMzc2MjMyXzxvQ7EcMA0GCSqGSIb3
   DQEBCwUAMG0xCzAJBgNVBAYTAlVTMQ0wCwYDVQQIDARVdGFoMRcwFQYDVQQHDA5T
   [...]
   -----END CERTIFICATE-----
 ssl_key: |
   -----BEGIN PRIVATE KEY-----
   MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC5jdYbjtNTAKW4
   /CwQr/7wOiLGzVxChn3mmCIF3DwbL/qvTFTX2d8bDf6LjGwLYloXHscRfxszX/4h
   [...]
   -----END PRIVATE KEY-----

Fields specific to the spec section of the mgmt-gateway service are described below.

class ceph.deployment.service_spec.MgmtGatewaySpec(service_type='mgmt-gateway', service_id=None, config=None, networks=None, placement=None, enable_auth=False, port=None, ssl_cert=None, ssl_key=None, ssl=True, certificate_source=None, custom_sans=None, ssl_prefer_server_ciphers=None, ssl_session_tickets=None, ssl_session_timeout=None, ssl_session_cache=None, server_tokens=None, ssl_stapling=None, ssl_stapling_verify=None, ssl_protocols=None, ssl_ciphers=None, enable_health_check_endpoint=False, virtual_ip=None, preview_only=False, unmanaged=False, extra_container_args=None, extra_entrypoint_args=None, custom_configs=None)

enable_auth: Flag to enable SSO auth. Requires oauth2-proxy to be active for SSO authentication.

networks: List[str]: A list of network identities instructing the daemons to only bind on the particular networks in that list. In case the cluster is distributed across multiple networks, you can add multiple networks. See Networks and Ports, Specifying Networks and Specifying Networks.

placement: PlacementSpec: See Daemon Placement.

port: The port number on which the server will listen

server_tokens: Flag to control server tokens in responses: on | off | build | string

ssl: Flag to enable or disable HTTPS. By default set to True.

ssl_ciphers: List of supported secure SSL ciphers. Changing this list may reduce system security.

ssl_prefer_server_ciphers: Prefer server ciphers over client ciphers: on | off

ssl_protocols: A list of supported SSL protocols (as supported by nginx)

ssl_session_cache: Nginx SSL/TLS session cache settings: off | none | [builtin[:size]] [shared:name:size]

ssl_session_tickets: Flag to control session tickets: on | off

ssl_session_timeout: The duration for SSL session timeout. Syntax: time (i.e: 5m)

ssl_stapling: Flag to enable or disable SSL stapling: on | off

ssl_stapling_verify: Flag to control verification of SSL stapling: on | off

virtual_ip: Virtual IP address used for the management gateway in a high availability setup.

Warning

TLSv1.3 is considered safe at this moment and includes a set of secure ciphers by default. When configuring SSL/TLS ciphers for older versions, especially TLSv1.2, it is crucial to use only a subset of secure ciphers. Using weak or outdated ciphers can significantly compromise the security of your system.

Any alteration of the cipher list for SSL/TLS configurations is the responsibility of the system administrator. Avoid modifying these lists without a thorough understanding of the implications. Incorrect configurations can lead to vulnerabilities such as weak encryption, lack of forward secrecy, and susceptibility to various attacks. Always refer to up-to-date security guidelines and best practices when configuring SSL/TLS settings.

The specification can then be applied by running the following command:

ceph orch apply -i mgmt-gateway.yaml

Limitations

Services must bind to the appropriate ports based on the applications being proxied. Ensure that there are no port conflicts that might disrupt service availability.

Default images

The mgmt-gateway service internally makes use of nginx reverse proxy. The following container image is used by default:

mgr/cephadm/container_image_nginx = 'quay.io/ceph/nginx:sclorg-nginx-126'

Admins can specify the image to be used by changing the container_image_nginx cephadm module option. If there are already running daemon(s), you must redeploy the daemon(s) in order to have them actually use the new image.

For example:

ceph config set mgr mgr/cephadm/container_image_nginx <new-nginx-image>
ceph orch redeploy mgmt-gateway

Brought to you by the Ceph Foundation

The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.