Operating the Read (Primary) Balancer

You might be wondering: How can I improve performance in my Ceph cluster? One important data point you can check is the read_balance_score on each of your replicated pools.

This metric, available via ceph osd pool ls detail (see Pools for more details) indicates read performance, or how balanced the primaries are for each replicated pool. In most cases, if a read_balance_score is above 1 (for instance, 1.5), this means that your pool has unbalanced primaries and that you may want to try improving your read performance with the read balancer.

Online Optimization

At present, there is no online option for the read balancer. However, we plan to add the read balancer as an option to the Balancer Module in the next Ceph version so it can be enabled to run automatically in the background like the upmap balancer.

Offline Optimization

Primaries are updated with an offline optimizer that is built into the osdmaptool -- ceph osd cluster map manipulation tool.

  1. Grab the latest copy of your osdmap:

    ceph osd getmap -o om
    
  2. Run the optimizer:

    osdmaptool om --read out.txt --read-pool <pool name> [--vstart]
    

    It is highly recommended that you run the capacity balancer before running the balancer to ensure optimal results. See Using pg-upmap for details on how to balance capacity in a cluster.

  3. Apply the changes:

    source out.txt
    

    In the above example, the proposed changes are written to the output file out.txt. The commands in this procedure are normal Ceph CLI commands that can be run in order to apply the changes to the cluster.

    If you are working in a vstart cluster, you may pass the --vstart parameter as shown above so the CLI commands are formatted with the ./bin/ prefix.

    Note that any time the number of pgs changes (for instance, if the pg autoscaler [Autoscaling placement groups] kicks in), you should consider rechecking the scores and rerunning the balancer if needed.

To see some details about what the tool is doing, you can pass --debug-osd 10 to osdmaptool. To see even more details, pass --debug-osd 20 to osdmaptool.

Troubleshooting

Removing pg-upmap-primary mappings

For scenarios where you need to manually remove pg-upmap-primary mappings, Ceph provides the following developer-level commands. These commands should be used with caution, as they directly modify primary PG mappings and can impact read performance (this excludes any data movement).

Note

Users affected by #66867 or #61948 may find these commands useful when dealing with unexpected pg-upmap-primary behavior.

To remove a specific pg-upmap-primary mapping, use:

ceph osd rm-pg-upmap-primary <pgid>

If you need to clear all pg-upmap-primary mappings in your cluster, you may use:

ceph osd rm-pg-upmap-primary-all

Unable to Use Kernel Client

If you are unable to use the kernel client to map RBD images or mount a filesystem while pg-upmap-primary mappings are in your cluster, this is because pg-upmap-primary is not yet supported by the kernel client (as of 2025-09-08).

Follow these steps to confirm this scenario:

  1. Confirm that your cluster contains pg-upmap-primary mappings:

ceph osd dump | grep "pg_upmap_primary"
  1. Check for this error message in the kernel log:

$ dmesg | tail

[73393.901029] libceph: mon2 (1)10.64.24.186:6789 feature set mismatch, my 2f018fb87aa4aafe < server's 2f018fb8faa4aafe, missing 80000000
[73393.901037] libceph: mon2 (1)10.64.24.186:6789 missing required protocol features

Those details confirm that the cluster is using features that the kernel client doesn’t support. Until the kernel client supports pg-upmap-primary, you must remove the mappings to successfully perform mounts. You may do so with the following commands:

  1. If using the balancer module, change the mode back to one that does not use pg-upmap-primary. This prevents additional mappings from being made:

ceph balancer mode upmap
  1. Remove all pg-upmap-primary mappings:

ceph osd rm-pg-upmap-primary-all