Memory Profiling

Ceph Monitor, OSD, and MDS can report TCMalloc heap profiles. Install google-perftools if you want to generate these. Your OS distribution might package this under a different name (for example, gperftools), and your OS distribution might use a different package manager. Run a command similar to this one to install google-perftools:

sudo apt-get install google-perftools

The profiler dumps output to your log file directory (/var/log/ceph). See Logging and Debugging for details.

To view the profiler logs with Google’s performance tools, run the following command:

google-pprof --text {path-to-daemon}  {log-path/filename}

For example:

$ ceph tell osd.0 heap start_profiler
$ ceph tell osd.0 heap dump
osd.0 tcmalloc heap stats:------------------------------------------------
MALLOC:        2632288 (    2.5 MiB) Bytes in use by application
MALLOC: +       499712 (    0.5 MiB) Bytes in page heap freelist
MALLOC: +       543800 (    0.5 MiB) Bytes in central cache freelist
MALLOC: +       327680 (    0.3 MiB) Bytes in transfer cache freelist
MALLOC: +      1239400 (    1.2 MiB) Bytes in thread cache freelists
MALLOC: +      1142936 (    1.1 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =      6385816 (    6.1 MiB) Actual memory used (physical + swap)
MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =      6385816 (    6.1 MiB) Virtual address space used
MALLOC:            231              Spans in use
MALLOC:             56              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
$ google-pprof --text \
               /usr/bin/ceph-osd  \
 Total: 3.7 MB
 1.9  51.1%  51.1%      1.9  51.1% ceph::log::Log::create_entry
 1.8  47.3%  98.4%      1.8  47.3% std::string::_Rep::_S_create
 0.0   0.4%  98.9%      0.0   0.6% SimpleMessenger::add_accept_pipe
 0.0   0.4%  99.2%      0.0   0.6% decode_message

Performing another heap dump on the same daemon creates another file. It is convenient to compare the new file to a file created by a previous heap dump to show what has grown in the interval. For example:

$ google-pprof --text --base out/osd.0.profile.0001.heap \
      ceph-osd out/osd.0.profile.0003.heap
 Total: 0.2 MB
 0.1  50.3%  50.3%      0.1  50.3% ceph::log::Log::create_entry
 0.1  46.6%  96.8%      0.1  46.6% std::string::_Rep::_S_create
 0.0   0.9%  97.7%      0.0  26.1% ReplicatedPG::do_op
 0.0   0.8%  98.5%      0.0   0.8% __gnu_cxx::new_allocator::allocate

See Google Heap Profiler for additional details.

After you have installed the heap profiler, start your cluster and begin using the heap profiler. You can enable or disable the heap profiler at runtime, or ensure that it runs continuously. When running commands based on the examples that follow, do the following:

  1. replace {daemon-type} with mon, osd or mds

  2. replace {daemon-id} with the OSD number or the MON ID or the MDS ID

Starting the Profiler

To start the heap profiler, run a command of the following form:

ceph tell {daemon-type}.{daemon-id} heap start_profiler

For example:

ceph tell osd.1 heap start_profiler

Alternatively, if the CEPH_HEAP_PROFILER_INIT=true variable is found in the environment, the profile will be started when the daemon starts running.

Printing Stats

To print out statistics, run a command of the following form:

ceph  tell {daemon-type}.{daemon-id} heap stats

For example:

ceph tell osd.0 heap stats


The reporting of stats with this command does not require the profiler to be running and does not dump the heap allocation information to a file.

Dumping Heap Information

To dump heap information, run a command of the following form:

ceph tell {daemon-type}.{daemon-id} heap dump

For example:

ceph tell mds.a heap dump


Dumping heap information works only when the profiler is running.

Releasing Memory

To release memory that tcmalloc has allocated but which is not being used by the Ceph daemon itself, run a command of the following form:

ceph tell {daemon-type}{daemon-id} heap release

For example:

ceph tell osd.2 heap release

Stopping the Profiler

To stop the heap profiler, run a command of the following form:

ceph tell {daemon-type}.{daemon-id} heap stop_profiler

For example:

ceph tell osd.0 heap stop_profiler

Alternative Methods of Memory Profiling

Running Massif heap profiler with Valgrind

The Massif heap profiler tool can be used with Valgrind to measure how much heap memory is used. This method is well-suited to troubleshooting RadosGW.

See the Massif documentation for more information.

Install Valgrind from the package manager for your distribution then start the Ceph daemon you want to troubleshoot:

sudo -u ceph valgrind --max-threads=1024 --tool=massif /usr/bin/radosgw -f --cluster ceph --name NAME --setuser ceph --setgroup ceph

When this command has completed its run, a file with a name of the form massif.out.<pid> will be saved in your current working directory. To run the command above, the user who runs it must have write permissions in the current directory.

Run the ms_print command to get a graph and statistics from the collected data in the massif.out.<pid> file:

ms_print massif.out.12345

The output of this command is helpful when submitting a bug report.