Logging and Debugging
Ceph component debug log levels can be adjusted at runtime, while services are
running. In some circumstances you might want to adjust debug log levels in
ceph.conf
or in the central config store. Increased debug logging can be
useful if you are encountering issues when operating your cluster. By default,
Ceph log files are in /var/log/ceph
; containerized deployments often log
elsewhere under /var/log
.
Tip
Remember that debug output can slow down your system, and that this latency sometimes hides race conditions.
Debug logging is resource intensive. If you encounter a problem in a specific component of your cluster, begin troubleshooting by enabling logging for only that component. For example, if your OSDs are running without errors, but your CephFS metadata servers (MDS) are not, enable logging for specific instances that are having problems. Continue by enabling logging for each subsystem only as needed.
Important
Verbose logging sometimes generates over 1 GB of data per hour. If the disk that your operating system runs on (your “OS disk”) reaches its capacity, the node associated with that disk will stop working.
Whenever you enable or increase the level of debug logging, ensure that you have ample capacity for log files, as this may dramatically increase their size. For details on rotating log files, see Accelerating Log Rotation. When your system is running well again, remove unnecessary debugging settings in order to ensure that your cluster runs optimally. Logging debug-output messages is a slow process and a potential waste of your cluster’s resources.
For details on available settings, see Subsystem, Log and Debug Settings.
Runtime
To see configuration settings at runtime, log in to a host that has a running daemon and run a command of the following form:
ceph daemon {daemon-name} config show | less
For example:
ceph daemon osd.0 config show | less
To activate Ceph’s debugging output (that is, the dout()
logging function)
at runtime, inject arguments into the runtime configuration by running a ceph
tell
command of the following form:
ceph tell {daemon-type}.{daemon id or *} config set {name} {value}
Here {daemon-type}
is osd
, mon
, or mds
. Apply the runtime
setting either to a specific daemon (by specifying its ID) or to all daemons of
a particular type (by using the *
wildcard as the ID). For example, to increase
debug logging for a specific ceph-osd
daemon named osd.0
, run the
following command:
ceph tell osd.0 config set debug_osd 0/5
The ceph tell
command goes through the monitors. However, if you are unable
to bind to the monitor, there is another method that can be used to activate
Ceph’s debugging output: use the ceph daemon
command to log in to the host
of a specific daemon and change the daemon’s configuration. For example:
sudo ceph daemon osd.0 config set debug_osd 0/5
For details on available settings, see Subsystem, Log and Debug Settings.
Boot Time
To activate Ceph’s debugging output (that is, the dout()
logging function)
at boot time, you must add settings to your Ceph configuration file (or
set corresponding values in the central config store).
Subsystems that are common to all daemons are set under [global]
in the
configuration file. Subsystems for a specific daemon are set under the relevant
daemon section in the configuration file (for example, [mon]
, [osd]
,
[mds]
). Here is an example that shows possible debugging settings in a Ceph
configuration file:
[global]
debug_ms = 1/5
[mon]
debug_mon = 20
debug_paxos = 1/5
debug_auth = 2
[osd]
debug_osd = 1/5
debug_filestore = 1/5
debug_journal = 1
debug_monc = 5/20
[mds]
debug_mds = 1
debug_mds_balancer = 1
For details, see Subsystem, Log and Debug Settings.
Accelerating Log Rotation
If a host’s log filesystem is nearly full, you can accelerate log rotation by
modifying the Ceph log rotation file at /etc/logrotate.d/ceph
. To increase
the frequency of log rotation (which will guard against a filesystem reaching
capacity), add a size
directive after the weekly
frequency directive.
To smooth out volume spikes, consider changing weekly
to daily
and
consider changing rotate
to 30
. The procedure for adding the size
setting is shown immediately below.
Note the default settings of the
/etc/logrotate.d/ceph
file:rotate 7 weekly compress sharedscripts
Modify them by adding a
size
setting:rotate 7 weekly size 500M compress sharedscripts
Start the crontab editor for your user space:
crontab -e
Add an entry to crontab that instructs cron to check the
etc/logrotate.d/ceph
file:30 * * * * /usr/sbin/logrotate /etc/logrotate.d/ceph >/dev/null 2>&1
In this example, the etc/logrotate.d/ceph
file will be checked and possibly
rotated every 30 minutes.
Valgrind
When you are debugging your cluster’s performance, you might find it necessary
to track down memory and threading issues. The Valgrind tool suite can be used
to detect problems in a specific daemon, in a particular type of daemon, or in
the entire cluster. Because Valgrind is computationally expensive, it should be
used only when developing or debugging Ceph, and it will slow down your system
if used at other times. Valgrind messages are logged to stderr
.
Subsystem, Log and Debug Settings
Debug logging output is typically enabled via subsystems.
Ceph Subsystems
For each subsystem, there is a logging level for its output logs (a so-called
“log level”) and a logging level for its in-memory logs (a so-called “memory
level”). Different values may be set for these two logging levels in each
subsystem. Ceph’s logging levels operate on a scale of 1
to 20
, where
1
is terse and 20
is verbose. In a certain few cases, there are logging
levels that can take a value greater than 20. The resulting logs are extremely
verbose.
The in-memory logs are not sent to the output log unless one or more of the following conditions are true:
a fatal signal has been raised or
an assertion within Ceph code has been triggered or
sending in-memory logs to the output log has been manually triggered. Consult the portion of the “Ceph Administration Tool documentation that provides an example of how to submit admin socket commands for more detail.
Log levels and memory levels can be set either together or separately. If a
subsystem is assigned a single value, then that value determines both the log
level and the memory level. For example, debug ms = 5
will give the ms
subsystem a log level of 5
and a memory level of 5
. On the other hand,
if a subsystem is assigned two values that are separated by a forward slash
(/), then the first value determines the log level and the second value
determines the memory level. For example, debug ms = 1/5
will give the
ms
subsystem a log level of 1
and a memory level of 5
. See the
following:
debug {subsystem} = {log-level}/{memory-level}
#for example
debug mds balancer = 1/20
The following table provides a list of Ceph subsystems and their default log and memory levels. Once you complete your logging efforts, restore each subsystem’s values to their defaults or to a level suitable for normal operations.
Subsystem |
Log Level |
Memory Level |
---|---|---|
|
0 |
5 |
|
0 |
1 |
|
0 |
1 |
|
1 |
1 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
0 |
1 |
|
0 |
1 |
|
0 |
1 |
|
0 |
1 |
|
0 |
1 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
1 |
5 |
|
0 |
5 |
|
0 |
5 |
|
1 |
3 |
|
1 |
3 |
|
0 |
5 |
|
1 |
5 |
|
0 |
10 |
|
1 |
5 |
|
0 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
1 |
|
1 |
1 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
1 |
|
0 |
0 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
3 |
|
1 |
5 |
|
4 |
5 |
|
1 |
5 |
|
2 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
0 |
5 |
|
1 |
5 |
|
0 |
5 |
|
1 |
5 |
|
1 |
5 |
|
1 |
5 |
Logging and Debugging Settings
It is not necessary to specify logging and debugging settings in the Ceph configuration file, but you may override default settings when needed. Ceph supports the following settings:
- log_file
The location of the logging file for your cluster.
- type:
str
- see also:
log_to_file
,log_to_stderr
,err_to_stderr
,log_to_syslog
,err_to_syslog
- log_max_new
The maximum number of new log files.
- type:
int
- default:
1000
- see also:
- log_max_recent
The purpose of this option is to log at a higher debug level only to the in-memory buffer, and write out the detailed log messages only if there is a crash. Only log entries below the lower log level will be written unconditionally to the log. For example, debug_osd=1/5 will write everything <= 1 to the log unconditionally but keep entries at levels 2-5 in memory. If there is a seg fault or assertion failure, all entries will be dumped to the log.
- type:
int
- default:
500
- min:
1
- log_to_file
Determines if logging messages should appear in a file.
- type:
bool
- default:
true
- see also:
- log_to_stderr
Determines if logging messages should appear in
stderr
.- type:
bool
- default:
true
- err_to_stderr
Determines if error messages should appear in
stderr
.- type:
bool
- default:
false
- log_to_syslog
Determines if logging messages should appear in
syslog
.- type:
bool
- default:
false
- err_to_syslog
Determines if error messages should appear in
syslog
.- type:
bool
- default:
false
- log_flush_on_exit
Determines if Ceph should flush the log files after exit.
- type:
bool
- default:
false
- clog_to_monitors
Determines if
clog
messages should be sent to monitors.- type:
str
- default:
default=true
- clog_to_syslog
Determines if
clog
messages should be sent to syslog.- type:
str
- default:
false
- mon_cluster_log_to_syslog
Determines if the cluster log should be output to the syslog.
- type:
str
- default:
default=false
- mon_cluster_log_file
The locations of the cluster’s log files. There are two channels in Ceph:
cluster
andaudit
. This option represents a mapping from channels to log files, where the log entries of that channel are sent to. Thedefault
entry is a fallback mapping for channels not explicitly specified. So, the following default setting will send cluster log to$cluster.log
, and send audit log to$cluster.audit.log
, where$cluster
will be replaced with the actual cluster name.- type:
str
- default:
default=/var/log/ceph/$cluster.$channel.log cluster=/var/log/ceph/$cluster.log
- see also:
mon_cluster_log_to_file
OSD
- osd_debug_drop_ping_probability
N/A
- type:
float
- default:
0.0
- osd_debug_drop_ping_duration
N/A
- type:
int
- default:
0
Filestore
- filestore_debug_omap_check
Debugging check on synchronization. This is an expensive operation.
- type:
bool
- default:
false
MDS
RADOS Gateway
Brought to you by the Ceph Foundation
The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.