Skip to main content
The Events page provides centralized access to Ceph cluster logs, audit trails, and Prometheus alerts. Monitor cluster activity, track security-relevant events, and respond to alerts from a single interface.

Key Concepts

Cluster Logs

Operational messages from Ceph daemons including health changes, OSD events, and monitor activity.

Audit Logs

Security-relevant events tracking who performed what actions on the cluster.

Alerts

Prometheus alerts triggered by threshold violations or abnormal conditions.

Severity Levels

Classification of events by importance: Critical, Error, Warning, Info, Debug.

Required Permissions

ActionPermission
View Cluster Logsiam:project:infrastructure:ceph:logs
View Audit Logsiam:project:infrastructure:ceph:logs
View Alertsiam:project:infrastructure:ceph:read

Event Types

Cluster Logs

Operational logs from Ceph components:
LevelDescription
ERRORCritical issues requiring immediate attention
WARNINGPotential problems or degraded conditions
INFONormal operational messages
DEBUGDetailed diagnostic information

Audit Logs

Security and compliance logs tracking:
  • User authentication events
  • Configuration changes
  • Administrative operations
  • Access control modifications

Prometheus Alerts

Active alerts from the monitoring system:
SeverityDescription
CRITICALSevere issues requiring immediate action
WARNINGConditions that may become critical if not addressed
INFOInformational alerts for awareness
StateDescription
activeAlert is currently firing
suppressedAlert is silenced or inhibited

How to View Events

1

Select Cluster

Choose a Ceph cluster from the cluster dropdown. Only ready (bootstrapped) clusters will show events.
2

Select Event Type

Choose a tab to view specific event types:
  • Cluster Logs: Operational messages from Ceph daemons
  • Audit Logs: Security and compliance events
  • Alerts: Active Prometheus alerts
3

Review Statistics

The summary cards show:
  • Total Events/Alerts: Count of all events in the current view
  • Critical/Errors: Critical alerts or error logs (pulsing if > 0)
  • Warnings: Warning-level events (pulsing if > 0)
  • Info: Informational events
  • Status: Overall status based on severity counts
4

Filter Events

Use filters to focus on specific events:
  • Level/Severity Filter: Show only specific severity levels
  • Search: Find events by message, source, or alert name

How to View Cluster Logs

1

Select Cluster Logs Tab

Click the Cluster Logs tab to view operational messages.
2

Review Log Entries

Each log entry shows:
  • Timestamp: When the event occurred
  • Level: Severity (ERROR, WARNING, INFO, DEBUG)
  • Source: Component that generated the log
  • Message: Detailed event description
3

Filter by Level

Click the level filter to show only specific severity levels. Multiple levels can be selected.
4

Search Logs

Use the search box to find logs by message content, source name, or level.

How to View Audit Logs

1

Select Audit Logs Tab

Click the Audit Logs tab to view security events.
2

Review Audit Entries

Audit logs capture:
  • Administrative commands executed
  • Configuration changes
  • User activities
  • Access control events
3

Filter and Search

Use the same filtering and search capabilities as cluster logs.
Audit logs are essential for compliance and security investigations. Review them regularly to detect unauthorized activities.

How to View Alerts

1

Select Alerts Tab

Click the Alerts tab to view active Prometheus alerts.
2

Review Alert Details

Each alert shows:
  • Started: When the alert began firing
  • Severity: Critical, Warning, or Info
  • Alert: Name of the alert rule
  • Summary: Brief description of the issue
  • Description: Detailed explanation
  • State: Active or Suppressed
3

Filter by Severity

Click the severity filter to show only specific alert levels.
4

Search Alerts

Use the search box to find alerts by name, summary, or description.

Statistics Cards

Total Events/Alerts

The total count of events in the currently selected view (Cluster Logs, Audit Logs, or Alerts).

Critical/Errors

Count of critical alerts or error-level logs. A pulsing indicator appears when this count is greater than zero, indicating issues requiring attention.

Warnings

Count of warning-level events. A pulsing indicator appears when warnings exist.

Info

Count of informational events for general awareness.

Status

Overall status based on event counts:
  • Critical: Red indicator when critical/error events exist
  • Warning: Amber indicator when warnings exist (no criticals)
  • Healthy: Green indicator when no warnings or errors

Log Entry Fields

Cluster and Audit Logs

FieldDescription
TimestampDate and time the event occurred
LevelSeverity: ERROR, WARNING, INFO, DEBUG
SourceCeph component that generated the event
MessageDetailed description of the event

Alerts

FieldDescription
StartedWhen the alert started firing
SeverityAlert importance: CRITICAL, WARNING, INFO
AlertName of the Prometheus alert rule
SummaryBrief description of the condition
DescriptionDetailed explanation and context
StateCurrent alert state: active, suppressed

Common Alert Types

AlertSeverityDescription
CephHealthErrorCriticalCluster health is HEALTH_ERR
CephHealthWarningWarningCluster health is HEALTH_WARN
CephOSDDownWarningOne or more OSDs are down
CephOSDNearFullWarningOSD approaching full capacity
CephOSDFullCriticalOSD is full, writes blocked
CephPGNotScrubbedWarningPGs haven’t been scrubbed recently
CephMonQuorumAtRiskWarningMonitor quorum may be lost
CephMgrModuleCrashWarningManager module has crashed

Troubleshooting

  • Verify the cluster is bootstrapped and ready
  • Check that logging is enabled in Ceph configuration
  • Ensure you have the required permissions
  • Try refreshing the page
  • Audit logging must be enabled in Ceph configuration
  • Check if mgr/dashboard/AUDIT_API_ENABLED is true
  • Verify the cluster dashboard is accessible
  • Prometheus must be deployed in the cluster
  • Alertmanager must be configured
  • Check Prometheus connectivity
  • Verify alert rules are defined
  • Use level/severity filters to focus on important events
  • Search for specific keywords
  • Review critical and error events first
  • Consider setting up automated alerting
  • The underlying condition may still exist
  • Check cluster health on the Clusters page
  • Investigate the specific issue mentioned in the alert
  • Alerts resolve automatically when conditions clear
  • Check time synchronization (NTP) on cluster nodes
  • Clock skew can affect event ordering
  • Verify system timezone settings

FAQ

Cluster Logs: Operational messages about cluster health, daemon status, and system events. Used for troubleshooting and monitoring.Audit Logs: Security-focused logs tracking administrative actions, configuration changes, and user activities. Used for compliance and security audits.
Log retention depends on your Ceph configuration:
  • Default retention varies by Ceph version
  • The dashboard shows recent logs from the cluster
  • For long-term retention, export logs to external systems
Currently, events cannot be directly exported from this interface. For log aggregation:
  • Configure rsyslog or journald forwarding
  • Use Loki or similar log aggregation systems
  • Set up Prometheus remote write for alerts
Suppressed alerts are:
  • Silenced by an administrator
  • Inhibited by another alert rule
  • Temporarily muted during maintenance
The underlying condition still exists but notifications are suspended.
  • Cluster logs appear in near real-time
  • Audit logs appear as actions are performed
  • Alerts appear when Prometheus evaluates rules (typically every 15-60 seconds)
Click refresh to get the latest events.
Some events may not have a source identified when:
  • The source daemon information is unavailable
  • The event came from a system-level process
  • The log format doesn’t include source details
Alert management (silencing, acknowledgment) is handled through:
  • Prometheus Alertmanager directly
  • Ceph Dashboard’s native alert management
  • Integration with incident management systems
The pulsing animation indicates active issues:
  • Critical/Errors card: Pulses when count > 0
  • Warnings card: Pulses when count > 0
This draws attention to events requiring review.