Events - ShiftLabs

The Events page provides centralized access to Ceph cluster logs, audit trails, and Prometheus alerts. Monitor cluster activity, track security-relevant events, and respond to alerts from a single interface.

Key Concepts

Cluster Logs

Operational messages from Ceph daemons including health changes, OSD events, and monitor activity.

Audit Logs

Security-relevant events tracking who performed what actions on the cluster.

Alerts

Prometheus alerts triggered by threshold violations or abnormal conditions.

Severity Levels

Classification of events by importance: Critical, Error, Warning, Info, Debug.

Required Permissions

Action	Permission
View Cluster Logs	`iam:project:infrastructure:ceph:logs`
View Audit Logs	`iam:project:infrastructure:ceph:logs`
View Alerts	`iam:project:infrastructure:ceph:read`

Event Types

Cluster Logs

Operational logs from Ceph components:

Level	Description
ERROR	Critical issues requiring immediate attention
WARNING	Potential problems or degraded conditions
INFO	Normal operational messages
DEBUG	Detailed diagnostic information

Audit Logs

Security and compliance logs tracking:

User authentication events
Configuration changes
Administrative operations
Access control modifications

Prometheus Alerts

Active alerts from the monitoring system:

Severity	Description
CRITICAL	Severe issues requiring immediate action
WARNING	Conditions that may become critical if not addressed
INFO	Informational alerts for awareness

State	Description
active	Alert is currently firing
suppressed	Alert is silenced or inhibited

How to View Events

Select Cluster

Choose a Ceph cluster from the cluster dropdown. Only ready (bootstrapped) clusters will show events.

Select Event Type

Choose a tab to view specific event types:

Cluster Logs: Operational messages from Ceph daemons
Audit Logs: Security and compliance events
Alerts: Active Prometheus alerts

Review Statistics

The summary cards show:

Total Events/Alerts: Count of all events in the current view
Critical/Errors: Critical alerts or error logs (pulsing if > 0)
Warnings: Warning-level events (pulsing if > 0)
Info: Informational events
Status: Overall status based on severity counts

Filter Events

Use filters to focus on specific events:

Level/Severity Filter: Show only specific severity levels
Search: Find events by message, source, or alert name

How to View Cluster Logs

Select Cluster Logs Tab

Click the Cluster Logs tab to view operational messages.

Review Log Entries

Each log entry shows:

Timestamp: When the event occurred
Level: Severity (ERROR, WARNING, INFO, DEBUG)
Source: Component that generated the log
Message: Detailed event description

Filter by Level

Click the level filter to show only specific severity levels. Multiple levels can be selected.

Search Logs

Use the search box to find logs by message content, source name, or level.

How to View Audit Logs

Select Audit Logs Tab

Click the Audit Logs tab to view security events.

Review Audit Entries

Audit logs capture:

Administrative commands executed
Configuration changes
User activities
Access control events

Filter and Search

Use the same filtering and search capabilities as cluster logs.

Audit logs are essential for compliance and security investigations. Review them regularly to detect unauthorized activities.

How to View Alerts

Select Alerts Tab

Click the Alerts tab to view active Prometheus alerts.

Review Alert Details

Each alert shows:

Started: When the alert began firing
Severity: Critical, Warning, or Info
Alert: Name of the alert rule
Summary: Brief description of the issue
Description: Detailed explanation
State: Active or Suppressed

Filter by Severity

Click the severity filter to show only specific alert levels.

Search Alerts

Use the search box to find alerts by name, summary, or description.

Statistics Cards

Total Events/Alerts

The total count of events in the currently selected view (Cluster Logs, Audit Logs, or Alerts).

Critical/Errors

Count of critical alerts or error-level logs. A pulsing indicator appears when this count is greater than zero, indicating issues requiring attention.

Warnings

Count of warning-level events. A pulsing indicator appears when warnings exist.

Info

Count of informational events for general awareness.

Status

Overall status based on event counts:

Critical: Red indicator when critical/error events exist
Warning: Amber indicator when warnings exist (no criticals)
Healthy: Green indicator when no warnings or errors

Log Entry Fields

Cluster and Audit Logs

Field	Description
Timestamp	Date and time the event occurred
Level	Severity: ERROR, WARNING, INFO, DEBUG
Source	Ceph component that generated the event
Message	Detailed description of the event

Alerts

Field	Description
Started	When the alert started firing
Severity	Alert importance: CRITICAL, WARNING, INFO
Alert	Name of the Prometheus alert rule
Summary	Brief description of the condition
Description	Detailed explanation and context
State	Current alert state: active, suppressed

Common Alert Types

Alert	Severity	Description
CephHealthError	Critical	Cluster health is HEALTH_ERR
CephHealthWarning	Warning	Cluster health is HEALTH_WARN
CephOSDDown	Warning	One or more OSDs are down
CephOSDNearFull	Warning	OSD approaching full capacity
CephOSDFull	Critical	OSD is full, writes blocked
CephPGNotScrubbed	Warning	PGs haven’t been scrubbed recently
CephMonQuorumAtRisk	Warning	Monitor quorum may be lost
CephMgrModuleCrash	Warning	Manager module has crashed

Troubleshooting

No events showing

Verify the cluster is bootstrapped and ready
Check that logging is enabled in Ceph configuration
Ensure you have the required permissions
Try refreshing the page

Missing audit logs

Audit logging must be enabled in Ceph configuration
Check if mgr/dashboard/AUDIT_API_ENABLED is true
Verify the cluster dashboard is accessible

Alerts not appearing

Prometheus must be deployed in the cluster
Alertmanager must be configured
Check Prometheus connectivity
Verify alert rules are defined

Too many events to review

Use level/severity filters to focus on important events
Search for specific keywords
Review critical and error events first
Consider setting up automated alerting

Alert stuck in active state

The underlying condition may still exist
Check cluster health on the Clusters page
Investigate the specific issue mentioned in the alert
Alerts resolve automatically when conditions clear

Events showing incorrect timestamps

Check time synchronization (NTP) on cluster nodes
Clock skew can affect event ordering
Verify system timezone settings

FAQ

What's the difference between cluster logs and audit logs?

Cluster Logs: Operational messages about cluster health, daemon status, and system events. Used for troubleshooting and monitoring.Audit Logs: Security-focused logs tracking administrative actions, configuration changes, and user activities. Used for compliance and security audits.

How long are logs retained?

Log retention depends on your Ceph configuration:

Default retention varies by Ceph version
The dashboard shows recent logs from the cluster
For long-term retention, export logs to external systems

Can I export events for analysis?

Currently, events cannot be directly exported from this interface. For log aggregation:

Configure rsyslog or journald forwarding
Use Loki or similar log aggregation systems
Set up Prometheus remote write for alerts

What does a suppressed alert mean?

Suppressed alerts are:

Silenced by an administrator
Inhibited by another alert rule
Temporarily muted during maintenance

The underlying condition still exists but notifications are suspended.

How quickly do events appear?

Cluster logs appear in near real-time
Audit logs appear as actions are performed
Alerts appear when Prometheus evaluates rules (typically every 15-60 seconds)

Click refresh to get the latest events.

Why are some log sources showing as 'unknown'?

Some events may not have a source identified when:

The source daemon information is unavailable
The event came from a system-level process
The log format doesn’t include source details

How do I acknowledge or silence alerts?

Alert management (silencing, acknowledgment) is handled through:

Prometheus Alertmanager directly
Ceph Dashboard’s native alert management
Integration with incident management systems

What triggers the pulsing indicator on cards?

The pulsing animation indicates active issues:

Critical/Errors card: Pulses when count > 0
Warnings card: Pulses when count > 0

This draws attention to events requiring review.

​Key Concepts

Cluster Logs

Audit Logs

Alerts

Severity Levels

​Required Permissions

​Event Types

​Cluster Logs

​Audit Logs

​Prometheus Alerts

​How to View Events

​How to View Cluster Logs

​How to View Audit Logs

​How to View Alerts

​Statistics Cards

​Total Events/Alerts

​Critical/Errors

​Warnings

​Info

​Status

​Log Entry Fields

​Cluster and Audit Logs

​Alerts

​Common Alert Types

​Troubleshooting

​FAQ

Key Concepts

Required Permissions

Event Types

Cluster Logs

Audit Logs

Prometheus Alerts

How to View Events

How to View Cluster Logs

How to View Audit Logs

How to View Alerts

Statistics Cards

Total Events/Alerts

Critical/Errors

Warnings

Info

Status

Log Entry Fields

Cluster and Audit Logs

Alerts

Common Alert Types

Troubleshooting

FAQ