Upgrade

The Upgrade page allows you to upgrade your Ceph cluster to newer versions. Upgrades are performed in a controlled manner, updating one daemon at a time to minimize disruption. You can monitor progress, pause, resume, or stop upgrades as needed.

Key Concepts

Rolling Upgrade

Ceph upgrades one daemon at a time, maintaining cluster availability throughout the process.

Target Version

The Ceph version or container image you are upgrading to.

Daemon

A Ceph service process (mon, mgr, osd, mds, rgw) that gets upgraded individually.

Upgrade Blockers

Conditions that prevent an upgrade from starting, such as cluster health errors.

Required Permissions

Action	Permission
View Upgrade Status	`iam:project:infrastructure:ceph:read`
Start Upgrade	`iam:project:infrastructure:ceph:write`
Pause Upgrade	`iam:project:infrastructure:ceph:write`
Resume Upgrade	`iam:project:infrastructure:ceph:write`
Stop Upgrade	`iam:project:infrastructure:ceph:write`
View History	`iam:project:infrastructure:ceph:read`

Upgrade Status

Status	Description
Up to Date	Cluster is running the latest available version
Available	Newer version is available for upgrade
In Progress	Upgrade is currently running
Paused	Upgrade is paused and can be resumed

How to Check Available Upgrades

Select Cluster

Choose a Ceph cluster from the cluster dropdown.

View Current Status

Review the statistics cards showing:

Current Version: The version currently running
Target: Available upgrade target or “Up to Date”
Progress: Percentage complete (if upgrading)
Daemons: Number of daemons in the cluster
Health: Current cluster health status

Check for Blockers

If blockers exist, they will be displayed with warnings. Common blockers include:

HEALTH_ERR: Cluster has critical health errors
Active maintenance operations
Insufficient redundancy

How to Start an Upgrade

Click Start Upgrade

Click the Start Upgrade button to open the upgrade wizard.

Pre-flight Check

The wizard checks cluster health and readiness:

Cluster health status (HEALTH_OK, HEALTH_WARN, HEALTH_ERR)
Critical blockers are highlighted

Critical blockers (like HEALTH_ERR) must be resolved before upgrading. Non-critical warnings allow proceeding with caution.

Select Version

Choose your upgrade target:

Available Versions: Pre-defined versions from the registry
Custom Image: Specify a container image URL directly

The recommended version is highlighted if available.

Confirm Upgrade

Type UPGRADE to confirm you want to proceed. This prevents accidental upgrades.

Monitor Progress

The wizard shows real-time progress:

Overall progress percentage
Current daemon being upgraded
Daemon upgrade status (pending, upgrading, upgraded)

IMPORTANT: Upgrades can take significant time depending on cluster size. Do not interrupt power or network during the upgrade process.

How to Use Custom Image Upgrades

Custom image upgrades allow you to specify an exact container image URL, useful for:

Testing pre-release versions
Using custom-built images
Air-gapped environments with private registries

Start Upgrade Wizard

Click Start Upgrade and complete the pre-flight check.

Select Custom Image

Toggle the Use Custom Image option in the version selection step.

Enter Image URL

Enter the full container image URL, for example:

quay.io/ceph/ceph:v18.2.4
registry.example.com/ceph/ceph:v18.2.4

Confirm and Start

Complete the confirmation step to begin the upgrade.

How to Pause an Upgrade

Pausing stops the upgrade at the current daemon while maintaining cluster stability.

Locate Pause Button

During an active upgrade, the Pause button appears in the upgrade controls.

Click Pause

Click Pause to stop the upgrade process. The current daemon will complete before pausing.

Verify Paused State

The status changes to “Paused” and the cluster remains in a mixed-version state.

Pausing is useful when you need to investigate issues or perform other maintenance. The cluster remains operational during the pause.

How to Resume an Upgrade

Verify Paused State

Ensure the upgrade shows as “Paused” in the status.

Click Resume

Click the Resume button to continue the upgrade from where it stopped.

Monitor Progress

The upgrade continues with the next daemon in sequence.

How to Stop an Upgrade

Stopping an upgrade cancels the process entirely.

Click Stop

Click the Stop button during an active or paused upgrade.

Confirm Stop

Confirm that you want to stop the upgrade.

Stopping an upgrade leaves the cluster in a mixed-version state. While Ceph supports mixed versions, it’s recommended to either complete the upgrade or roll back manually.

How to View Upgrade History

Navigate to History Tab

Click the History tab on the Upgrade page.

Review Past Upgrades

The history table shows:

From/To versions
Status (success, failed, cancelled)
Start and completion times
Duration
Number of daemons upgraded

Filter Results

Use search and pagination to find specific upgrade records.

Statistics Cards

Current Version

The Ceph version currently running on the cluster (e.g., 18.2.2 reef).

Target/Available

Shows either:

Up to Date: No newer version available
Available: A newer version is available to upgrade to
Target: The version being upgraded to (during upgrade)

Progress

Percentage of daemons that have been upgraded. Only shown during active upgrades.

Daemons

Total number of Ceph daemons that will be upgraded.

Health

Current cluster health status:

HEALTH_OK: All healthy
HEALTH_WARN: Warnings present (upgrade can proceed with caution)
HEALTH_ERR: Critical issues (upgrade blocked)

Upgrade Order

Ceph upgrades daemons in a specific order to maintain cluster stability:

Managers (mgr) - Updated first as they coordinate the upgrade
Monitors (mon) - Quorum is maintained throughout
OSDs - Updated one at a time to preserve data availability
Metadata Servers (mds) - For CephFS clusters
RADOS Gateways (rgw) - For object storage clusters

Troubleshooting

Upgrade shows 'Upgrade Blocked'

Check cluster health with ceph health detail
Resolve any HEALTH_ERR conditions
Ensure all OSDs are up and in
Verify no other operations are in progress

Upgrade is very slow

Each daemon restart takes time
Large OSDs may take longer to restart
Check for recovery/backfill operations
Network bandwidth affects image download speed

Upgrade stuck on a daemon

Check the daemon’s host is accessible
Verify the container image is available
Review daemon logs on the host
Consider pausing and investigating

Cluster unhealthy after partial upgrade

Mixed versions are supported but not ideal
Resume the upgrade to complete it
If issues persist, check version compatibility
Contact support for rollback procedures

Custom image not pulling

Verify the image URL is correct
Check registry authentication
Ensure nodes can reach the registry
Test with podman pull <image> on a node

Cannot find available versions

Versions are managed by the system administrator
Check if versions are marked as active in settings
Verify your cluster’s current version name matches version groups

FAQ

How long does an upgrade take?

Upgrade duration depends on:

Number of daemons in the cluster
OSD sizes (larger OSDs take longer to restart)
Network speed for image downloads
Any recovery operations that occur

Small clusters may complete in under an hour; large clusters can take several hours.

Is the cluster available during upgrade?

Yes. Ceph’s rolling upgrade process maintains cluster availability:

Only one daemon is upgraded at a time
Monitor quorum is preserved
Data redundancy protects against OSD restarts

Clients may experience brief latency spikes during daemon restarts.

What if I need to roll back?

Ceph doesn’t support automatic rollback. Options include:

Restore from backup (if available)
Manually reinstall previous version (complex)
Complete the upgrade and address issues

Prevention is best: test upgrades in a staging environment first.

Can I skip versions?

Generally, you should upgrade to the next major version only. For example:

Quincy → Reef (supported)
Pacific → Reef (not recommended, upgrade to Quincy first)

Check Ceph’s official upgrade documentation for supported paths.

What is the recommended version?

The recommended version is determined by:

Latest stable release in your version series
System administrator configuration
Known compatibility with your environment

It’s highlighted in the version selection list.

Should I upgrade if cluster shows HEALTH_WARN?

It depends on the warning:

Minor warnings (clock skew, nearfull): Usually safe to proceed
Data-related warnings (degraded PGs): Resolve first
OSD warnings (down OSDs): Fix before upgrading

Review each warning to understand its impact.

What happens if power fails during upgrade?

Ceph is resilient to interruptions:

Completed daemons retain their new version
In-progress daemon may need manual recovery
The upgrade can be resumed once power is restored
Data is protected by replication

How do I know the upgrade completed successfully?

Signs of successful upgrade:

Progress shows 100%
Current Version matches Target Version
Cluster health returns to HEALTH_OK
All daemons show the new version

Check the History tab for the official completion record.

Getting Started

Infrastructure

Platform Services

CI/CD & Deployments

Pipeline & Helm

Performance Testing

Security

Mesh Networking

Access Management

Audit & Compliance

Settings

Key Concepts

Rolling Upgrade

Target Version

Daemon

Upgrade Blockers

Required Permissions

Upgrade Status

How to Check Available Upgrades

How to Start an Upgrade

How to Use Custom Image Upgrades

How to Pause an Upgrade

How to Resume an Upgrade

How to Stop an Upgrade

How to View Upgrade History

Statistics Cards

Current Version

Target/Available

Progress

Daemons

Health

Upgrade Order

Troubleshooting

FAQ

Getting Started

Infrastructure

Platform Services

CI/CD & Deployments

Pipeline & Helm

Performance Testing

Security

Mesh Networking

Access Management

Audit & Compliance

Settings

​Key Concepts

Rolling Upgrade

Target Version

Daemon

Upgrade Blockers

​Required Permissions

​Upgrade Status

​How to Check Available Upgrades

​How to Start an Upgrade

​How to Use Custom Image Upgrades

​How to Pause an Upgrade

​How to Resume an Upgrade

​How to Stop an Upgrade

​How to View Upgrade History

​Statistics Cards

​Current Version

​Target/Available

​Progress

​Daemons

​Health

​Upgrade Order

​Troubleshooting

​FAQ

Key Concepts

Required Permissions

Upgrade Status

How to Check Available Upgrades

How to Start an Upgrade

How to Use Custom Image Upgrades

How to Pause an Upgrade

How to Resume an Upgrade

How to Stop an Upgrade

How to View Upgrade History

Statistics Cards

Current Version

Target/Available

Progress

Daemons

Health

Upgrade Order

Troubleshooting

FAQ