OSDs

Object Storage Daemons (OSDs) are the core storage components of a Ceph cluster. Each OSD manages a physical disk and handles data replication, recovery, and rebalancing. This page allows you to view OSD status, add new OSDs, and safely remove existing ones.

Key Concepts

OSD

Object Storage Daemon - a service that stores data on a physical disk and handles replication.

Device Class

The type of storage device: HDD (rotational), SSD (solid-state), or NVMe (high-speed SSD).

Up/Down

Whether the OSD process is running (Up) or stopped (Down).

In/Out

Whether the OSD is participating in data placement (In) or excluded from it (Out).

Required Permissions

Action	Permission
View OSDs	`iam:project:infrastructure:ceph:read`
Add OSD	`iam:project:infrastructure:ceph:write`
Mark In/Out	`iam:project:infrastructure:ceph:write`
Reweight OSD	`iam:project:infrastructure:ceph:write`
Scrub OSD	`iam:project:infrastructure:ceph:execute`
Remove OSD	`iam:project:infrastructure:ceph:delete`

OSD Status

Up/Down Status

Status	Description
Up	OSD daemon is running and responsive
Down	OSD daemon is stopped or not responding

In/Out Status

Status	Description
In	OSD participates in data placement and receives data
Out	OSD is excluded from data placement; data migrates away

An OSD can be Up but Out - this means it’s running but not receiving new data. This is commonly used during maintenance or before removal.

Device Classes

Class	Description
HDD	Traditional rotational hard disk drive
SSD	Solid-state drive with faster random I/O
NVMe	High-performance NVMe solid-state drive

Device classes are used by CRUSH rules to place data on specific storage types. For example, you can configure pools to use only SSDs for high-performance workloads.

How to View OSDs

Select Cluster

Choose a Ceph cluster from the cluster dropdown. The first ready cluster is selected by default.

View OSD List

The table shows all OSDs with their status, host, device class, placement groups, and utilization.

Filter and Search

Use the search box to find OSDs by ID, hostname, or device class. Filter by status (Up, Down, In, Out).

Review Statistics

Check the summary cards for:

Total OSDs: All OSDs in the cluster
Up: Running OSD daemons
Down: Stopped or unresponsive OSDs
In Cluster: OSDs actively storing data
Total Capacity: Combined storage across all OSDs
Avg Usage: Average utilization percentage

How to Add an OSD

Adding an OSD creates a new storage daemon on an available disk.

Click Add OSD

Click the Add OSD button in the page header.

View Available Disks

A drawer opens showing all available (unused) disks across cluster nodes.

Select Disks

Check the disks you want to use as OSDs. Each disk shows:

Device path (e.g., /dev/sdb)
Host where the disk is located
Disk size
Device type (HDD, SSD, NVMe)

Add OSDs

Click Add OSDs to create OSD daemons on the selected disks.

Monitor Progress

The operation runs in the background. The new OSDs will appear in the list once created.

Only available disks that don’t already have OSDs are shown. If no disks appear, all disks may already be in use or there may be no OSD-role nodes in the cluster.

How to Mark an OSD Out

Marking an OSD “out” removes it from data placement, causing data to migrate to other OSDs.

Find the OSD

Locate the OSD in the list.

Open Actions Menu

Click the actions menu (dropdown arrow) on the OSD row.

Click Mark Out

Select Mark Out from the menu.

Wait for Migration

Data will begin migrating to other OSDs. Monitor cluster health until all placement groups are active+clean.

Mark Out is the first step in safely removing an OSD. It triggers data migration without stopping the OSD daemon.

How to Mark an OSD In

Marking an OSD “in” adds it back to data placement.

Find the OSD

Locate the OSD that is currently marked out.

Open Actions Menu

Click the actions menu on the OSD row.

Click Mark In

Select Mark In from the menu.

Wait for Rebalancing

Data will begin distributing to this OSD. The cluster will rebalance automatically.

How to Scrub an OSD

Scrubbing verifies data integrity by comparing object replicas across OSDs.

Find the OSD

Locate the OSD to scrub.

Open Actions Menu

Click the actions menu on the OSD row.

Select Scrub Type

Choose either:

Scrub: Light verification of object metadata
Deep Scrub: Full verification including data checksums

Monitor Progress

The scrub runs in the background. Check cluster logs for results.

Deep scrub is I/O intensive and may impact performance. Schedule deep scrubs during low-usage periods.

How to Remove an OSD

Removing an OSD is a multi-step process to ensure data safety. A guided wizard walks you through the process.

Select OSD(s)

Select one or more OSDs using checkboxes, or click Remove OSD from the actions menu.

Click Remove

Click the Remove button to open the removal wizard.

Pre-flight Check

Review the selected OSDs and the amount of data that will need to migrate.

Safety Check

The system checks if removal is safe:

Safe to Destroy: All data has sufficient replicas elsewhere
Not Safe: Removal would cause data loss (requires force removal)

Data Migration

Click Start Migration to mark OSDs out and begin data migration. Wait for all placement groups to reach active+clean state.

Confirm Removal

Type REMOVE to confirm you want to permanently remove the OSDs.

Cleanup Commands

After removal, the wizard provides ceph orch device zap commands to clean the disks for reuse.

DESTRUCTIVE OPERATION: Removing an OSD is permanent. Ensure data migration is complete before confirming removal. Skipping migration may cause data loss.

OSD Table Fields

Field	Description
OSD	OSD identifier (e.g., osd.0, osd.1)
Host	Node where the OSD is running
Class	Device class (HDD, SSD, NVMe)
Status	Up/Down and In/Out status badges
PGs	Number of placement groups on this OSD
Size	Total capacity of the OSD
Usage	Current utilization percentage with visual bar

Removal Wizard Steps

Step 1: Pre-flight Check

Shows selected OSDs with:

OSD ID and hostname
Device class
Data to migrate
Placement group count

Step 2: Safety Check

Verifies if OSDs can be safely destroyed:

Checks replica counts for all affected placement groups
Shows “Safe to Destroy” or “Not Safe to Destroy”
Option to force removal if unsafe (not recommended)

Step 3: Data Migration

Marks OSDs as “out”
Monitors data migration progress
Shows when all placement groups are active+clean
Option to skip waiting (may cause data loss)

Step 4: Confirm Removal

Requires typing REMOVE to confirm
Option to force removal (skips safety checks)
Executes OSD deletion

Step 5: Cleanup

Shows completion status
Provides zap commands for disk cleanup
Commands can be copied to clipboard

Troubleshooting

OSD shows Down status

Check if the OSD host is accessible
Verify the OSD daemon is running: systemctl status ceph-osd@<id>
Check for disk failures or hardware issues
Review OSD logs for errors

No available disks for new OSD

All disks may already have OSDs
Ensure the node has the OSD role assigned
Check if disks are properly detected by the system
Verify disks aren’t mounted or in use by other services

OSD utilization is very high

Consider adding more OSDs to the cluster
Check if other OSDs are down or out
Verify CRUSH rules are distributing data evenly
Consider reweighting OSDs to balance load

Data migration is slow

This is normal for large amounts of data
Check network bandwidth between nodes
Verify no backfill/recovery throttling is set too low
Monitor ceph status for recovery progress

Cannot remove OSD - not safe to destroy

Some placement groups don’t have enough replicas
Wait for recovery to complete
Check if other OSDs are down
Use force removal only if you accept potential data loss

OSD removal stuck

Check cluster health for blocking issues
Verify network connectivity
Check if the OSD daemon is still running
Review operation logs for specific errors

Scrub taking too long

Deep scrub is I/O intensive on large OSDs
Check OSD performance and disk health
Consider adjusting scrub scheduling options
Large OSDs with many objects take longer

FAQ

What is the difference between Up/Down and In/Out?

Up/Down indicates whether the OSD daemon process is running.In/Out indicates whether the OSD participates in data placement.An OSD can be:

Up + In: Normal operation, storing and serving data
Up + Out: Running but not receiving data (draining)
Down + In: Not running but expected to return (temporary failure)
Down + Out: Not running and excluded from placement

When should I mark an OSD out vs remove it?

Mark Out when:

Performing temporary maintenance
The OSD will return to service
You want to drain data without removing the OSD

Remove when:

Decommissioning a disk permanently
Replacing failed hardware
The OSD will not return to service

What happens when I add a new OSD?

When you add an OSD:

The Ceph orchestrator deploys an OSD daemon on the disk
The OSD is added to the CRUSH map
The cluster begins rebalancing data to include the new OSD
Data gradually distributes across all OSDs

How long does OSD removal take?

Removal time depends on:

Amount of data on the OSD
Network speed between nodes
Number of remaining OSDs
Current cluster load

Small OSDs may complete in minutes; large OSDs can take hours.

What is the 'safe to destroy' check?

This check verifies that removing the OSD won’t cause data loss:

Checks all placement groups on the OSD
Ensures each PG has sufficient replicas on other OSDs
If any PG would lose its last copy, removal is blocked

Force removal bypasses this check but may cause data loss.

What are the zap commands for?

After removing an OSD, the disk may still have Ceph metadata. The ceph orch device zap command:

Removes all Ceph data from the disk
Clears partition tables and LVM data
Makes the disk available for reuse

This is optional but recommended before reusing the disk.

Should I use force removal?

Avoid force removal unless you:

Understand the risk of data loss
Have backups of critical data
Are removing already-failed OSDs
Accept that some data may be lost

Normal removal with data migration is always safer.

What is OSD reweight?

Reweight adjusts how much data an OSD receives (0.0 to 1.0):

1.0: Full weight, normal data distribution
0.5: Half weight, receives half the normal data
0.0: No data (equivalent to marking out)

Use reweight to gradually drain an OSD or balance uneven utilization.

When should I run a deep scrub?

Deep scrub verifies data integrity at the bit level:

Run periodically for data verification
After suspected disk issues
When data corruption is suspected

Deep scrubs are scheduled automatically but can be run manually.

Getting Started

Infrastructure

Platform Services

CI/CD & Deployments

Pipeline & Helm

Performance Testing

Security

Mesh Networking

Access Management

Audit & Compliance

Settings

Key Concepts

OSD

Device Class

Up/Down

In/Out

Required Permissions

OSD Status

Up/Down Status

In/Out Status

Device Classes

How to View OSDs

How to Add an OSD

How to Mark an OSD Out

How to Mark an OSD In

How to Scrub an OSD

How to Remove an OSD

OSD Table Fields

Removal Wizard Steps

Step 1: Pre-flight Check

Step 2: Safety Check

Step 3: Data Migration

Step 4: Confirm Removal

Step 5: Cleanup

Troubleshooting

FAQ

Getting Started

Infrastructure

Platform Services

CI/CD & Deployments

Pipeline & Helm

Performance Testing

Security

Mesh Networking

Access Management

Audit & Compliance

Settings

​Key Concepts

OSD

Device Class

Up/Down

In/Out

​Required Permissions

​OSD Status

​Up/Down Status

​In/Out Status

​Device Classes

​How to View OSDs

​How to Add an OSD

​How to Mark an OSD Out

​How to Mark an OSD In

​How to Scrub an OSD

​How to Remove an OSD

​OSD Table Fields

​Removal Wizard Steps

​Step 1: Pre-flight Check

​Step 2: Safety Check

​Step 3: Data Migration

​Step 4: Confirm Removal

​Step 5: Cleanup

​Troubleshooting

​FAQ

Key Concepts

Required Permissions

OSD Status

Up/Down Status

In/Out Status

Device Classes

How to View OSDs

How to Add an OSD

How to Mark an OSD Out

How to Mark an OSD In

How to Scrub an OSD

How to Remove an OSD

OSD Table Fields

Removal Wizard Steps

Step 1: Pre-flight Check

Step 2: Safety Check

Step 3: Data Migration

Step 4: Confirm Removal

Step 5: Cleanup

Troubleshooting

FAQ