Key Concepts
Node
Maintenance Mode
OSD
Labels
Required Permissions
| Action | Permission |
|---|---|
| View Nodes | iam:project:infrastructure:ceph:read |
| Add Node | iam:project:infrastructure:ceph:write |
| Remove Node | iam:project:infrastructure:ceph:write |
| Enter Maintenance | iam:project:infrastructure:ceph:execute |
| Exit Maintenance | iam:project:infrastructure:ceph:execute |
Node Status
| Status | Description |
|---|---|
| Online | Node is healthy and participating in the cluster |
| Offline | Node is not responding or disconnected from the cluster |
| Maintenance | Node is in maintenance mode (daemons stopped, no data migration) |
How to View Nodes
Select Cluster
View Node List
How to View Node Details
Review Dashboard
- Node Status: Current state (online, offline, maintenance)
- OSDs: Number of Object Storage Daemons on this node
- IP Address: Network address of the node
- Services/Roles: Ceph services running on the node
View Hardware Information
- CPU: Model, cores, threads
- Memory: Total and available RAM
- Architecture: CPU architecture type
View System Information
- OS: Operating system and version
- Kernel: Linux kernel version
- Uptime: How long the node has been running
- Security: SELinux/security settings
View Storage
- HDD: Count and total capacity of hard disk drives
- Flash/SSD: Count and total capacity of solid-state drives
- Physical Disks: Detailed list of all attached disks
How to Add Nodes to a Cluster
Adding nodes to an existing Ceph cluster involves OS preparation, SSH trust setup, and registration with the Ceph orchestrator.Register Node Settings
- Hostname
- IP address
- SSH credentials
- Node role
Start Add Operation
- Pre-addition validation and SSH connectivity check
- Operating system preparation
- Network configuration (/etc/hosts updates)
- SSH trust establishment from primary monitor
- Registration with Ceph orchestrator
How to Remove Nodes from a Cluster
Removing nodes safely drains services and removes the node from the cluster.Click Remove Node(s)
Confirm Removal
- Pre-removal validation
- Removal from CRUSH map
- Draining the host (stopping all daemons)
- Removal from orchestrator
- Database record update
How to Enter Maintenance Mode
Maintenance mode allows you to perform hardware or software maintenance on a node without affecting cluster health.Select Enter Maintenance
How to Exit Maintenance Mode
Node Configuration Fields
| Field | Description |
|---|---|
| Hostname | The node’s hostname as known to Ceph |
| Address | IP address for cluster communication |
| Status | Current operational state (online, offline, maintenance) |
| Roles | Ceph services/labels assigned to the node (mon, osd, mgr, etc.) |
| OSD Count | Number of OSDs running on this node |
| Ceph Version | Version of Ceph software on the node |
Detail Page Sections
Node Information Card
Shows basic node identity:- Hostname
- IP address
- Ceph version
- Active services/daemons
Hardware Card
Displays hardware specifications:- CPU model, core count, thread count
- Total and available memory
- System architecture
- Hardware vendor and model
System Card
Shows operating system details:- OS distribution and version
- Kernel version
- System uptime
- Security configuration (SELinux)
- FQDN
Storage Card
Summarizes attached storage:- HDD count and total capacity
- Flash/SSD count and total capacity
Network Interfaces Card
Lists all network interfaces with:- Interface name
- Operational status
- IPv4 address
- MTU setting
- Interface type
- Network driver
Physical Disks Card
Shows detailed disk information:- HDD List: All rotational drives with vendor, model, serial, size
- Flash List: All solid-state drives with vendor, model, serial, size
Troubleshooting
Node shows offline status
Node shows offline status
- Verify the node is powered on and network is connected
- Check SSH connectivity to the node
- Verify firewall rules allow Ceph ports (3300, 6789, 6800-7300)
- Check if Ceph daemons are running on the node
Cannot add node to cluster
Cannot add node to cluster
- Verify SSH credentials are correct
- Ensure the node meets minimum requirements (disk space, memory, CPU)
- Check that the node’s hostname is resolvable
- Verify network connectivity between nodes
- Check operation logs for specific errors
Node removal fails
Node removal fails
- Ensure the cluster has enough capacity to handle data redistribution
- Check if the node has the only copy of any data
- Verify the primary monitor is accessible
- Check operation logs for specific errors
Cannot enter maintenance mode
Cannot enter maintenance mode
- Check if there are active operations on the node
- Verify cluster health allows maintenance
- Use force maintenance if regular maintenance fails
- Ensure sufficient cluster redundancy
Node stuck in maintenance mode
Node stuck in maintenance mode
- Verify the Ceph orchestrator is running
- Check if there are pending operations on the node
- Try the exit maintenance operation again
- Check cluster logs for errors
Hardware information not showing
Hardware information not showing
- The node may not have the facts/inventory data collected
- Verify the Ceph orchestrator has access to the node
- Check if the node has required packages installed
- Try refreshing the page after a few minutes
FAQ
What is the difference between maintenance mode and removing a node?
What is the difference between maintenance mode and removing a node?
When should I use Force Enter Maintenance?
When should I use Force Enter Maintenance?
- Regular maintenance mode fails
- The node is having issues preventing graceful shutdown
- You need to take the node offline urgently
How long does adding a node take?
How long does adding a node take?
- Network speed between nodes
- OS preparation requirements
- Number of existing nodes in the cluster
Can I remove multiple nodes at once?
Can I remove multiple nodes at once?
What are node labels/roles used for?
What are node labels/roles used for?
- mon: Monitor daemon
- osd: Object Storage Daemon
- mgr: Manager daemon
- mds: Metadata Server (for CephFS)
- rgw: RADOS Gateway (for object storage)
What happens to data when a node is removed?
What happens to data when a node is removed?
- OSDs on the node are marked out
- Data is redistributed to remaining OSDs
- The node is removed from CRUSH map
- The cluster rebalances to maintain redundancy
Why does the node show 0 OSDs?
Why does the node show 0 OSDs?
- The node only runs monitor or manager services
- Disks haven’t been added as OSDs yet
- The node was recently added and OSDs haven’t been deployed
Can I add a node that was previously removed?
Can I add a node that was previously removed?
- Ensuring the node entry exists in the system
- Using the Add Node operation
- Optionally adding OSDs back if needed