It is tempting to ignore disk errors when they occur. Your Mac may seem to be working fine, and yet SoftRAID keeps reporting that one of your disks has an error. You should take all disk errors seriously. A disk that has been completely reliable can show one or two errors in the months before it fails catastrophically. These errors are your early warning sign.
This FAQ explains the different types of errors SoftRAID reports, what they mean, and how to determine whether the problem is your hardware or your software.
Types of Errors SoftRAID Reports
I/O Errors
An I/O (Input/Output) error means a communication failure occurred between your Mac and a disk — a read or write operation failed to complete. An I/O error is not automatically a disk failure. It means something went wrong during the operation, and the cause needs investigation.
Common causes:
- The disk hardware is failing or has failed
- A cable, enclosure bridge chip, or Thunderbolt/USB bus issue
- A power supply problem
- Filesystem or directory corruption (often after a kernel panic, forced shutdown, or unexpected ejection)
- The disk was ejected or disconnected during an active I/O operation
SoftRAID displays the I/O error count in both the disk tile and the volume tile. You can clear the counter via Disk menu → Clear I/O Errors to track whether new errors are occurring.
Disk Errors (SMART-Based)
Disk errors are hardware-level conditions reported by the drive itself via SMART (Self-Monitoring, Analysis, and Reporting Technology). Unlike I/O errors — which are communication failures — disk errors are evidence of physical problems on the drive. The most important SMART-based disk errors are:
- Reallocated Sectors: The drive found a bad sector and moved the data to a spare sector elsewhere on the disk. Any reallocated sector is grounds for immediate replacement — research shows a drive with even one reallocated sector is 20–60x more likely to fail within 60 days.
- Pending Reallocations: The drive has marked a sector for reallocation but has not yet moved the data. Certify the disk first; replace if certification fails or sectors get reallocated.
- Unreliable Sectors: The drive had to retry reading a sector. Often caused by an external event (bus ejection, power event). Certify the disk; if it passes, it can return to service.
- Failed Reallocations: The drive attempted to reallocate a bad sector but failed — it cannot move the data to a spare sector. This generally means the drive has days or weeks before complete failure. Replace immediately.
Predicted to Fail
A “Predicted to Fail” status means SoftRAID has detected one or more SMART parameters that research shows are strongly associated with imminent failure — even if the drive has not yet failed. The drive is still functioning, but the warning should be treated urgently.
What to do: Back up immediately. Order a replacement drive. Do not wait for the drive to actually fail.
SoftRAID can be set to ignore SMART predictions on a specific disk (Disk menu → Disable SMART) — for example, on an internal drive you cannot immediately replace. This silences the alert but does not reduce the risk.
SMART Failure
A SMART Failure means the drive has already failed its SMART self-test. This is not a prediction — the drive has failed. It may still be passing data, but it could stop completely at any moment.
What to do: Remove the disk from your enclosure as soon as possible. A failing drive can cause data corruption on other drives in the enclosure. Replace immediately.
Summary: Error Types at a Glance
I/O Error
Reallocated Sectors
Pending Reallocations
Unreliable Sectors
Failed Reallocations
Predicted to Fail
SMART Failure
Communication failure — cause unknown
Bad sectors found on drive surface
Sectors marked for reallocation
Sectors required retries
Drive cannot reallocate bad sectors
SMART pattern indicates imminent failure
Drive has already failed its self-test
Investigate; may be software or hardware
Replace immediately
Certify first; replace if fails
Certify first; replace if fails
Replace immediately
Back up and replace urgently
Remove and replace immediately
Is the Problem Hardware or Software?
I/O errors in particular can be caused by either a hardware problem (failing disk, bad cable, enclosure fault) or a software/filesystem problem (directory corruption). Here is how to tell them apart:
Step 1: Check the pattern — but don’t jump to conclusions
A single disk showing I/O errors is more likely to point to that specific disk or its enclosure slot. But multiple disks showing I/O errors does not automatically mean a hardware problem — it can also result from:
- Directory corruption (a file pointer referencing a disk location that doesn’t exist)
- A macOS kernel panic
- A loose cable or momentary bus interruption
- An enclosure-wide issue
The key diagnostic step for multiple I/O errors is to check the SoftRAID log (Utilities menu → View Log). Look at the timestamps:
- Did all the errors happen at the same time? That points to a single event — a kernel panic, a bus interruption, a sudden ejection — rather than ongoing hardware failure
- Are errors repeating over time on the same disk(s)? That points to a hardware or persistent connection problem
You can also clear the I/O error counters (Disk menu → Clear I/O Errors) and resume normal use. Monitor whether new errors appear, and on which disks.
Step 2: To check for volume directory damage — run Disk Utility First Aid
If errors occurred after a kernel panic, forced shutdown, or Thunderbolt ejection, run Disk Utility → First Aid on the volume. This often resolves errors caused by filesystem metadata corruption. If First Aid clears the errors and they do not return, the cause was software, not hardware. Disk Utility is limited in what it can repair. If it finds a problem on an HFS volume but cannot repair it, consider using the third-party application DiskWarrior.
Step 3: Check SMART data
Regardless of I/O errors, check the disk tile in SoftRAID for any SMART-based disk errors. If a disk showing I/O errors also has reallocated sectors or a predicted failure, the I/O errors are almost certainly caused by disk hardware failure. Replace the disk immediately.
When to Replace a Disk
Replace immediately:
- SMART test failure
- Failed reallocations
- Any reallocated sectors (even one)
- SSD/NVMe media wear below 10%
- Repeated I/O errors on the same disk, especially with any SMART indicators present
- Unusual noises (clicking, grinding)
Certify first, replace if it fails:
- Pending reallocations
- Unreliable sectors
Plan proactive replacement:
- HDD power-on hours exceeding 25,000–30,000 (mission-critical environments)
Replacing the Disk
Once you have identified a disk that needs replacement, see our video on how to replace a faulty drive in a RAID array in SoftRAID. For step-by-step written instructions, see the “How to hot swap a drive in a ThunderBay or Mercury Elite Pro Quad” or the “RAID Failure and Missing Disks” for more details on Replacing a Disk.
