In 2015, as a high school student, I had a PC with an old 500GB hard drive that spun at 7200 RPM. One morning, I discovered my PC wasn’t booting; the hard drive had crashed. I had stored a lot of movies on it, and many of my friends used my PC as a backup device. Luckily, I didn’t lose any critical data because I had burned it to DVDs (remember when that was a thing?) and saved it to Google Drive.
This incident made me wonder: What if the drives used by Google Drive servers crashed someday? Would I lose my data? Don’t they crash? They certainly do. So how do they recover? These questions sparked my curiosity. Today’s topic is about that technology.
“Hard drives and SSDs are physical objects, and like any physical object, they can degrade over time. This applies to the solid-state drives used for cloud storage. One way to mitigate this is to monitor the drives’ health continuously. There are many online tools available for this purpose.
However, even an SSD with 100% health can still crash. Data recovery is an option in such cases. However, consider the costs and complexities of managing data recovery for thousands of SSDs in a single server room. Additionally, you might not want to see error messages like ‘Your document cannot be loaded because the drive we used has crashed and been sent for data recovery.’”
When problems arise, there are always solutions and common problems often have multiple solutions. As I’ve just discussed, fault tolerance is a concern. To improve fault tolerance, RAID is a powerful and valuable solution for storage systems.
What the heck is RAID?
RAID (Redundant Array of Independent Disks) is a storage system that uses multiple drives in one system to store data. There are various levels of RAID.
Popular RAID levels
Many RAID levels offer their own set of advantages. Such as:
RAID 0: RAID 0 is not mainly used for data safety but primarily to increase IO/s (Input Output per second). It’s also known as striping. The more drives you add to the array, the faster IO/s you get. But if one drive fails, all of your data could be lost.
RAID 1: RAID 1 can be considered for data safety. In this setup, data is stored as a copy on each of the drives available. However, it wastes half of your device storage.
RAID 5: RAID 5 is my favorite. It offers both performance and safety. It requires at least 3 drives to be implemented. Data is split across drives, creating distributed parity. Parity is a technique that uses a calculative algorithm to restore data in case one drive fails. However, for multiple drive failures, your data may get lost.
RAID 10: RAID 10 is popular among professionals and businesses. It is a combination of RAID 0 and 1. It offers both increased IO/s and improved fault tolerance.
Although RAID is used to improve performance and reduce fault tolerance, it is not a data recovery solution. It’s designed to decrease the likelihood of data loss. Next time I need a drive for backup, I am going to buy four!!