What is RAID?
RAID (redundant array of independent disks) is a setup consisting of multiple disks for data storage. They are linked together to prevent data loss and/or speed up performance. Having multiple disks allows the employment of various techniques like disk striping, disk mirroring, and parity.
In this article, learn about RAID types, their pros and cons, and their use cases.
RAID Levels and Types
RAID levels are grouped into the following categories:
- Standard RAID levels
- Non-standard RAID levels
- Nested/hybrid RAID levels
Additionally, you can choose how to implement RAID on your system. Therefore you can choose between hardware RAID, software RAID, and firmware RAID.
The following list explains the standard RAID levels (0, 1, 2, 3, 4, 5, 6) and popular non-standard and hybrid options (RAID 10).
RAID 0: Striping
RAID 0, also known as a striped set or a striped volume, requires a minimum of two disks. The disks are merged into a single large volume where data is stored evenly across the number of disks in the array.
This process is called disk striping and involves splitting data into blocks and writing it simultaneously/sequentially on multiple disks. Configuring the striped disks as a single partition increases performance since multiple disks do reading and writing operations simultaneously. Therefore, RAID 0 is generally implemented to improve speed and efficiency.
It is important to note that if an array consists of disks of different sizes, each will be limited to the smallest disk size in the setup. This means that an array composed of two disks, where one is 320 GB, and the other is 120 GB, actually has the capacity of 2 x 120 GB (or 240 GB in total).
Certain implementations allow you to utilize the remaining 200 GB for different use. Additionally, developers can implement multiple controllers (or even one per disk) to improve performance.
RAID 0 is the most affordable type of redundant disk configuration and is relatively easy to set up. Still, it does not include any redundancy, fault tolerance, or party in its composition. Hence, problems on any of the disks in the array can result in complete data loss. This is why it should only be used for non-critical storage, such as temporary files backed up somewhere else.
Advantages of RAID 0
- Cost-efficient and straightforward to implement.
- Increased read and write performance.
- No overhead (total capacity use).
Disadvantages of RAID 0
- Doesn’t provide fault tolerance or redundancy.
When Raid 0 Should Be Used
RAID 0 is used when performance is a priority and reliability is not. If you want to utilize your drives to the fullest and don’t mind losing data, opt for RAID 0.
On the other hand, such a configuration does not necessarily have to be unreliable. You can set up disk striping on your system along with another RAID array that ensures data protection and redundancy.
RAID 1: Mirroring
RAID 1 is an array consisting of at least two disks where the same data is stored on each to ensure redundancy. The most common use of RAID 1 is setting up a mirrored pair consisting of two disks in which the contents of the first disk is mirrored in the second. This is why such a configuration is also called mirroring.
Unlike with RAID 0, where the focus is solely on speed and performance, the primary goal of RAID 1 is to provide redundancy. It eliminates the possibility of data loss and downtime by replacing a failed drive with its replica.
In such a setup, the array volume is as big as the smallest disk and operates as long as one drive is operational. Apart from reliability, mirroring enhances read performance as a request can be handled by any of the drives in the array. On the other hand, the write performance remains the same as with one disk and is equal to the slowest disk in the configuration.
Advantages of RAID 1
- Increased read performance.
- Provides redundancy and fault tolerance.
- Simple to configure and easy to use.
Disadvantages of RAID 1
- Uses only half of the storage capacity.
- More expensive (needs twice as many drivers).
- Requires powering down your computer to replace failed drive.
When Raid 1 Should Be Used
RAID 1 is used for mission-critical storage that requires a minimal risk of data loss. Accounting systems often opt for RAID 1 as they deal with critical data and require high reliability.
It is also suitable for smaller servers with only two disks, as well as if you are searching for a simple configuration you can easily set up (even at home).
Raid 2: Bit-Level Striping with Dedicated Hamming-Code Parity
RAID 2 is rarely used in practice today. It combines bit-level striping with error checking and information correction. This RAID implementation requires two groups of disks – one for writing the data and another for writing error correction codes. RAID 2 also requires a special controller for the synchronized spinning of all disks.
Instead of data blocks, RAID 2 stripes data at the bit level across multiple disks. Additionally, it uses the Humming error ode correction (ECC) and stores this information on the redundancy disk.
The array calculates the error code correction on the fly. While writing the data, it strips it to the data disk and writes the code to the redundancy disk. On the other hand, while reading data from the disk, it also reads from the redundancy disk to verify the data and make corrections if needed.
Advantages of RAID 2
- Reliability.
- The ability to correct stored information.
Disadvantages of RAID 2
- Expensive.
- Difficult to implement.
- Require entire disks for ECC.
When Raid 2 Should Be Used
RAID 2 is not a common practice today as most of its features are now available on modern hard disks. Due to its cost and implementation requirements, this RAID level never became popular among developers.
Raid 3: Bit-Level Striping with Dedicated Parity
Like RAID 2, RAID 3 is rarely used in practice. This RAID implementation utilizes bit-level striping and a dedicated parity disk. Because of this, it requires at least three drives, where two are used for storing data strips, and one is used for parity.
To allow synchronized spinning, RAID 3 also needs a special controller. Due to its configuration and synchronized disk spinning, it achieves better performance rates with sequential operations than random read/write operations.
Advantages of RAID 3
- Good throughput when transferring large amounts of data.
- High efficiency with sequential operations.
- Disk failure resiliency.
Disadvantages of RAID 3
- Not suitable for transferring small files.
- Complex to implement.
- Difficult to set up as software RAID.
When Raid 3 Should Be Used
RAID 3 is not commonly used today. Its features are beneficial to a limited number of use cases requiring high transfer rates for long sequential reads and writes (such as video editing and production).
Raid 4: Block-Level Striping with Dedicated Parity
RAID 4 is another unpopular standard RAID level. It consists of block-level data striping across two or more independent diss and a dedicated parity disk.
The implementation requires at least three disks – two for storing data strips and one dedicated for storing parity and providing redundancy. As each disk is independent and there is no synchronized spinning, there is no need for a controller.
RAID 4 configuration is prone to bottlenecks when storing parity bits for each data block on a single drive. Such system bottlenecks have a large impact on system performance.
Advantages of RAID 4
- Fast read operations.
- Low storage overhead.
- Simultaneous I/O requests.
Disadvantages of RAID 4
- Bottlenecks that have big effect on overall performance.
- Slow write operations.
- Redundancy is lost if the parity disk fails.
When Raid 4 Should Be Used
Considering its configuration, RAID 4 works best with use cases requiring sequential reading and writing data processes of huge files. Still, just like with RAID 3, in most solutions, RAID 4 has been replaced with RAID 5.
Raid 5: Striping with Parity
RAID 5 is considered the most secure and most common RAID implementation. It combines striping and parity to provide a fast and reliable setup. Such a configuration gives the user storage usability as with RAID 1 and the performance efficiency of RAID 0.
This RAID level consists of at least three hard drives (and at most, 16). Data is divided into data strips and distributed across different disks in the array. This allows for high performance rates due to fast read data transactions which can be done simultaneously by different drives in the array.
Parity bits are distributed evenly on all disks after each sequence of data has been saved. This feature ensures that you still have access to the data from parity bits in case of a failed drive. Therefore, RAID 5 provides redundancy through parity bits instead of mirroring.
Advantages of RAID 5
- High performance and capacity.
- Fast and reliable read speed.
- Tolerates single drive failure.
Disadvantages of RAID 5
- Longer rebuild time.
- Uses half of the storage capacity (due to parity).
- If more than one disk fails, data is lost.
- More complex to implement.
When Raid 5 Should Be Used
RAID 5 is often used for file and application servers because of its high efficiency and optimized storage. Additionally, it is the best, cost-effective solution if continuous data access is a priority and/or you require installing an operating system on the array.
Raid 6: Striping with Double Parity
RAID 6 is an array similar to RAID 5 with an addition of its double parity feature. For this reason, it is also referred to as the double-parity RAID.
This setup requires a minimum of four drives. The setup resembles RAID 5 but includes two additional parity blocks distributed across the disk. Therefore, it uses block-level striping to distribute the data across the array and stores two parity blocks for each data block.
Block-level striping with two parity blocks allows two disk failures before any data is lost. This means that in an event where two disks fail, RAID can still reconstruct the required data.
Its performance depends on how the array is implemented, as well as the total number of drives. Write operations are slower compared to other configurations due to its double parity feature.
Advantages of RAID 6
- High fault and drive-failure tolerance.
- Storage efficiency (when more than four drives are used).
- Fast read operations.
Disadvantages of RAID 6
- Rebuild time can take up to 24 hours.
- Slow write performance.
- Complex to implement.
- More expensive.
When Raid 6 Should Be Used
RAID 6 is a good solution for mission-critical applications where data loss cannot be tolerated. Therefore, it is often used for data management in defense sectors, healthcare, and banking.
Raid 10: Mirroring with Striping
RAID 10 is part of a group called nested or hybrid RAID, which means it is a combination of two different RAID levels. In the case of RAID 10, the array combines level 1 mirroring and level 0 striping. This RAID array is also known as RAID 1+0.
RAID 10 uses logical mirroring to write the same data on two or more drives to provide redundancy. If one disk fails, there is a mirrored image of the data stored on another disk. Additionally, the array uses block-level striping to distribute chunks of data across different drives. This improves performance and read and write speed as the data is simultaneously accessed from multiple disks.
To implement such a configuration, the array requires at least four drives, as well as a disk controller.
Advantages of RAID 10
- High performance.
- High fault-tolerance.
- Fast read and write operations.
- Fast rebuild time.
Disadvantages of RAID 10
- Limited scalability.
- Costly (compared to other RAID levels).
- Uses half of the disk space capacity.
- More complicated to set up.
When Raid 10 Should Be Used
RAID 10 is often used in use cases that require storing high volumes of data, fast read and write times, and high fault tolerance. Accordingly, this RAID level is often implemented for email servers, web hosting servers, and databases.
Non-Standard RAID
The RAID levels mentioned above are considered standard or commonly used RAID implementations. However, there is a myriad of ways you can set up redundant arrays of independent disks.
Accordingly, many open-source projects and companies have created their own configurations to adhere to their needs. As a result, there are many non-standard RAID implementations, such as:
- RAID-DP
- Linux MD RAID 10
- RAID-Z
- Drive Extender
- Declustered RAID
Nested (Hybrid) RAID
You can combine two or more standard RAID levels to ensure better performance and redundancy. Such combinations are called nested (or hybrid) RAID levels.
Hybrid RAID implementations are named after the RAID levels they incorporate. In most cases, they include two numbers where their order represents the layering scheme.
Popular hybrid RAID levels include:
- RAID 01 (striping and mirroring; also known as “mirror of stripes”)
- RAID 03 (byte-level striping and dedicated parity)
- RAID 10 (disk mirroring and straight block-level striping)
- RAID 50 (distributed parity and straight block-level striping)
- RAID 60 (dual parity and straight block-level striping)
- RAID 100 (a stripe of RAID 10s)
RAID Implementation Types
There are three ways of utilizing RAID, differing by where the processing takes place.
Hardware-based RAID
When installing the hardware setup, you insert a RAID controller card in a fast PCI-Express slot on the motherboard and connect it to the drives. External RAID drive enclosures with a built-in controller card are also available.
Software-based RAID
For the software setup, you connect the drives directly to the computer, without using a RAID controller. In that case, you manage the disks through utility software on the operating system.
Firmware/Driver-based RAID
Firmware-based RAID (also known as a driver-based RAID) are RAID systems often stored directly on the motherboard. All its operations are performed by the computer’s CPU, not by a dedicated processor.
Note: If you are setting up hardware RAID, you should consider installing MegaCLI for managing and communicating with RAID controllers.
Conclusion
RAID is a useful and practical way to speed up server performance and ensure that no data is lost. Deciding what kind of setup is best for your business greatly depends on your priorities. Explore all the options and get all the advantages of this powerful tool and technique.
原创文章,作者:奋斗,如若转载,请注明出处:https://blog.ytso.com/223470.html