With RAID, ZFS, Replica or Erasure Coding to the right capacity

Storage capacities of disk pools: RAID, ZFS, Erasure Coding, or Replica

Here we show you how to quickly estimate the net capacity of your system and how the technique works.

We have compiled a list of the common protection mechanisms that are used in our storage systems, servers and clusters.
First we have to declare the corresponding variables for the used formulas, which you can find below each post. The formula letters always remain the same.

Used variables

Drive count		n
Drive capacity		K
Net capacity of a drive array		N

With Replica and Erasure Coding, additional variables are added.
_{The number of drives refers to a single server for these systems.}

Number of servers		Sn
Raw data		R
Parities		P

RAID types

A RAID controller is basically a computer that specializes in storing data as sparingly as possible with a predefined redundancy. It does this either by calculating parities (XORing) or simply copying data to multiple drives to mirror them. Either way, a RAID controller can rebuild the original dataset from it.

We, as veterans in the RAID market, supply both small and large systems with built-in controllers, external RAID controllers (so-called RAID heads) in separate enclosures as well as PCIe plug-in cards for installation in a server.

Depending on the protection needs or importance, the admin can select the desired RAID level and create the necessary RAID sets via the WebGUI, the user interface of the controller.

RAID0

It starts right away with a contradiction, because RAID0 is basically not a data protector – on the contrary: The probability of failure of the federation is even increased by this procedure, which is also called striping. After all, if even one of the disks fails, the data is lost. This mode only brings performance gains, since the controller combines the drives into a logical volume and thus drives them in parallel.

Capacities: Drive sizes add up.

Formula: N = n x K

RAID1

In the simplest and also most expensive variant, the RAID controller simultaneously copies the data to two or more drives of the same size. The data carriers involved are therefore merely mirrored and the data stock is available in cloned form on all drives.

Capacities: The capacity of a drive remains from the data carriers.

Formula: N = K = (n x K) / n

RAID10

RAID10 is a combination of the two previous modes. The controller first mirrors two hard disks on top of each other, so that a one-to-one copy is created. The drives cloned in this way are in turn combined to form a virtual drive at RAID level 0 for the sake of performance. Due to mirroring, an even number of disks is always required.

Capacities: Half of the capacity remains from the data carriers.

Formula: N = (n x K) / 2

RAID1E

RAID1E represents a special form, since it can also be implemented with an odd number of drives. The data blocks are distributed evenly among the drives. Example: Block 1 is written to HDD1 and HDD2, Block 2 to HDD2 and HDD3, Block 3 to HDD1 and HDD3 and so on.

Capacities: Here the capacity is reduced by half.

Formula: N = (n x K) / 2

RAID5 (RAID3, RAID4)

RAID3 and RAID4 are hardly used nowadays, as RAID5 does the job more securely and effectively. In principle, however, they work similarly: with four hard disks involved, for example, the data is distributed evenly on three of them. On the fourth disk, however, the controller stores calculated parities. The controller can reconstruct the original data stock from these parities or fragments if one of the data carriers fails. With RAID3 and RAID4, the controller stores the parities on one and the same data carrier (byte-wise or block-wise). As a result, this drive is subjected to particularly high stress and is subject to a higher probability of failure. This is why RAID5 is usually preferred, since parity is distributed and the disks are used evenly.

Capacities: Capacity is reduced by the size of a volume in the array.

Formula: N = (n – 1) x K

RAID6

RAID6 works similarly to RAID5, but uses two disks in the array to store the parities. Advantage: If one of them has already failed and the defective drive has been replaced, the period during which the rebuild takes place is also protected. (Rebuild is the reconstruction of the lost data). After all, depending on the size and data stock, a rebuild can sometimes take several days.

Capacities: Capacity is reduced by the size of two disks in the array.

Formula: N = (n – 2) x K

RAID50 (RAID30, RAID40)

RAID50 is again a combination of two modes: Two RAID5 sets are combined to a virtual data carrier by striping via RAID0. Thus, RAID50 requires at least six disks, with one disk in each trio allowed to fail without jeopardizing the data set.

Capacities: The capacity is reduced by two hard disks.

Formula: N = (n – 2) x K
_{(Where n must be even)}

RAID60

RAID60, like RAID50, forms a combination of RAID levels. The only difference is that two disks in each RAID6 can fail.

Capacities: The capacity of four disks (two per RAID6) must be subtracted from the total volume.

Formula: N = (n – 4) x K
_{(Where n must be even)}

Controllers with these RAID modes

Complete systems that can handle RAID

ZFS

Under ZFS, the capacity calculation is a bit trickier. This ingenious – because transactional – file system requires a certain overhead (3.2 percent) for its reliable work in servers or in the data center for the metadata. In addition, the pros recommend always leaving 20 percent space in the pool of usable memory. The reason: ZFS always needs some reserve in memory to play out its strengths, such as the integrated RAID function, creating snapshots (via copy-on-write), automatic data error correction and deduplication.

In addition, ZFS can handle multiple options, giving the admin a choice of redundant drives, similar to classic RAID levels.

By the way, we have two architectures in our program that are based on ZFS: Our installations with the storage operating system JovianDSS from Open-E have integrated ZFS as a core feature. The same applies to our servers with TrueNAS/FreeNAS – however, there are additional factors that would go beyond the scope of this article.

RAID-Z1

Essentially, RAID-Z1 is the same as the original RAID5. This means that only one drive may fail during operation.

Capacities: The sum of all drives reduces by overhead/reserve and the size of one of the drives.

Formula: N = (1 – 0,032) x 0,8 x (n–1) x K

RAID-Z2

RAID-Z2 also corresponds to a RAID mode: RAID6. Here, two drives are kept in reserve so that the rebuild can also run securely.

Capacities: The sum of all drives reduces by overhead/reserve and the size of two drives.

Formula: N = (1 – 0,032) x 0,8 x (n–2) x K

Open-E JovianDSS with ZFS

TrueNAS/FreeNAS with ZFS

Defined replicas

Our modern Ceph systems based on PetaSAN or via Ambedded Mars 400, however, also work with duplication. That is, you scatter replicas of the data objects across all servers in the cluster. The recommendation here is to keep two replicas in addition to the original object. (The term is not clearly defined everywhere. Three replicas are often also given in the sense of – one original and two copies).

As with ZFS, there is a pain threshold that must be met and is included in the calculation of the total capacity: the near-fill ratio. Experts recommend not to fill more than 85% of the storage tank in these systems. So once the cluster reaches 85% of its capacity, it should be expanded in the near future. Furthermore, the number of integrated servers also plays a major role in this distributed file system.

Capacities: The sum of all drives on all servers (Sn) is reduced by the near-fill ratio and divided by the number of desired replicas.

Formula: N = (K x n x Sn x 0,85) / 3

Systems that use replicas

Erasure Coding

Erasure Coding is a proven forward error correction (FEC) technique used in object memories such as our PetaSAN platform. Objects can be restored by adding parities across the entire cluster. What sounds quite simple in itself is actually associated with a comparatively high computational effort.

With Erasure Coding the motto is: On each drive only one single data block of an object may be stored and at the same time a node may not accumulate more data blocks than an object may lose.

This relationship is of course also reflected in the capacity calculation. First, a near-fill ratio of 85 percent also applies to Erasure Coding. Above this limit, further capacity should be added. Another factor is the ratio of raw data stock (R) and the sum of raw data and parities (R+P).

Capacities: Number of servers multiplied by the sum of all drives reduces by near-fill ratio and the aforementioned raw data/parity ratio.

Formula: N = K x n x Sn x 0,85 x (R / (R + P))

Systems with Erasure Coding

Any questions?

We are happy to advise you!

Konrad Beyer

Technical Support

Our technical manager has a comprehensive knowledge of all storage and server topics.