*Article* **Compression-Assisted Adaptive ECC and RAID Scattering for NAND Flash Storage Devices**

**Seung-Ho Lim <sup>1</sup> and Ki-Woong Park 2,\***


Received: 27 March 2020; Accepted: 20 May 2020; Published: 22 May 2020

**Abstract:** NAND flash memory-based storage devices are vulnerable to errors induced by NAND flash memory cells. Error-correction codes (ECCs) are integrated into the flash memory controller to correct errors in flash memory. However, since ECCs show inherent limits in checking the excessive increase in errors, a complementary method should be considered for the reliability of flash storage devices. In this paper, we propose a scheme based on lossless data compression that enhances the error recovery ability of flash storage devices, which applies to improve recovery capability both of inside and outside the page. Within a page, ECC encoding is realized on compressed data by the adaptive ECC module, which results in a reduced code rate. From the perspective of outside the page, the compressed data are not placed at the beginning of the page, but rather is placed at a specific location within the page, which makes it possible to skip certain pages during the recovery phase. As a result, the proposed scheme improves the uncorrectable bit error rate (UBER) of the legacy system.

**Keywords:** NAND flash memory; P/E cycle; compression; adaptive ECC; RAID scattering; stripe log; parity

#### **1. Introduction**

With advances in process geometry scale-down of flash memory, a rapid increase in the capacity of NAND flash memory chips has been achieved. The main reason for this rapid increase in capacity is the increase in the number of bits stored per cell. However, as this number increases, the inter-electron interference degree increases, resulting in an increase in the error rate of the flash memory [1,2]. In addition, as the program/erase cycle (P/E cycle) increases; that is, as the usage time increases, the physical characteristics of the cell deteriorate, and the error rate increases rapidly. This is a serious drawback of computer systems based on NAND flash memory storage devices.

The typical approach to rectify this short endurance and increase in errors is using error-correction codes (ECCs) [3,4]. NAND flash memory devices use the ECC module to create additional parity for data-error correction. The parity generated by the ECC module is stored in the out-of-bound (OOB) region within a page; a page is the unit of read and write operations. Therefore, ECC is used for error correction in page units. Although ECC parities can achieve error-correction gain, if there are more errors than an ECC can correct, they may have no choice but to return decoding failures, if used without any additional error-recovery scheme for the failed data. However, creating more ECC parity is a complex and difficult process because the ECC module is a type of hardware block within the flash memory controller. One useful approach for creating more ECC parities for specific data is to use lossless compression [5,6], where additional ECC parity is created and stored in the free area after compression of the data.

In contrast, as a complement to in-page ECC decoding failures, an error management method using another parity technique between outer-pages, such as RAID5, has often been applied. Many previous works show that parity with the RAID5 technique can reduce the page error rate [7–10]; however, it has also been shown to result in increased additional write operations and low space utilization owing to RAID parity management. That is, the RAID parity operation creates an additional write operation in the parity update process. From a space management point of view, the smaller the size of a cluster that has RAID parity, the larger the number of parities, resulting in a space overhead.

In this paper, we propose an enhancement scheme for an in-page ECC parity ability and outer-page RAID parity management with the help of lossless data compression. In the proposed system, incoming data are compressed through a compression module, and then, so-called adaptive ECC and RAID-scattering schemes are applied to this compressed data step by step. For in-page ECC improvement, ECC parity is created by applying ECC encoding only to the compressed data. Compression reduces the source length applied to ECC encoding, which results in lowering of the code rate.

The code rate is defined as a rate occupied by data in a codeword, so the lower the code rate, the higher the parity rate; thus, the error correction capability is improved by compressing data and applying ECC only to these compressed data.

After compression, data are then placed in each offset position within the page, in which the placement position is determined by its position in the RAID stripe.

We refer this placement scheme as RAID-scattering scheme. Since the data are compressed and occupy a smaller area than the page, each datum occupies only a specific part of the page, and the rest becomes a non-use area, and the value of the non-use can be set a known value. As a result, the specific positioning of each compressed data creates a region where non-use regions overlap each other in a stripe. Owing to the overlapping of the non-use area, the overhead of restoring data in RAID is reduced, because when data are restored with RAID parity algorithm for the error-occurred page in the RAID stripe, the non-use area of the error page does not need to be restored.

As a result, some pages can be skipped during the restoration process.

In other words, it results in more pages being recovered by each RAID parity, and parity overhead is reduced at the same level as the RAID reliability. In addition, this paper describes a FTL architecture, a FTL management scheme and the associated metadata for parity logging that can efficiently update and manage RAID parity on NAND flash devices. The scheme proposed in this paper is an extension of our previous work [11] that only considered in-page ECC management. This paper extends and highlights a combination of in-page ECC management and outer-page parity management, as well as detailed metadata structure of RAID parity management.

The organization of this paper is as follows. In Section 2, background and related work are presented. The proposed adaptive ECC and RAID-scattering scheme is described in Section 3, and experimental results are shown in Section 4. Finally, Section 5 concludes this paper.

#### **2. Background and Related Work**

In this section, the background of this research area is first described, which includes basics of flash memory and flash memory-based systems. Then, the related work is described.

### *2.1. NAND Flash Memory-Based Storage Devices*

Figure 1 describes typical NAND flash-based storage devices. The basic hardware architecture of NAND flash memory-based storage devices consists of a high-performance controller, DRAM main memory, and an array of NAND flash memory chips. NAND flash memory is a nonvolatile semiconductor storage device that is formed by integrating a floating gate-based semiconductor device. The main components include a page, which is a read and write unit, and a block, which is an erase unit. The page sizes are mainly about 4 KB, 8 KB, and 16 KB, and dozens of pages are collected to form a block. There are three internal commands used for NAND flash memory: read, program, and erase. The read and program commands transfer data to and from the flash chip, and a data transfer unit is a page. Actually, the erase command does not transfer data and works in block

units. A client device can make only two types of requests, namely, a read and a write request. The read request is related to the read command and the write request is related to the program command.

**Figure 1.** Basic architecture for NAND flash-based storage devices.

Currently, triple-level cells (TLC) are the most commonly used NAND flash memory. It stores three bits per cell, where each of the three bits of data in a TLC flash cell are either programmed or erased, which means the cell has a total of eight different states from 000 to 111. In contrast, a single-level cell (SLC) and a multi-level cell (MLC) store one bit and two bits per cell, respectively. With several states, the quantized voltage level of memory cells decides the value of the cell. The growth in the number of states per cell raises interference between states as the quantized decision levels of the cell start moving close between adjacent states, which results in detection errors. Recent advances in 3D lamination technology make it easier to secure cell-to-cell spacing compared to conventional 2D methods, resulting in a slight increase in P/E cycles compared with those of conventional 2D.

There are two crucial points regarding NAND flash memory. One, if the data are written to a flash memory, write operations should be preceded by erase operations internally. Second, the basic units of erase and write operations are different from each other. To overcome the unit mismatch between write and erase operations and to efficiently use flash memory, special software layers called flash translation layers (FTLs) have been developed [12–17]. Beyond single internal operational issues, other works have focused on the parallelism issue within embedded storage devices to speed them up [18–23].
