Extending Flash Lifetime

Extending Flash Lifetime

Extending Flash Lifetime

The characteristics of flash memory are key factors to which the controller must adapt in order to achieve these goals. The process of writing data, the limited number of program/erase cycles of flash memory cells, and error handling are just a few examples of these characteristics, which are described in more detail below.

Programming data: pages and blocks

One key characteristic of NAND flash is that data is written and programmed in pages. Those pages are typically sized between 4 kB and 16 kB and can only be erased by erasing entire blocks, which consists of multiple pages. The fact that a cell within pages and blocks needs to be programmed and erased ‘in a flash’, is why it is called Flash Memory. Any cells that have been programmed to 0 can only be reset to 1 by erasing the entire block. Ultimately, before new data can be programmed into a page that already contains data, the current contents of the page must be copied to a different, empty page or erased if no longer needed. If no empty pages are available, a block must be erased before copying the data to a page in that block. The old page is then marked as invalid and is available for erase and reuse.

The controller manages this process of deciding which pages to use and keeping track of invalid pages that need to be erased. Also, it performs a function called ‘garbage collection’ by consolidating pages of valid data into blocks, creating empty blocks ready for erasing and reuse. During all data processes, the controller manages the mapping from the logical addresses from the host to the physical location in the memory. Another key feature of the controller is the insurance of data integrity if there is a power failure while data is being moved. This is especially critical when the power failure occurs in vital industries such as medical technology.

Limited program and erase cycles

The processes described above include complicated physical characteristics and relatively high voltages. Because of this, the choice of which blocks and pages to use becomes even more complex by the limited number of program and erase cycles that Flash cells can. To prevent early failure of pages, the controller performs a process called ‘wear levelling’ to ensure all flash blocks are used equally.

Error Correction and Bad Block Management

Another important task of the controller is detecting errors when reading data by using Error Correction Coding (ECC) as efficiently as possible. Recent ECCs can correct over 120 bit errors within 1kByte of user data. This means that up to 1 out of 70 bits read is expected to be incorrect relating to the Raw Bit Error Rate (RBER)! 

If repeated failures occur in a block or a block fails to erase, it is marked as bad and will not be used in the future. One important feature of flash memory is that they are built with additional spare blocks. So, in case a block shows an error rate beyond the tolerance limit, the block will not be used and will be marked as bad. Instead, another spare block will be used.

Health monitoring

Finally, let us look at tracking the current status and expected lifetime of the flash memory. This process is used to avoid unexpected failures and data loss. As with hard disk drives, the standard self-monitoring, analysis, and reporting technology (SMART) allow the controller to report the health of the flash memory.

Therefore, a state-of-the-art flash controller technology is key to enable flash-based storage products, ensuring high levels of endurance, reliability and operating life compared to hard disk storage.