In April of last year the LTO consortium (IBM, HP, and Quantum) introduced the latest version of LTO. At the time I was myopically focused on data tiering technologies which caused me to miss what was probably the most significant advance in tape technology for many years. Not appropriate for a StorageTek alumni!
Although impressive, the significance of this announcement was not the capacity and throughput improvements but a tape file system that enables a disk like search functionality (random-like) that dramatically improves data-on-tape access times, simplifies on-tape file updates and includes a drag and drop functionality that simplifies on-tape data management. When this functionality is combined with a 3TB cartridge capacity, a 240MB/s transfer rate (2:1 compression rate), a media reliability that exceed that of disk and the natural greenness of a removable media architecture, it is not surprising that the latest generation of LTO tape is being viewed as a legitimate storage option for unstructured big data including rich digital media.
The perennial negatives of tape have been its slow sequential search characteristics that bloated data access times and the cumbersome necessity to scan the whole tape simply to know what it contains. LTFS (Linear Tape file System) is an innovative tape file system with the notable characteristics of being open source and application independent. The file system partitions the tape into two distinct, individually addressable, unequal segments with the smaller quick read segment containing descriptive metadata that enables the quick search capability (random-like) of the data contained in the second and much larger segment. LTFS minimizes the traditional search performance negatives associated with tape and while the performance will never approximate that of disk, LTO-5 tape with LTFS will deliver a data access response time that should satisfy the demands of many if not most applications in the big data world.
The key factors that separates LTFS from past attempts to solve these data-on-tape accessibility and management issues are as follows;
1. It is not a proprietary solution; it is open sourced with interoperable versions available from multiple vendors. This flexibility will simplify and enhance data mobility.
2. LTO-5 cartridge is self contained and potentially self-describing, hosting both the file system and the data. This separation of the file system from the application addresses the concern “after 30 years will the application still be available to access the data?” A reasonable question but with the separation of the file system from the application and the file system being open source accessibility over time to the data on the cartridge will be greatly increased.
With a native capacity of 1.5TB, a transfer rate of 140MB/sec (3TB and 240MB/s with a 2:1 compression), a reliability (BER) that exceeds spinning disk media and an energy efficiency profile that can only be generated by removable media, LTO-5/LTFS based tape solutions have a functional profile that uniquely positions them as long term repositories for data intense, rich media applications.
Bottom-line: The assault on tape by disk based solutions that integrate data de-duplication lose their impact in non-dedupe data environments such as commercial video, surveillance video, digital photography, PDF’s, images, seismic/scientific etc. These data types contain little if any duplicate data other than where files have been duplicated. In fact using a de-duplication solution in these data environments is likely to have the unintended and undesirable consequence of driving storage costs in the wrong direction, up.
Tape has a $/GB, kW/hr/GB, sustained GB/sec advantage over tier 3 disk and now with its random-like accessibility and drag and drop manageability, tape has an answer to disk based solutions, at least within use case boundaries defined by rich media.
With the introduction of LTO-5 and LTFS, tape has an opportunity to regain its position as a legitimate participant in the enterprise storage hierarchy. However, the challenge for the LTO consortium is to build a comprehensive ecosystem of back-up, archival and rich media management applications compatible with LTFS. Not a show stopper but a significant bump in the road leading to a broad based adoption of this technology.
Although impressive, the significance of this announcement was not the capacity and throughput improvements but a tape file system that enables a disk like search functionality (random-like) that dramatically improves data-on-tape access times, simplifies on-tape file updates and includes a drag and drop functionality that simplifies on-tape data management. When this functionality is combined with a 3TB cartridge capacity, a 240MB/s transfer rate (2:1 compression rate), a media reliability that exceed that of disk and the natural greenness of a removable media architecture, it is not surprising that the latest generation of LTO tape is being viewed as a legitimate storage option for unstructured big data including rich digital media.
The perennial negatives of tape have been its slow sequential search characteristics that bloated data access times and the cumbersome necessity to scan the whole tape simply to know what it contains. LTFS (Linear Tape file System) is an innovative tape file system with the notable characteristics of being open source and application independent. The file system partitions the tape into two distinct, individually addressable, unequal segments with the smaller quick read segment containing descriptive metadata that enables the quick search capability (random-like) of the data contained in the second and much larger segment. LTFS minimizes the traditional search performance negatives associated with tape and while the performance will never approximate that of disk, LTO-5 tape with LTFS will deliver a data access response time that should satisfy the demands of many if not most applications in the big data world.
The key factors that separates LTFS from past attempts to solve these data-on-tape accessibility and management issues are as follows;
1. It is not a proprietary solution; it is open sourced with interoperable versions available from multiple vendors. This flexibility will simplify and enhance data mobility.
2. LTO-5 cartridge is self contained and potentially self-describing, hosting both the file system and the data. This separation of the file system from the application addresses the concern “after 30 years will the application still be available to access the data?” A reasonable question but with the separation of the file system from the application and the file system being open source accessibility over time to the data on the cartridge will be greatly increased.
With a native capacity of 1.5TB, a transfer rate of 140MB/sec (3TB and 240MB/s with a 2:1 compression), a reliability (BER) that exceeds spinning disk media and an energy efficiency profile that can only be generated by removable media, LTO-5/LTFS based tape solutions have a functional profile that uniquely positions them as long term repositories for data intense, rich media applications.
Bottom-line: The assault on tape by disk based solutions that integrate data de-duplication lose their impact in non-dedupe data environments such as commercial video, surveillance video, digital photography, PDF’s, images, seismic/scientific etc. These data types contain little if any duplicate data other than where files have been duplicated. In fact using a de-duplication solution in these data environments is likely to have the unintended and undesirable consequence of driving storage costs in the wrong direction, up.
Tape has a $/GB, kW/hr/GB, sustained GB/sec advantage over tier 3 disk and now with its random-like accessibility and drag and drop manageability, tape has an answer to disk based solutions, at least within use case boundaries defined by rich media.
With the introduction of LTO-5 and LTFS, tape has an opportunity to regain its position as a legitimate participant in the enterprise storage hierarchy. However, the challenge for the LTO consortium is to build a comprehensive ecosystem of back-up, archival and rich media management applications compatible with LTFS. Not a show stopper but a significant bump in the road leading to a broad based adoption of this technology.

3 comments:
We may be hopping over the bump sooner than some expect. Companies like Cache-A, StorageDNA, For-A, SGL and several others already announced/delivered solutions with LTO-5/LTFS tape support.
Because LTFS IS a file system, files may be copied to/from tape simply by scripts. Any application can read/write files directly to tape. If API-integration is desired then think of LTO/LTFS as if you write files to disk. Use the same POSIX interface, with fopen(), fseek(), fread(), fclose(), etc. No need for any SCSI or special tape-commands. In fact, your application would not know if it writes to tape or to disk (other than some occasiional long seek times). Hence it is relatively easy to convert an existing non-LTFS tape archive system to work LTFS tapes.
Anonymous .. many thanks. It is indeed encouraging to see the ecosystem growing. Looking forward to seeing some of the more famliar industry names lining up in support.
Bill to your point about the growing ecosystem around LTFS, I thought I'd point you to http://oss.oracle.com/projects/ltfs/
Oracle now claims they are supporting LTFS with a single software stack for HP LTO-5, IBM LTO-5 and the T10K-C.
Post a Comment