(From ACM TechNews)
Computerworld (07/25/05) P. 39; Mearian, Lucas
To address the costs of storing an ever-growing body of data, and to comply with federal regulations demanding that more content be stored for a longer time, companies are in search of new methods for long-term digital storage.
Currently, transferring the information from one medium to another is the only way to extend data’s lifespan, though alternatives are in the works. Some turn to plain-text formats, such as ASCII and Unicode, though while they offer compatibility, they do not support enhanced features such as graphics. Alternatively, Adobe’s PDF/A offers long-term data storage without the backward-compatibility issues.
The technologies with the most potential are XML-based data storage designs. For media, the Storage Networking Industry Association (SNIA) is addressing the challenge of the 100-year archive in its search for a format that will always be readable. Others are not betting on one particular format, instead opting for expansive disk arrays that ensure the data are readable and available, but even they do not solve the long-term problem.
Central to any archival project, such as the open-systems management center the state of Washington recently created, is the backup of data. Washington paid considerable attention to metadata to aid future searches. Each document is tagged with information specific to its creation, such as its author, the location and time of its creation, and even the computer used to produce it. The system also standardized its formats: All Word documents are converted into PDF files, and all images are turned into TIFF files.
Long-term data storage is still evolving, though, and there is minimal continuity at this point. “There aren’t what we’d call standards for long-term archiving–only best practices,” said Strategic Research’s Michael Peterson, who also serves as a program director for the SNIA Data Management Forum.