As
part of the IT Architecture Initiative, the Office of Information Technologies
(OIT) is producing a series of papers outlining directions in information
technology architecture.
In the
spirit of RFCs, the papers are written to help understand and to open
dialogue about information technology trends at Cornell, with the ultimate
goal of improving the use and interoperability of information technology
services throughout Cornell.
NOTE: The genesis of this paper was a joint effort of the Cornell University Library and the Office of Information Technology.
pdf version
Digital Asset Management: An Introduction to Key Issues
Prepared
by R. David Vernon and
Oya Y. Rieger
Revised 2/02
REV 2
SYNOPSIS
The goal of this paper is to highlight the critical challenges the Cornell community is facing in managing its rapidly growing "digital assets." The papers aims are fostering a common understanding of the basic principles of digital asset management and suggesting closer collaboration among several stakeholders to devise scalable, common solutions.
This paper includes:
The Storage
Challenge at Cornell
Primary Digital Asset Management System Issues
Action Items
Closing
Thoughts
What is Digital Asset Management?
Digital asset management is the systematic management of digital data, such as text, image, audio, and video files, so that they can be reused and re-purposed. It aims to maximize the value of these assets by facilitating easy storage and retrieval while protecting and, at times, enhancing their utility. A digital asset is any form of salient information that plays a role in an institutions efficiency and effectiveness. Some examples of digital assets are:
- Reports
- Scientific data
- Databases
- Image repositories
- Web sites
- Distance Learning Courses
Some goals of digital asset management are:
- Enable ownership control (rights management) and security.
- Ensure the authenticity and integrity of documents.
- Create reusable content that can support both short- and long-term use.
- Ensure effective management of assets to maximize efficiency, productivity and profitability.
- Protect the integrity of data (storage and transmission requirements).
- Ensure the longevity of data (archiving).
With the
increasing dependency on digital information, institutions are recognizing
that effective management of these assets is critical. As the definition
of digital asset management continues to evolve some use the phrase
interchangeably with "persistent archives," "digital
preservation" or "digital archiving." [The main goals
of digital preservation are to ensure content (bit stream) integrity,
protect technical context, maintain provenance data, enable usability
(locate, retrieve, use). Although "digital preservation" and
"digital archiving" are sometimes used interchangeably, "archiving"
historically has a slightly different meaning. Digital archiving refers
to initiatives assumed by institutions with a mandated responsibility
to maintain digital information for legal, fiscal, evidential, or historical
purposes. In addition, the ultimate goal of a persistent archive is
to preserve not only the bits associated with the original data, but
also the context that permits the data to be interpreted.] However,
for the purposes of this paper, the authors interpret digital asset
management as the broadly scoped challenge of deploying highly valued
repositories for digital information.
The Storage Challenge at Cornell
With the explosive growth in the use of data intensive applications and Internet resources, there is an unprecedented demand for reliable digital archives to store and distribute Cornells "digital assets." To address this growth within Cornell a growing number of departments are developing services that could be placed under the general rubric "digital libraries." Some worthy of note include:
- Cornell Lab of Ornithologys archive of natural sounds.
- Cornell Computer Science departments digital library research initiatives.
- Cornell University Librarys proposed central digital asset management services.
- Courses stored at Cornell Computing Services.
- Site
for Science.
All of these projects
focus on a common need: the secure and reliable storage of diverse
forms of digital information that allows "best" use of and
access to those assets over time. (It should be noted that the term
"reliable" also implies "persistent" storage of
data. One approach for persistent storage is being developed jointly
by the U.S. National Archives and Records Administration and the San
Diego Supercomputer Center. The approach focuses on storing the information
objects that make up a collection and identifying their metadata attributes
and behaviors that can be used to recreate the collection. See also:
R. Moore et al. Collection-Based Persistent Digital Archives, D-Lib
Magazine, March 2000, Volume 6 Number
3 [Part 1]; Number
4 [Part 2]).
Primary Digital Asset Management System Issues
Implementers of digital collections often struggle with common challenges and questions. These include but are not limited to:
- Selection of hardware and hardware limitations.
- Managing the persistence of assets within the data repository.
- Cataloging (metadata) processes.
- Control and security of digital assets.
- Rights management.
- Content creation.
- Interoperability among digital libraries.
- Long-term funding.
Hardware and Related Server Applications
A broad
range of hardware and software tools are used to structure and store
digital assets.
Software
strategies range from simple "flat file" copies of digital
assets structured by the directory tools of a given computer operating
system to advanced and specialized databases. These digital library
or data-storage applications not only efficiently record data on media,
but also often provide advanced cataloging and digital asset management
tools. In addition, within commercial application offerings, there is
a range of application focuses, for example, some are developed primarily
for the management of digital video assets, others for digital images,
and others for traditional text repositories. At Cornell, departments
currently leverage a mix of these commercial and locally developed applications.
The types
of storage hardware most commonly used include high-speed disk drives,
magnetic tape, and optical media. These storage devices have varying life
expectancies. Digital assets stored on them must be refreshed periodically
(refreshing
involves copying content from one storage medium to another)
to assure the informations integrity over time. Though there are
not any universally applied standards on media life, most institutions
assume a less than 10-year life span for magnetic media, and a less than
100-year life span for high quality optical media. Pragmatically, the
life of any storage device is more likely to be influenced by the use
environment in which it exists. Often used optical media stored in harsh
environments may fail before archived magnetic tapes stored in optimal
conditions. Simply stated, no mainstream system in use at Cornell today
is a "write once and forget about it forever" storage architecture.
All users who are concerned about persistence of their digital assets
should be cognizant of their respective storage systems limitations
and requirements for data refreshing.
A myriad
of architectures based on the above hardware can be used to create digital
libraries. Primary among these are:
- RAID (Redundant
Arrays of Inexpensive Disk) hard disk systems.
- Robotic magnetic
tape-based storage systems.
- Robotic optical
disk storage systems (CD / DVD).
- Storage Area
Networks (SAN's).
- Combinations
of the above hardware systems forming hierarchical data storage systems.
RAID arrays
are by far the most commonly used solution. These systems are simple,
cost effective, and allow very fast storage and access to digital information.
However, the life expectancy of the hard disk drives that comprise RAID
arrays is fairly short. Most system administrators expect a functional
life of less than 5 years and many plan on less than 3! The ultimate
cost of long-term storage of data on only RAID disk systems can be higher
than initially assumed.
Tape-based
systems are an order of magnitude less expensive (per unit of data stored)
than spinning disk systems. In addition, in optimal environments tape
life can easily exceed 10 years (however, if tapes are accessed often
and the environment is harsh, tape life can be dramatically reduced).
The disadvantage of tape-based systems is increased length of time required
to access data from the tape when compared to magnetic and optical disk
alternatives. To date, the most common deployments of tape-based storage
at Cornell are used to back-up disk storage or to archive static data.
An example of this type of system is the CIT EZ-Backup service.
Optical
media fills a niche between magnetic disk and tape. While it is not
clear that optical storage systems have a lower total cost of ownership
than disk systems, they offer excellent access speeds when compared
to tape storage architectures and far superior media life compared to
tape-based or disk-based systems.
Storage
area networks are most often groups of disk arrays linked via a dedicated
high speed fiber network. Theoretically this storage area "network"
acts as a common backplane to integrate and present multiple storage
devices to servers and clients attached to the network as a singular
storage resource. Pragmatically the complexities of metadata associated
with data positioning of bitstreams on multiple disk platters and issues
of security have kept true SAN's from gaining large market shares. Many
companies, such as IBM and EMC have products that are marked under the
general rubric of a SAN resource, but deviate from a "pure"
SAN and are more closely related to large integrated RAID disk systems.
Storage
Area Networks should not be confused with Common File Systems used by
many institutions to share data. Common File Systems are applications
running on servers that allow multiple host accesses to a file via an
open data network. Examples of Common File Systems Include: Distribute
File System (DFS), Andrew File System (AFS), Network File System (NFS),
Novell Netware, and Microsoft's file serving suite.
Technology
organizations often combine disk, tape, and optical storage systems
in order to get the best results for large online and archival storage
projects. Hierarchical data storage systems automatically migrate data
from one storage media to another to take advantage of a given technologys
benefit. For example, highly active data remains on relatively high
cost disk, while static data is migrated off to low cost tape. While
these systems at first seem to offer the best of all worlds, their complexity
and management overhead should not be underestimated! Although there
is no clear rule for when it makes sense to deploy a hierarchical storage
system, those with storage requirements of greater than 5 TB might want
to consider this solution (See also: R. Moore et al. Collection-Based
Persistent Digital Archives, D-Lib Magazine, March 2000, Volume 6 Number
3 [Part 1]; Number
4 [Part 2]).
Data Persistence
Ensuring
short- and long-term persistence of digital assets requires attention
to the applications used to structure the data as well as the storage
media. If the original application or utility doesnt exist in
the future, even well preserved assets may be useless. National standards
bodies are currently promulgating data
type standards and projects exist that intend to structure assets
into archives
that assure future interpretation.
Pragmatically,
given the pervasive nature of many data file types fear of lost interpretation
tools may not be as valid an issue as was once believed. It is unlikely
that future file interpretation will be problematic for digital archives
containing assets formatted in standard formats such as ASCII, .gif,
.tif, .mpeg, .jpeg, .ps, and .html.
A more
likely and insidious problem would be loss of access to the database
structure used to organize and "file" the digital assets.
Many digital asset storage "databases" used to file assets
on tapes, disk and optical media do so in a proprietary format.
In short, without the vendors software that placed the given file
on a tape or disk, you cannot retrieve the information. This fact bears
consideration by anyone trying to architect a long-term digital asset
management architecture. Unfortunately, most major corporations, such
as IBM and Microsoft to name but two, are unwilling to assure open access
to source code (the human readable form of a given computer application
from which the machine-readable form has been generated), and often
will not even escrow the application(s) to assure users access in the
case of the companys demise. This control over the storage application
is one of the reasons the largest data archives in the world use a consortium
derived storage application that assures access to needed software and
stores data on media in
a means that open source code tools can extract.
Metadata
Metadata has become a buzzword in various communities that design, create, and preserve information systems. Expanding the boundaries of educational and cultural institutions, metadata is now seen as one of the essential components of success for any organization that owns a digital resource.
What is Metadata?
"Without
metadata, content is just bits."
Fortune,
November 27, 2000
Metadata
is a structured description of an information object or a collection
of information objects. Some examples of information objects are:
- Books
- Journal articles
- MS Word files
- Student registration
databases
- Image files
- Video clips
- Course web sites
Metadata
describes a range of characteristics that are pertinent to text, audio,
video, image, and multimedia.
The history
of metadata goes back to card catalogs familiar from libraries, museums,
and archives. In the traditional context, each entry in the card catalog
(a 3x5 card) enabled users to locate material by title, author, classification
number, or a subject heading providing both intellectual and
physical access to a collection. The advent of the Web and the proliferation
of multimedia content have significantly broadened the scope and functionality
of metadata. Here are some examples of how people are using metadata
to facilitate their work and research:
- Professor Jacob
has developed a digital image collection of wheat to support his research
project. For each image, he carefully records metadata, including
an image ID, wheat name and type, size of crop, time image is created,
and location so that he can organize and manage his collection by
these information elements.
- Professor Nickel
is using the Family Welfare statistical database created by
one of her Cornell colleagues to support her child poverty case study.
She is relying on the technical documentation to interpret raw data
found in the database. The technical documentation includes various
metadata elements, such as information on how the data were collected
and tallied, and a description of each variable included in the study.
- The copyright
status of images used in CyberTowerstudy rooms
(Cornell Adult University's CyberTower Web site offers an array of
"study rooms," each representing a short course taught by
a Cornell faculty member. This subscription-based system targets Cornell
alumni.)
are managed through a rights management database, which documents
the copyright clearance process. This metadata is essential for protecting
the Cornell Adult University from any copyright violation claims,
in addition to securing the facultys right to the images created
by them.
- The Bureau of
National Affairs Web site uses HTML
Meta Tags so that their Web pages can easily be located
(HTML Meta Tags are used to specify to Web search engines how to index
your document. Metadata for the Web is still in its infancy and continues
to evolve. There are not yet any commonly accepted standards. Consequently,
the vast majority of Web tools (such as search engines) lack any common
infrastructure for specifying or using the properties of Web content.).
Using metatags, the staff assigns key words to describe the contents
of the Web site. These keywords, which serve as metadata, are used
by Internet search engines to increase precision.
- BJTV, a local
television channel, uses content rating metadata to disclose the nature
of its programs. This practice allows families to filter content and
block inappropriate material from their children.
Metadata Types
Metadata relays information about content, context, quality, condition, and structure of information objects. The most common categories of metadata are:
- Descriptive
- Structural
- Administrative
(technical or preservation)
It is important to note that there is a significant level of overlap among these classes, and that the types continue to evolve responding to changes in the information environment. In addition to these classifications, we see the emergence of specialized metadata sets to support certain purposes. For example, the advent of technology-mediated instruction introduced a new category of metadata called "instructional metadata."
Descriptive metadata facilitates resource discovery by providing intellectual access points such as title, author or creator, subject terms, and location. This category of metadata greatly contributes to the interoperability of systems by providing common frameworks and data dictionaries. For example, one can search the Cornell Online Library Catalog using a Library of Congress Subject Heading, which is a controlled vocabulary for descriptive metadata.
Examples
of Descriptive Metadata
(These descriptive metadata examples are based on Dublin
Core Metadata Initiative, which is used to supplement existing methods
for searching and indexing Web-based metadata.)
Journal Article from a Web Site
Title: Autobiography of W. J. Stillman
Creator: Stillman, William James
Subject:
Identifier: Atlantic Monthly, v. 85, issue 507 (Jan. 1900), pp. 1-16
Identifier (URI): http://cdl.library.cornell.edu/cgi-bin/moa/moa-cgi?notisid=ABK2934-0085-3
Publisher: Atlantic Monthly
Date (W3C-DTF): 2000-01
Type: text
Format: image/tiff
Student Activities Database
Title: Student Activities Database
Creator: Cornell University, Office of the Dean of Students
Subject (LCSH): Student activities--New York (State)--Ithaca--Databases.
Description: A database locally maintained by the student affairs staff of the Office of the Dean of Students, and networked in that office (Day Hall, B32)
Type: dataset; service
Format: html, xml, pdf
Date: 1998-01
Economics 101 Web site
Title: Economics 101
Creator: Drake, William E. (Professor)
Date (Valid): Spring 2001
Description: Prerequisites: Completion of Economics 100.
Description (TOC): Syllabus; Course readings; Quizzes; Chat room; Library resources
Identifier (URI): http://www.nyed.edu/courseinfo/Ec101
Subject: math; algebra; calculus
Structural metadata is used to navigate an information object. It assists in identification and retrieval of structural sections, such as chapters, sections, illustrations, video segments, etc. Structural metadata also identifies relationships among collections. For example, one can note that there are three versions of a certain file, in .GIF, MS Word, and .PDF formats. The role of structural metadata is growing as the current computing capabilities can use this information in an automated way to search, manipulate, and interrelate information objects.
Examples
of Structural Metadata
Journal Article from a Web Site
Pages: pp. 34-56
Footnotes: pp. 55-56
Table 1: p. 36
Illustration 1: p. 37
Student Summer Activities Database
Database Fields: student activity title, place, number of students
enrolled, counselor, budget for the activity, starting and ending dates.
Relational Fields:
Related Databases:
Economics 101 Web site
Section Titles: syllabus, course readings, quizzes, chat room, library
resources
Prerequisites: Economics 100
Administrative
metadata
supports short- and long-term management of information. A related but
broader category, preservation metadata (The Open Archival Information
System (OAIS) Reference Model initiative aims to articulate the system
functionality and components to preserve any type of information over
any length of time. This
model presents a preservation metadata model with a high level framework
for entities, functions, and associated administrative activities.)
aims to document an information objects lifecycle to facilitate
its maintenance and development. Examples of data elements in the administrative
metadata category include media specifications, file format, compression
ratio, authentication data, security, and maintenance information. Administrative
metadata helps to document an information objects use, function,
and history. Information on intellectual property rights, such as contractual
terms related to the documents use and distribution, is also included
in this category.
Examples
of Administrative Metadata
Journal Article from a Web Site
File Format: GIF
Compression: LZW
Storage Media: Sun Microsystems CIT Storage Facility
System Administrator:
Change History: Moved from UNIX to Sun Microsystems in June 2000.
Copyright: Journal is public domain. Rights on digital version belong to Cornell University Library.
Student Activities Database
Access Restrictions: Use of the database is limited to the Student Activities staff.
Database Software: MS Excel Version 2.1 Windows
Location: Day Hall, Equipment Room, G1
System Requirements: Windows NT
System Administrator: Joan Break
Backup Cycle and Method: Daily through EZ-Backup
Economics 101 Web site
System Requirements: CourseInfo Version 1.0
System Administrator: CIT/ATS
Access Limitations: Requires a password and a user ID
Copyright Clearance: Two images required copyright limitation (with a link to the Copyright Database).
Use Rights: Content owned by Professor Drake permission needs to be granted to for the site or part of it to be incorporated.
Mime Types: HTML files, style sheets, .GIF images, MPEG3 video
Content Updates: 1/5/2001, 3/6/2001, 6/6/2001
Role of Metadata
in Digital Asset Management
"Digital assets are only as good as the metadata describing them."
Web Techniques, April 2001
Digital asset management relies on the existence of descriptive, structural, and administrative metadata to support the following processes:
- Enable usability and efficient
distribution of digital assets to staff and users.
- Facilitate the
reliable storage and retrieval of digital assets.
- Ensure authentication
of digital content through digital signatures, public key encryption,
etc.
- Control unauthorized
access to digital content.
- Protect intellectual
property rights.
- Ensure longevity
of digital assets in the face of rapidly changing technical and organizational
infrastructures.
It is too risky to rely on institutional memory for tracking the development and use of information systems. There needs to be a common and reliable location for project teams to record information such as the technical specifications of a system, maintenance procedures, access rights, etc. Digital asset management metadata serves this purpose and assists in keeping track of legal requirements, privacy concerns, or other proprietary interests. This metadata also insures that potential users can make an informed decision about whether data is appropriate for their intended use. With the advent of e-commerce, usage-tracking data (e.g., number of logins) is becoming an integral part of digital asset management metadata.
Metadata Standards
and Structures
Metadata
standards offer common principles such as data dictionaries (controlled
vocabulary / subject headings for a certain discipline) and allow information
to be easily and consistently located and managed. For example, they
enable interoperability by allowing users to search across a group of
image collections using certain access points. Different communities
propose, design, and maintain metadata standards to support their specific
needs. The main goal of adhering to a national or local standard is
to enable interoperability. For example, if a common metadata standard
is implemented for Cornell course Web sites, users can search across
these sites to locate courses that are related to a certain topic or
to find a site that includes audiovisual illustrations. Full potential
of metadata is achieved only if an institution adheres to a standard
and when the information and retrieval systems take advantage of this
framework. A successful example of this vision is demonstrated by the
Multimedia Educational Resource for Learning and Online Teaching (MERLOT).
Designed primarily for faculty and students in higher education, MERLOT
facilitates the sharing of teaching-learning materials by organizing
them in a searchable database, which is based on the Instructional Management
Systems metadata standard. Equally important to standards is the structure
used to input metadata. XML (Extensible Markup Language) is emerging
as a universal format. Like HTML, XML uses tags and attributes to record
metadata embedded in information resources.
Several groups are engaged in metadata standards development, including:
- Metadata
for Education Group
The Metadata for Education Group (MEG) serves as an open forum for
debating the description and provision of educational resources at
all educational levels across the United Kingdom.
- Dublin
Core Metadata Initiative (DCMI)
Identifies a small, simple set of metadata elements that can be used by any community to describe and search a wide range of digital resources. DC, IEEE, and IMS joint effort for the development of instructional metadata.
- IMS
Learning Resource Metadata Specification
Aims to define the necessary metadata elements to support widespread
reuse, discovery and sharing of learning materials via the Internet.
The goal is to simplify the process of locating learning materials
on the Web.
- The Digital Object
Identifier (DOI®)
Provides a framework for managing intellectual content, for linking customer with content suppliers, for facilitating electronic commerce, and enabling copyright management for all types of media.
Control and Security of Digital Assets
For many digital library deployments it is important to consider access and control issues. While limiting local access to a known and limited group can be trivial, providing only authorized access to a global community, and then maintaining some form of control over the assets once accessed, can be a daunting challenge.
Along with local
authentication and authorization strategies, national initiatives such
as evolving Public
Key Infrastructure services must be understood by any department
interested in providing regulated access to a global community. Without
such tools to authenticate and in turn authorize access, stored digital
assets may easily become accessible to unintended parities. For sensitive
or copyrighted information this may constitute a considerable liability
for the archives owner.
In addition to the need to control initial access to the digital asset resource, many owners of the assets will require control or at least future identification of their assets once extracted. To accomplish this, many organizations mark the digital assets with digital "water marks" or digital "finger prints." For example, owners of an image can digitally fingerprint the image to provide irrefutable proof that the image is theirsa valuable utility if another is trying to take credit for the image. These fingerprints are often "invisible" bit streams imbedded into the larger bit streams of the asset.
Though it is beyond the scope of this paper to provide detail on evolving text, sound, image and digital video fingerprinting and watermarking technologies, it is important to note that the systems are not trivial and require significant command of the evolving technology.
Interoperability of Digital Libraries
Campus organizations that create systems to store digital assets do so in parallel with a larger Cornell, national and global community. In an ideal world tools used to extract information from one of these digital stores would work with another. This can only be accomplished if there is a universally agreed upon open standard and process for the retrieval, aggregation and presentation of assets stored. However, no such universally accepted digital library standard exists, and at Cornell, and throughout the world, there remains a mix of digital storage strategies.
This does not mean that departments at Cornell should ignore the challenge of creating extensible and interoperable architectures that will allow broad access. Minimally, departments should:
- Examine metadata suites
that are consistent with peer group schemas.
- Explore resources
available from the Cornell University Library.
- Understand research
being performed within the Cornell Digital Library Research group.
The key is to understand the added value "standard and open" asset management architectures might offer a given department. If value is there, the department should work diligently to implement.
Funding and Institutional Accountability
One final, key element in the deployment and long-term maintenance of a digital asset library is often left unsaidfunding. All too often institutions create repositories for digital assets without addressing the need for long term, immutable funding streams to assure the ongoing operation costs. The best designed storage systems will fail as soon as the funding to replace aging hardware or maintain staff runs out. All departments worried about persistent storage must create base funding streams that are assured for the expected life of the archive.
Action Items
There are several areas on which CIT, Cornell University Library, Cornell Digital Library Research Group and other Cornell departments could collaborate that offer value to the Cornell community at large. These are:
- Foster a common understanding
of the challenges.
- Foster willingness
to work cognizant of, and when appropriate, in concert with other
campus digital asset management initiatives.
- Sponsor Digital
Asset management forums to facilitate open exchange of information.
- Outline a "best
practice" storage process.
- Form a metadata
registry where Cornell Metadata strategies could be openly shared
to assure best interoperability between Cornell digital repositories.
Best Practice Storage Process
Currently
there are no well-articulated guidelines for expected media life, recommended
magnetic and optical media storage procedures, data refresh cycles,
or minimum cataloging/metadata strategies. While the Cornell Policy
office does have a Records Retention Policy this policy ONLY notes the
class of information to be stored and the length of time for the respective
classes. (The current records retention policy was created in 1977 and
created prior to the massive growth in and decentralized nature of digital
information storage at Cornell. As the current language is ambiguous
in regards to many issues outlined, its review may be warranted. The
University Archivist is responsible for 1) designating which official
university records are archival; and 2) affecting the transfer of archival
records from the office in which they originated or were received to
the University Archives at such times and in the manner and form prescribed
by the Archives and subject to the appropriate retention and disposition
schedules that are outlined in University Policy 4.7, Retention of University
Records. The University Archives is a component of the Library's Division
of Rare and Manuscript Collections. University policy does not make
a distinction between paper-based and electronic records.) Principle
trust of the current records retention policy is preservation of University
activities based on content rather than the storage media. Regardless
of format (print, digital, microfilm, etc.), preservation must be ensured
if the information is judged to be "permanent" for legal,
historical, or financial reasons. All permanent materials do not need
to be maintained centrally. The office of origin has the responsibility
for retaining records as a part of the existing recording system. It
is interesting to note that in some cases university policy currently
requires "permanent" data retention! Without complementary
policy recommendations for the structure and care of the equipment on
which this data is stored, it is unlikely such ambitious data retention
goals will be achieved. To this end it is strongly recommended that
Cornell expand its data retention policy to include technical and metadata
requirements as well. A good example of a broader data retention policy
is the current National
Archives and Records Administration draft recommendations.
Metadata Registry
While there
is little chance that a single campus-wide metadata "standard"
could ever be achieved, there is value in noting the different schemas
departments devise. Creating a central location where these targeted
strategies are noted, provides an opportunity to recommend changes that
may improve interoperable access to the distributed digital asset stores.
Even in those cases where the nature of the digital library requires
esoteric extensions to a core metadata set, information could more readily
be exchanged to the community at large on the availability of the extended
classification system. The library may well be the logical owner of
this registry/education process. The Cornell University Library has
a newly formed Metadata
Working Group to provide a forum to discuss metadata issues facing
the library. The key goal of the group is to exchange information about
metadata and its application to all library functions.
Campus-Wide Digital Asset Management Service
Clearly there is a core set of digital assets that have great value to the University. It is also clear that there is a lack of campus wide systematic thinking applied to the long-term storage of these assets. This is not to imply that all data is at risk, only that there is an exponential increase in the number of digital assets being created and the number of departments and individuals responsible for maintaining this information. It may well be that for a certain class of digital assets a central, structured, broad-based digital asset management service may be of great value. This library would be held to the highest standards of persistent storage science and assure all on campus access to a world-class service. While all departments may not require such a system, many are likely to see its advantage when faced with the issues outlined earlier. Currently both the Library and CIT have large, structured digital asset management systems. As has been an ongoing theme in past partnership discussions, it may well make sense for these groups to share expertise, compromise when appropriate, and create a digital asset management system capable of providing a much needed secure and persistent storage service for the campus at large.
Closing Thoughts and Recommendations
Plainly there is a growing demand for and deepening concern over the preservation and control of the growing store of digital assets at Cornell. The old saying: "An ounce of prevention is worth a pound of cure" is certainly applicable to Cornells digital asset management and storage needs. Cornell does not currently have a universal plan for digital asset management. The first and most important step to rectify this shortcoming is open communication and friendly collaboration across all Cornell departments that are currently engaged in developing digital asset management resources. Fortunately, there is great expertise in this area at Cornell. The Computer Science department is world renowned for its work in digital library application development. The Cornell University Library not only maintains some of the finest online digital image collections in the world, but also has an active preservation research agenda. A unified effort, combined with the world-class expertise available from within the Cornell family will place us well ahead of many of our peer institutions.
While there are significant administrative and financial challenges
to face today, failure to address these needs now, regardless of a central
or distributed strategy, will inevitably cost the University far more
in the future. Simply stated, this is a problem that will not go away.
It requires prudent university investment for the foreseeable future.
To this end the Cornell University Library and OIT will continue to
work in partnership to provide leadership throughout Cornell and champion
the development of required services.
Original
date of publication: December 2001
Last
modified: February 2002
Return
to Papers Page