Understanding Metadata
Understanding
Metadata
What is Metadata? .................................................................................................. 1
What Does Metadata Do? ................................................................................ 1
Structuring Metadata ................................................................................. 2
Metadata Schemes and Element Sets ............................................... 3
Dublin Core ..................................................................................................................... 3
TEI and METS .............................................................................................................. 4
MODS ....................................................................................................................... 5
EAD and LOM ...................................................................................................... 6
<indecs>, ONIX, CDWA, and VRA .................................................................. 7
MPEG .......................................................................................................... 8
FGDC and DDI ........................................................................................ 9
Creating Metadata ................................................ 10
Interoperability and Exchange of Metadata ....11
Future Directions .................................... 12
More Information on Metadata ........ 13
Glossary ...................................... 15
Acknowledgements
Understanding Metadata is a revision and expansion of Metadata Made
Simpler: A guide for libraries published by NISO Press in 2001.
NISO Press extends its thanks and appreciation to Rebecca Guenther
and Jacqueline Radebaugh, staff members in the Library of Congress
Network Development and MARC Standards Office, for sharing their
expertise and contributing to this publication.
About NISO
NISO, a non-profit association accredited by the American National
Standards Institute (ANSI), identifies, develops, maintains, and publishes
technical standards to manage information in our changing and ever-more
digital environment. NISO standards apply both traditional and new
technologies to the full range of information-related needs, including
retrieval, re-purposing, storage, metadata, and preservation. NISO
Standards, information about NISO’s activities and membership are
featured on the NISO website <http://www.niso.org>.
This booklet is available for free on the NISO website
(www.niso.org) and in hardcopy from NISO Press.
Published by:
NISO Press
National Information Standards Organization
4733 Bethesda Avenue, Suite 300
Bethesda, MD 20814 USA
Email: nisohq@niso.org
Tel: 301-654-2512
Fax: 301-654-1721
URL: www.niso.org
Copyright © 2004 National Information Standards Organization
ISBN: 1-880124-62-9
Understanding Metadata
What Is Metadata?
administrative data; two that
in the headers of image files.
sometimes are listed as separate
Storing metadata with the object it
Metadata is structured infor-
metadata types are:
describes ensures the metadata will
mation that describes, explains,
not be lost, obviates problems of
locates, or otherwise makes it
− Rights management meta- linking between data and metadata,
easier to retrieve, use, or manage
data, which deals with
and helps ensure that the metadata
an information resource. Metadata
intellectual property rights,
and object will be updated together.
is often called data about data or
and
However, it is impossible to embed
information about information.
− Preservation metadata, which metadata in some types of objects
The term metadata is used
contains information needed
(for example, artifacts). Also, storing
differently in different communities.
to archive and preserve a
metadata separately can simplify
Some use it to refer to machine
resource.
the management of the metadata
understandable information, while
itself and facilitate search and
others use it only for records that
Metadata can describe re-
retrieval. Therefore, metadata is
describe electronic resources. In
sources at any level of aggregation.
commonly stored in a database
the library environment, metadata
It can describe a collection, a single
system and linked to the objects
is commonly used for any formal
resource, or a component part of a
described.
scheme of resource description,
larger resource (for example, a
applying to any type of object, digital
photograph in an article). Just as
What Does
or non-digital. Traditional library
cataloging is a form of metadata;
Metadata Do?
MARC 21 and the rule sets used
Metadata is key
An important reason for creating
with it, such as AACR2, are
to ensuring that
descriptive metadata is to facilitate
metadata standards. Other
resources will
discovery of relevant information. In
metadata schemes have been
addition to resource discovery,
developed to describe various types
survive and
metadata can help organize
of textual and non-textual objects
continue to be
electronic resources, facilitate
including published books,
accessible into
interoperability and legacy resource
electronic documents, archival
integration, provide digital
finding aids, art objects, educational
the future.
identification, and support archiving
and training materials, and scientific
and preservation.
datasets.
Resource Discovery
There are three main types of
catalogers make decisions about
Metadata serves the same
metadata:
whether a catalog record should be
•
created for a whole set of volumes
functions in resource discovery as
Descriptive metadata describes
or for each particular volume in the
good cataloging does by:
a resource for purposes such as
set, so the metadata creator makes
• allowing resources to be found
discovery and identification. It
similar decisions. Metadata can also
by relevant criteria;
can include elements such as
be used for description at any level
title, abstract, author, and
of the information model laid out in
• identifying resources;
keywords.
the IFLA (International Federation
• bringing similar resources
• Structural metadata indicates of Library Associations and
together;
how compound objects are put
Institutions) Functional Require-
together, for example, how
ments for Bibliographic Records:
• distinguishing dissimilar re-
pages are ordered to form
work, expression, manifestation, or
sources; and
chapters.
item. For example, a metadata
• giving location information.
record could describe a report, a
• Administrative metadata pro- particular edition of the report, or a
vides information to help
Organizing Electronic
specific copy of that edition of the
manage a resource, such as
report.
Resources
when and how it was created, file
Metadata can be embedded in
As the number of Web-based
type and other technical
a digital object or it can be stored
resources grows exponentially,
information, and who can access
separately. Metadata is often
aggregate sites or portals are
it. There are several subsets of
embedded in HTML documents and
increasingly useful in organizing
Page 1
links to resources based on
digital object may also be given
The latter group developed a
audience or topic. Such lists can be
using a file name, URL (Uniform
framework outlining types of
built as static webpages, with the
Resource Locator), or some more
presentation metadata. A follow-up
names and locations of the
persistent identifier such as a PURL
group, PREMIS (PREservation
resources “hardcoded” in the
(Persistent URL) or DOI (Digital
Metadata: Implementation Strat-
HTML. However, it is more efficient
Object Identifier). Persistent
egies)—also sponsored by OCLC
and increasingly more common to
identifiers are preferred because
and RLG—is developing a set of
build these pages dynamically from
object locations often change,
core elements and strategies for the
metadata stored in databases.
making the standard URL (and
encoding, storage, and manage-
Various software tools can be used
therefore the metadata record)
ment of preservation metadata
to automatically extract and
invalid. In addition to the actual
within a digital preservation system.
reformat the information for Web
elements that point to the object, the
Many of these initiatives are based
applications.
metadata can be combined to act
on or compatible with the ISO
as a set of identifying data,
Reference Model for an Open
Interoperability
differentiating one object from
Archival Information System
Describing a resource with
another for validation purposes.
(OAIS).
metadata allows it to be understood
by both humans and machines in
Archiving and
Structuring Metadata
ways that promote interoperability.
Preservation
Interoperability is the ability of
Metadata schemes (also called
Most current metadata efforts
multiple systems with different
schema) are sets of metadata
center around the discovery of
hardware and software platforms,
elements designed for a specific
recently created resources.
data structures, and interfaces to
purpose, such as describing a
However, there is a growing
exchange data with minimal loss of
particular type of information
concern that digital resources will
content and functionality. Using
resource. The definition or meaning
not survive in usable form into the
defined metadata schemes, shared
of the elements themselves is
future. Digital information is fragile;
transfer protocols, and crosswalks
known as the semantics of the
it can be corrupted or altered,
between schemes, resources
scheme. The values given to
intentionally or unintentionally. It
across the network can be
metadata elements are the content.
may become unusable as storage
searched more seamlessly.
Metadata schemes generally
media and hardware and software
specify names of elements and their
Two approaches to inter-
technologies change. Format
semantics. Optionally, they may
operability are cross-system search
migration and perhaps emulation of
specify content rules for how
and metadata harvesting. The
current hardware and software
content must be formulated (for
Z39.50 protocol is commonly used
behavior in future hardware and
example, how to identify the main
for cross-system search. Z39.50
software platforms are strategies for
title), representation rules for
implementers do not share
overcoming these challenges.
content (for example, capitalization
metadata but map their own search
Metadata is key to ensuring that
rules), and allowable content values
capabilities to a common set of
resources will survive and continue
(for example, terms must be used
search attributes. A contrasting
to be accessible into the future.
from a specified controlled
approach taken by the Open
Archiving and preservation require
vocabulary).
Archives Initiative is for all data
special elements to track the
providers to translate their native
There may also be syntax rules
lineage of a digital object (where it
metadata to a common core set of
for how the elements and their
came from and how it has changed
elements and expose this for
content should be encoded. A
over time), to detail its physical
harvesting. A search service
metadata scheme with no
characteristics, and to document its
provider then gathers the metadata
prescribed syntax rules is called
behavior in order to emulate it on
into a consistent central index to
syntax independent. Metadata can
future technologies.
allow cross-repository searching
be encoded in any definable syntax.
Many organizations inter-
regardless of the metadata formats
Many current metadata schemes
nationally have worked on defining
used by participating repositories.
use SGML (Standard Generalized
metadata schemes for digital
Mark-up Language) or XML
Digital Identification
preservation, including the National
(Extensible Mark-up Language).
Library of Australia, the British
Most metadata schemes include
XML, developed by the World Wide
Cedars Project (CURL Exemplars
elements such as standard
Web Consortium (W3C), is an
in Digital Archives), and a joint
numbers to uniquely identify the
extended form of HTML that allows
Working Group of OCLC and the
work or object to which the
for locally defined tag sets and the
Research Libraries Group (RLG).
metadata refers. The location of a
easy exchange of structured
Page 2
Understanding Metadata
information. SGML is a superset of
some complexity. There has
for libraries is being developed by
both HTML and XML and allows for
historically been some tension
the Libraries Working Group.
the richest mark-up of a document.
between supporters of a minimalist
Useful XML tools are becoming
view, who emphasize the
widely available as XML plays an
need to keep the elements
Dublin Core Example
increasingly crucial role in the
to a minimum and the
exchange of a variety of data on the
semantics and syntax
Title=”Metadata Demystified”
Web.
simple, and supporters of
a structuralist view who
Creator=”Brand, Amy”
Metadata Schemes and
argue for finer semantic
Creator=”Daly, Frank”
Element Sets
distinctions and more
extensibility for particular
Creator=”Meyers, Barbara”
Many different metadata
communities.
Subject=”metadata”
schemes are being developed in a
These discussions
variety of user environments and
Description=”Presents an overview of
have led to a distinction
disciplines. Some of the most
metadata conventions in
between qualified and
common ones are discussed in this
publishing.”
unqualified (or simple)
section.
Dublin Core. Qualifiers can
Publisher=”NISO Press”
Dublin Core
be used to refine (narrow
Publisher=”The Sheridan Press”
the scope of) an element,
The Dublin Core Metadata
or to identify the encoding
Date=”2003-07"
Element Set arose from discussions
scheme used in repre-
Type=”Text”
at a 1995 workshop sponsored by
senting an element value.
OCLC and the National Center for
The element Date, for
Format=”application/pdf”
Supercomputing Applications
example, can be used with
Identifier=”http://www.niso.org/
(NCSA). As the workshop was held
the refinement qualifier
standards/resources/
in Dublin, Ohio, the element set was
created to narrow the
Metadata_Demystified.pdf”
named the Dublin Core. The
meaning of the element to
continuing development of the
the date the object was
Language=”en”
Dublin Core and related spec-
created. Date can also be
ifications is managed by the Dublin
used with an encoding scheme
Because of its simplicity, the
Core Metadata Initiative (DCMI).
qualifier to identify the format in
Dublin Core element set is now
The original objective of the
which the date is recorded, for
used by many outside the library
Dublin Core was to define a set of
example, following the ISO 8601
c o m m u n i t y — r e s e a r c h e r s ,
elements that could be used by
standard for representing date and
museum curators, and music
authors to describe their own Web
time.
collectors to name only a few. There
resources. Faced with a pro-
All Dublin Core elements are
are hundreds of projects worldwide
liferation of electronic resources
optional and all are repeatable. The
that use the Dublin Core either for
and the inability of the library
elements may be presented in any
cataloging or to collect data from the
profession to catalog all these
order. While the Dublin Core
Internet; more than 50 of these have
resources, the goal was to define a
description recommends the use of
links on the DCMI website. The
few elements and some simple
controlled values for fields where
subjects range from cultural
rules that could be applied by
they are appropriate (for example,
heritage and art to math and
noncatalogers. The original 13 core
controlled vocabularies for the
physics. Meanwhile the Dublin Core
elements were later increased to
Subject field), this is not required.
Metadata Initiative has expanded
15: Title, Creator, Subject, Descrip-
However, working groups have
beyond simply maintaining the
tion, Publisher, Contributor, Date,
been established to discuss
Dublin Core Metadata Element Set
Type, Format, Identifier, Source,
authoritative lists for certain
into an organization that describes
Language, Relation, Coverage, and
elements such as Resource Type.
itself as “dedicated to promoting the
Rights.
While Dublin Core leaves content
widespread adoption of inter-
The Dublin Core was developed
rules to the particular imple-
operable metadata standards and
to be simple and concise, and to
mentation, the DCMI encourages
developing specialized metadata
describe Web-based documents.
the adoption of application profiles
vocabularies for discovery
However, Dublin Core has been
(domain-specific rules) for particular
systems.”
used with other types of materials
domains such as education and
and in applications demanding
government. An application profile
Understanding Metadata
Page 3
The Text Encoding
Metadata Encoding and
an encoding format for metadata
Initiative (TEI)
Transmission Standard
for textual and image-based works.
The Digital Library Federation (DLF)
(METS)
The Text Encoding Initiative is an
built on that earlier work to create
international project to develop
The Metadata Encoding and
METS, a standard schema for
guidelines for marking up electronic
Transmission Standard (METS)
providing a method for expressing
texts such as novels, plays, and
was developed to fill the need for a
and packaging together descriptive,
poetry, primarily to support research
standard data structure for
administrative, and structural
in the humanities. In addition to
describing complex digital library
metadata for objects within a digital
specifying how to encode the text
objects. METS is an XML Schema
library. Expressed using the XML
of a work, the TEI Guidelines for
for creating XML document
schema language, METS provides
Electronic Text Encoding and
instances that express the structure
a document format for encoding the
Interchange also specify a header
of digital library objects, the
metadata necessary for manage-
portion, embedded in the resource,
associated descriptive and
ment of digital library objects within
that consists of metadata about the
administrative metadata, and the
a repository and for exchange
work. The TEI header, like the rest
names and locations of the files that
between repositories.
of the TEI, is defined as an SGML
comprise the digital object.
DTD (Document Type Definition)—
The metadata nec-
a set of tags and rules defined in
essary for successful
Metadata in Action
SGML syntax that describe the
management and use of
An oral historian makes tape-
structure and elements of a
digital objects is both more
recordings of interviews with members of
document. This SGML mark-up
extensive than and
a particular ethnic group. Interviewees
becomes part of the electronic
different from the
sign a paper release form giving
resource itself. Since the TEI DTD
metadata used for
intellectual property rights to the historian.
is rather large and complicated in
managing collections of
Most interviewees grant permission to
order to apply to a vast range of
printed works and other
disseminate the interviews in print and
texts and uses, a simpler subset of
electronically, but several restrict
physical materials.
the DTD, known as TEI Lite, is
publication and dissemination until 25
Structural metadata is
commonly used in libraries.
years after death.
needed to ensure that
Information about each interview is
It is assumed that TEI-encoded
separately digitized files
kept in a database: Interviewer,
texts are electronic versions of
(for example, different
Interviewee, Date, Place, etc. Each
printed texts. Therefore the TEI
pages of a digitized book)
interview follows a questionnaire format.
Header can be used to record
are structured appro-
The questionnaire exists as a text file. The
bibliographic information about both
priately. Technical
tapes, release forms, database, and text
the electronic version of the text and
metadata is needed for
file are donated to a library that has a
about the non-electronic source
information about the
special collection focusing on the particular
version. The basic bibliographic
ethnic group.
digitization process so
information is similar to that
that scholars may
The tapes are digitized. Since each
recorded in library cataloging and
interview runs over several tapes,
determine how accurate a
can be mapped to and from MARC.
technicians record structural metadata to
reflection of the original
keep component parts of each interview
However, there are also elements
the digital version
together. Technicians record
defined to record details about how
provides. Other technical
administrative metadata such as file
the text was transcribed and edited,
metadata is required for
names, location of each interview in the
how mark-up was performed, what
internal purposes in order
files, equipment used, the methods of
revisions were made, and other
to periodically refresh and
digitizing and assuring quality and
non-bibliographic facts. Libraries
migrate the data, ensuring
completeness, file formats, etc. Different
tend to use TEI headers when they
the durability of valuable
segments of this metadata allow the audio
have collections of SGML-encoded
files to be automatically tracked, accessed,
resources.
full text. Some libraries use TEI
stored, refreshed, and migrated.
METS was originally
headers to derive MARC records for
An archivist expands the database to
an outgrowth of the
their catalogs, while others use
include the persistent identifier of each
Making of America II
interview, thereby linking the audio file to
MARC records as the basis for
project, a digitization
the descriptive metadata. The names of
creating TEI header descriptions for
project of major research
the data elements are revised to match
the source texts.
libraries that attempted to
Dublin Core terminology, including
address these metadata
qualifiers used specifically for audio
issues, in part by providing
(continued on page 5)
Page 4
Understanding Metadata
A METS document contains
• Administrative Metadata – called MIX, Metadata for Images in
seven major sections:
Provides information regarding
XML Schema, and is based on a
• METS Header – Contains
how the files are created and
proposed NISO standard, Z39.87,
metadata describing the METS
stored, intellectual property
Data Dictionary: Technical
document itself, including such
rights, the original source object
Metadata for Digital Still Images.
information as creator, editor,
from which the digital library
Further work is in process on
etc.
object derives, and the prov-
extension schemas for audio, video,
enance of the files comprising
and websites. Another current area
• Descriptive Metadata – Points to
the digital library object.
of concentration for the METS
descriptive metadata external to
development community is the
the METS document (for
• File Section – Lists all files creation of METS application
example, a MARC record in an
containing content that comprise
profiles to give guidance regarding
OPAC or an Encoded Archival
the electronic versions of the
the creation of METS documents for
Description finding aid main-
digital object.
particular object types.
tained on a webserver), or to
• Structural Map – Outlines a
Use of the METS schema is
internally embedded descriptive
hierarchical structure for the
widespread. A list of implementation
metadata, or both.
digital library object and links the
registries using METS, a tutorial,
elements of that structure
and other important information can
Metadata in Action
to content files and
be found on the METS website.
(continued from page 4)
metadata that pertain to
Metadata Object
each element.
materials. Information on rights and
Description Schema
permissions is entered.
•
Structural Links –
(MODS)
An archivist creates an EAD finding
Allows METS creators to
aid for the audio collection using the
record the nodes in the
The Metadata Object
database as the core. Portions of the
hierarchy outlined in the
Description Schema (MODS) is a
questionnaire text file are incorporated as
a rich source of subject keywords. A MARC
Structural Map.
descriptive metadata schema that
is a derivative of MARC 21 and
record is derived from the EAD finding aid
•
Behavior
–
and added to OCLC and RLIN.
intended to either carry selected
Associates executable
data from existing MARC 21
A webpage is created where
behaviors with content in
researchers can access the finding aid,
records or enable the creation of
the METS object.
search the database, and listen to the
original resource description
audio files. Interviews coded as restricted
The METS header, file
records. It includes a subset of
are invisible to the search program until
section, structural map,
MARC fields and uses language-
the date when they become open to the
structural links, and
based tags rather than the numeric
public. Administrative, structural, and
behavior sections are
ones used in MARC 21 records. In
descriptive metadata is created for the
webpage to hold all the pieces together,
defined within the METS
some cases, it regroups elements
allow them to be managed, and allow
schema. METS is less
from the MARC 21 bibliographic
them to be accessed.
prescriptive about
format. Like METS, MODS is
The library participates in a metadata
descriptive and admin-
expressed using the XML schema
harvesting protocol to provide extracts of
istrative metadata, relying
language.
local metadata in a common format to a
on extension schemas—
Although the MODS standard
service provider so that information about
externally developed
can stand on its own, it may also
the collection is automatically included in
metadata schemes—to
complement other metadata
a number of relevant tools such as
provide specific elements.
formats. Because of its flexibility
catalogs and portals.
The METS Editorial Board
and use of XML, MODS may
The webpage is linked to the library’s
has endorsed three
potentially be used as a Z39.50
website dedicated to resources about the
descriptive metadata
Next Generation specified format,
ethnic group, where it is available to
schemes: simple Dublin
an extension schema to METS, a
researchers in context with archival and
visual materials, digitized secondary
Core, MARCXML, and
metadata set for harvesting, and for
sources, etc. Administrative, structural,
MODS (discussed below).
creating original resource metadata
and descriptive metadata at the website
For technical metadata
records in an XML syntax.
level has also been created.
the METS website makes
Rich description of electronic
available schemas for text
resources is a particular focus of
and digital still images.
MODS, which provides some
The latter standard is
advantages over other metadata
Understanding Metadata
Page 5
the EAD DTD provides
A MODS Record Example
support for both SGML
<mods>
and XML through the use
<titleInfo>
of defined “switches” for
<title>Metadata demystified</title>
</titleInfo>
turning off features used
<name type=”personal”>
only in SGML and turning
<namePart type=”family”>Brand</namePart>
on features used only in
<namePart type=”given”>Amy</namePart>
XML. The EAD standard
<role>
is maintained jointly by the
<roleTerm authority=”marcrelator” type=”text”>author</roleTerm>
Library of Congress and
</role>
</name>
the Society of American
<typeOfResource>text</typeOfResource>
Archivists.
<originInfo>
The EAD is particularly
<dateIssued>2003</dateIssued>
popular in academic
<place>
libraries, historical
<placeTerm type=”text”>Bethesda, MD</placeTerm>
</place>
societies, and museums
<publisher>NISO Press</publisher>
with large special
</originInfo>
collections. Many of these
<identifier type=”isbn”>1-880124-59-9</identifier>
collections contain unique
</mods>
materials unavailable
elsewhere and often the
schemes. MODS elements are
description. Finding aids differ from
materials in the
richer than the Dublin Core; its
catalog records by being much
collections are not individually
elements are more compatible with
longer, more narrative and
cataloged like traditional library
library data than the ONIX or Dublin
explanatory, and highly structured in
materials. By creating searchable
Core standards; and it is simpler to
a hierarchical fashion. They
EAD finding aids, libraries and
apply than the full MARC 21
generally start with a description of
archives can increase awareness of
bibliographic format. With its use of
the collection as a whole, indicating
their unique collections to the
XML Schema language, MODS
what types of materials it contains
Internet community.
offers enhancements over MARC
and why they are important. If the
Learning Object Metadata
21, such as the use of an optional
collection consists of the personal
ID attribute to facilitate linking at the
papers of an individual there can be
The IEEE Learning Technology
element level; the ability to specify
a lengthy biography of that person.
Standards Committee (LTSC)
language, script, and transliteration
The finding aid describes the series
developed the Learning Object
scheme at the element level; and
into which the collection is
Metadata (LOM) standard (IEEE
the ability to embed a rich
organized—such as corres-
1484.12.1-2002) to enable the use
description of components in the
pondence, business records,
and re-use of technology-supported
related Item element.
personal papers, and campaign
learning resources such as
The ability in MODS to give
speeches—and ends with an
computer-based training and
granular descriptions of constituent
itemization of the contents of the
distance learning. The LOM defines
parts of an object works particularly
physical boxes and folders
the minimal set of attributes to man-
well with the METS structural map
comprising the collection.
age, locate, and evaluate learning
for complex digital library objects.
Like the TEI Header, the EAD is
objects. The attributes are grouped
defined as an SGML DTD. It begins
into eight categories:
The Encoded Archival
with a header section that describes
• General, containing information
Description (EAD)
the finding aid itself (for example,
about the object as a whole;
The Encoded Archival
who wrote it) and then goes on to
• Lifecycle, containing metadata
Description (EAD) was developed
the description of the collection as
about the objects evolution;
as a way of marking up the data
a whole and successively more
contained in finding aids so that they
detailed information about the
• Technical, with descriptions of
can be searched and displayed
records or series within the
the technical characteristics and
online.
collection. If individual items being
requirements;
described exist in digital form, the
In archives and special
EAD can include pointers to the
• Educational, containing the
collections, the finding aid is an
digital objects. The 2002 version of
educational / pedagogical
important tool for resource
attributes;
Page 6
Understanding Metadata
• Rights, describing the intellectual allow various schemes for sculpture has its own special
property rights and use
transactions related to different
requirements. The Art Information
conditions;
genres such as music, journal
Task Force (AITF), developed a
articles, and books to be able to
conceptual framework for describ-
• Relation, identifying related interchange information, particularly ing and accessing information about
objects;
that related to intellectual property
objects and images called
• Annotation, containing com- rights. In order to support this Categories for the Descriptions of
ments and the date and author
common framework, <indecs> has
Works of Art (CDWA). Some 30
of the comments; and
developed a minimal kernel of
categories were defined, most with
required metadata.
multiple subcategories. Some
• Classification, which identifies
Several organizations have built
examples of the specialized
other classification system
on the <indecs> Framework to
descriptive elements relevant to
identifiers for the object.
develop specific metadata schemas.
artworks included are: Orientation,
Dimensions, Condition, Inscrip-
Within each category is a
Among them is the ONIX (Online
tions, Conservation Treatment, and
hierarchy of data elements to which
Information Exchange) International
Exhibition / Loan History.
the metadata values are assigned.
standard. ONIX is an XML-based
Examples of learning-related
metadata scheme developed by
Typically, visual resources
metadata elements found in the
publishers under the auspices of a
collections used in teaching art
Education category are Typical Age
number of book industry trade
history and similar subjects do not
Range (of the intended user),
groups in the United States and
contain original art works but rather
Difficulty, Typical Learning Time,
Europe. The original ONIX
slides or photographs of the original
and Interactivity Level.
specification was a direct response
art. Metadata for these materials
to the enormous growth in online
therefore has to accommodate the
The IMS Global Learning
book sales and the realization that
description of multiple levels of
Consortium has developed a suite
books described with images, cover
related resources, such as an
of specifications to enable
blurbs, reviews, and similar
original painting, a slide of the
interoperability in a learning
information significantly outsold
painting, and a digitized image of
environment. Their Meta-Data
books without this information.
the slide. The VRA Core Categories
Information Model specification is
Therefore ONIX for Books has
build on and expand the CDWA
based on the IEEE LOM scheme
elements to record a wide range of
work to define a single metadata
with only minor modifications.
evaluative and promotional infor-
element set that can be used to
E-Commerce – <indecs>
mation as well as basic bibliographic
describe the work (the actual
and ONIX
and trade data. ONIX for Serials is
painting, photograph, sculpture,
in development to define serials
building, etc. ) as well as the images
Metadata schemas are
product metadata at the title, item,
(visual representations) of them.
increasingly being developed to
and subscription package levels.
Version 3.0 of the VRA Core
support electronic commerce
While ONIX information was
Categories consists of 17 metadata
applications. The <indecs>
designed for use in the commerce
elements which can be used as
Framework (Interoperability of Data
cycle of a publication, it may also
applicable to describe each of these
in ECommerce Systems) was an
provide a source for enrichment of
versions and relate them to each
international collaborative effort
library-created catalog records; the
other: Record Type, Type, Title,
supported by the European
Bibliographic Enrichment Advisory
Measurements, Material, Tech-
Commission’s Info 2000 Pro-
Team (BEAT) project at the Library
nique, Creator, Date, Location, ID
gramme. The collaborators were
of Congress is experimenting with
Number, Style/Period, Culture,
major rights owners, such as
this use. ONIX metadata may also
Subject, Relation, Description,
publishers and members of the
be used by libraries in the future for
Source, and Rights. Like the Dublin
recording industry, who wanted to
the creation of a beginning
Core, the VRA Core scheme does
develop a framework for metadata
bibliographic record. Mappings
not specify any particular syntax or
standards to support network
between ONIX for Books and both
rules for representing content.
commerce in intellectual property.
MARC 21 and UNIMARC have
Both CDWA and VRA
The foundation of the <indecs>
already been created.
emphasize the use of controlled
work is a data model for intellectual
vocabularies for specified elements.
property and its transfer. Rather
Visual Objects – CDWA
A number of existing vocabularies
than developing a new metadata
and VRA
are suggested and communities are
scheme, <indecs> sought to
Metadata used to describe visual
encouraged to develop additional
develop a common framework to
objects such as a painting or
vocabularies as needed.
Understanding Metadata
Page 7
MPEG Multimedia
particular applications of audio. The
for the framework. It was issued
Metadata
cross-application low-level descrip-
as an ISO technical report (ISO/
tors cover Structures and Features
IEC TR 21000:1-2001) and is
The ISO/IEC Moving Picture
(temporal and spectral). The
available as a free download
Experts Group (MPEG) has
domain-specific high-level descrip-
from ISO’s publicly available
developed a suite of standards for
tors include such elements as
standards website. A second
coded representation of digital
Musical Instrument Timbre, Melody
edition of the vision document is
audio and video. Two of the
Description, and Spoken Content
underway to address comments
standards address metadata:
Description.
and suggestions received from
MPEG-7, Multimedia Content
The Description Schemes are
other organizations following the
Description Interface (ISO/IEC
based on XML, and can be
initial publication.
15938), and MPEG-21, Multimedia
expressed in textual form suitable
•
Framework (ISO/IEC 21000).
Part 2: Digital Item Declaration,
for editing, searching, filtering, and
issued in 2003, describes a
MPEG-7 defines the metadata
human readability; or in a binary
model for defining Digital Items.
elements, structure, and rela-
form for storage, transmission, and
It includes a description of the
tionships that are used to describe
streaming delivery. Since the full
syntax and semantics of each of
audiovisual objects including still
description of a multimedia object
the Digital Item Declaration
pictures, graphics, 3D models,
can be quite complex, the standard
elements and a corresponding
music, audio, speech, video, or
provides for a Summary Description
XML schema.
multimedia collections. It is a multi-
Scheme geared to browsing and
part standard that addresses:
navigation.
• Part 3: Digital Item Identification,
• Description Tools including
The standard envisions that
also issued in 2003, describes
Descriptors that define the
search engines could use MPEG-7
how to uniquely identify Digital
syntax and the semantics of
metadata descriptions to identify
Items and how to link Digital
each metadata element and
audiovisual objects in entirely new
Items with related information
Description Schemes that
ways, such as digitizing a musical
such as descriptive metadata.
specify the structure and
phrase played on a keyboard and
• Part 4: Intellectual Property
semantics of the relationships
then retrieving a list of musical
Management and Protection is
between the elements.
pieces that contain the sequence of
still in development. It is intended
•
notes; drawing some lines on an
A Description Definition Lang-
to define the framework for
electronic drawing tablet and
uage to define the syntax of the
ensuring interoperability of
retrieving images with similar
Description Tools, allow the
intellectual property manage-
graphics; or using a voice excerpt
creation of new Description
ment tools, including authen-
to retrieve related speech files,
Schemes, and allow the
tication, and accommodates the
photographs, video clips, and
extension and modification of
Rights information defined in the
biographical information of the
existing Description Schemes.
following two parts.
speaker. These retrieval mech-
• System tools, to support storage anisms are outside the scope of • Part 5: Rights Expression
and transmission, synch-
MPEG-7, but the standards
Language, issued in 2004, is a
ronization of descriptions with
developers wanted to
machine-readable language that
content, and management and
accommodate these futuristic
can declare rights and per-
protection of intellectual property.
capabilities and have included
missions.
many interoperability requirements
Descriptors for visual and audio
• Part 6: Rights Data Dictionary is
beyond the typical metadata
are defined separately using a
still in development. It will define
elements.
hierarchy of elements and sub-
a standard set of terms to be
MPEG-21 was developed to
elements. For visual objects there
used with the Rights Expression
address the need for an overarching
are descriptors for Basic Structure,
Language. It is also expected to
framework to ensure interoperability
Color, Texture, Shape, Motion,
include specifications for
of digital multimedia objects. The
Localization, and Face Recognition.
mapping and transforming rights
multi-part standard is not yet fully
Audio descriptors are divided into
metadata terminology. The
completed but is intended to include
two categories: low-level
Rights Data Dictionary and
the following:
descriptors that are common to
Expression Language are being
audio objects across most
• Part 1: Vision, Technologies and
viewed as models for the
applications, and high-level
Strategy provides the overview
handling of intellectual property
descriptors that are specific to
of the complete vision and plan
metadata for applications
beyond audiovisual.
Page 8
Understanding Metadata
• Part 7: Digital Item Adaptation, Documentation Initiative (DDI) information resources. The profile
also in development, is intended
standard for describing social
defines an extended set of data for
to standardize networking and
science datasets. The DDI is
describing biological data, such as
interoperability description tools.
defined as an XML DTD, and allows
the taxonomic name of the
Included in this part will be User
for top down hierarchical description
organism and its classification in the
Characteristic description tools
of a social science study, the data
taxonomic hierarchy.
that specify user preferences.
files resulting from that
study, and the variables
Metadata in Action
There are some seven additional
used in the data files.
A county land planner is studying the
parts identified and in various
There is also a header
impact of new zoning laws on a particular
stages of development that deal
area that uses Dublin Core
bird species. The study team is composed
with technical interoperability issues
of an ecologist, hydrologist, civil engineer,
elements for a high-level
of less specific relevance to
and environmental protection specialist.
description of the DDI
metadata. All of the published parts
Remote sensing data for the last 20
document itself.
are available from ISO as ISO/IEC
years provides a trend analysis of the
21000-[part#].
Extensions and
decrease in wetlands, the bird’s habitat.
These datasets have FGDC metadata. The
Metadata for Datasets
Profiles
biologists on the study team need to
document the results of a field inventory.
Metadata schemes for datasets
Despite the recent
Using a biological profile to extend the
are enabling original data in the
development of many of
FGDC element set, the biologists add the
science and social science fields to
these metadata schemes,
genus-species name and taxonomic
be shared in a way that was never
most have already been
hierarchy. The ecologists are concerned
possible before the Internet. One of
subject to the changes
with collection methods and modeling
tools. The data related to the changes in
the most well developed element
brought about by imple-
human population are documented using
sets is the Federal Geographic Data
menting them in real world
a metadata set developed by the Census
Committee (FGDC) Content
situations. These modifi-
Bureau.
Standard for Digital Geospatial
cations are of two types:
This study results in a technical report
Metadata (CSDGM), officially
extensions and profiles.
which is assigned Dublin Core metadata
known as FGDC-STD-001-1998.
An extension is the
by the author. When the technical report
Geospatial datasets include
addition of elements to an
is cataloged into the organization’s
repository, the Dublin Core elements are
topographic and demographic data,
already developed
used as the basis for automatic generation
GIS (geographic information
scheme to support the
of a MARC cataloging record. This record
systems), and computer-aided
description of an infor-
is enhanced by the cataloger and included
cartography base files. They are
mation resource of a
in the library’s online public access
used in a wide variety of areas,
particular type or subject
catalog.
including soil and land use studies,
or to meet the needs of a
biodiversity counts, climatology and
particular interest group.
global change tracking, remote
Extensions increase the
sensing, and satellite imagery. The
number of elements.
The U.S. Department of
FGDC Content Standard is required
Profiles are subsets of a scheme
Education’s Gateway to Edu-
for use with resources created and
that are implemented by a particular
cational Materials (GEM) project
funded by the U.S. Government and
interest group. Profiles can
has based their own metadata
is also being used by many state
constrain the number of elements
scheme on the Dublin Core. The
governments.
that will be used, refine element
GEM profile limits the Dublin Core
An international standard, ISO
definitions to describe the specific
elements that can be used (for
19115, Geographic Information—
types of resources more accurately,
example, Contributor is not allowed)
metadata was issued in 2003. A
and specify values that an element
and makes some elements
technical amendment that will allow
can take.
mandatory. GEM also defines ad-
datasets to be both ISO and FGDC
In practice, many applications
ditional elements such as Audience,
compliant is underway along with an
use both extensions and profiles of
Grade, Quality, and Standards,
implementation model that can be
base metadata schemes. For
extending the base Dublin Core set
used in conjunction with an XML
example, the National Biological
for educational use.
schema.
Information Infrastructure (NBII)
A metadata scheme becoming
has developed a Biological Data
well established in the social and
Profile of the FGDC Content
behavioral sciences is the Data
Standard for use with biological
Understanding Metadata
Page 9
Creating Metadata
being used. The template will
making it difficult to locate relevant
then generate a formatted set of
information.
Who creates metadata? The
the element attributes and their
answer to this varies by discipline,
The Framework of Guidance for
corresponding values.
the resource being described, the
Building Good Digital Collections,
tools available, and the expected
• Mark-up tools will structure the available on the NISO website,
outcome, but it is almost always a
articulates six principles applying to
metadata attributes and values
cooperative effort.
good metadata:
into the specified schema
Much basic structural and
language. Most of these tools
• Good metadata should be
administrative metadata is supplied
generate XML or SGML
appropriate to the materials in
by the technical staff who initially
Document Type Definitions
the collection, users of the
digitize or otherwise create the
(DTD). Some templates include
collection, and intended, current
digital object, or is generated
such a mark-up as part of their
and likely use of the digital
through an automated process. For
final translation of the metadata.
object.
descriptive metadata, it is best in
• Extraction tools will • Good metadata supports inter-
some situations if the originator of
automatically create metadata
operability.
the resource provides the
from an analysis of the digital
information. This is particularly true
• Good metadata uses standard
resource. These tools are
in the documentation of scientific
controlled vocabularies to reflect
generally limited to textual
datasets where the originator has
the what, where, when and who
resources. The quality of the
significant understanding of the
of the content.
metadata extracted can vary
rationale for the dataset and the
significantly based on the tool’s
•
uses to which it could be put, and
Good metadata includes a clear
algorithms as well as the content
for which there is little if any textual
statement on the conditions and
and structure of the source text.
information from which an indexer
terms of use for the digital object.
These tools should be con-
could work.
sidered as an aid to creating
• Good metadata records are
However, many projects have
metadata. The resulting
objects themselves and
found that it is more efficient to have
metadata should always be
therefore should have the
indexers or other information
manually reviewed and edited.
qualities of archivability,
professionals create the descriptive
persistence, unique ident-
metadata, because the authors or
• Conversion tools will translate
ification, etc. Good metadata
creators of the data do not have the
one metadata format to another.
should be authoritative and
time or the skills. In other cases, a
The similarity of elements in the
verifiable.
combination of researcher and
source and target formats will
•
information professional is used.
affect how much additional
Good metadata supports the
The researcher may create a
editing and manual input of
long-term management of
skeleton, completing the elements
metadata may be required.
objects in collections.
that can be supplied most readily.
Metadata tools are generally
There are a number of ongoing
Then results may be supplemented
developed to support specific
efforts for dealing with the metadata
or reviewed by the information
metadata schemas or element sets.
quality challenge:
specialist for consistency and
The websites for the particular
• Metadata creation tools are
compliance with the schema syntax
schema will frequently have links to
being improved with such
and local guidelines.
relevant toolsets.
features as templates, pick lists
Creation Tools
that limit the selection in a
Metadata Quality Control
particular field, and improved
Many metadata project
The creation of metadata
validation rules.
initiatives have developed tools and
automatically or by information
made them available to others,
• Software interoperability pro-
originators who are not familiar with
sometimes for free. A growing
grams that can automate the
cataloging, indexing, or vocabulary
number of commercial software
“crosswalk” between different
control can create quality problems.
tools are also becoming available.
schemas are continuously being
Mandatory elements may be
Creation tools fall into several
developed and refined.
missing or used incorrectly. Schema
categories:
syntax may have errors that prevent
•
•
Content originators are being
Templates allow a user to enter
the metadata from being processed
formally trained in understanding
the metadata values into pre-set
correctly. Metadata content ter-
metadata and controlled
fields that match the element set
minology may be inconsistent,
vocabulary concepts and in the
Page 10
Understanding Metadata
use of metadata-related software
Interoperability and
descriptions, created at different
tools.
times for different purposes, can
Exchange of Metadata
•
also be linked to each other. RDF is
Existing controlled vocabularies
generally expressed in XML.
that may have initially been
Some people ask: Do we need
designed for a specific use or a
so many metadata standards? With
Metadata Crosswalks
narrow audience are getting
all the metadata standards,
The interoperability and ex-
broader use and awareness. For
initiatives, extensions, and profiles,
change of metadata is further
example, the Content Types and
how can interoperability be
facilitated by metadata crosswalks.
Subtypes originally defined for
ensured?
A crosswalk is a mapping of the
MIME email exchange are
It is important to remember that
elements, semantics, and syntax
commonly used as the controlled
different schemes serve distinct
from one metadata scheme to those
list for the Dublin Core Format
needs and audiences. Comple-
of another.
element.
mentary schemes can be used to
describe the same resource for
A crosswalk allows metadata
• Communities of users are multiple purposes and to serve a created by one community to be
developing and refining
number of user groups. For ex-
used by another group that employs
audience-specific metadata
ample, a technical report could have
a different metadata standard. The
schemas, application profiles,
a MARC metadata set in a library’s
degree to which these crosswalks
controlled vocabularies, and
online catalog, an FGDC
are successful at the individual
user guidelines. The MODS User
description as part of the National
record level depends on the
Guidelines are a good example
Spatial Data Infrastructure
similarity of the two schemes, the
of the latter.
C l e a r i n g h o u s e
granularity of the elements in the
Mechanism, and an
target scheme compared to that of
A Dublin Core description
embedded set of
the source, and the compatibility of
represented in RDF
Dublin Core ele-
the content rules used to fill the
<?xml version=”1.0"?>
ments.
elements of each scheme.
<!DOCTYPE rdf:RDF SYSTEM “http://purl.org/
The Resource
Crosswalks are important for
dc/schemas/dcmes-xml-20000714.dtd”>
D e s c r i p t i o n
virtual collections where resources
<rdf:RDF xmlns:rdf=”http://www.w3.org/
Framework (RDF),
are drawn from a variety of sources
1999/02/22-rdf-syntax-
developed by the
and are expected to act as a whole,
ns#”xmlns:dc=”http://purl.org/dc/elements/
World Wide Web
perhaps with a single search engine
1.1/”>
Consortium (W3C),
applied. While these crosswalks are
<rdf:Description about=”http://
is a data model for
key, they are also labor intensive to
www.niso.org/standards/resources/
Metadata_Demystified.pdf”>
the description of
develop and maintain. The mapping
resources on the
of schemes with fewer elements
<dc:title>Metadata Demystified</
dc:title>
Web that provides a
(less granularity) to those with more
<dc:creator>Brand, Amy</dc:creator>
mechanism for
elements (more granularity) is
integrating multiple
problematic.
<dc:creator>Daly, Frank</dc:creator>
metadata schemes.
<dc:creator>Meyers, Barbara</
Table 1 on page 12 shows a
dc:creator>
In RDF a name-
crosswalk between Dublin Core,
space is defined by
<dc:subject>metadata</dc:subject>
MARC 21, and VRA Core for
a URL pointing to a
<dc:description>Presents an overview
selected elements. In this case,
of metadata conventions in publish-
Web resource that
there is no attempt to map at the
ing.</dc:description>
describes the
content level.
<dc:publisher>NISO Press</
metadata scheme
Metadata Registries
dc:publisher>
that is used in the
<dc:publisher>The Sheridan Press</
description. Multiple
Registries are an important tool
dc:publisher>
namespaces can
for managing metadata. Metadata
<dc:date>2003-07</dc:date>
be defined, allowing
registries can provide information
<dc:format>application/pdf</
elements from
on the definition, origin, source, and
dc.format>
different schemes
location of data. Registration can
</rdf:Description>
to be combined in a
apply at many levels, including
</rdf:RDF>
single resource
schemes, usage profiles, metadata
description. Multiple
elements, and code lists for element
values. The metadata registry
provides an integrating resource for
Understanding Metadata
Page11
Table 1. Example of Metadata Crosswalk Mapping
Dublin Core
EAD
MARC 21
Title Element
Title
<titleproper>
245 00$a (Title Statement/Title proper)
700 1#$a (Added Entry--Personal Name)
(with $e=author)
Author Element
Creator <author>
720$a (Added Entry–Uncontrolled
Name/Name) (with $e=author)
Date Created
260 ##$c (Date of publication, distribution,
Date.Created <unitdate>
Element
etc.)
legacy data, acts as a lookup tool
interoperability between systems of
National Center for Super-
for designers of new databases,
digital image files. The metadata
computing Applications. In 2001, it
and documents each data element.
elements defined in the standard
became an official ANSI/NISO
Registries can also document
cover basic image parameters such
standard (Z39.85) and in 2003
multiple schemes or element sets,
as compression and color profile,
Dublin Core was issued as an
particularly within a specific field of
information about the equipment
international standard (ISO 15836).
interest. A good example is the U.S.
and settings use to create the
The World Wide Web
Environmental Protection Agency’s
image, and performance assess-
Consortium’s (W3C) metadata
Environmental Data Registry that
ment data such as sampling
activity has been incorporated into
provides information about
frequency and color maps.
the Semantic Web, their initiative to
thousands of data elements used
Metadata work is ongoing
“provide a common framework that
in current and legacy EPA
across a number of standards
allows data to be shared and reused
databases.
development organizations. In the
across application, enterprise, and
Standards relevant to metadata
International Organization for
community boundaries.” The RDF
registries include ISO/IEC 11179,
Standardization (ISO), a subcom-
framework is one of the key
Specification and Standardization of
mittee of Technical Committee (TC)
enabling standards. The Semantic
Data Elements, and ANSI X3.285,
46 (Information and documen-
Web efforts are directed to
Metamodel for the Management of
tation), is addressing metadata
standards that increase the
Shareable Data.
development for bibliographic
interoperability of metadata, rather
applications. ISO TC 211 (Geo-
than specific metadata schemas.
Future Directions
graphic information / Geomatics) is
The World Wide Web has
developing metadata standards for
created a revolution in the
Most early metadata standards
applications in geographic
accessibility of information. The
have focused on the descriptive
information systems. The Data
development and application of
elements needed for discovery,
management and interchange
metadata represents a major
identification, and retrieval. As
subcommittee of ISO-IEC JTC1
improvement in the way information
metadata initiatives developed,
(Information technology) is
can be discovered and used. New
administrative metadata, especially
developing standards for the
technologies, standards, and best
in the rights and preservation areas
specification and management of
practices are continually advancing
was further emphasized. Technical
metadata and has recently issued
the applications for metadata. The
metadata is one area that still does
a technical report on Procedures for
resources in the following section
not get much attention in metadata
achieving metadata registry content
will give you a head start in tracking
schemas. The effective exchange
consistency (ISO/IEC 20943).
developments and contain links to
and use of the digital objects
Many organizations that
more information on the projects
described by the metadata often
developed metadata specifications
discussed throughout this
requires knowledge of specific
outside the formal standards
document.
technical aspects of the objects
community are seeking to have their
beyond its filename and type.
specifications turned into
Newer standards are beginning to
international standards. The Dublin
address this need. The NISO/AIIM
Core is an example of this
standard, Z39.87, Data Dictionary—
approach. It was originally de-
Technical Metadata for Digital Still
veloped in 1995 at a workshop
Images, focuses solely on the
sponsored by OCLC and the
technical data needed to facilitate
Page12
Understanding Metadata
More Information on Metadata
General Resources
Metadata Information
Working Group on Preservation
Clearinghouse Interactive
Metadata, January 31, 2001
Digital Libraries: Metadata
(MICI)
www.oclc.org/research/projects/
Resources (IFLA)
http://www.metadata
pmwg/presmeta_wp.pdf
http://www.ifla.org/II/
information.org
metadata.htm
Schemes, Initiatives,
Metadata Portals and Multi-
A Framework of Guidance for
and Related Sites
standard Projects
Building Good Digital
by Candy Schwartz
Application profiles: mixing and
Collections
http://web.simmons.edu/
matching metadata schemas
http://www.niso.org/framework/
~schwartz/meta.html
Rachel Heery and Manjula Patel,
forumframework.html
Ariadne, Issue 25, September
Metadata Primer – A “How To”
2000.
Introduction to Metadata:
Guide on Metadata
http://www.ariadne.ac.uk/issue25/
Pathways to Digital
Implementation [for digital spatial
app-profiles/intro.html
Information
data]
by Martha Baca
by David Hart and Hugh Phillips
The Cedars Project (CURL
http://www.getty.edu/research/
http://www.lic.wisc.edu/metadata/
exemplars in digital archives)
conducting_research/standards/
metaprim.htm
http://www.leeds.ac.uk/cedars/
intrometadata/index.html
metadata.html
Metadata Principles and
Metadata: Cataloging by Any
Practicalities
CDWA (Categories for the
Other Name
Duval, Erik, Wayne Hodgins,
Description of Works of Art)
by Jessica Milstead and Susan
Stuart Sutton, and Stuart L.
http://www.getty.edu/research/
Feldman
Weibel
conducting_research/standards/
ONLINE, January 1999
D-Lib Magazine 8(4) (April 2002)
cdwa/
http://www.onlinemag.net/
http://www.dlib.org/dlib/april02/
OL1999/milstead1.html
weibel/04weibel.html
DDI (Data Documentation
Initiative)
Metadata and Its Application
Metadata Resources (UKOLN)
http://www.icpsr.umich.edu/DDI/
by Brad Eden
http://www.ukoln.ac.uk/metadata/
Library Technology Reports
resources
DOI (Digital Object Identifier)
(September-October 2002)
http://www.doi.org/
Metadata Standards
Metadata Demystified: A Guide
http://www.chin.gc.ca/English/
Dublin Core Metadata Initiative
for Publishers
Standards/metadata_intro.html
(DCMI)
by Amy Brand, Frank Daly,
http://dublincore.org
Barbara Meyers
Metadata Standards,
NISO Press & The Sheridan
Crosswalks, and Standards
EAD (Encoded Archival
Press, 2003,
Organizations
Description)
ISBN 1-880125-49-9
http://staff.library.mun.ca/staff/
http://www.loc.gov/ead/
http://www.niso.org/standards/
toolbox/standards.htm
Environmental Data Registry
resources/
(EPA)
Metadata_Demystified.pdf
Metadata.net – Projects, Tools &
Services, and Schema Registry
http://www.epa.gov/edr/
Metadata Fundamentals for All
(Australia)
FGDC Content Standard for
Librarians
http://metadata.net/
Digital Geospatial Metadata
by Priscilla Caplan
(CSDGM)
ALA, 2003, ISBN: 0-8389-0847-0
Preservation Metadata for
Digital Objects: A Review of the
http://www.fgdc.gov/metadata/
State of the Art
Gateway to Educational
A White Paper by the OCLC/RLG
Materials (GEM)
http://www.geminfo.org/
Understanding Metadata
Page 13
IFLA Functional Requirements
OAIS (Open Archival
Crosswalks and Lists of
for Bibliographic Records
Information System)
Crosswalks
http://www.ifla.org/VII/s13/frbr/
http://www.ccsds.org/documents/
frbr.htm
650x0b1.pdf
All about Crosswalks
http://www.oclc.org/research/
IMS Global Learning
ONIX (Online Information
projects/mswitch/
Consortium
Exchange)
1_crosswalks.htm
http://www.imsglobal.org
http://www.editeur.org/onix.html
Dublin Core / MARC / GILS
<indecs> interoperability of
Open GIS Consortium
Crosswalk
data in ecommerce systems
http://www.opengis.org/
http://www.loc.gov/marc/
http://www.indecs.org/
dccross.html
PADI (Preserving Access to
LOM (Learning Object
Digital Information)
FGDC to MARC
Metadata)
http://www.nla.gov.au/padi/topics/
http://www.alexandria.ucsb.edu/
http://ltsc.ieee.org/wg12/
32.html
public-documents/metadata/
fgdc2marc.html
MARC 21 (Machine-Readable
PREMIS (PREservation
Cataloging)
Metadata: Implementation
Issues in Crosswalking Content
http://www.loc.gov/marc
Strategies)
Metadata Standards
http://www.oclc.org/research/
by Margaret St. Pierre and William
MetaWeb Project
projects/pmwg
P. LaPlant, Jr.
http://www.dstc.edu.au/Research/
http://www.niso.org/press/
Projects/metaweb/
PURL (Persistent Uniform
whitepapers/crsswalk.html
Resource Locator)
METS (Metadata Encoding and
http://purl.org
MARC 21 to Dublin Core
Transmission Standard)
http://www.loc.gov/marc/
http://www.loc.gov/standards/
RDF (Resource Description
marc2dc.html
mets/
Framework)
http://www.w3.org/RDF/
Metadata: Mapping between
MIX (Metadata for Images in
Metadata Formats (UKOLN)
XML Schema)
SCHEMAS: Forum for Metadata
http://www.ukoln.ac.uk/metadata/
http://www.loc.gov/standards/mix/
Schema Implementors
interoperability/
(UKOLN)
MODS (Metadata Object
http://www.ukoln.ac.uk/metadata/
Metadata Mappings
Description Schema)
schemas/
(Crosswalks)
http://www.loc.gov/standards/
http://libraries.mit.edu/guides/
mods/
TEI (Text Encoding Initiative)
subjects/metadata/mappings.html
http://www.tei-c.org/
MPEG (Moving Picture Experts
Metadata Standards Crosswalk
Group)
VRA (Visual Resources
(Getty)
http://www.chiariglione.org/mpeg/
Association) Core Categories
http://www.getty.edu/research/
http://www.vraweb.org/
conducting_research/standards/
NBII (National Biological
vracore3.htm
intrometadata/3_crosswalks/
Information Infrastructure)
crosswalk1.html
http://www.nbii.gov/
XML (Extensible Markup
Language)
Metadata Standards Crosswalks
Nordic Metadata Projects
http://www.w3.org/XML/
(Canadian Heritage Information
http://www.lib.helsinki.fi/meta/
Network)
Z39.50
NSDI (National Spatial Data
http://www.chin.gc.ca/English/
http://www.loc.gov/z3950/agency/
Infrastructure)
Standards/
http://www.fgdc.gov/nsdi/
ZING (Z39.50 Next Generation)
metadata_crosswalks.html
http://www.loc.gov/z3950/agency/
OAI (Open Archives Initiative)
zing/zing-home.html
http://www.openarchives.org/
Page 14
Understanding Metadata
Metadata Registries &
NBII Metadata Clearinghouse
FGDC Metadata Tools
Clearinghouses
http://metadata.nbii.gov/
http://www.nbii.gov/datainfo/
metadata/tools/
DCMI Registry Working Group
The SCHEMAS Registry
http://dublincore.org/groups/
http://www.schemas-forum.org/
Metadata Software Tools
registry/
registry/
http://ukoln.bath.ac.uk/metadata/
software-tools/
DESIRE Metadata Registry
Tools for Metadata
http://desire.ukoln.ac.uk/registry/
Creation
OAI-Specific Tools
http://www.openarchives.org/tools/
Environmental Data Registry
DDI Tools
tools.html
http://www.epa.gov/edr/
http://www.icpsr.umich.edu/DDI/
users/tools.html#a01
RDF Editors and Tools
FGDC Clearinghouse Registry
http://www.ilrt.bris.ac.uk/
http://registry.gsdi.org/
Dublin Core tools
discovery/rdf/resources/#sec-tools
http://dublincore.org/tools/
MICI (Metadata Information
TEI Software
Clearinghouse Interactive)
http://www.tei-c.org/Software/
http://
index.html
www.metadatainformation.org/
Glossary
AACR2 (Anglo-American
DC (Dublin Core) – a general
extension – an element that is not
Cataloging Rules) – A standard set
metadata element set for describing
officially part of a metadata scheme,
of rules for cataloging library
all types of resources.
which is defined for use with that
materials. The “2” refers to the
scheme for a particular application.
DDI (Data Documentation
second edition.
Initiative) - a specification for
FGDC (Federal Geographic Data
administrative metadata –
describing social science datasets.
Committee) – a U.S. Federal
metadata related to the use,
government interagency committee
descriptive metadata – metadata
management, and encoding
that describes a work for purposes
responsible for developing the
processes of digital objects over a
of discovery and identification, such
National Spatial Data Infrastructure.
period of time. Includes the subsets
as creator, title, and subject.
of technical metadata, rights
GEM (Gateway to Educational
management metadata, and
DLF (Digital Library Federation)
Materials) – a U.S. Department of
preservation metadata.
– a membership organization
Education initiative that has defined
dedicated to making digital
an extension to the Dublin Core
ANSI (American National
information widely accessible.
element set to accommodate
Standards Institute) – administers
educational resources.
and coordinates the U.S. voluntary
DOI (Digital Object Identifier) – a
GIS (Geographic Information
standardization and conformity
unique identifier assigned to
electronic objects of intellectual
System) – a computer system for
assessment system.
property which can be resolved to
capturing, managing, and
CDWA (Categories for the
the object’s location on the Internet.
displaying data related to positions
Descriptions of Works of Art) – a
DTD (Document Type Definition)
on the Earth’s surface.
metadata element set for describing
– a formal description in SGML or
artworks.
HTML (Hypertext Mark-up
XML syntax of the structure
Language) – a set of tags and rules
crosswalk – a mapping of the
(elements, attributes, and entities)
derived from SGML used to create
elements, semantics, and syntax
to be used for describing the
hypertext documents for the World
from one metadata scheme to
specified document type.
Wide Web. Officially, a W3C
another.
EAD (Encoded Archival
Recommendation.
CSDGM (Content Standard for
Description) – a metadata scheme
<indecs> (Interoperability of Data
Digital Geospatial Metadata) – a
for collection finding aids.
in ECommerce Systems) – a
metadata standard developed by
element set – information
framework for metadata to support
the FGDC. Officially known as
segments of the metadata record,
commerce in intellectual property.
FGDC-STD-001.
often called semantics or content.
interoperability – the ability of
dataset – a collection of computer-
encoding rules – the syntax or
multiple systems, using different
readable data records.
prescribed order for the elements
hardware and software platforms,
contained in the metadata
data structures, and interfaces, to
description.
exchange and share data.
Understanding Metadata
Page 15
Glossary
ISO (International Organization
namespace – in RDF, a way to tie
SGML (Standard Generalized
for Standardization) – the primary
a specific use of a metadata
Markup Language) – a language
international standards develop-
element to the scheme where the
used to mark-up electronic
ment organization.
intended definition is to be found.
documents with tags that define the
relationship between the content
IEC (International Electro-
NISO (National Information
and the structure. Officially,
technical Commission) – an
Standards Organization) – a
international standard ISO 8879,
international standards develop-
standards development organ-
Information processing—Text and
ment organization for all electrical,
ization, accredited by the American
office systems—Standard Gen-
electronic and related technologies.
National Standards Institute, that
eralized Markup Language (SGML).
Co-sponsors with ISO the Joint
develops library and information-
Technical Committee 1 on Infor-
related standards.
structural metadata – metadata
that indicates how compound
mation Technology.
ONIX (Online Information
objects are structured, provided to
Exchange) – a metadata scheme
LOM (Learning Object Metadata)
support use of the objects.
for book bibliographic, trade, and
– a metadata scheme for
promotional data.
syntax – rules for how metadata
technology-supported learning
elements and their content are
resources.
preservation metadata – a form of
encoded.
administrative metadata dealing
MARC 21 (MAchine Readable
with the provenance of a resource
technical metadata – a form of
Cataloging) -- a formatting, record
and its archival management.
administrative metadata dealing
structure, and encoding standard
with the creation or storage
for electronic bibliographic
profile – a subset of a scheme
encoding processes or formats of
cataloging records developed by
defined and used by a particular
the resource.
the Library of Congress. The “21”
interest group to customize the
refers to the version of MARC
scheme for its purposes.
TEI (Text Encoding Initiative) – a
issued in 1998 that integrated the
metadata scheme for electronic text
U.S. and Canadian versions of
PURL (Persistent URL) – a naming
MARC.
and resolution system developed by
URL (Uniform Resource Locator)
OCLC utilizing an intermediate
– A unique address for identifying
MARCXML – a metadata scheme
redirection service to locate a
and locating a resource on the
for working with MARC data in a
resource’s URL.
Internet.
XML environment
qualifier – an optional sub-element
VRA (Visual Resources
metadata – structured information
to a Dublin Core element that is
Association ) Core – a metadata
that describes, explains, locates,
used to further refine the element
scheme for describing a visual work
and otherwise makes it easier to
or support a specific encoding
and its representations
retrieve and use an information
scheme.
W3C (World Wide Web
resource.
RDF (Resource Description
Consortium) – an international
metadata harvesting – a technique
Framework) – a language for
consortium that develops
for extracting metadata from
representing metadata about Web
consensus protocols and
individual repositories and
resources so it can be exchanged
specifications to ensure the
collecting it in a central catalog
between applications without loss
interoperability of the World Wide
of meaning. Officially, a suite of
Web.
METS (Metadata Encoding and
W3C specifications.
Transmission Standard) – a
XML (Extensible Mark-up
metadata scheme for complex
registry – a formal system for the
Language) – an application profile
digital library objects.
documentation of the element sets,
of SGML designed for use in Web
descriptions, semantics, and syntax
applications. Officially, a W3C
MODS (Metadata Object
of one or more metadata schemes.
Recommendation.
Description Schema) – a
metadata scheme for rich
rights management metadata – a
Z39.50 – a NISO and ISO standard
description of electronic resources.
form of administrative metadata
protocol for cross-system search
dealing with the intellectual property
and retrieval. Officially, international
MPEG (Moving Pictures Experts
rights of a resource.
standard, ISO 23950, Information
Group) – Standards Committee 29,
Retrieval (Z39.50): Application
Working Group 11 of ISO/IEC JTC1,
scheme (schema)– a metadata
Service Definition and Protocol
which develops standards for digital
element set and rules for using it.
Specification, and ANSI/NISO
audio and video. Also refers to a
semantics – the names and
standard Z39.50.
suite of standards developed by the
meanings of metadata elements.
group.
Page 16
Understanding Metadata
Support the leaders in our community who support NISO as Voting Members:
3M
Entopia, Inc.
National Security Agency
American Association of Law
ExLibris USA
Nylink
Libraries
Fretwell-Downing Informatics
OCLC, Inc.
American Chemical Society
Gale Group
Openly Informatics, Inc.
American Library Association
Geac Library Solutions
ProQuest Information and Learning
American Society for Information
Science and Technology
GIS Information Systems, Inc.
Random House, Inc.
American Society of Indexers
H.W. Wilson Company
Recording Industry Association of
America
American Theological Library
Helsinki University Library
Association
The Research Libraries Group
Index Data
ARMA International
SAGE Publications
Infotrieve
Armed Forces Medical Library
Serials Solutions, Inc.
Innovative Interfaces, Inc.
Art Libraries Society of North
SIRSI Corporation
America
Institute for Scientific Information
Society for Technical
AIIM International
The International DOI Foundation
Communication
Association of Information and
Ithaka/JSTOR/ARTstor
Society of American Archivists
Dissemination Centers
John Wiley & Sons, Inc.
Special Libraries Association
Association of Jewish Libraries
KINS, Inc.
Synapse Corporation
Association of Research Libraries
Library Binding Institute
TAGSYS, Inc.
Auto-Graphics, Inc.
Library of Congress
Talis Information Ltd.
Barnes & Noble, Inc.
The Library Corporation
Triangle Research Libraries
Book Industry Communication
Network
Los Alamos National Laboratory
California Digital Library
U.S. Department of Commerce,
Lucent Technologies
NIST, Office of Information
Cambridge Information Group
Services
Medical Library Association
Checkpoint Systems, Inc.
U.S. Department of Defense, DTIC
MINITEX
(Defense Technical Information
College Center for Library
Center)
Automation
Modern Language Association
U.S. Department of Energy, Office
Colorado State Library
Motion Picture Association of
of Scientific & Technical
America
Information
CrossRef
MuseGlobal, Inc.
U.S. Government Printing Office
Davandy, L.L.C.
Music Library Association
U.S. National Commission on
Docutek Information Systems
Libraries and Information Science
National Agricultural Library
Dynix Corporation
VTLS, Inc.
National Archives and Records
EBSCO Information Services
Administration
WebFeat
Elsevier Science Inc.
National Federation of Abstracting
and Information Services
Endeavor Information
Systems, Inc.
National Library of Medicine
Understanding Metadata
Page
ISBN 1-880124-62-9
Document Outline
- What is Metadata?
- What Does Metadata Do?
- Resource Discovery
- Organizing Electronic Resources
- Interoperability
- Digital Identification
- Archiving and Preservation
- Structuring Metadata
- Metadata Schemes and Element Sets
- Dublin Core
- Text Encoding Initiative (TEI)
- Metadata Encoding and Transmission Standard (METS)
- Metadata Objects Description Schema (MODS)
- Encoded Archival Description (EAD)
- Learning Object Metadata (LOM)
- E-Commerce
- Visual Objects
- Categories for the Description of Works of Art (CDWA)
- VRA Core Categories
- MPEG Multimedia Metadata
- Metadata for Datasets
- Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM)
- Data Documentation Initiative (DDI)
- Extensions and Profiles
- NBII Biological Data Profile
- Gateway to Educational Materials (GEM)
- Creating Metadata
- Creation Tools
- Metadata Quality Control
- Interoperability and Exchange of Metadata
- Resource Description Framework (RDF)
- Metadata Crosswalks
- Metadata Registries
- Future Directions
- More Information on Metadata
- General Resources
- Schemes, Initiatives, and Related Sites
- Crosswalks and Lists of Crosswalks
- Metadata Registries and Clearinghouses
- Glossary
- Sidebars and Tables
- Dublin Core Example
- Metadata in Action (1)
- MODS Record Example
- Metadata in Action (2)
- Dublin Core description represented in RDF
- Example of Metadata Crosswalk Mapping