Guided Exploration

Metadata

Metadata is data about data, or information about information. Examples are the name of an author or the title of a document. Metadata describes characteristics such as content, condition or quality of information. Libraries have used metadata for ages. They use it to categorize books and articles, but also as an instrument for exploration and search. In Aduna software, metadata is used for finding and exploring information sources.

Metadata and findability

Metadata

Explicit metadata

The use of (explicit) metadata improves the findability of information. This can be illustrated with the following example. Suppose you want to find information on a subject because you have to give a presentation on this subject. Common search engines return millions of hits. Overwhelmed by the number it is hard to find the right document. When metadata is available about subject you would learn that it is related to subjectA and subjectB. This information helps you to focus and increases the findability of information.

Ontologies, thesauri and taxonomies

An ontology is metadata with (hierarchical) structure. The structure describes relations between concepts. There are ontologies with predefined relations. A thesaurus uses 'is broader than', 'is narrower than' and 'is related to' relations. A famous thesaurus is the Arts and Architecture Thesaurus (AAT) of the Getty Institute. It is based on 34,000 concepts and describes fine art, architecture, decorative arts, archival materials, and material culture. A taxonomy is usually based on hierarchical relations.

Standards for metadata

Dublin Core and standard metadata schemes

Dublin Core logo

Dublin Core is a standard metadata scheme. A standard metadata scheme is a collection of metadata terms that can be used to annotate or classify information. Dublin Core is such a standard. It has terms like Contributor, Creator, Title and Description. The Dublin Core standard is generic (as in: not specific for a domain). There are more domain specific metadata standards, like CIDOC CRM that is used in the world of museums.

RDF and OWL

RDF logo

Like XML is a mechanism to annotate data (for example with tags like Author=J.Doe), RDF and OWL are languages that describe metadata. RDF (Resource Description Framework) is a language that enables simple, machine-interpretable expressions about information. In RDF you are able to express that 'J. Doe' is an 'Author', that 'Author' is_a 'Person' and that 'Person' lives_in 'Town'. Now the computer can infer that 'J.Doe' possibly lives in some 'Town'. Simple for us humans, new for computers. OWL (Web Ontology Language) is a language that is capable of more complex expressions than RDF.