Controlled Vocabulary

What's new?
Looking for the Controlled Vocabulary Keyword Catalog?
Want to subscribe to the Controlled Vocabulary forum?
Looking for info on the new IPTC Core Schema for XMP?

What is a Controlled Vocabulary, and how is it useful?

Takes the Guess Work out of Searching
A controlled vocabulary makes a database easier to search. Since we have many different ways of describing concepts, drawing all of these terms together under a single word or phrase in a database makes searching the database more efficient as it eliminates guess work. However, arriving at this efficiency requires consistency on the part of the individual indexing the database and the use of pre-determined terms.

A Familiar Concept
It’s likely you are already familiar with the concept of controlled vocabulary. Phonebook Yellow Page listings are arranged using controlled vocabulary. For example, a search for "Car Dealers" leads you to a note to “see Automobile Dealers." At a basic level, this is how a controlled vocabulary system works.

One Search is All it Takes
Conducting a search in a database that uses controlled vocabulary or indexing terms is efficient and precise. The biggest advantage to controlled vocabulary is that once you do find the correct term, most of the information you need is grouped together in one place, saving you the time of having to search under all of the other synonyms for that term.

Finding a Balance
It's difficult to say whether controlled vocabulary or natural language systems give the best retrieval performance. Free Text or Natural Language systems often provide more results in a shorter time span because you are searching all the fields of a given database (the Google search engine is a form of free text search). Such searches work well for very specific searches, however, when a topic is older or broader in scope, you likely will retrieve irrelevant hits. You also may miss some records relevant to your search because you didn't choose the proper search term. As with a web search, searching a database requires striking a balance between preciseness and generating enough hits to make the search successful.

Stop Words
In many online databases you should keep in mind that there are certain words that are ignored. These are called "Stop Words." Common stop words are words such as 'the', 'a', 'an', 'this', and 'that'. While stop words may provide some useful content in Natural Language Processing; most keyword based approaches do not use grammars to parse user input, so this content is not used effectively.

Much of the information on this site concerns how to apply a controlled vocabulary used to describe images used in an imagedatabase. Within the "metalogging" section you will find resources and suggestions on how to efficiently caption and keyword your images. Use the links at the top or bottom of the page to read about these other items of interest. If you would like to join others in discussing this topic further, just enter your email address in the sign-up box below.

examples  |  books  |  products  |  image databases  |  links  |  what's new
imagedatabases  |  programs  |  IPTC standard  |  downsampling  | filenaming 
metalogging  |  captioning  |  keywording  |  guidelines  | metalog resources
home  |  contact  | sitemap

 

A member of the Zillionbucks.com Webhosting Service