What's
new?
Looking
for the Controlled Vocabulary Keyword Catalog?
Want to subscribe to the Controlled Vocabulary
forum?
Looking for info on the new IPTC Core Schema for XMP?
Takes the Guess
Work out of Searching
A controlled vocabulary makes a database easier to search. Since we have
many different ways of describing concepts, drawing all of these terms together
under a single word or phrase in a database makes searching the database more
efficient as it eliminates guess work. However, arriving at this efficiency
requires consistency on the part of the individual indexing the database and
the use of pre-determined terms.
A Familiar Concept
It’s likely you are already familiar with the concept of controlled
vocabulary. Phonebook Yellow Page listings are arranged using controlled vocabulary.
For example, a search for "Car Dealers" leads you to a note to “see
Automobile Dealers." At a basic level, this is how a controlled vocabulary
system works.
One Search is All
it Takes
Conducting a search in a database that uses controlled vocabulary or indexing
terms is efficient and precise. The biggest advantage to controlled vocabulary
is that once you do find the correct term, most of the information you need
is grouped together in one place, saving you the time of having to search
under all of the other synonyms for that term.
Finding a Balance
It's difficult to say whether controlled vocabulary or natural language systems
give the best retrieval performance. Free Text or Natural Language systems often
provide more results in a shorter time span because you are searching all the
fields of a given database (the Google search engine is a form of free text
search). Such searches work well for very specific searches, however, when a
topic is older or broader in scope, you likely will retrieve irrelevant hits.
You also may miss some records relevant to your search because you didn't choose
the proper search term. As with a web search, searching a database requires
striking a balance between preciseness and generating enough hits to make the
search successful.
Stop Words
In many online databases you should keep in mind that there are certain words
that are ignored. These are called "Stop Words." Common stop words
are words such as 'the', 'a', 'an', 'this', and 'that'. While stop words may
provide some useful content in Natural Language Processing; most keyword based
approaches do not use grammars to parse user input, so this content is not used
effectively.
Much of the information on this site concerns how to apply a controlled vocabulary used to describe images used in an imagedatabase. Within the "metalogging" section you will find resources and suggestions on how to efficiently caption and keyword your images. Use the links at the top or bottom of the page to read about these other items of interest. If you would like to join others in discussing this topic further, just enter your email address in the sign-up box below.
A member of the Zillionbucks.com Webhosting Service