Developing a Digital Image Archive can become a tremendously huge undertaking if you don't break it down into small discrete steps. These can go on at the same time, and even be handled by different people in your organization. Some tasks require different skill sets, and while there are a number of technical challenges with regards to capturing, processing, and storing images; how you organize your images for retrieval can be even more important.
Organizing
Your Photographs
There are many research projects that are examining the indexing of
images by automatic content analysis, but they can not achieve the level of
detail and accuracy needed to replace a truly professional manual indexing
system.
There are a lot of considerations to be made before you even begin the process (unless of course you like to redo work for the thrill of it). The biggest task to tackle is how to physically organize and file the scans (in folders on a hard drive, on a local area network, or even a series of CDR's) so you can find the materials you need later. You also need to consider what the images will be used for (thumbnails to locate the "real" photos, web and multimedia, final art for publication?). This will probably require you to employ one or more "off-the-shelf" software applications to assist you in cataloging the images or creating a searchable database of your images.
Organize,
Organize, Organize
You may already have an organizational scheme that you already use for your
existing "physical" images. See if it's possible to modify or transfer this
system for your "virtual" image storage. My own filenaming
system differs for slides, negatives, and digital files, and aids me in locating
the physical film. If you don't have some organizational "hierarchy" in place,
my first suggestion would be to create your own "dewey decimal system" for
images. See if you can locate a copy of Ernst Robl's "Organizing your
Photos" described on the books page. For
a summary of some of the more important parts of that book see his article,
Image Numbering,
Filing, and Retrieval that was originally prepared for the American
Society of Picture Professionals site.
The next big step is to begin embedding your images with "metadata" (literally, data about data, or information about your image). There are many kinds of metadata, so if you are unfamiliar with the term, and want a brief overview take a look at the Metalogging section of this site. The one we will exploit for our purposes goes by various names, but was first conceived by the International Press Telecommunications Council, so are usually just referred to as the IPTC. If you use Adobe Photoshop, you'll know this as the File info feature found under the File menu. To increase productivity you might want to consider something like the Image Info Toolkit, or PhotoMechanic.
Caption
and Keyword to Aid in Retrieval
The two most frequently searched fields in the IPTC schema are the Caption field
(now called "Description" in the latest vesions of Photoshop), and
the Keyword field. Writing good captions and determining good keywords to aid
you and others in finding your images is part of what I call "metalogging"
and is covered in detail on this site. See the separate pages on writing
good captions and determining good
keywords, as well as a comprehensive list of caption
and keyword guidelines.
If you don't have some organizational "hierarchy" in
place for describing your subjects, my first suggestion would be to create
your own "keyword thesaurus" using your own "controlled
vocabulary" that works as part of your own "dewey decimal system"
for images. The Library of Congress Classification Outline is a good start.
See some of the other examples listed on this
site. If you don't want to go to the trouble of creating your own controlled
vocabulary, and you don't find any that match up with your specialties on
the example page, you might want to consider the use of the Image
Info Toolkit and it's integrated Keyword
Catalog.
I started with the Library of Congress's Thesaurus of Graphic Materials combined
with the hierarchy from the
International Press Telecommunications Council (IPTC), and several other
sources found on this site (picking and choosing from each). I also "picked"
some ideas from Jim Pickerell's http://www.pickphoto.com/
site, which has lots of useful information on image archiving in his "selling
stock" newsletter (plus lots of other good stuff if you are interested in
selling your images for commercial purposes).
Why are
you doing this?
What do you plan to do with the images? If you plan on using the digital
images only for thumbnails to locate the original slide or negative, then
your requirements for scanning or "acquiring" are much more reasonable. If
you intend to use the images for final art you will need much larger files,
or a way of finding the physical film quickly, so that you can have the darkroom
work handled.
Most desktop scanners today can easily give you a 27mb to 55mb file (when stored as an uncompressed RGB tiff), and only cost $2,000 to $4,000. I've used the Polaroid Sprintscan 35+, the Sprintscan 4000, and the Microtek 4000t and with any it only takes a minute or two to do a scan of this size from 35mm film.
Many photographers are now using pro and prosumer digital cameras for covering assignments and creating stock images for sale. Having an organized system is even more important if you are shooting digital, because it's possible to easily lose track of where that file is located if you don't have a good system in place.
The Research Libraries Group of the OCLC gave an overview of the workflow being used at Corbis in this article on their site. If you are scanning your images with the idea of being able to license them as stock photographs, then this is a good reference.
Where
is that scan?
After you've determined what you are going to do with your images, and
have scanned them or downloaded the images from the digital camera, what do
you do next? One thing you might want to consider is creating a version that's
easy to access but large enough to show relevant details. I do this by downsampling
the high resolution file using a technique I developed over a period of time.
This image, because of its small file size, is a good one to annotate and
catalog in your image database. In addition to taking less time to create
a thumbnail, they are small enough to be easily opened when you need to append
data in the "file info/IPTC" in photoshop, or re-insert this
info back into the "header" of the image file.
Classify/categorize,
Identify, and Catalog.
Figure out where the image belongs in your classification system. Identify
the WHO, WHAT, WHY, WHEN, WHERE and HOW's of the image and either place in
the "file info/IPTC"
part of your image file (In Photoshop look for the file info header
under the FILE menu) or create a way to link the text file with the
image (or just keep reading).
If these are your own images, or those of your employers, you may want to provide a copyright notice within the IPTC header. If you are familiar with Adobe Photoshop, you can apply this information as a batch process using "actions." On Riecks.com you can see how to insert a copyright notice into the File Info/IPTC section as a photoshop action. If you are storing your images as jpeg files, you don't want to use photoshop, as you will be "recompressing" the image file each time you save. If you are saving jpeg files (or shooting them with your digital camera) you may want to use one of a handful of utility programs that allow you to change the info in the file "header" without affecting the actual image portion of the file.
If you are creating an image database with the intention of putting it on the web, you may want to include visible watermarks on the face of the image. There are various image security options including visible watermarks discussed on Riecks.com as well.
Finally you are ready to create "catalogs" of your images. There
are numerous applications that can handle the
task of creating a thumbnail, and automatically gathering information about
the file (size, resolution, filename, color space, etc). One easy way is to
use a program that can automatically grab the IPTC/file info from the
Photoshop file when creating the thumbnail. It's best to test several programs
you are considering, as you may find there are differences in how this information
is transferred to your program of choice. In order to make it really useful
you may want to consider image databases that can take any information you
have added within the image database, and push that back into the IPTC file
info of the actual images in the catalog/database.
Benefits of an
Image Database
In addition to allowing you to search for your images by description or keyword,
some of these programs support drag-and-drop or drag-and-place
for linking the original file to your page layout, or word processing applications.
Most have "slide show viewers" or integrated browsers (some tie into quicktime).You
can often use them to create "static" HTML pages that can be viewed on Windows
or Mac machines. Several can import/export to proprietary databases or standard
database formats that will allow you to edit the database info or create custom
applications. Some of the more progressive even have a way to move the resulting
catalog on to your website and still remain searchable.