Controlled Vocabulary

The First International Photo Metadata Conference

"Working towards a seamless photo workflow"

These are my notes from the First International Photo Metadata Conference, which was held on Thursday, June 7, 2007, in Florence, Italy. This report is split into three sections corresponding to the conference viewpoints of photo metadata creators and users, standards bodies, and implementers.

Part I: Photo metadata Creators and Users   << you are here
Part II: Photo Metadata Standardization Bodies
Part II: Photo Metadata Implementers

Part I: This first section deals with Photo metadata creator and users, such as independent and commercial photographers, small and large picture agencies and libraries, and trade associations from the photo business.

The Picture Tide is Rising

Andreas Trampe of Germany’s Stern magazine opened the First International Metadata Conference to a capacity crowd (over 125 people), by illustrating the challenges of managing a flood of images at a busy picture desk.

Trampe stressed that "we have an excess of images. The search process has become much too time consuming." Consider that they receive 25,000 images every weekend and 12,000 each weekday from 12 press agencies. In addition, they have online access to 300 image databases of an additional 60 million images. From all of these options, they select just 250 photos for each issue, or about 0.3 percent of all of the images received.

He then proved the point, by sharing the flood of images coming in the day before the conference which was during the G8 summit. Trampe did a search simply using the keyword "G8" that resulted in an overwhelming 9,000 images. He limited the search by date to only the last three days but this only reduced the yield to over 4,000 hits. Narrowing the search further to those with the keyword "demonstration" still left him with nearly 800 images to sift through.

Trampe complained about the fact that photographers and agencies "misuse the IPTC fields, unleashing an orgy of keywords on us." He offered some amusing examples of the inappropriate images that appear in a search when photographers and image editors enter erroneous information simply to increase the chance of someone finding their images in a search.

This keyword and caption "spamming" frustrates image buyers trying to find an image on a specific topic. Some suppliers will even "jigger" the IPTC date created field, so that images will show up in more recent searches. Part of this is a training problem, as most entering this data are self-taught, or learn of such techniques from their colleagues. The other part of this issue is a lack of standards and guidelines.

He then demonstrated on the Grazi Neri website how to use effective search filters by doing a search of Florence, and then limited the search to only those in the category of "Travel/stock" and to show mostly images of the city. This reduced the field from 3000 to 600 images that would be useful for most magazine editors.

Trampe concluded that it may be more worthwhile to expend efforts on getting better quality image metadata content than in buying better database software.


Meta-Education Needed

presented by David Riecks


My presentation focused on issues that affect stock photographers who are embedding metadata into their images. In some cases distributors may modify metadata without knowing what they are doing. With some distributors metadata may be lost simply because their workflows haven’t been tested or reviewed.

Clients often change filenames when they download comps of images. This makes it difficult for the client to identify the supplier, or the supplier to find the correct image. Always embedding the unique filename into the Document Title field is one solution that would be a great help for all parties in the imaging chain.

There is also a perception among many clients who feel that they can’t view embedded metadata without opening the image in Photoshop. There are alternatives, but education is needed.

I reported that the “Save for Web” feature in Photoshop, by default, still discards all metadata. The defaults for this option should be flip-flopped so that metadata is always preserved by default.

Photographers need ways to insert metadata as early as possible in the process – preferably at the capture stage. They also need to validate their own workflows so they know that their metadata is still in the file before sending on to the distributor or client. They also need to check to see if their distributor is changing or retaining metadata, both in the images sent to clients as well as the preview images that are displayed on websites.

Further along in the stock image workflow, more issues arise. In some cases, stock archives and distributors may modify metadata without knowing what they are doing. With some, metadata may be lost simply because their workflows haven't been tested or reviewed. Others are intentionally stripping or altering it to conform to their own workflow.

There is also a widely held mis-perception among clients that they can't view embedded metadata without opening the image in Photoshop. There are alternatives, but meta-education is needed.

Training and education for everyone in the stock imaging chain is the key to better photo metadata. For additional details, download my presentation from the conference website at www.phmdc.org


Identifying “Pain Points”

presented by Peter Krogh

Peter tried to identify a number of “pain points” based on his experience and others he has worked with (Krogh is the author of “The DAM book”). He discussed the problems that occur when moving images from applications that only support older IPTC schemas, to those that use the newer XMP variety. He made a number of suggestions, some which will likely resonate:

1. Adopt the Photoshop namespace "Copyright Status" tag as an IPTC Standard. Currently it is not, and in many applications there is no support, requiring photographers to re-enter this information.

2. Figure out a way to manage images from a collections management standpoint. For example, the Document Title field could be expanded to allow the easy recording of all sources of a file by making it a "bag-type" field, similar in sense to the current Keywords field. This would be useful for photographers doing montages, HDR (layering multiple exposures to expand the dynamic range), or stitching images into larger panoramic images.

3. We need a way to express "Parent-Child" relationships with keywords (note: this might more properly be called "hierarchical" relationships). He showed how this could be accomplished in Lightroom simply with the addition of some XMP coding.

<lr:hierarchicalSubject>
<rdf:Bag>
<rdf:li>Sample Keyword|Son of Sample</rdf:li>
</rdf:Bag>
</lr:hierarchicalSubject>

Pipe symbols would be used to separate the hierarchically arranged information, with sets of hierarchical terms, each as separate line items. Krogh mentions that he is not sure how to denote the synonyms in such a schema, but he assumes “it would not be too hard to design.”

4. Photographers (and others) need a way to express rankings and ratings of photographs within a set. To insure that this information is available across many applications, one of the few methods used currently is to use the keywords field to enter notes on rankings/ratings. There really needs to be a better place to store this information for process and handling info. Krogh suggested that the use of a Pipe separated set of terms within a "Collections" field could be used to accomplish this and tag the image in a durable way. This way, you could have ratings as indicated by the photographer, as well as by an editor, distributor, etc.

5. Expanding this concept further, it would be useful to store "Alternate Metadata sets" within a future IPTC schema. This might include information from other users, allow the storage of alternate color/tonal renderings for an image, deal with color management issues.

6. By saving metadata changes as "layers," there could be both front and back sets that could be grouped and tagged. Similar to the keyword sets above, this could be stored within an XMP schema. There is also the need to protect some of the metadata sets from tampering. This could be handled with some forms of encryption, but there would be a need to establish standards.


Automated News Image Processing

presented by Simon Span: The Mirror

Span discussed the advantages of using EXIF info within a newspaper workflow, as the Trinity Mirror staff handle image processing for 240 newspapers (500+ media brands in total).

Today, only 10-15 percent of the images they receive at Trinity Mirror have EXIF info. Span stressed that they need to talk to the photographers and agencies that submit images to make sure that they retain EXIF info, as well as encouraging photographers to enter as much information as possible shortly after the shoot using standard IPTC metadata. At present, there is no single software that is useful for dealing with this situation.

They use a typical Picture Desk workflow and deal with approximately 10,000 to 15,000 images per day. They use IPTC and EXIF, but in a proprietary way. They end up having to alter the IPTC metadata as submitted, in order to have consistency in their internal systems.

Today each image must go through several adjustments and conversions, with each conversion causing a loss of quality. Newspaper workflows require automatic conversion. Currently they are converting all images to Colormatch RGB, but are considering a move to Adobe RGB. Span mentioned that his goal is that they “should make as much use of existing data as possible” and with that in mind, they are looking at what it will take to only require a single color space conversion in the workflow.


Expanding the IPTC Standard for Stock

presented by Jan Leidicke: BVPA / Keystone

Leidicke reported that most stock agencies still only use the older IIM standard for IPTC metadata. Very few have systems that are set up to properly handle XMP based metadata such as IPTC Core.

For historical images the exact date of creation is typically not known. However, the current Date Created field requires you to specify year, month and day. If the editor enters a bogus month and day just so they can enter a year, this can create problems later in knowing if the day and month are really accurate or just a guess. The way that this field is defined needs to change to accommodate less precise dates.

Leidicke also mentioned how the use of the “named people” fields (one of the fields recommended by the Photo Metadata Working Group) could make it easier to find images with that actual person in the image.

The present IPTC standard lacks fields for properly expressing model release information, rights and permissions granted, and more. Metadata entered using controlled vocabularies can be easier to translate. It’s also important to enforce standards and always enter information into the proper field. Image buyers expect to find certain kinds of information within specified fields, regardless of image source.


Does “Meta Matter?”

presented by Roger Bacon: Reuters

While Reuters is a name in long standing with the news community (about 150 years), they only began adding photography coverage since the mid 1980’s. Bacon questioned “Does Meta Matter?” and began his talk by mentioning that Reuters’ currently has about 600 photographers and image editors involved in the photo area, and they are receiving about 1,500 images per day. The photographers are required to enter the following fields: headline, caption, category code, urgency, supplemental category code, byline, credit, object name, date created, city,
state (USA only), country, and original transmission reference.

The photographer is asked NOT to add keywords; this is something that they do at the management level. As you can see by the field names indicated above, Reuters is primarily using the older IPTC Information Interchange Module (IIM) standard.

Bacon emphasized the importance to add metadata early and make sure that your internal systems don’t throw it away. He mentioned that it would be great if the time and date could be automatically updated (like a cell phone does) regardless of where the photographer was in the world. It would also be helpful if photographers could upload other information to the camera. For example, if you could easily enter routing information before image capture, the images could be automatically distributed via selected channels.

Bacon did admit that, at present, Reuters strips EXIF metadata. This is due to one major client reporting that their workflow was “broken” when sent images containing this information.

Reuters adds rights information to the caption and to the special instructions field. But rights management is one important area that needs to be simplified. Bacon referred to what they enter as “Polyhierarchical” data, using something he referred to as Paneikon, or RRPE. He admitted that the information which is entered for news photos doesn’t work well for stock use.

In addition, they have several outstanding issues, such as clients who are asking for RAW files (their photographers only supply jpegs). In addition, MSN has been asking them to provide square thumbnails so they could automatically format them for their online news packages.

In closing Bacon asked the question, “what are we discarding today, that will be invaluable tomorrow?”

>>>Proceed to Part II

Notes

Download The IPTC Photo Metadata White Paper and transcripts of many of the presentations given at the Conference from http://phmdc.org/

You’re also invited to discuss further developments of the requirements and open issues which have been raised at the Conference on an IPTC moderated forum:
http://groups.yahoo.com/group/iptc-photometadata

 

This report was prepared June 23, 2007 by David Riecks with assistance from Betsy Reid and James Mulford.
Words and Pictures are ©2007 David Riecks, all rights reserved.

 

examples  |  books  |  products  |  image databases  |  links  |  what's new
imagedatabases  |  programs  |  IPTC standard  |  downsampling  | filenaming 
metalogging  |  captioning  |  keywording  |  guidelines  | metalog resources
home  |  contact  | sitemap