These are my notes from the First International Photo
Metadata Conference, which was held on Thursday, June 7, 2007, in Florence,
Italy. This report is split into three sections corresponding to the
conference viewpoints of photo metadata creators and users, standards
bodies, and implementers.
Part I: This first section deals with Photo
metadata creator and users, such as independent and commercial photographers,
small and large picture agencies and libraries, and trade associations
from the photo business.
The Picture Tide is Rising
Andreas Trampe of Germany’s Stern magazine opened the First International
Metadata Conference to a capacity crowd (over 125 people), by illustrating
the challenges of managing a flood of images at a busy picture desk.
Trampe stressed that "we have an excess of images. The search
process has become much too time consuming." Consider that they
receive 25,000 images every weekend and 12,000 each weekday from 12
press agencies. In addition, they have online access to 300 image databases
of an additional 60 million images. From all of these options, they
select just 250 photos for each issue, or about 0.3 percent of all of
the images received.
He then proved the point, by sharing the flood of images coming in
the day before the conference which was during the G8 summit. Trampe
did a search simply using the keyword "G8" that resulted in
an overwhelming 9,000 images. He limited the search by date to only
the last three days but this only reduced the yield to over 4,000 hits.
Narrowing the search further to those with the keyword "demonstration"
still left him with nearly 800 images to sift through.
Trampe complained about the fact that photographers and agencies "misuse
the IPTC fields, unleashing an orgy of keywords on us." He offered
some amusing examples of the inappropriate images that appear in a search
when photographers and image editors enter erroneous information simply
to increase the chance of someone finding their images in a search.
This keyword and caption "spamming" frustrates image buyers
trying to find an image on a specific topic. Some suppliers will even
"jigger" the IPTC date created field, so that images will
show up in more recent searches. Part of this is a training problem,
as most entering this data are self-taught, or learn of such techniques
from their colleagues. The other part of this issue is a lack of standards
and guidelines.
He then demonstrated on the Grazi Neri website how to use effective
search filters by doing a search of Florence, and then limited the search
to only those in the category of "Travel/stock" and to show
mostly images of the city. This reduced the field from 3000 to 600 images
that would be useful for most magazine editors.
Trampe concluded that it may be more worthwhile to expend efforts on
getting better quality image metadata content than in buying better
database software.
Meta-Education Needed
presented by David Riecks
My presentation focused on issues that affect stock photographers who
are embedding metadata into their images. In some cases distributors
may modify metadata without knowing what they are doing. With some distributors
metadata may be lost simply because their workflows haven’t been
tested or reviewed.
Clients often change filenames when they download comps of images.
This makes it difficult for the client to identify the supplier, or
the supplier to find the correct image. Always embedding the unique
filename into the Document Title field is one solution that would be
a great help for all parties in the imaging chain.
There is also a perception among many clients who feel that they can’t
view embedded metadata without opening the image in Photoshop. There
are alternatives, but education is needed.
I reported that the “Save for Web” feature in Photoshop,
by default, still discards all metadata. The defaults for this option
should be flip-flopped so that metadata is always preserved by default.
Photographers need ways to insert metadata as early as possible in
the process – preferably at the capture stage. They also need
to validate their own workflows so they know that their metadata is
still in the file before sending on to the distributor or client. They
also need to check to see if their distributor is changing or retaining
metadata, both in the images sent to clients as well as the preview
images that are displayed on websites.
Further along in the stock image workflow, more issues arise. In some
cases, stock archives and distributors may modify metadata without knowing
what they are doing. With some, metadata may be lost simply because
their workflows haven't been tested or reviewed. Others are intentionally
stripping or altering it to conform to their own workflow.
There is also a widely held mis-perception among clients that they
can't view embedded metadata without opening the image in Photoshop.
There are alternatives, but meta-education is needed.
Training and education for everyone in the stock imaging chain is the
key to better photo metadata. For additional details, download my presentation
from the conference website at www.phmdc.org
Identifying “Pain Points”
presented by Peter Krogh
Peter tried to identify a number of “pain points” based
on his experience and others he has worked with (Krogh is the author
of “The DAM book”). He discussed the problems that occur
when moving images from applications that only support older IPTC schemas,
to those that use the newer XMP variety. He made a number of suggestions,
some which will likely resonate:
1. Adopt the Photoshop namespace "Copyright Status" tag as
an IPTC Standard. Currently it is not, and in many applications there
is no support, requiring photographers to re-enter this information.
2. Figure out a way to manage images from a collections management
standpoint. For example, the Document Title field could be expanded
to allow the easy recording of all sources of a file by making it a
"bag-type" field, similar in sense to the current Keywords
field. This would be useful for photographers doing montages, HDR (layering
multiple exposures to expand the dynamic range), or stitching images
into larger panoramic images.
3. We need a way to express "Parent-Child" relationships
with keywords (note: this might more properly be called "hierarchical"
relationships). He showed how this could be accomplished in Lightroom
simply with the addition of some XMP coding.
<lr:hierarchicalSubject>
<rdf:Bag>
<rdf:li>Sample Keyword|Son of Sample</rdf:li>
</rdf:Bag>
</lr:hierarchicalSubject>
Pipe symbols would be used to separate the hierarchically arranged
information, with sets of hierarchical terms, each as separate line
items. Krogh mentions that he is not sure how to denote the synonyms
in such a schema, but he assumes “it would not be too hard to
design.”
4. Photographers (and others) need a way to express rankings and ratings
of photographs within a set. To insure that this information is available
across many applications, one of the few methods used currently is to
use the keywords field to enter notes on rankings/ratings. There really
needs to be a better place to store this information for process and
handling info. Krogh suggested that the use of a Pipe separated set
of terms within a "Collections" field could be used to accomplish
this and tag the image in a durable way. This way, you could have ratings
as indicated by the photographer, as well as by an editor, distributor,
etc.
5. Expanding this concept further, it would be useful to store "Alternate
Metadata sets" within a future IPTC schema. This might include
information from other users, allow the storage of alternate color/tonal
renderings for an image, deal with color management issues.
6. By saving metadata changes as "layers," there could be
both front and back sets that could be grouped and tagged. Similar to
the keyword sets above, this could be stored within an XMP schema. There
is also the need to protect some of the metadata sets from tampering.
This could be handled with some forms of encryption, but there would
be a need to establish standards.
Automated News Image Processing
presented by Simon Span: The Mirror
Span discussed the advantages of using EXIF info within a newspaper
workflow, as the Trinity Mirror staff handle image processing for 240
newspapers (500+ media brands in total).
Today, only 10-15 percent of the images they receive at Trinity Mirror
have EXIF info. Span stressed that they need to talk to the photographers
and agencies that submit images to make sure that they retain EXIF info,
as well as encouraging photographers to enter as much information as
possible shortly after the shoot using standard IPTC metadata. At present,
there is no single software that is useful for dealing with this situation.
They use a typical Picture Desk workflow and deal with approximately
10,000 to 15,000 images per day. They use IPTC and EXIF, but in a proprietary
way. They end up having to alter the IPTC metadata as submitted, in
order to have consistency in their internal systems.
Today each image must go through several adjustments and conversions,
with each conversion causing a loss of quality. Newspaper workflows
require automatic conversion. Currently they are converting all images
to Colormatch RGB, but are considering a move to Adobe RGB. Span mentioned
that his goal is that they “should make as much use of existing
data as possible” and with that in mind, they are looking at what
it will take to only require a single color space conversion in the
workflow.
Expanding the IPTC Standard for Stock
presented by Jan Leidicke: BVPA / Keystone
Leidicke reported that most stock agencies still only use the older
IIM standard for IPTC metadata. Very few have systems that are set up
to properly handle XMP based metadata such as IPTC Core.
For historical images the exact date of creation is typically not known.
However, the current Date Created field requires you to specify year,
month and day. If the editor enters a bogus month and day just so they
can enter a year, this can create problems later in knowing if the day
and month are really accurate or just a guess. The way that this field
is defined needs to change to accommodate less precise dates.
Leidicke also mentioned how the use of the “named people”
fields (one of the fields recommended by the Photo Metadata Working
Group) could make it easier to find images with that actual person in
the image.
The present IPTC standard lacks fields for properly expressing model
release information, rights and permissions granted, and more. Metadata
entered using controlled vocabularies can be easier to translate. It’s
also important to enforce standards and always enter information into
the proper field. Image buyers expect to find certain kinds of information
within specified fields, regardless of image source.
Does “Meta Matter?”
presented by Roger Bacon: Reuters
While Reuters is a name in long standing with the news community (about
150 years), they only began adding photography coverage since the mid
1980’s. Bacon questioned “Does Meta Matter?” and began
his talk by mentioning that Reuters’ currently has about 600 photographers
and image editors involved in the photo area, and they are receiving
about 1,500 images per day. The photographers are required to enter
the following fields: headline, caption, category code, urgency, supplemental
category code, byline, credit, object name, date created, city,
state (USA only), country, and original transmission reference.
The photographer is asked NOT to add keywords; this is something that
they do at the management level. As you can see by the field names indicated
above, Reuters is primarily using the older IPTC Information Interchange
Module (IIM) standard.
Bacon emphasized the importance to add metadata early and make sure
that your internal systems don’t throw it away. He mentioned that
it would be great if the time and date could be automatically updated
(like a cell phone does) regardless of where the photographer was in
the world. It would also be helpful if photographers could upload other
information to the camera. For example, if you could easily enter routing
information before image capture, the images could be automatically
distributed via selected channels.
Bacon did admit that, at present, Reuters strips EXIF metadata. This
is due to one major client reporting that their workflow was “broken”
when sent images containing this information.
Reuters adds rights information to the caption and to the special instructions
field. But rights management is one important area that needs to be
simplified. Bacon referred to what they enter as “Polyhierarchical”
data, using something he referred to as Paneikon, or RRPE. He admitted
that the information which is entered for news photos doesn’t
work well for stock use.
In addition, they have several outstanding issues, such as clients
who are asking for RAW files (their photographers only supply jpegs).
In addition, MSN has been asking them to provide square thumbnails so
they could automatically format them for their online news packages.
In closing Bacon asked the question, “what are we discarding
today, that will be invaluable tomorrow?”
>>>Proceed to Part II
Notes
Download The IPTC Photo Metadata White Paper and transcripts of many
of the presentations given at the Conference from http://phmdc.org/
You’re also invited to discuss further developments of the requirements
and open issues which have been raised at the Conference on an IPTC
moderated forum:
http://groups.yahoo.com/group/iptc-photometadata
|