There are a number of myths or misconceptions that surround the practice
of embedding information such as IPTC, IPTC-IIM, XMP or
even Exif into
a digital image file — like JPEGs, TIFFs, Photoshop, DNG and other Raw
files). There are a number of applications
or utilities which can do this easily and safely, but first, let's take
a look at the list.
Let me stress, if it wasn't clear before, that the above statements are common misunderstandings about photo metadata, meaning that they are not correct. If you want to understand why, read on for a short summary of each, as well as links to other resources that should help you understand.
An illustration of the "Metadata" overhead discussed
in item 3 below. |
|
Using All Metadata setting -
"on disk" size is 39.2 kb View metadata list for drp2091169-sfw-q60-all-wicc.jpg |
Using Copyright Metadata setting -
"on disk" size is 29.4 kb View metadata list for drp2091169-sfw-q60-copyright.jpg |
By purging nearly all metadata with the exception of the copyright notice, you can save 9.8 kb in this specific instance. Check out the lists above to see just how much data is being stored in each image, and what additional information can be stored in that extra 9.8 kb. The same image saved using the None setting in Save for Web, takes up 27.7 kb on disk; so adding the Copyright Notice metadata alone only adds 1.7 kb of data above that used by the image and the ICC profile in this instance. Keep in mind that Photoshop's "Save for Web & Devices" only stores the metadata in the XMP format, and includes more information that the type stored using the legacy IPTC-IIM format. |
1. Embedded
metadata is hard to read
There are many specific imaging applications
and utilities, such as Adobe
Photoshop,
Bridge,
Lightroom, Expression
Media, Photo
Mechanic and others that make
it easy to
both read
and enter various forms of photo metadata into your images. Reading metadata
that is contained in images is much easier than embedding, and can be done
with a number of free utilities. Some common
utilities for Mac OS X, are Apple Preview and Spotlight. On
the Windows platform you can use IrfanView, or Microsoft
Pro Photo Tools; and those using Windows 7 will find a basic set
of image data available directly in Windows Explorer. It's
even possible to read the metadata from images on the web using an online service
-- like
the
one built
by Jeffery
Friedl
that
leverages
Phil Harvey's ExifTool -- which can show you all
sorts of information in your image files, including GPS. This particular tool
can even be installed in the
toolbar of many popular internet browsers, so that revealing
this information is only a one-click operation.
2. Embedded
metadata will always be there
Unfortunately, there is no way to "lock" your embedded
photo metadata
into your
images. The closest you will find to a locking mechanism is in the METAmachine
application, which prevents the user from changing Creator and Copy
Notice fields
if there
is a previous entry. The embedded metadata
in a digital image
is
fragile
and
some
applications
either
don't
know,
or
respect
the
work that
was
done to store this information along with the image pixels.
In
some
cases,
simply
uploading
an
image
to
a
website, or having it processed online to a different size might
result
in
a partial or total loss of metadata (see item 5 below).
3. Embedded
metadata adds a lot of disk space overhead
The amount of disk space needed to hold a reasonable amount of information
about an image takes up surprisingly little space, as it's mostly plain text.
Various
outfits
that
sell
or
give
away
software
to
"strip"
your
images
of metadata so they will be "leaner and meaner" on the Internet like
to perpetuate this meme; claiming that the addition of embedded photo metadata
adds
a large
amount
of "overhead" to
a file. In reality, unless you are filling in every single metadata field in
Photoshop CS5 or Lightroom 3 (which include the IPTC Extension), the additional
disk
space
required
to
store
that
data in your original high-resolution files will
be a very tiny fraction of the space compared to the space used by the pixels
in
your
image.
In
most cases, adding basic copyright, creator, contact info and a three or four
sentence
caption
will only
add about
2
to
4
kb to your file size. For a 20 or 30 mb TIFF file that's infinitesimal by comparison.
See the illustration above for one example of what to expect for much smaller
images being saved for use on the web.
4. Embedded
metadata can be read by the search engines
There is no evidence, from the tests I've conducted, or from other
reports I've seen, that would lead
to
me to believe that the various embedded metadata schemas (IPTC, XMP, or Exif)
are
being read or used
by
the major
search engines (i.e. Google, Bing, and Yahoo). It is possible that they may be
reading this information, but so far there is no reason to assume that it is
being used as part of their ranking algorithm.That doesn't mean that embedding
metadata is not a good practice. It's just at this point in time that it has
little value for those wishing to enhance their Search Engine Optimization
(SEO). If you are interested in more details on this issue, see the Why
Embedded
Photo Metadata Won't Help Your SEO (at least without some help) article.
5. Images
uploaded to social media/photo sharing sites will retain my embedded metadata
As of late 2010, over half of the various social media or photo sharing
sites either remove all embedded metadata on upload, or remove it from images
that are processed to intermediate preview and thumbnail images. For details
on that issue, see the Controlled Vocabulary Survey
regarding the Preservation
of Photo Metadata by Social Media Websites. Users need to test and verify
their online services to ensure that metadata is preserved. If not, they need
to ask their services why they are not preserving
their photo metadata. As stressed in the Metadata
Manifesto, "systems need to preserve ownership metadata by default and
discourage removal of other metadata by warning users about the legal implications
of removal."
6. Removing
embedded metadata is against the law
This is a tricky subject. There are lots of different "fields" within
the various embedded metadata
types or schemas mentioned above. If you are the owner of the image, it's
up to you what to include or edit. If the image is one you are managing for someone
else, or simply using; then you should be careful in what you edit, and/or remove.
Some
of
the fields,
such as
the
Copyright
Notice,
Source, Creator, and Contact Info, comprise
what is referred to as "Copyright
Management
Information" and removal of these is against the law in the United States
under the Digital
Millenium Copyright Act (DMCA). Other jurisdictions may have similar laws,
so you might want to check with an Intellectual Property attorney before making
changes to embedded metadata in images that don't belong to you. Removal of other
fields, such as Caption/Description,
Title, Headline, or Keywords, in a digital image file isn't necessarily
going to land
you
in
hot
water of a legal type, but
it will make it harder for those that may be legitimately using a digital file
to
find
it, or know what is going on in the image.
7. All
embedded metadata is the same
There are actually many different types of photo metadata which peacefully
co-exist (for the most part) in your digital images. They
make
up
an
alphabet
soup of acronyms such as IPTC, IPTC-IIM, XMP or
even Exif. Some
of these, like Exif, are
auto-generated; while the rest are mostly "user-entered" (though that
process can be done in batch-mode operations to hundreds or thousands of images
at a time). Some
metadata,
like IPTC-IIM is
stored in a binary form, while others, like XMP is written in a form more similar
to the HTML of this web page. Thus it may be possible to have the name of the
photographer stored three times in the same image: in Exif, in IPTC-IIM and in
XMP
(IPTC
Core) — which may appear the same regardless of where the data was entered.
There
are a
few fields that are "shared" between the different schemas, so you
could say
that those are the same; but this is only the case for a few (the
IPTC
Core
schema
shares
a
few
fields
with
Dublin
Core and the IPTC
Extension
schema shares a couple of fields with PLUS, see the Metadata
Field Guide if you
want to know which). Member companies of the Metadata
Working Group are working to make sure that the information in these various
schema can easily interoperate regardless of where and how they are stored.
8. Picasa
(or iPhoto) writes all my captions and keywords into my embedded metadata
as soon as I enter them
The simple truth is that whether or not this happens depends a lot
on the file format and the program used. While Picasa is a very useful program
(and one I recommend to most of my family members), it is designed to
work
primarily
with JPEG images. Version 3.8 of Picasa can read a number of the various metadata
fields
in
JPEG images, and even a few fields in TIFF files, but it can only write your
captions and keywords to JPEGs at present. All the versions of iPhoto I've looked
at up
till now do not enter your caption info into the image when you write the caption;
that is only done at the time when you "export" the image, and only
to specific file formats. If you aren't sure what is being done and care to verify,
you
can
check
the
files
after
you add info
(and before you export) by using the online tool built
by Jeffery Friedl that leverages
Phil Harvey's ExifTool.
9. Adding
copyright and contact information to images on the Internet makes websites
load slowly
Metadata does take up some storage space in a digital file. However,
when compared to the image pixels, it's generally quite small for most high-resolution
images from digital SLR cameras.
When you make the image sizes smaller, the space occupied by the pixels may shrink
dramatically, while the space to store the metadata does not change at all (unless
you opt to remove some of the information). If you are displaying a number of
small thumbnails (say, less than 150 pixels on the long dimension for instance),
and
each
of those
thumbnails has the full set of metadata you had embedded in your original high
resolution image, then it could be that the metadata takes up more space than
the pixels.
So if you had a page of say
500~1000
thumbnails
on
a
page,
the
addition
of
that embedded
photo
metadata
could
increase
the overall
load time of the page.
However, that same amount of metadata in a 600 pixel wide preview might only increase that file size by 1 or 2 percent — which really isn't nearly as big of a deal as the lean and mean purists would have you believe. So if you are concerned about the speed of your website, it may be worth testing before removing all metadata from your thumbnails, or preview images. The difficulty at present with that idea, is that the tools used to manage and remove metadata are very simple and typically don't offer fine-grained solutions that would allow you to easily remove specific metadata fields while leaving others. In addition, removal of critical fields — like the Copyright Notice, or Creator, or Contact Info fields — will make it difficult if not impossible for others to know where that image came from; once it's removed from its original location, or downloaded from the Internet. If some version of an Orphan Works bill passes in the near future, you may be wishing you hadn't pared your images down by removing all metadata; especially if you see others using your images without your permission.
10. Adding
embedded photo metadata, like copyright and contact information, is difficult
to do and time consuming
Whether or not adding embedded photo metadata is painful has everything
to do with the application you use. If you are using a free application/utility
that doesn't allow for storing values, or sets of values, in a template — and
instead requires you to type in each entry one character at a time — then it
will be
time
consuming.
However, there are a number of professional applications that make easy
to save repetive information, like your name, copyright notice, contact info,
etc., and allow you to save these values into metadata templates and even apply
this infomation in batch-operations. See the various "Meta-tutorials" on
the PhotoMetadata site to see how easily this can be done. Some of the tutorials
even
have
video versions
if you want to take a break from reading. If you are comfortable with "Command-Line"
applications there are free utilities such as ExifTool that
can add, or modify
the metadata in a batch of files in a very short time. The International Press
Telecommunications Council (IPTC) has posted a list of various Software
Applications that support the IPTC-IIM, IPTC Core and IPTC Extension metadata
schemas that
is worth investigating as well.
The real reason to take the time to embed photo metadata — especially copyright and contact info — is that it provides a "trail of breadcrumbs" for tracking down the source of an image. Placing a credit line or other type of ownership information below an image on a web page is all well and good. However, as soon as someone "right-clicks & downloads the image" that contextual information surrounding the image on the page is gone and lost forever. Metadata that is embedded in the image can travel along with the image, regardless of where it goes.
11. Metadata
is always stored inside the image file (OR Metadata is always stored
outside the image file)
There is no set answer to this question,
so it's important to understand what the application you are using does
with your information and where it's stored.
First of all, it's not always possible for the information to be stored inside
the image, as not all image file formats support the embedding of metadata. JPEG,
TIFF, Photoshop (PSD), and Digital Negatives (DNG) do allow for the embedding
of
metadata and are widely
supported. Proprietary RAW file formats may allow you to embed metadata, but
not all applications
can or will do this. A number, such as those in the Adobe Creative Suite and
Lightroom, will create a small text file that has the same name as
the image file,
but with a .XMP extension. Other applications — when instructed — will
write the data out to some other kind of text file. This is what Apple's Final
Cut
Server does; as it
saves information from it's database in an XML form to be saved with the
image or video in a companion text file.
Software applications may reference the information in the image, or they might store it in their own internal database. Image browsers such as Adobe Bridge, Photo Mechanic, Breeze Browser and FotoStation immediately write your metadata to the image and have to read it from the image when you search (though some browsers, like Bridge, can "cache" the information locally). Browsers show you the images that are in a specific folder at that time; but have limited use once the drive or media on which they are located is no longer connected or accessible.
Contrast that with Image cataloging programs which know where your images are located when they are fed into the application. Many cataloging applications, such as Apple Aperture, Adobe Lightroom, Phase One Expression Media, Canto Cumulus, or Extension Portfolio will read in existing metadata from a digital image file and store it in their own local database. Any information (metadata) you enter will be stored in that local database, along with a note on the path to where the original file was first encountered. That is why all of these applications recommend that you do not move the files using other means outside that program itself. Some of these applications will allow you to synchronize the information in the internal database with the original file (and some like Lightroom can be set to do this automatically). Some will only add the metadata at the time that you export a version of that file to share (and only if that file format supports embedded metadata). In some instances (such as with video, or other formats which don't support embedded metadata), it is appropriate that the metadata is only stored in the database; as there is no way to embed it in the file. However, storing information in an internal database is by no means the only way.
12. The
topic of metadata is beyond the understanding of the everyday user
If you've made it this far, then you already know and understand a
lot more about metadata than many photographers or image users. If you make images
with
a
digital camera, then you probably have learned that a better understanding of
embedded
photo metatdata
will make it easier for you to store, find and share your images — now and in
the future.
If
you'd
like
to know more, please take a look at some of the other sections of this website,
and be sure to visit Photometadata.org.
Many thanks to Richard Wagner, Bob Stromberg, and others from the Controlled Vocabulary forum who contributed ideas for this article.
<<Return to Blog article index
Initial posting: November 18, 2010