Ask LH: How Can I Deal With All My Duplicate Photos?

Hey Lifehacker, My archive of digital photos stretches back a dozen years and is 70GB in size. I recently combined all my photos files on to my new laptop and have quickly realised that in some cases I have triplicate copies or more of the same photos in different folders. Are there any programs that can analyse and match photos that appear more than once? I have spoken with all my friends and they all suffer this common problem. Thanks, Picture Perfectionist

Photos picture from Shutterstock

Dear PP

As luck would have it, Ask LH answered this exact same query last year. I’ve included the original post below for your convenience:

Sorting out your digital photo archive can be a real nightmare. The first step is to sweep away all the clutter by minimising the duplicates. In addition to making things less messy and more streamlined, this will also free up hard drive space which is never a bad thing.
DupliFinder is an old workhorse of a de-dupe application that lets you compare and delete duplicate images on Windows machines. It sifts through image files pixel by pixel and tells you if they’re a match which makes it perfect for cleaning up your photo library.
VisiPics is another option that’s worth checking out. It scans the photo content of each image file and then groups matches together. You can also adjust the match intensity via a sliding scale which is handy if you want to delete very similar photos that were taken in the same location. VisiPics can take its sweet time to complete a scan, but it’s capable of chugging away in the background without slowing down your computer.
Windows users can also employ Duplicate Commander to remove the extra copies and replace them with hard links. Just be mindful that the tool may remove different images that have the same file name, so use with caution. Windows, Mac and Linux users can also check out Duplicate File Searcher.
Once you’ve banished your unwanted duplicates it will be time to rename what’s left in a bid to make photo management more er, manageable. There are plenty of batch rename apps that can help in this department. Examples we’ve looked at in the past include Rapid Streams (Windows), Name Changer (Mac) and Bulk Rename Utility (Windows). OS X also comes with a built-in Automator tool which can accomplish the same results.
You can then use a file management application like TeraCopy or the cross-platform Ultra-copier to quickly move all your photos into the desired directories.
Now that things are slightly less terrifying, you may want to invest your time in a digital photo organiser.

And here are some additional reader tips from the same post:

Grayda: If you have Adobe Bridge, Picasa or a similar photo management app, you can sort them into various categories, such as by date (useful for finding holiday photos), by keyword (I started to do this, but with 300gb of photos, it was going to take a while 😐 ), or even location if your camera records GPS info.
My advice would be to start small — do one or two sets of photos at a time and keep going until you reach the end. It’s not going to be easy, it’s not going to be fun, and it’s certainly not going to be quick, but you’ll feel relieved when it’s done.
And for future photos, cut them off the card (not copy) so that you don’t get those kinds of duplicates, and de-dupe the similar looking photos before you move them to your computer.

pb12in: I file all my photos by Year > Month > Event. I don’t bother renaming the files, I just make sure that the ‘event’ folder’s name is descriptive enough so I can later search for something like ‘birthday’ and the results will quickly allow me to find what I’m after. It’s not a perfect solution, but is quick, easy and doesn’t tie me to a third party program or muck with the directory.

Thomas The Tanked Engine: I’m in the middle of doing this now. So far:

  • Used visipics to find duplicates. Seems like a pretty cool program, like the fact you can flag files and folders to ignore and it remembers.
  • Discovered that I was creating my own dupes by manually backing up my mobiles photos as well as using the dropbox auto camera upload feature. Stopped doing the manual backup.
  • Stop caring about individual filenames, even if some photos have identical filenames because they are from different devices. It doesn’t matter. Visipics does pixel compare, filename is irrelevant (as long as I’m not merging folders I guess?)
  • Thomas The Tanked Engine: Start caring about very useful folder names for groups of photos (e.g. holidays, weddings, any sort of event when multiple shots taken). Start with date (e.g. YYYY-MM-DD), basic description (e.g. Christmas in Brisbane) and even perhaps a notable highlight, to remind me what these shots were – as I get older one Christmas may be like another – (e.g. Tom Got Bit By Shark).
    So I start getting folders like:
    2008-12-26 Christmas In Brisbane (Tom Got Bit By Shark)
    2008-12-31 New Years At Home (Recycling Bin Set On Fire)
    I also made a separate folder for “great photos”. When you’re taking hundreds if not thousands of digital photos, you want a highlights reel! These are just copies of the original which still sits in the source folder (I exclude the ‘Great Photos’ from visipics dupe search).

BETLOG: I use a date-time-lens-tag based directory structure, and have my cameras set to produce unique filenames, but for various reasons additional fallbacks are often required.
I suspect the simplest way to ensure that all filenames are unique is to pull exif: DateTimeOriginal then adjust the syntax to be filesystem friendly for your OS, maybe add a numeric value to indicate several shots taken in that same second… and rename the file as that. Something that could be done auto/programatically, but which I don’t think is available in anything right now.
If you use a time format ending with a UTC reference you can even ensure that international travel and timezone hopping doesn’t make a logic mess.

grantguest: Just. Get. Lightroom. It’s cheap as chips and worlds better than anything else.

swimwiz: I have thousands of photos and keep adding to them with scans of shots going back to the turn of the 1900’s. With all the digital photos I have them named by year, month, day, hour, minute, second then sequence number as: yyyy-mm-dd-hh-mm-sec-seq#. This is done straight from the camera download or a file renaming program for older digital photos with data from the original file.
With scans of old photos they are given the same treatment from the day they are scanned.
After that I then sort them into folders of events or the persons/era you are doing them for, it is a never ending job but with photo viewers it is easy to pick out what you need and place them in a new folder appropriate for the sorting.
You should see the eyes light up when I show old friends photos they were in years and years ago.
Makes your collection of photos and sorting them worth the effort.

As always, if any readers have their own photo organisation tips and suggestions, please let PP know in the comments section below.


Got your own question you want to put to Lifehacker? Send it using our [contact text=”contact form”].


10 responses to “Ask LH: How Can I Deal With All My Duplicate Photos?”