Addie's place on the web...

Just some stuff I want to share with you

Photo tagging and resizing

Sunday 9 August 2020
Jekyll

I ended my photo cleanup post with the statement that I should review how I resize images and handle the Exif tags in them.

That process took over 30 hours to resize the (5.000+) photos on the site and still resulted in some issues. Over the last weeks, I switched to another toolset to do the resizing. It now takes less than 1 hour to resize the images and so far, I have not found any issues with the results.

The “old” way of processing

Up till now, I used MiniMagick, MiniExiftool and MultiExiftool to do the resizing and Exif processing. All 3 are ruby wrappers for ImageMagick and ExifTool. Those tools do the actual work.

ImageMagick and ExifTool are powerful tools that provide lots of functionality that is not used for this website. The wrappers do not add anything to the functionality to these tools. Since the wrappers are spawning processes for each moment where they need to use ImageMagick or ExifTool, the performance is low. Another downside of this approach was that for each photo I had to use all 3 of them in a particular order. Each of them would then open the “previous” image file, do their own little bit of magic, and then update the file or write it to another file. This resulted in lots of IO and overhead. Overall, the performance of this approach was poor.

The “new” way of processing

I started to search for other ways of processing the photos and eventually found libvips.

According to the benchmarks and speed and memory use information in the libvips wiki pages, it should be much faster than ImageMagick. And libvips uses libexif to handle the Exif tag processing. As a result, there is no need for multiple tools; this one does it all and should be much faster as well.

The next step was to get it to work. Fortunately, there is a nice ruby gem to work with libvips: ruby vips. And there is lots of documentation as well (although you will need to switch between the ruby vips and libvips documentation every now and then to fully grasp how to get going).

I created an application to process some individual photos and was amazed by the speed increase. Based on that, I updated my plugin. Here’s a snippet of the code that does the actual work:


    # By now, we know what the size of the image should be: @width and @height

    buffer = IO.binread(path)
    vips_image = Vips::Image.thumbnail_buffer buffer, @width, height: @height, size: 'down', linear: false

    # Clean up the metadata/tags.
    #
    # We have a list of tags that we do not want to remove.
    # Start by getting all current tags, then build a list of tags that we don't need anymore

    fields = vips_image.get_fields

    to_remove = fields.difference([
        "exif-data",
        "exif-ifd0-Artist",
        "exif-ifd0-DateTime",
        "exif-ifd2-DateTimeDigitized",
        "exif-ifd2-DateTimeOriginal",
        "exif-ifd2-TimeZoneOffset",
        "exif-ifd3-GPSAltitude",
        "exif-ifd3-GPSAltitudeRef",
        "exif-ifd3-GPSDateStamp",
        "exif-ifd3-GPSLatitude",
        "exif-ifd3-GPSLatitudeRef",
        "exif-ifd3-GPSLongitude",
        "exif-ifd3-GPSLongitudeRef",
        "exif-ifd3-GPSMapDatum",
        "exif-ifd3-GPSSatellites",
        "exif-ifd3-GPSSpeed",
        "exif-ifd3-GPSSpeedRef",
        "exif-ifd3-GPSTimeStamp"
    ])

    # Remove all tags that we don't need.

    to_remove.each do |field_name|
        vips_image.remove field_name
    end

    # Add a copyright and artist Exif tag to the image.
    #
    # We already have a @owner that holds the name of the person that took the picture,
    # but we need to find out when the picture got taken.
    # 
    # There are several Exif tags that we can use for that.
    # Once we have found one, we can construct the actual copyright and artist tag
    # and add them to the image.

    date_fields = [
    "exif-ifd2-DateTimeOriginal",
    "exif-ifd2-DateTimeDigitized",
    "exif-ifd3-GPSTimeStamp",
    "exif-ifd0-DateTime"
    ]

    date_fields.each do |field_name|
        if vips_image.get_typeof(field_name) > 0
            field = vips_image.get field_name
            year = field[0,4]
            copyright = "Copyright #{year} by #{@owner}"

            vips_image.set_type Vips::REFSTR_TYPE, "exif-ifd0-Artist", @owner
            vips_image.set_type Vips::REFSTR_TYPE, "exif-ifd0-Copyright", copyright
            break
        end
    end

    # Write the final result to the destination path

    vips_image.write_to_file dest_path, strip: false, Q: 95

Then it was time to go for it: resize all photos. It took less than 1 hour to do all of them. Initially I did not trust what I saw; this was much to quick. After lots of checking, everything checked out to be okay.

So, here we are. All photos have been resized and all Exif tags have been cleaned up.

Happy with yet another improvement. Time to move on to the next item on the to do list.


Want to respond to this post?
Look me up on twitter Twitter, facebook Facebook or linkedin LinkedIn.