diff --git a/content/post/20230303_transcribe/index.md b/content/post/20230303_transcribe/index.md
index 64b47ed..8c2d2ca 100644
--- a/content/post/20230303_transcribe/index.md
+++ b/content/post/20230303_transcribe/index.md
@@ -140,4 +140,21 @@ ffmpeg -y -ss 01:23:45 -i input.webm -frames:v 1 -q:v 2 output.jpg
 A markdown document is generated by inserting each paragraph in turn.
 A screenshot is inserted as well, *unless* it is too similar to the last inserted screenshot.
 This happens when the speaker lingers on a slide for a while, generating a lot of text without changing the video much.
-Finally, I use [Pandoc](https://github.com/jgm/pandoc) to convert that markdown file into a PDF.
\ No newline at end of file
+Finally, I use [Pandoc](https://github.com/jgm/pandoc) to convert that markdown file into a PDF.
+
+## Image Similarity
+
+How do I decide whether a frame is "too similar" to a previous frame?
+I experimented with a few options and settled on the `dhash` function in `imagehash`.
+A description of dhash is provided [here](https://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html).
+
+In short, the difference hash works like this:
+* Image is reduced to 9x8 (72 pixels) and grayscale.
+* Compute pixel differences within rows, yeilding 8x8 grid of differences
+* Each of 64 bits in the hash is set if the left pixel is brighter than the right pixel
+* The distance between two hashes is the hamming distance - the number of bits changed.
+
+For my purposes, we want to call *some* small variation in frames "the same", since many videos of talks have a small overlay of the presenter speaking.
+However, we don't want to be too liberal, since it's also common for slides to change only incrementally as a concept is explained.
+I settled on a difference of 1 bit as providing a reasonable test.
+If the overlay of the speaker is too large, this doesn't work quite as well, but I'd rather include extra images in the output rather than too few.