image difference

This commit is contained in:
Carl Pearson
2023-03-07 06:58:11 -07:00
parent 98cd5fdb58
commit fd4e282fea

View File

@@ -141,3 +141,20 @@ A markdown document is generated by inserting each paragraph in turn.
A screenshot is inserted as well, *unless* it is too similar to the last inserted screenshot.
This happens when the speaker lingers on a slide for a while, generating a lot of text without changing the video much.
Finally, I use [Pandoc](https://github.com/jgm/pandoc) to convert that markdown file into a PDF.
## Image Similarity
How do I decide whether a frame is "too similar" to a previous frame?
I experimented with a few options and settled on the `dhash` function in `imagehash`.
A description of dhash is provided [here](https://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html).
In short, the difference hash works like this:
* Image is reduced to 9x8 (72 pixels) and grayscale.
* Compute pixel differences within rows, yeilding 8x8 grid of differences
* Each of 64 bits in the hash is set if the left pixel is brighter than the right pixel
* The distance between two hashes is the hamming distance - the number of bits changed.
For my purposes, we want to call *some* small variation in frames "the same", since many videos of talks have a small overlay of the presenter speaking.
However, we don't want to be too liberal, since it's also common for slides to change only incrementally as a concept is explained.
I settled on a difference of 1 bit as providing a reasonable test.
If the overlay of the speaker is too large, this doesn't work quite as well, but I'd rather include extra images in the output rather than too few.