image difference
This commit is contained in:
@@ -141,3 +141,20 @@ A markdown document is generated by inserting each paragraph in turn.
|
|||||||
A screenshot is inserted as well, *unless* it is too similar to the last inserted screenshot.
|
A screenshot is inserted as well, *unless* it is too similar to the last inserted screenshot.
|
||||||
This happens when the speaker lingers on a slide for a while, generating a lot of text without changing the video much.
|
This happens when the speaker lingers on a slide for a while, generating a lot of text without changing the video much.
|
||||||
Finally, I use [Pandoc](https://github.com/jgm/pandoc) to convert that markdown file into a PDF.
|
Finally, I use [Pandoc](https://github.com/jgm/pandoc) to convert that markdown file into a PDF.
|
||||||
|
|
||||||
|
## Image Similarity
|
||||||
|
|
||||||
|
How do I decide whether a frame is "too similar" to a previous frame?
|
||||||
|
I experimented with a few options and settled on the `dhash` function in `imagehash`.
|
||||||
|
A description of dhash is provided [here](https://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html).
|
||||||
|
|
||||||
|
In short, the difference hash works like this:
|
||||||
|
* Image is reduced to 9x8 (72 pixels) and grayscale.
|
||||||
|
* Compute pixel differences within rows, yeilding 8x8 grid of differences
|
||||||
|
* Each of 64 bits in the hash is set if the left pixel is brighter than the right pixel
|
||||||
|
* The distance between two hashes is the hamming distance - the number of bits changed.
|
||||||
|
|
||||||
|
For my purposes, we want to call *some* small variation in frames "the same", since many videos of talks have a small overlay of the presenter speaking.
|
||||||
|
However, we don't want to be too liberal, since it's also common for slides to change only incrementally as a concept is explained.
|
||||||
|
I settled on a difference of 1 bit as providing a reasonable test.
|
||||||
|
If the overlay of the speaker is too large, this doesn't work quite as well, but I'd rather include extra images in the output rather than too few.
|
||||||
|
Reference in New Issue
Block a user