I had an idea on how to implement a "duplicate file finder" feature on Hirobooru that i guess it could work reasonably well

I could resize each image to 10x10px, turn it to grayscale and store the gray values in an array. Then when comparing images i could see which have the gray values most close to each other... :blobcatthonking:

I'm not sure if that's how it's supposed to be done but i just thought of that.

Follow

I've already implemented this algorithm and it works pretty well!

For now i can compare one image against many to find similar images. I need now to think on a way to compare a group of images against each other to find the duplicates...

· · Dashboard FE · 2 · 0 · 0

The algorithm works, and this makes SO MUCH EASIER to seek and destroy repeated images without having to export them and using another software.

Refined the algorithm a bit to make it only consider images as similar if they have roughly the same ratio.

@hideki
Sort the images into 100 lists by pixel value, compare images by checking how close they are to one another across all the sorted lists?

Or use Levenshtein distance on the resultant image when interpreted as a string. You could probably alter the algo, to favor low bits above high ones.

I think image searchers use a specialized hash to compare.

@NickolasGir thanks, will research about the Levenshtein distance. Yeah i've heard some calculate signatures and then compare distance between signatures too, need to see how that is done.

Sign in to participate in the conversation
Game Liberty Mastodon

Mainly gaming/nerd instance for people who value free speech. Everyone is welcome.