byte[]
Philomena Contributor
"@derpy727":/meta/feature-suggestions-and-discussion/post/2932808#post_2932808
[bq]There's literally no other way to find the original derpibooru page by a given image downloaded from Derpibooru except by heavyweight reverse image search (which doesn’t have a JSON API even).[/bq]Technically, it actually does have a JSON API; POST with the image form-encoded in the image param, and set Accept to @application/json@, or append .json to the URL, or you can use it as a bookmarklet with the GET query param @scraper_url@. However, based on what you wrote, this probably wouldn't be very helpful to you.
[bq]I'm not even sure what is the use-case for hashes of original submissions alone. [...] Maybe you're using it to detect duplicate submissions… no, scratch that.[/bq]Actually, that's correct. SHA-512 prevents exact duplicate copies of images from being uploaded. This catches about 50% of our duplicate uploads; the rest are caught with a heavier-weight perceptual deduplication.
[bq]But if you did, SHA512 of post-optimisation images would be even BETTER for detecting duplicates, because [...][/bq]This is a faulty assertion. Optimized image data do not have a single "normal form":https://en.wikipedia.org/wiki/Normal_form_(mathematics) that they will naturally and obviously be reduced to for every set of input pixels, and the output data and resultant hash are likely to be different for runs on differently-encoded inputs.
[bq]And since I'm not quite confident in Derpibooru's reverse search algorithm (which is it, btw?)[/bq]Homegrown. See "here":https://gist.github.com/liamwhite/b023cdba4738e911293a8c610b98f987.
As far as I can tell, nothing I can personally do on my side would be helpful for you.
[bq]There's literally no other way to find the original derpibooru page by a given image downloaded from Derpibooru except by heavyweight reverse image search (which doesn’t have a JSON API even).[/bq]Technically, it actually does have a JSON API; POST with the image form-encoded in the image param, and set Accept to @application/json@, or append .json to the URL, or you can use it as a bookmarklet with the GET query param @scraper_url@. However, based on what you wrote, this probably wouldn't be very helpful to you.
[bq]I'm not even sure what is the use-case for hashes of original submissions alone. [...] Maybe you're using it to detect duplicate submissions… no, scratch that.[/bq]Actually, that's correct. SHA-512 prevents exact duplicate copies of images from being uploaded. This catches about 50% of our duplicate uploads; the rest are caught with a heavier-weight perceptual deduplication.
[bq]But if you did, SHA512 of post-optimisation images would be even BETTER for detecting duplicates, because [...][/bq]This is a faulty assertion. Optimized image data do not have a single "normal form":https://en.wikipedia.org/wiki/Normal_form_(mathematics) that they will naturally and obviously be reduced to for every set of input pixels, and the output data and resultant hash are likely to be different for runs on differently-encoded inputs.
[bq]And since I'm not quite confident in Derpibooru's reverse search algorithm (which is it, btw?)[/bq]Homegrown. See "here":https://gist.github.com/liamwhite/b023cdba4738e911293a8c610b98f987.
As far as I can tell, nothing I can personally do on my side would be helpful for you.