> Hash collisions would fail human review. This (pervasive, over the past couple...

TeMPOraL · on Aug 10, 2021

Can they even, legally, review anything at all? I mean, it's highly likely there will be actual CP among the matches, viewing of which is - AFAIK - a crime in the US.

cmsj · on Aug 10, 2021

That is somewhat unclear at the moment. They don't get to see the actual image in your library, they see a derived image that's part of the encrypted data uploaded by your phone as it analyses the images.

I don't believe any of the information they've released thus far, gives any actual detail about what that derived image actually is.

One might guess it's a significantly detail-reduced version of the original image, that they would compare against the detail-reduced image that is able to be generated from the matching hash in the CSAM database.

simondotau · on Aug 10, 2021

Tens of thousands of automated detections per day? Unlikely. More likely tens per year. Remember, this isn't a porn detector combined with a child detector. It is hashing images in your cloud-enabled photo library and comparing those to hashes of images already known to child abuse authorities.

In addition, consider how monumentally unlikely it is for any CSAM enthusiast to copy these illicit photos into their phone's general camera roll alongside pictures of their family and dog. This is only going to catch the stupidest and sloppiest CSAM enthusiast.

detaro · on Aug 10, 2021

For comparison to your "likely tens per year" number, Facebook is running the same kind of detectors and reports ~20 million instances a year: https://twitter.com/durumcrustulum/status/142377627884745113...

scbrg · on Aug 10, 2021

That doesn't seem to be the same kind of detectors at all.

"21.4 million of these reports were from Electronic Service Providers that report instances of apparent child sexual abuse material that they become aware of on their systems."

So those 20M seems to be images that Facebook looked at and determined to be CP. Apple's system is about comparing hashes against already known CP.

For the record: I don't support Apple's system here, but it's not the same kind of detection at all. Let's try to not make up random facts.

detaro · on Aug 10, 2021

From the same thread: https://twitter.com/alexstamos/status/1424017125736280074

> The vast majority of Facebook NCMEC reports are hits for known CSAM using a couple of different perceptual fingerprints using both NCMEC's and FB's own hash banks.

scbrg · on Aug 10, 2021

Ah, I see. My apologies.

DannyBee · on Aug 10, 2021

Facebook looked at them after they hash matched known CP. That is how all these providers do it.

If you think that this is 20 million people mashing the report button, that is almost certainly wrong

simondotau · on Aug 10, 2021

That's a summary number of many kinds of reports, of which CSAM hash matches would be one part.

That summary number also includes accusations of child sex trafficking and online enticement. I wouldn't be surprised if reported allegations of trafficking and enticement were in excess of 99.9% of Facebook's reporting. But since they don't break it out, I can only guess.

Given that guesses aren't useful to anyone, it would be interesting if you know of any statistics from any of the major tech vendors, of the reporting frequency of just CSAM hash matches.

detaro · on Aug 10, 2021

> of which CSAM hash matches would be one part.

The majority part:

https://twitter.com/alexstamos/status/1424017125736280074

> The vast majority of Facebook NCMEC reports are hits for known CSAM using a couple of different perceptual fingerprints using both NCMEC's and FB's own hash banks.

simondotau · on Aug 10, 2021

Fascinating. Thank you for providing the clarification. I still find that number to be perplexingly huge. If it's indeed correct, one hopes that Apple know what they're getting themselves in for.

jacquesm · on Aug 11, 2021

> If it's indeed correct

Just admit you are wrong and leave it at that without continuing to try to put a false light on this.

simondotau · on Aug 12, 2021

Thanks for the kind suggestion, but I'm not going to concede anything on the basis of an assertion made by one person in one tweet, with zero supporting evidence, zero specificity, zero context.

Assuming that number is correct, it means there are orders of magnitude more reports than there are entries in the CSAM database. So even if I conceded that Facebook were reporting over 10 million CSAM images, how many distinct images does this represent? More than four? We have no idea.

How many of those four were actually illegal? Remember, there's a Venn diagram of CSAM and illegal. A non-sexual, non-nude photograph of a child about to be abused is CSAM but not illegal.

This is a serious topic; you don't seem to be taking it seriously.

matwood · on Aug 10, 2021

Google is probably a better comparison. I can't find the source atm, but IIRC it was ~500k/year.

simondotau · on Aug 10, 2021

That wouldn't surprise me as Google's reporting would include everything seen by GoogleBot as it crawls the internet.

cgio · on Aug 10, 2021

Ten thousand iOS users doing something stupid or sloppy per day (noting they don’t have to be stupid or sloppy in general for that to happen) would not hit the monumentally unlikely criteria for me. Also this is not counting the false positives which is the premise of this thread.

simondotau · on Aug 10, 2021

Yes, being sloppy is common.

I don't know about anyone else but I've never had any issue with regular porn sloppily falling into my camera roll. And that's just regular legal porn. Maybe I'm more diligent than others but regardless, it's just not something that happens to me.

Being sloppy with material which you know is illegal? Material which, if stumbled upon by a loved one, could utterly ruin your life whether or not authorities are notified? Material which (I optimistically assume) is difficult to acquire and you'd know to guard with the most extreme trepidation? We're seriously expecting tens of thousands of CSAM enthusiasts to be sloppy with their deepest personal secret and have this stuff casually fall into their camera roll?

I'm not buying that.

samrolken · on Aug 10, 2021

A false positive will not have any effect. The threshold system they have means that they won’t be able to decrypt the results unless there are many separate matches.