Would you guys use the following service if it existed?

Would you guys use the following service if it existed?

Screenshots of posts from social media, think a screenshot of a tweet from donald trump or whatever can be faked super easily (as our /b/rothers know...). We were thinking of building a service that would verify that screenshot to see whether trump actually wrote that exact content or whether it was doctored.

Attached: 1447985963207.jpg (480x536, 74K)

I wouldn't use it but I can already see a limitation.

You'll have to have a scanning system in place to extract the relevant text portions from the shot and parse it over to a crawler that will have to sift through the posts still available. If said crawler isn't able to see the post or if the original was taken down, it's totally fucked.

So literally just archive.is?

That's a fair consideration and part of the technical hurdle to overcome, but FWIW I believe it to be doable.

My question simply pertains whether people care to verify content that is so easily faked

I don't think that service can parse text from screenshots which is exactly the problem.. I can't find an easy way to verify such content (e.g., screenshots of tweets, facebook posts, youtube comments, news articles, etc.)

if its going to be related to image board culture or politics in any way it might be an interesting conceptual project and website people use for fun but worthless in monetary terms.
professionally if you could make the system work without flaws (v.hard and potentially v.costly) there may be demand from law enforcement and the legal world

Right, so you want to upload a picture to your service and check whether that happened. It's a hell of a challenging thing to do. And if the content gets stolen, unless you're constantly crawling the web, you will not be able to verify things that were deleted. Which means you're going to have a search engine tier database

>And if the content gets stolen,
I don't know why I said stolen, I obviously meant deleted.

I definitely agree that it would be challenging. Especially the concern you're bringing up, to constantly be crawling each of the services claimed to be supported. But this is an engineering problem (that may or may not be challenging) not a research question, right?

Wasn't this already a share blue thing and somebody cracked the shitty fact checker code?

Wouldn't it be better to reverse the problem? Instead of trying to verify screenshots, which is most likely impossible, why not provide a screenshot verification service? So that you can print a hash to a screenshot and then people can verify whether that has corresponds to a valid capture.

It wouldn't provide actual verification, but if it becomes popular enough people may not believe unverified screenshots.

I'll offer one way we thought of potentially monetizing this: imagine you are someone who puts out content... like, trump... you may be interested in ensuring that no one is putting out content under your name. Perhaps even be notified when faked content is identified somewhere.

Also there may be advertising interest in knowing that you as an individual cared to verify whether the post of a particular celebrity was faked or not

I am not familiar with the thing you are describing. Can you give more info please?

You do know that if even one pixel is different between two pics, the hashes will be completely different, right?

That's interesting, and certainly simpler on the front-end (i.e., it would save us from developing the software to extract the text from the screenshot).

I think the point is that each screenshot would have an association with a true link (i.e., a reference) and in this way, be verified

I don't mean hashing the pixels. I mean an algorithm that associates the same picture (even though different in format, compression, etc) to a sequence that can be checked.
Pictures can be associated by similarity, it's how all reverse image searches work

This would be a disaster. and also the reason why we are not considering verifying image-content that people put out (or video/audio for that matter).

We just want to verify the source of textual information

Oh I see. That makes more sense. Yeah, that would definitely do the trick. Are there algorithms available for reverse image searching?

Well the implementation details would be left to you. I'm just saying that providing fact verification is doable if you check whether the information is true as it is happening.
Say somebody wants to screenshot a tweet, they would send the picture to your website and the source of that tweet. Then your backend would generate a code that could be shared. And you would archive the source of that information, so that whoever wants to fact check it will have access to it.

Yes. Well, it's still an active area of research but thanks to Deep Learning (TM) performance on this problem has improved dramatically.

I like this... thanks for proposing the idea!

Moving the onus onto the person who is creating the content to ensure that it would still be trusted despite being a screenshot makes a lot of sense

Ah, that’s interesting. Google reverse image search and tineye has been around for a few years though, are they using machine learning to do it? Even as early as 2011?

Not OP btw, just interested.

There's a very small demographic for people smart enough to consider validity but not able to do their own legwork. By the time a screenshot goes viral, there's already a snopes or political entry for it.

If your system grabbed all posts as they happen and stored them in a database for later checks, it'd beat those to the punch.

machine learning (and computer vision) have been around for a REALLY long time and people have attacked these problems for a long time. It's only recently that performance has gotten so good that it was worth anyone's while to pay attention to it and pay (a lot) of money for these services.

If you are interested in reading more about this stuff you can read about: keypoint matching (e.g., HOG, SIFT) and Deformable Parts Models. But all of this stuff has mostly been displaced by deep learning

Some liberal fact checkers thought they found a way to verify facts online, but the code was too simple and was cracked. In order to ridicule the liberal fact checkers, someone created a website to auto generate code which could be applied to whatever ridiculous shit you wanted. It's a really stupid idea, if you ask me.

You are hitting at the core of the proposal: is there a demographic of people that need content verified. Me and my team believe so.

In some cases, it may be a matter of convenience- like if it is the immediate last post of that account it's easy. But what if it's a post from a week ago? Or if it's not twitter, but say, a CNN article that's not as easily cataloged? For content producers it may lend some peace of mind that their account can't be misrepresented.

Of course both pieces need to be in play (i.e., people caring that content is verified and producers caring that their content can't be hijacked)

Lol. I think that generally verifying stuff will be pretty difficult, but i don't know if that reflects poorly on the concept of verifying online content. Thanks for mentioning this though, very interesting.