It should come as little surprise that any content you offer to the web for public consumption has the potential to be scraped and misused by anyone clever enough to do it. And while that doesn’t make this weekend’s report from The New York Times any less damning, it’s a great reminder about how important it is to really go through the settings for your various social networks and limit how your content is, or can be, accessed by anyone.
I won’t get too deep into the Times’ report; it’s worth reading on its own, since it involves a company (Clearview AI) scraping more than three billion images from millions of websites, including Facebook, and creating a facial-recognition app that does a pretty solid job of identifying people using images from this massive database.
Even though Clearview’s scraping techniques technically violate the terms of service on a number of websites, that hasn’t stopped the company from acquiring images en masse. And it keeps whatever it finds, which means that turning all your online data private isn’t going to help if Clearview has already scanned and grabbed your photos.
Still, something is better than nothing. On Facebook, likely the largest stash of your images, you’re going to want to visit Settings > Privacy and look for the option described: “Do you want search engines outside of Facebook to link to your profile?”
Turn that off, and Clearview won’t be able to grab your images. That’s not the setting I would have expected to use, I confess, which makes me want to go through all of my social networks and rethink how the information I share with them flows out to the greater web.
Lock down your Facebook even more with these settings
Since we’re already here, it’s worth spending a few minutes wading through Facebook’s settings and making sure as much of your content is set to friends-only as possible. That includes changing “Who can see your future posts” to “friends,” using the “Limit Past Posts” option to change everything you’ve previously posted to friends-only, and making sure that only you can see your friends list—to prevent any potential scraping and linking that some third-party might attempt. Similarly, make sure only your friends (or friends of friends) can look you up via your email address or phone number. (You never know!)
You should then visit the “Timeline and Tagging” settings page and make a few more changes. That includes only allowing friends to see what other people post on your timeline, as well as posts you’re tagged in. And because I’m a bit sensitive about all the crap people tag me in on Facebook, I’d turn on the “Review” options, too. That won’t help your account from being scraped, but it’s a great way to exert more control over your timeline.
Finally, even though it also doesn’t prevent companies from scraping your account, pull up the “Public posts” section of Facebook’s settings page and limit who is allowed to follow you (if you desire). You should also restrict who can comment or like your public information, like posts or other details about your life you share openly on the service.
Once I fix Facebook, then what?
Here’s the annoying part. Were I you, I’d take an afternoon or evening and write out all the different places I typically share snippets of my life online. For most, maybe that’s probably a handful of social services: Facebook, Instagram, Twitter, YouTube, Flickr, et cetera.
Once you’ve created your list, I’d dig deep into the settings of each service and see what options you have, if any, for limiting the availability of your content. This might run contrary to how you use the service—if you’re trying to gain lots of Instagram followers, for example, locking your profile to “private” and requiring potential followers to request access might slow your attempts to become the next big Insta-star. However, it should also prevent anyone with a crafty scraping utility to mass-download your photos (and associate them with you, either through some fancy facial-recognition tech, or by linking them to your account).
Will you be able to fully prevent someone from finding a way to build the next big database of searchable images? Probably not. But the most important lesson we can take from situations like these—Clearview AI’s…well…existence—is that you should understand the nuances of the services you share your information to, and you should always use your settings to restrict access to your data in whatever way you feel most comfortable. Were I you, I’d err on the side of “let trusted people see what I do instead of anyone on the internet,” but that’s just me.