Let’s Talk About Google’s Data Feast
Google is at it again. Remember when we thought our public data was somewhat safe from the claws of this search engine behemoth? It seems that Google is now officially admitting it’s dipping into the open web buffet and feasting on whatever public data it pleases, courtesy of a nifty update to their privacy policy. Oh, and the kicker? This publicly available data is going to train their AI systems. Sweet, isn’t it?
The Privacy Policy Rewrite
And so, on the fine day of July 1st, 2023, Google decided to share a fresh slice of their privacy policy. “Google uses information to improve our services and to develop new products, features, and technologies that benefit our users and the public”. Oh, it sounds so altruistic, doesn’t it? But let’s peel the layers and decode what that means. Basically, any information you throw onto the web could be a free for all for Google’s AI to sharpen its teeth.
The Devil is in the Details
Here’s where it gets interesting. The updated policy gives us a bit more clarity on what services will be having a feast day with our data. Apparently, the term “AI Models” has replaced the somewhat narrower “language models”. The net is wider now, my friends, and Google has more room to swing its AI arms and gulp down more than just your casual text data. All of this is nicely tucked away in a section called “Your Local Information” – you’ve got to click through to find the gold.
A Sticky Wicket
Now, Google’s updated policy is clear that they’ll use “publicly available information” to train their AI machinery. But it’s not so clear how they plan to sidestep copyright laws. I mean, a good chunk of the web has robust policies that explicitly forbid data collection for AI training. And let’s not even start on the labyrinth that is GDPR. How Google plans to navigate these tricky waters will be quite the spectacle to watch.
The Big Picture
It’s not just Google that’s causing a stir in the AI world. The secrecy surrounding training data for AI systems like OpenAI’s GPT-4 is raising eyebrows. From social media posts to copyrighted content, the line is blurry on what’s fair game for these AI titans. It’s not surprising that we’re seeing a surge in lawsuits and the tightening of data regulations. Plus, there’s the question of the ethics of the human labor involved in sorting and handling this vast data ocean.
The Web Reacts
Meanwhile, Google is getting some heat from all sides. Gannett, the largest newspaper publisher in the US, is accusing Google of using AI advancements to monopolize the digital ad market. Social platforms like Twitter and Reddit are clamping down on free data harvesting, causing a bit of a ruckus among users.
So, there we are. Google’s updated privacy policy is a fresh reminder that the web is a wild west when it comes to data. As for where this all leads, well, that’s a story for another day.