Your website can now opt out of training Google’s Bard and future AIs

Large language models are trained on all kinds of data, most of which it seems was collected without anyone’s knowledge or consent. Now you have a choice whether to allow your web content to be used by Google as material to feed its Bard AI and any future models it decides to make.

It’s as simple as disallowing “User-Agent: Google-Extended” in your site’s robots.txt, the document that tells automated web crawlers what content they’re able to access.

Though Google claims to develop its AI in an ethical, inclusive way, the use case of AI training is meaningfully different than indexing the web.

“We’ve also heard from web publishers that they want greater choice and control over how their content is used for emerging generative AI use cases,” the company’s VP of Trust, Danielle Romain, writes in a blog post, as if this came as a surprise.

Interestingly, the word “train” does not appear in the post, although that is very clearly what this data is used for: as raw material to train machine learning models.

Instead, the VP of Trust asks you whether you really don’t want to “help improve Bard and Vertex AI generative APIs” — “to help these AI models become more accurate and capable over time.”

See, it’s not about Google taking something from you. It’s about whether you’re willing to help.

On one hand that is perhaps the best way to present this question, since consent is an important part of this equation and a positive choice to contribute is exactly what Google should be asking for. On the other, the fact that Bard and its other models have already been trained on truly enormous amounts of data culled from users without their consent robs this framing of any authenticity.

The inescapable truth borne out by Google’s actions is that it exploited unfettered access to the web’s data, got what it needed, and is now asking permission after the fact in order to look like consent and ethical data collection is a priority for them. If it were, we would have had this setting years ago.

Coincidentally, Medium just announced today that it would be blocking crawlers like this universally until there’s a better, more granular solution. And they aren’t the only ones by a long shot.

source

Rinsu Ann Easo

Diligent Technical Lead with 9 years of experience in software development. Successfully lead project management teams to build technological products. Exposed to software development life cycle including requirement analysis, program design, development and unit testing and application maintenance. Has worked on Java, PHP, PL/SQL, Oracle forms and Reports, Oracle, Bootstrap, structs, jQuery, Ajax, java script, CSS, Microsoft Excel, Microsoft Word, C++, and Microsoft Office.

Amazon Unveils Nova Act: AI Agent Revolutionizing Online Shopping

Metro Bank Leverages AI to Revolutionize Corporate and Commercial Lending

Lloyds Banking Group Secures Patent for AI-Powered Cybersecurity Innovation

SP and CME Group to Sell Post-Trade Business to KKR for $3.1 Billion

16 Game-Changing Startups Pitching at ebadays Fintech Zone in Paris

UK RegTech Northrow Explores Sale Amid Regulatory Challenges

SwapAgent and Trafigura Launch Blockchain Deposit Accounts with JPMorgan’s Kinexys

FinTech InShorts

Latest

SP and CME Group to Sell Post-Trade Business to KKR for $3.1 Billion

16 Game-Changing Startups Pitching at ebadays Fintech Zone in Paris

UK RegTech Northrow Explores Sale Amid Regulatory Challenges

Popular

Banking as a Service: Meaning, Examples, Benefits and Future

FinTech Alliance: Partners with Seedrs to facilitate funding opportunities for founders

Best fintech blogs and websites

Sitemap