
Google Friday invited members of
the web and artificial intelligence (AI) communities to help develop complementary protocols and standards governing AI-related content.
The move reflects increasing concerns that AI can
leverage web data in new ways, raising ethical challenges regarding fair use, privacy, and bias, as well as other potential unintended consequences.
Data across the web is used to train large
language models that are used in generative AI tools, raising questions about how much and what kind of data and content can be used from a publisher before it infringes on copyrights.
Google is calling on publishers, academics, civil society groups, and AI developers to join a public discussion on developing new protocols and ethical guidelines. Google
will initially organize the discussion, serving as the hub and disseminating information during.
advertisement
advertisement
"We believe it’s time for the web and AI communities to
explore additional machine-readable means for web publisher choice and control for emerging AI and research use cases," Danielle Romain, vice president of trust at Google, wrote in a Google blog post.
This is not the first time Google spearheaded such efforts. It
helped organize the community-developed web standard robots.txt nearly 30 years ago, which has proven to be a "transparent" way for web publishers to control how search engines crawl the content they
own.
Google said it's important that web publishers -- even personal websites -- have choice and control over the content written and posted on their sites, but that recent changes in search
and advancements in technology require new discussions on the flow and use of content.
There already have been a multitude of lawsuits and accusations related to the theft of intellectual
property and content being used to train large language models for AI, specifically generative AI platforms.
The AI Web Publishers' Mailing List is intended for members of the web and AI communities wanting to participate in and receive information on
the process being developed for machine-readable content that will give web publisher choices and controls.
The announcement follows announcements earlier this year of new AI products and
principles that aim to ensure platforms are ethical, fair, transparent, and accountable.