Artificial intelligence developers heavily rely on illegally scraping copyrighted material from news publications and journalists to train their models, a news industry group mentioned, stated Cointelegraph.
Sources revealed on October 30, 2023, the News Media Alliance (NMA) published a 77-page white paper. The whitepaper is expected to accompany submission to the United States Copyright Office. The United States Copyright Office claimed that the data sets that train AI models use more news publisher content compared to other sources.
“Many generative AI developers have chosen to scrape publisher content without permission and use it for model training and in real-time to create competing products,” NMA explained.
Furthermore, the NMA also recommends the Copyright Office needs to adopt measures to scrap protected content from third-party websites, Cointelegraph concluded.
(With insights from Cointelegraph)