Meta Accused of Using 81.7TB of Pirated Books to Train AI Models

Agencies Ghacks
Feb 7, 2025
Facebook
|
8

Meta is facing allegations of downloading over 81.7 terabytes of pirated books to train its artificial intelligence models, according to newly unsealed court documents. These revelations have emerged in a copyright infringement lawsuit filed by authors, including Sarah Silverman, Richard Kadrey, and Christopher Golden, who claim that Meta utilized their works without permission to develop its AI technologies.

The unsealed emails suggest that Meta's executives, including CEO Mark Zuckerberg, were aware of and approved the use of data from Library Genesis (LibGen), a well-known repository of pirated books, for training their AI models. Internal communications indicate that Meta employees discussed strategies to obscure the origins of the data, such as removing explicit copyright markings and altering metadata, to mitigate legal risks.

Meta has defended its actions by asserting that training AI models on publicly available datasets constitutes "fair use" under copyright law. The company has filed motions to dismiss the lawsuit, arguing that their use of the data is transformative and does not infringe upon the authors' rights.

This case is part of a broader wave of legal challenges against tech companies accused of using copyrighted materials without authorization to train AI systems. The outcomes of these lawsuits could have significant implications for the development of AI technologies and the protection of intellectual property rights in the digital age.

Advertisement

Tutorials & Tips


Previous Post: «
Next Post: «

Comments

  1. Carl said on February 8, 2025 at 1:42 pm
    Reply

    Makes
    Everything
    Their
    Asset

  2. Shiva said on February 7, 2025 at 10:24 pm
    Reply

    Well, if it were a real AI you could say that it is getting closer and closer to being ‘human’… I have to say that most of the researchers I know are aware of LibGen or Sci-Hub, but as funny as this news seems, the sadness and hypocrisy of Meta remains.
    The last video of that multimillionaire kid justifying bullshit and sucking up to the new president was enough for me. I apologize for the language, but when it takes, it takes.

    1. Shiva said on February 8, 2025 at 10:17 pm
      Reply

      @Boris
      That our personal data represent the new ”gold dust” is not even up for discussion except to find solutions. Well, one can ask the Meta AI since it has probably memorized all the available university books that talk about it. :-)
      Personally, I have read The Age of Surveillance Capitalism by Shoshana Zuboff. I suppose it can be contextualized that, granted that we are all weak-minded, Meta is certainly not on the same level as a vacuum cleaner that sends home floor plan data (see Cambridge Analytica).

      @Allwynd
      I do not participate in American politics because I am European, and I try not to make the mistake of comparing Democrats and Republicans to our parties of the Right or Left and consequently cheering in the stadium. Whether it is this President or another President the sadness is that I see a billionaire kid who could afford at least some consistency other than just making money.
      And I close, which is not the place, I am instead very concerned when another billionaire social owner throws slogans like MEGA back home at me. Sadly, I am still fairly young to say ‘I hope I die before the future that lies ahead’.

      1. boris said on February 10, 2025 at 8:29 pm
        Reply

        @Shiva

        “Whether it is this President or another President, the sadness is that I see a billionaire kid who could afford at least some consistency other than just making money.”

        There is a reason for this. Unlike Microsoft/Google/Nvidia/Apple/Amazon, Meta does not really have monopoly or any kind of staying power in anything. Theoretically, within a year, a couple of new social networks blow up and suck up every second user Meta has right now. Even right now, X is picking up most of the conservatives that were hesitant to sign up for social networks before. And Bluesky picking up all liberals that used Twitter before.

        Meta has to offer something besides new AI bots and failed metaverse. And they have to show that they abandon censorship because that what most people right now want, at least in US. When people will want censorship back, Meta will implement it again. Unlike other big tech companies, Meta is always one step from irrelevancy, so they fight it with populism. Billionaire kid could live really well for the rest of his life, but as soon as he grows backbone, his company will start disappearing.

        “And I close, which is not the place, I am instead very concerned when another billionaire social owner throws slogans like MEGA back home at me.”

        While I agree with most of what he’s saying or doing domestically, it’s not his place to go to other countries to tell them what to do. Business might be global, but politics is local. Just make sure that X is open to all opinions worldwide, but leave foreign politics alone.

    2. boris said on February 8, 2025 at 11:01 am
      Reply

      Meta got caught, but they are all doing the same stuff. I recently saw a video where “Diaspora” network described that when they analyzed their traffic, 75% of it was from all different AI bots scrapping the whole network. And that did not include regular bots from search engines, just bots from AI companies. All of them mining every bit of Internet. This is not a Meta problem, this is an industry problem. If you blame just one company, you are missing a point.

      And if you believe this behavior is just an AI industry problem, you are missing the point again. All car manufacturers and mobile providers sell your GPS to the highest bidder. And most important buyers: car insurance industry, repo companies, governments.

      There are plenty of other examples, but the point is that Meta is just one of thousands of companies that does this. This is part of behavior on Internet or any other computer network that is very normalized by now.

    3. Allwynd said on February 8, 2025 at 10:01 am
      Reply

      Were you OK with the billionaire kid sucking up to the previous president instead? That he is greedy and spineless, he is, but I’m curious if you don’t like the current president, but liked the previous one instead.

      1. boris said on February 9, 2025 at 6:27 am
        Reply

        Martin asked me not to get too political before, so I am not going to go deep. “The billionaire kid” will suck to anybody who is in power (or whoever he feels has the real power). He sucked to the previous president, and now he sucks to the new one. Like any other big corporation founder/owner, he will do anything to be in favor with the party in power. This guy is not special. He is just most eager to please.

  3. Anonymous said on February 7, 2025 at 4:10 pm
    Reply

    In contrast to their plattforms Tos its hypocrisy at its best.

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.