• setVeryLoud(true);@lemmy.ca
    link
    fedilink
    English
    arrow-up
    27
    ·
    edit-2
    8 days ago

    Gist:

    What’s new: The Northern District of California has granted a summary judgment for Anthropic that the training use of the copyrighted books and the print-to-digital format change were both “fair use” (full order below box). However, the court also found that the pirated library copies that Anthropic collected could not be deemed as training copies, and therefore, the use of this material was not “fair”. The court also announced that it will have a trial on the pirated copies and any resulting damages, adding:

    “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages.”

      • setVeryLoud(true);@lemmy.ca
        link
        fedilink
        English
        arrow-up
        15
        ·
        8 days ago

        My interpretation was that AI companies can train on material they are licensed to use, but the courts have deemed that Anthropic pirated this material as they were not licensed to use it.

        In other words, if Anthropic bought the physical or digital books, it would be fine so long as their AI couldn’t spit it out verbatim, but they didn’t even do that, i.e. the AI crawler pirated the book.

        • devils_advocate@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          8
          ·
          8 days ago

          Does buying the book give you license to digitise it?

          Does owning a digital copy of the book give you license to convert it into another format and copy it into a database?

          Definitions of “Ownership” can be very different.

          • booly@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            3
            ·
            7 days ago

            Does buying the book give you license to digitise it?

            Does owning a digital copy of the book give you license to convert it into another format and copy it into a database?

            Yes. That’s what the court ruled here. If you legally obtain a printed copy of a book you are free to digitize it or archive it for yourself. And you’re allowed to keep that digital copy, analyze and index it and search it, in your personal library.

            Anthropic’s practice of buying physical books, removing the bindings, scanning the pages, and digitizing the content while destroying the physical book was found to be legal, so long as Anthropic didn’t distribute that library outside of its own company.

          • VoterFrog@lemmy.world
            link
            fedilink
            English
            arrow-up
            13
            ·
            edit-2
            7 days ago

            It seems like a lot of people misunderstand copyright so let’s be clear: the answer is yes. You can absolutely digitize your books. You can rip your movies and store them on a home server and run them through compression algorithms.

            Copyright exists to prevent others from redistributing your work so as long as you’re doing all of that for personal use, the copyright owner has no say over what you do with it.

            You even have some degree of latitude to create and distribute transformative works with a violation only occurring when you distribute something pretty damn close to a copy of the original. Some perfectly legal examples: create a word cloud of a book, analyze the tone of news article to help you trade stocks, produce an image containing the most prominent color in every frame of a movie, or create a search index of the words found on all websites on the internet.

            You can absolutely do the same kinds of things an AI does with a work as a human.

          • Enkimaru@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            arrow-down
            1
            ·
            8 days ago

            You can digitize the books you own. You do not need a license for that. And of course you could put that digital format into a database. As databases are explicit exceptions from copyright law. If you want to go to the extreme: delete first copy. Then you have only in the database. However: AIs/LLMs are not based on data bases. But on neural networks. The original data gets lost when “learned”.

    • DerisionConsulting@lemmy.ca
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      8 days ago

      Formatting thing: if you start a line in a new paragraph with four spaces, it assumes that you want to display the text as a code and won’t line break.

      This means that the last part of your comment is a long line that people need to scroll to see. If you remove one of the spaces, or you remove the empty line between it and the previous paragraph, it’ll look like a normal comment

      With an empty line of space:

      1 space - and a little bit of writing just to see how the text will wrap. I don’t really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

      2 spaces - and a little bit of writing just to see how the text will wrap. I don’t really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

      3 spaces - and a little bit of writing just to see how the text will wrap. I don’t really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

      4 spaces -  and a little bit of writing just to see how the text will wrap. I don't really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.