As the large language models (LLMs) become more powerful and advanced, they trigger quite a few concerns over their potential implications, including in areas such as copyright infringement thanks to Llama 3.1. being able to recall nearly half of a Harry Potter book.
Specifically, a recent study has estimated that Meta’s artificial intelligence (AI) model Llama 3.1 70B had memorized as much as 42% of the first Harry Potter book by J.K. Rowling, called Harry Potter and the Sorcerer’s Stone (also known as The Philosopher’s Stone), well enough to reproduce 50-token excerpts, per a report on June 12.
Harry Potter book is just the start
By comparison, an earlier, yet similar-sized model launched by the technology behemoth in February 2023 – Llama 1 65B – had recalled only 4.4% of the same book, raising concerns over the potential legal liability of the AI platform’s further memorization advances.
At the same time, the researchers found that Llama 3.1 70B was more likely to reproduce popular books, like The Sorcerer’s Stone, J.R.R. Tolkien’s The Hobbit, and George Orwell’s 1984. For instance, it only memorized 0.13% of Sandman Slim, a 2009 novel by author Richard Kadrey.
Still, Llama 3 could memorize more than any of the other models on the market. According to James Grimmelmann, a Cornell law professor who has collaborated with the paper’s authors:
“There are really striking differences among models in terms of how much verbatim text they have memorized.”
Meanwhile, Meta has launched another, newer, open-source Llama 4 model, which researchers are yet to test out in terms of potential copyright concerns. That said, the company is also reportedly betting a $15 billion stake in Scale AI to supercharge its “superintelligence” ambitions.
Earlier this year, Meta also announced a major breakthrough with ‘mind-reading’ AI. As the company said, the AI can decode and reconstruct sentences from the mind solely by reading brain signals, getting closer to realizing advanced machine intelligence (AMI).