{"id":9515,"date":"2025-03-03T14:02:46","date_gmt":"2025-03-03T14:02:46","guid":{"rendered":"https:\/\/journals.law.unc.edu\/ncjolt\/?p=9515"},"modified":"2025-03-03T14:02:46","modified_gmt":"2025-03-03T14:02:46","slug":"to-torrent-or-not-to-torrent-that-is-the-question","status":"publish","type":"post","link":"https:\/\/journals.law.unc.edu\/ncjolt\/blogs\/to-torrent-or-not-to-torrent-that-is-the-question\/","title":{"rendered":"To Torrent or Not to Torrent?: That is The Question"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"612\" height=\"423\" src=\"https:\/\/journals.law.unc.edu\/ncjolt\/wp-content\/uploads\/sites\/4\/2025\/03\/unnamed-1.jpg\" alt=\"\" class=\"wp-image-9516\" srcset=\"https:\/\/journals.law.unc.edu\/ncjolt\/wp-content\/uploads\/sites\/4\/2025\/03\/unnamed-1.jpg 612w, https:\/\/journals.law.unc.edu\/ncjolt\/wp-content\/uploads\/sites\/4\/2025\/03\/unnamed-1-300x207.jpg 300w\" sizes=\"(max-width: 612px) 100vw, 612px\" \/><\/figure>\n\n\n\n<p>The controversy around A.I. continues with the publication of <a href=\"https:\/\/arstechnica.com\/tech-policy\/2025\/02\/meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say\/\">recently unsealed emails from Meta<\/a> in a copyright case against them for illegally training its A.I. models on pirated books. The emails show that Meta torrented \u201c<a href=\"https:\/\/arstechnica.com\/tech-policy\/2025\/02\/meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say\/\">at least 81.7 terabytes of data across different shadow libraries<\/a> through the site <a href=\"https:\/\/annas-archive.org\/activity\">Anna\u2019s Archives<\/a>.\u201d<\/p>\n\n\n\n<p>The complaint in said case, <a href=\"https:\/\/dockets.justia.com\/docket\/california\/candce\/3:2023cv03417\/415175\"><em>Kadrey v. Meta Platforms, Inc<\/em><\/a><em>., <\/em>is a reminder that companies training A.I. models need vast amounts of data to train their models, but this risks massive copyright liability. \u201cVastly smaller acts of data piracy\u2014Just .008 percent of the amount of copyrighted works Meta pirated\u2013have resulted in Judges referring the conduct to the US Attorney\u2019s office for criminal investigation.\u201d&nbsp;<\/p>\n\n\n\n<p>As a Meta employee stated \u201c<a href=\"https:\/\/www.tomshardware.com\/tech-industry\/artificial-intelligence\/meta-staff-torrented-nearly-82tb-of-pirated-books-for-ai-training-court-records-reveal-copyright-violations\">[t]orrenting from a corporate laptop doesn\u2019t feel right<\/a>.\u201d Yet the word \u201ctorrent\u201d may seem alien to those not familiar with downloading terabytes of data, or with no background in computer science.\u00a0<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>The plaintiffs allege that Meta employees were aware of the illegal, copyright infringing behavior they were engaging in, and continued anyway<\/p><\/blockquote>\n\n\n\n<p>As explained by <a href=\"https:\/\/www.techradar.com\/vpn\/what-is-a-torrent\">Tech Radar<\/a>, \u201c[T]orrenting uses a peer-to-peer model for file transfer. Instead of downloading a file from a single source in one continuous stream, torrenting breaks the file into smaller pieces that can be rapidly distributed between peers in the \u2018swarm.\u2019 Each peer in a torrent swarm is responsible for uploading and downloading parts of the file simultaneously. Instead of file transfer being limited to the upload speed of the server as in the client-server model, each peer in the swarm can use their full bandwidth to distribute the file.\u201d Further, without the use of a virtual private network (VPN), the torrentor\u2019s IP address is <a href=\"https:\/\/arstechnica.com\/tech-policy\/2025\/02\/meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say\/\">visible<\/a>. Some of Meta\u2019s emails speak to this fact potentially being known by its employees and an effort to hide the IP address of Meta\u2019s network.<\/p>\n\n\n\n<p>These emails show, the plaintiffs allege, that Meta employees were aware of the illegal, copyright infringing behavior they were engaging in, and continued anyway. <a href=\"https:\/\/arstechnica.com\/tech-policy\/2025\/02\/meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say\/\">Meta maintains<\/a> their use of the data constitutes \u201cfair use\u201d under copyright law.<\/p>\n\n\n\n<p>What this complaint, and litigation more generally highlights, is a \u201cdo now, ask for forgiveness later\u201d mentality of companies training <a href=\"https:\/\/www.digitalmusicnews.com\/2024\/08\/15\/eric-schmidt-ai-companies-commentary\/\">large language models<\/a>. Said models, put very simply, need access to vast amounts of data in order<a href=\"https:\/\/arstechnica.com\/science\/2023\/07\/a-jargon-free-explanation-of-how-ai-large-language-models-work\/\"> to write comprehensive responses<\/a> to inquiries. The more data the models can pull from, the reasoning goes, the more sophisticated its answers can be.<\/p>\n\n\n\n<p>The new frontier of A.I., and the possibility of substantial financial return for firms in this space to develop the best models both encourage a \u201ctake now and pay settlements for copyright violations later\u201d philosophy. This is to say nothing about Meta using its own customers data \u201c<a href=\"https:\/\/web.swipeinsight.app\/posts\/meta-s-new-notification-on-ai-training-content-designed-to-minimize-user-objections\">to train models by default, without explicit consent<\/a>.\u201d<\/p>\n\n\n\n<p>As the Meta case unfolds, there are questions as to the precedent a case like this could set for future A.I. data use and training. Some commentators have spoken on the fact that using large data sets in this way may fall under the \u201c<a href=\"https:\/\/www.eff.org\/deeplinks\/2025\/02\/copyright-and-ai-cases-and-consequences\">fair use<\/a>\u201d doctrine of copyright law. Some courts have held that this could be true, such as a <a href=\"https:\/\/www.eff.org\/deeplinks\/2025\/02\/copyright-and-ai-cases-and-consequences\">Delaware court holding<\/a> before altering its opinion that, \u201c[An A.I.] developer used copyrighted works only \u2018as a step in the process of trying to develop a \u201cwholly new,\u201d albeit competing, product \u2026 that\u2019s \u2026 transformative intermediate copying, [or fair use].\u2019\u201d<\/p>\n\n\n\n<p>Regardless of how much weight the fair use doctrine will carry in the training of A.I. models, the facts alleged <a href=\"https:\/\/redact.dev\/blog\/meta-ai-zuckerberg-pirated-books-training-controversy\">against Meta likely do not point<\/a> in the direction of them winning the aforementioned case. Indeed, the case may end in a settlement as have other cases dealing with similar <a href=\"https:\/\/www.eff.org\/deeplinks\/2025\/02\/copyright-and-ai-cases-and-consequences\">concerns<\/a>. A.I. proponents, it\u2019s fair to say, fear a case of copyright infringement being appealed to higher courts on the chance that they lose and the entire A.I. development-complex suddenly stares <a href=\"https:\/\/www.businessinsider.com\/generative-ai-copyright-meta-google-openai-a16z-microsoft?op=1\">down the barrel of huge payments<\/a> necessitated by their potentially illegal data piracy, torrenting, etc.\u00a0<\/p>\n\n\n\n<p>Regardless, if Meta or others like <a href=\"https:\/\/www.eff.org\/deeplinks\/2025\/02\/copyright-and-ai-cases-and-consequences\">Google<\/a> settle their cases out of court, the Meta controversy shows that, sooner or later, an owner of a copyright may likely challenge these companies to the bitter end and commentators, or the counsel for firms like Meta, can only speculate as to who will win and who will pay.<\/p>\n\n\n\n<p><strong>Brett R. Goble<\/strong><\/p>\n\n\n\n<p>Brett Rodney Goble is a 2L and the University of North Carolina School of Law. He graduated from Centre College with a Bachelor of Arts in Political Science in 2022. He enjoys chess, cooking, and distance running. <\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The controversy around A.I. continues with the publication of recently unsealed emails from Meta in a copyright case against them for illegally training its A.I. models on pirated books. The emails show that Meta torrented \u201cat least 81.7 terabytes of data across different shadow libraries through the site Anna\u2019s Archives.\u201d The complaint in said case, <a href=\"https:\/\/journals.law.unc.edu\/ncjolt\/blogs\/to-torrent-or-not-to-torrent-that-is-the-question\/\" class=\"more-link\">&#8230;<\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[51],"tags":[416,179,425,163],"_links":{"self":[{"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/posts\/9515"}],"collection":[{"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/comments?post=9515"}],"version-history":[{"count":1,"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/posts\/9515\/revisions"}],"predecessor-version":[{"id":9517,"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/posts\/9515\/revisions\/9517"}],"wp:attachment":[{"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/media?parent=9515"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/categories?post=9515"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/journals.law.unc.edu\/ncjolt\/wp-json\/wp\/v2\/tags?post=9515"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}