This October, the Internet Archive’s Wayback Machine is projected to hit a once-in-a-generation milestone: 1 trillion web pages archived. That’s one trillion memories, moments, and movements—preserved for the public, forever.
We’ll be commemorating this historic achievement on October 22, 2025, with a global event: a party at our San Francisco headquarters and a livestream for friends and supporters around the world. More than a celebration, it’s a tribute to what we’ve built together: a free and open digital library of the web.
Join us in marking this incredible milestone. Together, we’ve built the largest archive of web history ever assembled. Let’s celebrate this achievement—in San Francisco and around the world—on October 22.
Here’s how you can take part:
1. RSVP Sign up now to be the first to know when registration opens for our in-person event and livestream. RSVP now
2. Support the Internet Archive Help us continue preserving the web for generations to come. Donate today!
3. Share Your Story What does the web mean to you? How has the Wayback Machine helped you remember, research, or recover something important? Submit your story
Let’s work together toward October 22—a day to look back, share stories, and celebrate the web we’ve built and preserved together.
A recent legal decision has reaffirmed the power of fair use in the digital age, and it’s a big win for libraries and the future of public access to knowledge.
On June 24, 2025, Judge William Alsup of the United States District Court for the Northern District of California ruled in favor of Anthropic, finding that the company’s use of purchased copyrighted books to train its AI model qualified as fair use. While the case centered on emerging AI technologies, the implications of the ruling reach much further—especially for institutions like libraries that depend on fair use to preserve and provide access to information.
What the Decision Says
In the case, publishers claimed that Anthropic infringed copyright by including copyrighted books in its AI training dataset. Some of those books were acquired in physical form and then digitized by Anthropic to make them usable for machine learning.
The court sided with Anthropic on this point, holding that the company’s “format-change from print library copies to digital library copies was transformative under fair use factor one” and therefore constituted fair use. It also ruled that using those digitized copies to train an AI model was a transformative use, again qualifying as fair use under U.S. law.
This part of the ruling strongly echoes previous landmark decisions, especially Authors Guild v. Google, which upheld the legality of digitizing books for search and analysis. The court explicitly cited the Google Books case as supporting precedent.
While we believe the ruling is headed in the right direction—recognizing both format shifting and transformative use—the court factored in destruction of the original physical books as part of the digitization process, a limitation we believe could be harmful if broadly applied to libraries and archives.
What It Means for Libraries
Libraries rely on fair use every day. Whether it’s digitizing books, archiving websites, or preserving at-risk digital content, fair use enables libraries to fulfill our public service missions in the digital age: making knowledge available, searchable, and accessible for current and future generations.
This decision reinforces the idea that copying for non-commercial, transformative purposes—like making a book searchable, training an AI, or preserving web pages—can be lawful under fair use. That legal protection is essential to modern librarianship.
In fact, the court’s analysis strengthens the legal groundwork that libraries have relied on for years. As with the Google Books decision, it affirms that digitization for research, discovery, and technological advancement can align with copyright law, not violate it.
Looking Ahead
This ruling is an important step forward for libraries. It reaffirms that fair use continues to adapt alongside new technologies, and that the law can recognize public interest in access, preservation, and innovation.
As we navigate a rapidly changing technological landscape, it’s more important than ever to defend fair use and support the institutions that bring knowledge to the public. Libraries are essential infrastructure for an informed society, and legal precedents like this help ensure they can continue their vital work in the digital age.
Here’s a resource sharing tip for our community of librarians:
RapidILL members have an option to include the Internet Archive as a potential supplier for their borrowing requests. If you are interested in providing your users with access to the Internet Archive’s collections through your RapidILL workflows, please complete this form.
How is knowledge created, shared, and preserved in the digital age—and what forces are shaping its future?
We’re thrilled to announce the launch of Future Knowledge, a new podcast from the Internet Archive and Authors Alliance. Hosted by Chris Freeland, librarian at the Internet Archive, and Dave Hansen, executive director of Authors Alliance, the series brings together authors, librarians, policymakers, technologists, and artists to explore how knowledge, creativity, and policy intersect in today’s fast-changing world.
In each episode, an author discusses their book or publication and the big ideas behind it—paired with a thought-provoking conversation partner who brings a fresh perspective from the realms of policy, technology, libraries, or the arts.
We’re kicking off the podcast with a double feature—two episodes tackling copyright history and AI’s global impact:
Episode 1: The Copyright Wars
Historian Peter Baldwin joins copyright scholar Pamela Samuelson to unpack The Copyright Wars—a sweeping look at 300 years of trans-Atlantic copyright battles. From 18th-century publishing monopolies to today’s clashes between Big Tech, libraries, and the entertainment industry, this conversation reveals how history can illuminate the future of intellectual property in a digital world.
Episode 2: Copyright, AI, and Great Power Competition
Authors Joshua Levine and Tim Hwang sit down with Lila Bailey to discuss Copyright, AI, and Great Power Competition. Together they explore how artificial intelligence is transforming copyright law—and how global powers are using IP policy as a strategic tool in the race for technological dominance.
Whether you’re an author thinking about how to share your work, a librarian navigating digital access, or a curious listener exploring how knowledge shapes our world, Future Knowledge is for you.
Ever wonder how government documents, once locked away on tiny sheets of microfiche, become searchable and accessible online? Now you can see it happen in real time.
Today, the Internet Archive has launched a livestream from our microfiche scanning center (https://www.youtube.com/live/aPg2V5RVh7U), offering a behind-the-scenes look at the meticulous work powering Democracy’s Library—a global initiative to make government publications freely available to the public.
“This livestream shines a light on the unsung work of preserving the public record, and the critical infrastructure that makes democracy searchable,” said Brewster Kahle, founder of the Internet Archive. “Transparency can’t be passive—it must be built, maintained, and seen. That’s what this livestream is all about.”
Watch the livestream now:
What You’ll See
The livestream features five active microfiche digitization stations, with a close-up view of one in action. Operators feed microfiche cards beneath a high-resolution camera, which captures multiple detailed images of each sheet. Software stitches these images together, after which other team members use automated tools to identify and crop up to 100 individual pages per card.
Each page is then processed, made fully text-searchable, and added to the Internet Archive’s public collections—completed with metadata—so that researchers, journalists, and the general public can explore and download them freely through Democracy’s Library.
📅 Live activity occurs Monday–Friday, 7:30am-3:30pm U.S. Pacific Time (GMT+8)—except U.S. holidays—with a second shift coming soon.
What Is Microfiche?
Microfiche is a flat sheet of film that holds dozens—sometimes hundreds—of miniaturized document images. It’s been a common format for archiving newspapers, court documents, government records, and more since the 20th century.
Why Is Microfiche Digitization Important?
“Materials on microfiche are an important part of our country’s history, but right now they are often only available online from expensive databases. We are excited that this project will digitize court documents from our collection and make them freely available to everyone,” said Leslie Street, Director of the Wolf Law Library of William and Mary College.
“Thousands of documents and reports from across the federal government were distributed in microfiche to Federal Depository Library Program (FDLP) libraries around the country from 1970 – 2022. While important for space-saving and preservation, microfiche has long been problematic for public access. So this digitization work of Democracy’s Library is incredibly important and will unlock free access to this essential historic public domain corpus to readers and researchers around the world!” noted James R. Jacobs, US government information librarian and co-author of the recently published book, Preserving Government Information: Past, Present, and Future.
Democracy’s Library is the Internet Archive’s ambitious project to collect, digitize, and provide free public access to the world’s government publications. From environmental impact reports to court decisions, these materials are essential for accountability, scholarship, and civic engagement.
The microfiche collections that will be digitized in this process include US GPO documents, Canadian government documents, US court documents, and UN publications. We are always looking for more collections to be donated.
Meet the People Behind the Work
From left: Internet Archive’s digital librarian, Brewster Kahle, with microfiche scanning operators Dylan, Louis, Elijah, Avery, and Fernando.
This digitization livestream was brought to life by Sophia Tung, appmaker & designer behind the viral robotaxi depot livestream on YouTube.
The digitization is overseen by scanning operators who are trained to handle physical library materials and digitization equipment.
Thanks also to Internet Archive staff who assisted this project, including CR Saikley, Merlijn Wajer, Brewster Kahle, Derek Fukumori, Jude Coelho, Anastasiya Smith, Jonathan Bloom, Bas Kloosterman, Andrea Mills, Richard Greydanus, Louis Brizuela, Carla Igot Bordador, and Ria Gargoles.
Thanks to Our Partners
Thank you to Wolf Law Library at the William & Mary Law School, University of Alberta, and Free Law Project for donating microfiche and helping advise this project.
If your library has microfiche or other materials to donate to the Internet Archive, please learn more about donating materials for preservation and digitization.
Support the Work
Preserving and digitizing these fragile, analog records is resource-intensive—and deeply worthwhile. Donate today to support the Internet Archive and Democracy’s Library.
Enjoy the livestream! Thank you for helping us preserve history and protect access to knowledge.
Inside the Internet Archive’s San Francisco headquarters, you’ll find racks of servers preserving humanity’s digital memory — from old websites to disappearing government data, books to historic videotapes.
“We are a digital library for our times — and hopefully, for all times,” says Mark Graham, director of the Wayback Machine.
But preserving access to information isn’t always easy. From political pressure to digital vanishing acts, the work of saving knowledge requires both care and courage.
In a time when websites can be taken down overnight — from climate change pages to stories celebrating diversity — the Wayback Machine ensures they’re not lost forever.
Former Air Force engineer Jessica Peterson, whose achievements were erased from the live web:
“I didn’t know [the Wayback Machine] existed… It gave me some relief.”
Whether you’re a researcher, student, journalist, or citizen — our goal is the same: Universal access to all knowledge.
If you value a free and open internet, watch this video. Then explore the Wayback Machine:https://web.archive.org/
A coalition of major record labels has filed a lawsuit against the Internet Archive—demanding $700 million for our work preserving and providing access to historical 78rpm records. These fragile, obsolete discs hold some of the earliest recordings of a vanishing American culture. But this lawsuit goes far beyond old records. It’s an attack on the Internet Archive itself.
This lawsuit is an existential threat to the Internet Archive and everything we preserve—including the Wayback Machine, a cornerstone of memory and preservation on the internet.
At a time when digital information is disappearing, being rewritten, or erased entirely, the tools to preserve history must be defended—not dismantled.
This isn’t just about music. It’s about whether future generations will have access to knowledge, history, and culture.
The Internet Archive is proud to join in celebrating a major milestone in the preservation of global cultural heritage: documents related to the history of slavery in Aruba have been officially added to UNESCO’s Memory of the World (MoW) International Register. The digitized documents have been preserved and are accessible online through the Coleccion Aruba and the Internet Archive.
These newly recognized documents are held by the National Archives of Aruba (ANA) and the National Library of Aruba (BNA). They offer crucial insight into the lives of enslaved people and their descendants in Aruba, helping to illuminate a shared painful past and its continuing impact on the present.
The nomination was prepared collaboratively by the Aruba National Committee for UNESCO’s Memory of the World Program (MoW-AW), UNESCO Aruba, ANA, and BNA. With the registration now official, these documents are not only globally recognized as having international significance—they are also more accessible than ever before.
The historical materials are available online through the Coleccion Aruba digital heritage site, as well as on the Internet Archive, supporting the goals of open access for schools, researchers, and the general public. This achievement underscores the importance of digitization and long-term preservation to ensure that future generations can continue to learn from these vital records.
The Internet Archive congratulates MoW-AW, UNESCO Aruba, the National Archives and National Library of Aruba, and their partners in Curaçao, Sint Maarten, Suriname, and the Netherlands on this historic achievement.
The internet is a living, breathing space—constantly growing, changing, and, unfortunately, disappearing. Important articles get taken down. Research papers become inaccessible. Historical records vanish. When content disappears, we lose pieces of our shared knowledge.
That’s where the Wayback Machine comes in. With the Wayback Machine’s Save Page Now tool, you have the power to help preserve the web in real time.
Why It Matters:
Prevent Link Rot: Keep references intact for future research.
Preserve Digital History: Ensure cultural moments remain accessible.
Save What Matters to You: You choose what to archive and preserve.
Senator Ron Wyden (D-Oregon) is urging the Federal Trade Commission to crack down on digital platforms that mislead consumers into believing they own purchased content when, in reality, they are only granted temporary access. In his statement, Wyden highlights how companies selling digital TV shows, e-books, music, and video games often retain the right to revoke access, leaving consumers without the content they paid for. He calls on the FTC to enforce transparency and prevent these deceptive sales practices. Read the full letter.
This push for fairness and transparency in digital media sales is important for libraries as well as consumers. Over the last decade, publishers have fundamentally changed the relationship between libraries and their collections, phasing out digital sales and even “perpetual access” license models in favor of subscription-only access models. While the companies behind these changes claim they will improve library services through enhanced discovery and integration of research content, librarians and scholars argue that renting rather than owning materials ultimately harms the libraries and their patrons.
“[T]he transition to subscription-only access represents more than a change in purchasing models – it fundamentally undermines the ability of academic libraries to build collections that serve their specific institutional needs. It is likely to impede our ability to maintain comprehensive research — let alone teaching — collections.”
Siobhan Haimé, Birkbeck, University of London
The shift to a streaming-only model doesn’t just harm libraries and consumers—it’s also devastating for artists, authors, and independent publishers. Without the ability to sell their work outright, creators are forced into licensing arrangements that give platforms control over distribution, pricing, and even availability. Independent publishers are pushing back, albeit unsuccessfully, as seen in their failed lawsuit against Amazon, alleging that the company’s dominance in digital books forces unfair terms on publishers and authors alike. Musicians, too, are speaking out—Max Collins, lead singer of famed alt-rock band Eve 6, explains how his band with popular songs averages a million streams each month on Spotify, paying out $3,000, on average, per month. As Collins writes in his op-ed, “It’s a pretty sick deal…for the corporations.”
Senator Wyden’s letter isn’t a sudden development—it’s the culmination of years of warnings about the risks of a “streaming-only” model and its impact on libraries and the communities they support. The shift away from ownership to perpetual leasing threatens long-term access to knowledge and culture. To explore what’s at stake, check out these additional resources:
The End of Ownership: Personal Property in the Digital Economy
By Aaron Perzanowski and Jason Schultz From the publisher, MIT Press: If you buy a book at the bookstore, you own it. You can take it home, scribble in the margins, put it on the shelf, lend it to a friend, sell it at a garage sale. But is the same thing true for the ebooks or other digital goods you buy? Retailers and copyright holders argue that you don’t own those purchases, you merely license them. That means your ebook vendor can delete the book from your device without warning or explanation—as Amazon deleted Orwell’s 1984 from the Kindles of surprised readers several years ago. These readers thought they owned their copies of 1984. Until, it turned out, they didn’t. In The End of Ownership, Aaron Perzanowski and Jason Schultz explore how notions of ownership have shifted in the digital marketplace, and make an argument for the benefits of personal property.
Data Cartels: The Companies that Control and Monopolize Our Information
By Sarah Lamdan From the publisher, Stanford University Press: In our digital world, data is power. Information hoarding businesses reign supreme, using intimidation, aggression, and force to maintain influence and control. Sarah Lamdan brings us into the unregulated underworld of these “data cartels”, demonstrating how the entities mining, commodifying, and selling our data and informational resources perpetuate social inequalities and threaten the democratic sharing of knowledge.
Four Digital Rights For Protecting Memory Institutions Online
By Lila Bailey, Michael Lind Menna The rights and responsibilities that memory institutions have always enjoyed offline must also be protected online. To accomplish this goal, libraries, archives and museums must have the legal rights and practical ability to:
Collect materials in digital form, whether through digitization of physical collections, or through purchase on the open market or by other legal means;
Preserve digital materials, and where necessary repair, back up, or reformat them, to ensure their long-term existence and availability;
Provide controlled access to digital materials for advanced research techniques and to patrons where they are—online;
Cooperate with other memory institutions, by sharing or transferring digital collections, so as to aid preservation and access.
The Publisher Playbook: A Brief History of the Publishing Industry’s Obstruction of the Library Mission.
By Kyle K. Courtney and Juliya Ziskina Abstract: Libraries have continuously evolved their ability to provide access to collections in innovative ways. Many of these advancements in access, however, were not achieved without overcoming serious resistance and obstruction from the rightsholder and publishing industry. The struggle to maintain the library’s access-based mission and serve the public interest began as early as the late 1800s and continues through today. We call these tactics the “publishers’ playbook.” Libraries and their readers have routinely engaged in lengthy battles to defend the ability for libraries to fulfill their mission and serve the public good. The following is a brief review of the times and methods that publishers and rightsholder interests have attempted to hinder the library mission. This pattern of conduct, as reflected in ongoing controlled digital lending litigation, is not unexpected and belies a historical playbook on the part of publishers and rightsholders to maximize their own profits and control over the public’s informational needs. Thankfully, as outlined in this paper, Congress and the courts have historically upheld libraries’ attempts to expand access to information for the public’s benefit.
Vanishing Culture: A Report on Our Fragile Cultural Record
By Luca Messarra, Chris Freeland and Juliya Ziskina In today’s digital landscape, corporate interests, shifting distribution models, and malicious cyber attacks are threatening public access to our shared cultural history. Vanishing Culture: A Report on Our Fragile Cultural Record aims to raise awareness of these growing issues. The report details recent instances of cultural loss, highlights the underlying causes, and emphasizes the critical role that public-serving libraries and archives must play in preserving these materials for future generations. By empowering libraries and archives legally, culturally, and financially, we can safeguard the public’s ability to maintain access to our cultural history and our digital future.
Chokepoint Capitalism: How Big Tech and Big Content Captured Creative Labor Markets and How We’ll Win Them Back
By Cory Doctorow & Rebecca Giblin This book examines how monopolistic corporations have structured markets—especially in digital media—to extract wealth while limiting access and competition. It also explores ways creators and the public can push back against these restrictive systems.