Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

The history, achievements and future work of Internet Wayback Machine

“We may be years away from the invention of the first functional time machine, but thanks to this awesome San Francisco-based non-profit digital library we can have “universal access to all knowledge.” Sounds great, right? The Internet archive was founded in 1996 with the mission: “We are building a public library that can serve anyone in the world with access to the Internet.” And did we mention that this is free!? Founded in May of 1996, the Archive started collecting and saving data when the internet was still in its infancy. Those behind the scenes recognized that while the internet produced a steady stream of short-lived content for consumers to peruse, no one was in the business of saving what was no longer current. Thanks to what is now known as the “Wayback Machine” over twenty years of web history is organized and available with the click of a button. And there is more–the Archive has since expanded their library to include millions of digital versions of public works by allowing people to save copies of their own work into special collections for free. The “Wayback Machine” gets its name from something called the “WABAC machine,” a time machine used by Mr. Peabody and Sherman to visit (and often accidentally affect) famous historical events in the popular cartoon The Rocky and Bullwinkle Show. Thanks to some fancy software, the Internet Archive currently holds the following:

  • 279 billion web pages
  • 11 million books and texts
  • 4 million audio recordings (including 160,000 live concerts)
  • 3 million videos (including 1 million television news programs)
  • 1 million images
  • 100,000 software programs

Wow, those are some pretty impressive numbers. The wizards behind the Archive scan up to 1,000 books per day in 28 locations around the world. In their words, “Because we are a library, we pay special attention to books. Not everyone has access to a public or academic library with a good collection, so to provide universal access we need to provide digital versions of books.” But the archival treatment doesn’t end there. Starting with the television news content following the events of September 11th<, 2001 the archive has been collecting selected U.S. television news broadcasts ever since, offering direct access to history unfolding in real time. The Archive is also in the process of digitizing half a million sound recordings (songs, poetry readings, sound effects, and comedy shows, for example) at the clip of a thousand records a day for a new venture called “The Great 78 Project.” All recordings are being pulled from 78rpm records circa 1880-1960, and will be available for free download on a rolling basis in a variety of digital forms. If you would like to read more about this fascinating project, see here. The Internet Archive is funded through donations, grants, and in exchange for archiving and digitalization services. As of 2007 the site was officially designated as a library by the State of California, and it is currently one of the top 300 websites in the world. Their current focus is the building of an official “Internet Archive of Canada,” which will be a veritable copy of the current archive stowed safely away from any attempts of censorship or removal. The Archive’s founder, Brewster Kahle, describes this move as precautionary: “For us, it means keeping our cultural materials safe, private and perpetually accessible. It means preparing for a Web that may face greater restrictions. It means serving patrons in a world in which government surveillance is not going away; indeed it looks like it will increase. Throughout history, libraries have fought against terrible violations of privacy—where people have been rounded up simply for what they read. At the Internet Archive, we are fighting to protect our readers’ privacy in the digital world.” You can visit the Internet Archive blog here and journey through the site’s vast content here. Free access to information is a right that deserves to be exercised, so hop on that Wayback Machine and enjoy the ride.”

Sorry, comments are closed for this post.