(navigation image)
Home Wayback Machine | Archive-It | Blog | Heritrix
Search: Advanced Search
Anonymous User (login or join us)
Upload

Most Downloaded Items Last Week more

  1. Crawldata from Alexa Internet from 2005-04-01T01:18:02PDT to 2005-03-05T10:41:36PDT
    273,872 downloads
  2. Crawldata from Alexa Internet from 2004-09-10T19:36:46PDT to 2004-09-11T02:21:43PDT
    180,246 downloads
  3. Liveweb Capture 2011-03-27T22:10:09PDT to 2011-03-28T05:27:05PDT
    137,953 downloads
  4. Crawldata from Alexa Internet from 2004-11-07T09:46:46PDT to 2004-11-07T19:19:57PDT
    134,956 downloads
  5. Crawldata from Alexa Internet from 2004-10-12T10:11:24PDT to 2004-10-12T20:16:40PDT
    134,422 downloads

Most Downloaded Items more

  1. Liveweb Capture 2011-03-27T22:10:09PDT to 2011-03-28T05:27:05PDT
    756,248 downloads
  2. Webwide Crawldata 2010-09-24T20:27:19UTC to 2010-09-25T04:26:09UTC
    727,109 downloads
  3. Crawldata from Internet Archive from 2007-08-01T19:54:22PDT to 2007-08-01T22:21:06PDT
    676,394 downloads
  4. Crawldata from Internet Archive from 2007-07-09T02:04:40PDT to 2007-07-09T04:14:42PDT
    540,852 downloads
  5. Liveweb Capture 2013-03-29T09:35:58 UTC to 2013-03-29T15:03:34 UTC
    518,724 downloads

Spotlight Item

Liveweb Capture 2011-03-27T22:10:09PDT to 2011-03-28T05:27:05PDT
Internet Archive Liveweb Capture from WaybackMachine, captured by wwwb-proxy0.us.archive.org:wbm from Sun Mar 27 22:10:09 PDT 2011 to Mon Mar 28 05:27:05 PDT 2011.

About the Internet Archive

Background

Frequently Asked Questions

1,698,183 itemsWelcome to Web Crawls

The Web Archive of the Internet Archive started in late 1996 is made available through the Wayback Machine, and some collections are available in bulk to researchers.

Other than the pages collected by the Internet Archive, major contributors include Alexa Internet, Cuil, and those listed below.

All items (most recently added first) - RSS

Sub-Collections

Accelovation Crawl
Web crawl snapshots generously donated from Accelovation. This data is currently not publicly accessible. From the site: Accelovation is pioneering the delivery of Insight Discovery™ software...
1,324 items
Alexa Crawls
Crawl data donated by Alexa Internet. This data is currently not publicly accessible. Decryption Keys are kept in an item. Alexa is the leading provider of free, global web metrics. Search Alexa to...
91,143 items
Archive Team
Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the...
13,566 items
Archive-It Digital Collection
The Archive-It Digital Collection
51,392 items
Away From Keyboard
Away From Keyboard is a memorial collection dedicated to preserving pieces of lives lived online from being scattered and lost. While no collection of data can ever replace a person, these archives...
293 items
collections-aaron-swartz
from Wikipedia: Aaron Hillel Swartz (November 8, 1986 – January 11, 2013) was an American computer programmer, writer, political organizer and Internet activist. Swartz was involved in the...
2 items
Common Crawl
Web crawl data from Common Crawl.
441 items
Cuil Crawl Data
Web crawl snapshot generously donated from cuil.com. This collection of pages mostly from 2007 and some from 2008, is about 310 terabytes of compressed data, and almost 60 billion URLs (mostly text)....
26,494 items
Custom Crawl Services
National library harvesting.
24,722 items
Focused Crawls
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
9,544 items
4 items
httparchive
Successful societies and institutions recognize the need to record their history - this provides a way to review the past, find explanations for current behavior, and spot emerging trends. In...
26 items
Institut national de l’audiovisuel
Crawl data from Institut national de l’audiovisuel in France. This data is currently not publicly accessible. from Wikipedia: The Institut national de l'audiovisuel (or INA, French for National...
50 items
Internet Archive Web Crawls
Crawl data collected by the Internet Archive. This data is currently not publicly accessible in this format. To view archived web pages, please visit the Wayback Machine.
346,068 items
Internet Memory Foundation
Data crawled on behalf of Internet Memory Foundation. This data is currently not publicly accessible. from Wikipedia: The Internet Memory Foundation (formerly the European Archive Foundation) is a...
50 items
Mercator Crawl
Crawl done with the DEC/HP-labs 'Mercator' crawler and converted to ARC format. This data is currently not publicly accessible.
1 items
Rescue Crawls
Rescue crawls conducted by the public for sites that have announced that they are closing.
2 items
Thumper Transfer
Web crawl data transferred from thumpers in Santa Clara data center.
urlteam Web Crawls
Crawl data collected by the urlteam. The URLTeam is the ArchiveTeam subcommittee on URL shorteners. We believe that they pose a serious threat to the internet's integrity. If one of them dies, gets...
4 items
Web Collections
Web Collections organized by year. Some of this data is currently not publicly accessible.
19 items
web-group-internal
miscellaneous data
28,760 items
Wiki Collections
Collections of Wiki data
30,631 items
Wikileaks.org Archive
A collection of web pages from the wikileaks websites as well as news coverage and commentary surrounding the Wikileaks releases. It includes coverage of the Afghan war diaries, the Iraq war logs,...
3 items

Recently Reviewed Items (more)

ArchiveTeam JSON Download of Twitter Stream: 2012-12
Average rating:4.00 out of 5 stars4.00 out of 5 stars4.00 out of 5 stars4.00 out of 5 stars

AIT-1216 Crawldata 2009-03-30T18:30:53PDT to 2008-11-08T20:37:09PST
Average rating:5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars

IMSLP Petrucci Music Library Data Dump - 20121202
Average rating:5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars

Communication issues musings of a dinosaur
Average rating:5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars5.00 out of 5 stars

www.theregister.co.uk/2010 201210 panic download
Average rating:

This Just In (more)

Webwide Crawldata 2013-05-23T13:38:37PDT to 2013-05-23T08:07:36PDT
1 hour ago

Wikimedia incremental dump files for the Belarusian Wikipedia on May 25, 2013
1 hour ago

Wikimedia incremental dump files for Wikiversity Beta on May 25, 2013
1 hour ago

Wikimedia incremental dump files for the Belarusian Classical Wikipedia on May 25, 2013
1 hour ago

Wikimedia incremental dump files for the Bikol Central Wikipedia on May 25, 2013
1 hour ago


 

New PostWayback Machine Forum Subscribe to or unsubscribe from this forum RSS feed of most recent posts to this forum

Subject Poster Replies Date
Dead Link on Archive.Org Page Disabled Community dot Org 1 May 24, 2013 07:50:24am
   Re: Dead Link on Archive.Org Page Jeff Kaplan 0 May 24, 2013 12:42:24pm
hi, please add my website. Xoltz 1 May 23, 2013 03:18:24pm
   Re: hi, please add my website. tophold 0 May 23, 2013 08:02:39pm
please fix http://momspaghetti.ytmnd.com/ Irockz 0 May 22, 2013 05:28:17pm
PDF Document for Archival CyberBob 0 May 22, 2013 03:56:55pm
Please add www.tophold.com tophold 1 May 20, 2013 10:35:32pm
   Re: Please add www.tophold.com Giperion 0 May 21, 2013 01:04:05am
insertion site Pawstudio 0 May 20, 2013 09:33:50am
Yoga Website dondevischarlie 0 May 20, 2013 03:33:34am
cara menghilangkan jerawat archieves hadingrh 0 May 19, 2013 12:46:53am
Add my academic website please. bohemiotx 0 May 17, 2013 04:09:36pm
Wayback Machine and 'robots.txt' Aaron1a12 0 May 17, 2013 08:17:41am
Neumu.net has been down due to an internal server error angeldeb82 1 May 16, 2013 08:39:53am
   Re: Neumu.net has been down due to an internal server error antheras55 1 May 17, 2013 05:26:01am
     Re: Neumu.net has been down due to an internal server error angeldeb82 1 May 17, 2013 07:43:49am
       Re: Neumu.net has been down due to an internal server error antheras55 0 May 17, 2013 08:59:25am
Please add my website trungkien8686 1 May 15, 2013 07:43:55pm
   Re: Please add my website Azhir 0 May 15, 2013 11:28:22pm
size of crawl Tom Phelps 0 May 15, 2013 10:10:03am
Please Add me Web Site antheras55 0 May 15, 2013 08:18:57am
Please Add My websites Anime 21 2 May 15, 2013 06:41:36am
   Re: Please Add My websites webarchiver354 0 May 15, 2013 07:13:44am
   Please Add My Site antheras55 0 May 15, 2013 08:18:01am
Pls add recent web archieve sdavidpaul 1 May 15, 2013 04:40:51am
   Re: Pls add recent web archieve webarchiver354 0 May 15, 2013 07:29:50am
Please add this site www.ondano.com chuco19 0 May 14, 2013 12:05:48pm
Please add my website Tipografialeone 0 May 14, 2013 09:52:45am
Please Add my website r3naturals 0 May 13, 2013 11:53:24am
please ad my site gino48 0 May 13, 2013 11:33:29am
please add my website adelrmdn 1 May 13, 2013 07:20:59am
   Re: please add my website Marianos 0 May 13, 2013 08:23:14am
The old Wayback Machine and missing content PS75 1 May 12, 2013 03:10:43pm
   Re: The old Wayback Machine and missing content Jeff Kaplan 2 May 12, 2013 03:33:01pm
     Re: The old Wayback Machine and missing content CEAK 1 May 13, 2013 05:09:32am
       Re: The old Wayback Machine and missing content Jeff Kaplan 1 May 13, 2013 07:49:03am
         Re: The old Wayback Machine and missing content CEAK 0 May 13, 2013 08:47:49am
     Re: The old Wayback Machine and missing content PS75 1 May 14, 2013 09:23:25am
       Re: The old Wayback Machine and missing content PS75 1 May 14, 2013 09:41:43am
         Re: The old Wayback Machine and missing content Jeff Kaplan 1 May 14, 2013 09:57:41am
           Re: The old Wayback Machine and missing content PS75 1 May 14, 2013 10:21:29am
             Re: The old Wayback Machine and missing content Jeff Kaplan 1 May 14, 2013 10:52:57am
               Re: The old Wayback Machine and missing content PS75 1 May 14, 2013 12:07:39pm
                 Re: The old Wayback Machine and missing content webarchiver354 0 May 15, 2013 06:58:28am
Concerned that my blog has been added to your archive AlbertinaMcNeill 3 May 08, 2013 03:19:48pm
   Re: Concerned that my blog has been added to your archive Peter Frouman 2 April 26, 2013 05:46:19am
     Re: Concerned that my blog has been added to your archive jezta 3 April 26, 2013 09:16:05am
       Re: Concerned that my blog has been added to your archive jory2 1 April 26, 2013 09:52:34am
         Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 0 April 26, 2013 05:26:20pm
       Re: Concerned that my blog has been added to your archive jory2 1 April 26, 2013 09:52:34am
         Re: Concerned that my blog has been added to your archive jezta 1 April 26, 2013 10:10:05am
           Re: Concerned that my blog has been added to your archive jory2 1 April 26, 2013 10:19:18am
             Re: Concerned that my blog has been added to your archive jezta 1 April 26, 2013 10:35:29am
               Re: Concerned that my blog has been added to your archive PDpolice 1 April 26, 2013 05:29:31pm
                 Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 1 April 29, 2013 04:31:41pm
                   Re: Concerned that my blog has been added to your archive micah6vs8 1 April 29, 2013 05:20:25pm
                     Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 1 April 30, 2013 02:26:01am
                       Re: Concerned that my blog has been added to your archive micah6vs8 1 April 30, 2013 07:42:48am
                         Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 3 May 01, 2013 01:08:16am
                           Re: Concerned that my blog has been added to your archive PDpolice 0 May 01, 2013 01:40:34am
                           Re: Concerned that my blog has been added to your archive micah6vs8 0 May 01, 2013 02:40:14am
                           Re: Concerned that my blog has been added to your archive jory2 2 May 01, 2013 04:40:02am
                             Re: Concerned that my blog has been added to your archive PDpolice 0 May 02, 2013 01:37:40am
                             Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 1 May 06, 2013 03:45:51am
                               Re: Concerned that my blog has been added to your archive jory2 0 May 06, 2013 04:22:28am
       Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 0 April 26, 2013 04:19:33pm
     Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 1 April 26, 2013 04:11:28pm
       Re: Concerned that my blog has been added to your archive Peter Frouman 3 April 26, 2013 05:30:48pm
         Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 0 April 26, 2013 05:45:08pm
         Re: Concerned that my blog has been added to your archive jory2 1 April 30, 2013 08:56:23am
           Re: Concerned that my blog has been added to your archive Peter Frouman 2 April 30, 2013 09:13:06pm
             Re: Concerned that my blog has been added to your archive JaneSmith01 0 May 01, 2013 02:02:41am
             Re: Concerned that my blog has been added to your archive jory2 0 May 01, 2013 03:15:12am
         Re: Concerned that my blog has been added to your archive jory2 1 May 01, 2013 05:33:33am
           Re: Concerned that my blog has been added to your archive Peter Frouman 1 May 01, 2013 07:36:22am
             Re: Concerned that my blog has been added to your archive jory2 0 May 01, 2013 10:40:51am
   Re: Concerned that my blog has been added to your archive peterthenovice 1 May 01, 2013 02:34:30pm
     Re: Concerned that my blog has been added to your archive AlbertinaMcNeill 1 May 01, 2013 03:35:25pm
       Re: Concerned that my blog has been added to your archive GregoriV 1 May 07, 2013 11:56:11pm
         Re: Concerned that my blog has been added to your archive jory2 1 May 08, 2013 03:10:20am
           Re: Concerned that my blog has been added to your archive PDpolice 1 May 08, 2013 12:22:53pm
             Re: Concerned that my blog has been added to your archive GregoriV 0 May 08, 2013 10:55:29pm
Please add this website jaiveb 1 May 07, 2013 06:19:49pm
   Re: Please add this website jaiveb 0 May 07, 2013 06:56:56pm
THANKS NATALI0648 0 May 07, 2013 12:53:45pm

View more forum posts
 

Terms of Use (10 Mar 2001)