|
Poster:
|
luuk84 |
Date:
|
November 03, 2009 06:14:08am |
|
Forum:
|
texts
|
Subject:
|
Re: Google rejects? |
I am really ONLY talking about Google scans uploaded here. I am one of those amateurs who scans vast numbers of books and I hugely appreciate the work done by others. My complaint is exclusively a Google one and concerns stuff uploaded by user tpb. The most recent scan I had a look at was Lovelich's Merlin (
http://www.archive.org/details/merlinamiddleen00lovegoog) which has one very odd looking page somewhere in the middle. One that I rejected recently were the multiple volumes of Hector Boece's Chronicle. These are all genuine Google scans (books from the university libraries they are known to be scanning) and since they are quite often faulty, I wondered they might be rejects.
|
Poster:
|
stbalbach |
Date:
|
November 03, 2009 07:12:09am |
|
Forum:
|
texts
|
Subject:
|
Re: Google rejects? |
Google is/was notorious for poor quality scans, in particular in the first few years. There are even entire blogs set up just to show examples of things like peoples fingers in pages etc.. Internet Archive scans have always been much better quality. Google has improved some recently and even begun rescanning some of its previous ones. The Google scans on IA are just copies that users have copied over, they are not rejects.
|
Poster:
|
Time Traveller |
Date:
|
November 03, 2009 08:23:22pm |
|
Forum:
|
texts
|
Subject:
|
Re: Google rejects? |
so I was correct about us moving stuff over from Google.
and could the bad scans by Google, have been before it began pushing Google Books as a major
WWW service? (money maker)
|
Poster:
|
garthus |
Date:
|
November 03, 2009 08:56:28pm |
|
Forum:
|
texts
|
Subject:
|
Re: Google rejects? |
Peter,
See:
http://www.archive.org/details/Order_To_Show_Cause_With_TRO_New_York_State__763For one of the many cases which I have been invloved in; this one is returnable November 12 of this year. I will be posting the others when I get the time. (Seems like engineers also make good lawyers) Did this without the help or input of any Lawyers. Wonderful how as the system gets more complicated, those with the knowledge can use it against itself.
Gerry
|
Poster:
|
garthus |
Date:
|
November 03, 2009 08:39:40pm |
|
Forum:
|
texts
|
Subject:
|
Re: Google rejects? |
Luke,
I probably have looked at thousands of Google scans and a similar number of archive scans. Also put up nearly 900 myself. Statistically Google and Microsoft scans have a very low defect rate considering what materials the scanners were working with and their quality has gotten better over time. My objection concerns their bastardization of the images with watermarks, but that is another story. In any case a bad scan is better than no scan and this can all be worked out over time. If someone wants perfection they have to do what many of us have done, that is to scan the items ourselves. The point here is it is getting better, and with more caring volunteers it will continue to get better.
Gerry
|
Poster:
|
luuk84 |
Date:
|
November 04, 2009 08:25:43am |
|
Forum:
|
texts
|
Subject:
|
Re: Google rejects? |
Thanks for the replies. It is reassuring to find so many dedicated people and to have the Google business cleared up a little. I am still astonished that they originally did such a shoddy job (I find the Microsoft scans to be much better). I work with scanned texts almost exclusively nowadays and a prerequisite is that these texts are scanned well (which is why I do most of my own scanning) because only then can they be OCRed successfully. For me this last stage is the crucial one, because I use a search engine that also indexes all this material and allows for sophisticated (proximity) searches on the basis of which I do my work. It is a great relief to find a huge work scanned by Google but an even greater frustration to then discover the end result is useless for OCRing, so that I have to do it myself anyway. As I said before, this happens again and again. In my case a shoddy job is worse than nothing at all, because I loose a lot of time checking it out only to find it is not as good as it should be.