Comments on: Google Print — the debate Blog, news, books Mon, 06 Feb 2017 12:39:00 +0000 hourly 1 By: Imre Simon Sun, 13 Nov 2005 10:51:52 +0000 In a 1965 book, coordinated by Internet pioneer J.C.R. Licklider and called “The Libraries of the Future”, based on statistical projections the authors predicted that around the year 2000 a computer system would be able to store the text of all the books ever written.

The program Google Prints seems to be a realization of this insight.

Curiously, as far as I know, not even Licklider and his co-authors were able to forsee that such a landmark realization would open the doors to cross-polinate the contents of all these books, through the building and instantaneous use of global indices, for instance, which are an important part of the Google Print project.

Equally astonishing to me is the fact that they did not forsee either, as far as I know, that such a technological breakthrough might eventually be pre-empted by upholding and enforcing copyright arguments. Note that all the copyright priciples were already in place at that time and that the ones we have now are not that much different from the ones of 1965. There would have been no need whatsoever for extrapolations!

A curiosity, anyway.

By: Joseph Pietro Riolo Wed, 09 Nov 2005 07:25:47 +0000 To nate:

There is no difference between creating an index and
distributing that index. It is considered as a separate
work in the U.S. copyright law (see the definition of
“supplementary work” within the definition of “work made
for hire” in Section 101). I don’t know if it has
been tested in any U.S. court. The U.S. copyright law
is pretty clear on this because it also defines
“derivative work” which is very different from
“supplementary work”.

However, you can’t recreate from index an identical
copy of a work that still has copyright. It is similar
to trying to recreate an identical copy from snippets
from different sources as allowed by fair use. For
example, 100,000 people quote snippets from the latest
Harry Potter book as allowed by fair use (for school
assignments/homework, research, criticism, analysis,
and so on). Then, I collect snippets from their works
to create a work that is identical or substantially
similar to the Harry Potter book. The U.S. copyright
law does not permit this.

There is nothing wrong with creating an index of the
color at each position of an image. After all, it is
just an index. But, if someone tries to recreate
an image from the index, he crosses the line into
the zone of infringement.

Joseph Pietro Riolo

Public domain notice: I put all of my expressions in this
comment in the public domain.

By: nate Tue, 08 Nov 2005 23:51:25 +0000 Joseph Pietro Riolo writes:
> Creating an index of words in a book is permissible by the U.S. copyright law.

Has this been tested anywhere? And is there a legal difference between creating that index and distributing that index? If the full text index includes the word order (as any index capable of phrase searching is), it’s trivial to re-invert that index to produce the original text. If it is both legal to create and distribute that index, that’s tantamount to saying that it’s legal to make a copy of and distribute the work itself.

It’s tempting to make a parallel between digital text and digital images. In ‘Kelly v. Arriba Soft’ (according to my layman’s understanding) it was decided that thumbnails are allowed to be created and distributed but high resolution copies were not. Obviously (red flag word) if you take a copyright GIF and convert it into an (approximately) equal resolution TIFF, it does not suddenly lose its copyright. But essentially, apart from the lack of a widespread tool to automatically rebuild the document from the inverted index, building an index is just such a conversion to another format in that no information is lost. There is (in my mind at least) a nifty parallel between an index of the color at each position of an image and the location of each word within a document.

Does this mean that one is not actually allowed to build such an index? Although people seem to be offering the ‘well, if building the index is illegal then web search itself is illegal’ reasoning as an apparent reductio ad absurdam, I’m not so sure that it is absurd. I think there is going to need to be a significant clarification of the law here (either by the courts or the Congress), and probably one that eventually makes a much more explicit link between the medium in which an object is expressed, the degree of ‘fair use’ that one is allowed, and the purpose to which that ‘use’ is being put.


By: Peter Rock Tue, 08 Nov 2005 22:09:44 +0000 Thanks JPR. That was informative.

Yes, it is important to distinguish between ownership and control. It would seem as though there is no copyright available on such a database as it – as far as I understand – won’t contain any arguable “creativity”.

But as you point out, in European law there is an opportunity for control through sui generis. The control available under this regime is worrisome. That is, Google can claim infringement if one were to either make substantial copies of portions of the database for their own personal use (extraction) or distribute copies to others (re-utilization).

So let’s say this project goes forth.

How do the copyrights of the authors relate to Google’s sui generis database rights? Can Google simply license the database to individuals (effectively offering copies of the scanned material) yet exercise their “re-utilization” rights to prevent anyone else from competing in the searchable-database arena? Or do the copyrights affecting the data contained within the database trump the rights of sui generis thus forcing Google to basically enforce their rights whether they want to or not?

I can’t help but believe that much of this headache is a result of trying to continue to apply the archaic notion of ALL RIGHTS RESERVED to a digital world. If only all works were under a creative commons license!

It is becoming clearer and clearer to me that the framers of Section 1, Article 8, Clause 8 of the U.S. Constitution never intended perfect control over distribution. I suppose exclusive rights weren’t much of an issue back then given the cost of reproduction with no opportunity to sell. If only they had imagined a digital world with the Internet. Perhaps they would have made this point clear and stated that 100% “exclusive” rights are not a fair bargain for the public – the supposed benefactors of copyright. The non-commercial right of creative commons licenses is, I believe, the key to striking a balance between free distribution and $$$ for authors. Sharing will not turn authors into beggars.

Where’d I put that tylenol?

By: Joseph Pietro Riolo Tue, 08 Nov 2005 20:58:31 +0000 To Peter Rock:

Google obviously will have control over the database.
After all, it is the one that has database. The amount
of control that Google has over the database greatly
depends on how much control it wants through security
measures and/or license.

It is important to make a distinction between control
and ownership. Google can’t claim ownership in the
database because the U.S. copyright law does not grant
copyright to any non-creative work. The database is
simply a list of all occurrences of words in books.
There is no creativity in selecting which words to
keep and which words to discard. It is like telephone
book that attempts to list all phone numbers in an
area. So, if Google accidentally releases database
to the people in the public, they can copy database as
much as they want to. In Europe, however, Google
can claim sui generis right in the database.

Google is not the only company that does it.
is another company that controls its databases through
license even though much information in the database is
in the public domain.

It is not an issue of “fair use”. Creating an index
of words in a book is permissible by the U.S. copyright
law. Database containing index is permissible as well.
What is the issue is the snippet. Plaintiffs in the
lawsuits claim that snippets are not within the boundary
of fair use.

Joseph Pietro Riolo

Public domain notice: I put all of my expressions in this
comment in the public domain.

By: Peter Rock Tue, 08 Nov 2005 18:04:31 +0000 The more I think about it, the aspect of Google Print that concerns me is – who will have control over the database created by this project? I REALLY don’t feel it is right for Google to exclusively own it even if they are the ones creating it. Exclusive rights to such a database seems wrong to me.

Is the focus going to simply be on “fair use” and whether or not Google gets the green light?

I understand this is an important “fair use” issue, but the rights to the database seem – to me at least – to be a much more vital question that needs to be considered before Google gets the green light.

What am I missing? This is a very complex issue…my head hurts.

By: David Tue, 08 Nov 2005 13:05:39 +0000 Correct me if I’m wrong, but women read as men do, as far as I know. By this I mean that the process is structurally equivalent, they look at the page, decode the glyphs or higher-order primitives (such as words) and convert this information into symbolic representations of the writing. Same can be said for search. Is there some peculiarity in women’s reading that would make it imperative to have them represented? Do they do something differently that needs addressing by itself? Because if they do not, as I would hold, it is a disservice to claim they should be represented. Women are first and foremost people, and just as 53-year-olds need not be represented in a debate, because their is no functional difference that requires it, neither do, in most debates, women.

So, whether there are women in the debate or not is entirely irrelevant, and subtracts no legitimacy whatsoever. It could be argued that women could participate just as the men who did, and this is true. Maybe there was a bias against women in the choice of debaters. Whether that was the case or not, though, doesn’t make it imperative to purposefully choose women as debaters on this and most other topics.

By: Ann Bartow Tue, 08 Nov 2005 08:02:54 +0000 With little effort I can think of 50 or more women who could have been part of this debate without diminishing the quality of the discourse in the least; in fact quite the contrary. The majority of librarians, and library patrons, in this country are female, as are the majority of book purchasers. Yet not a single woman gets a voice in this debate.

By: Peter Rock Tue, 08 Nov 2005 07:51:40 +0000 From the graphic:

Google explained: “Our ultimate goal is to work with publishers and libraries to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers find new readers.”

Umm, no.

Google’s “ultimate goal” is to increase market-share thus maximizing stock value for its shareholders. What they’ve described above is simply a means to an end disguised as the end. Falling for this romanticized propaganda is what can lead to poor decision making.

I’m not necessarily against the Google Print project. In fact, I believe Google should be allowed to do what they are doing but under regulation. Unfortunately, I’m seeing this project evolve into a future of DRM encrusted digital books being sold online that require a Trusted Computing platform and proprietary software to download and view. In turn, Google’s “system” will be protected by software patents and an ALL RIGHTS RESERVED database to help keep other interests at bay. Am I overreacting? Is this paranoia? Is this an implausbile scenario?

I would like to know what Google’s definition of “discover” is in the above statement. If I’m not mistaken, they are conflating the term “discover” with “advertise”.

I understand that at this stage this is simply about a digital searchable database. But I have no doubt that if this stage manifests itself fully, it will become much more than that. I’m excited about that possibility and believe this is where we should be headed. But, how should we proceed?

Ultimately, I’m convinced that even if every single book in existence is freely available on a global network, physical books will still be sold and there will be profit to be had. As human beings, this is what we should strive for. Unfortunately, ALL RIGHTS RESERVED does not fit this paradigm.

