October 29, 2008  ·  Lessig

As many have, I’ve been eager to understand the terms of the settlement in the AAP/Authors Guild v. Google case (Google Summary, Actual Settlement). After spending some time studying it, here are my thoughts. (4TR: I was not part of any of these settlement negotiations so all this was news to me).

IMHO, this is a good deal that could be the basis for something really fantastic. The Authors Guild and the American Association of Publishers have settled for terms that will assure greater access to these materials than would have been the case had Google prevailed. Under the agreement, 20% of any work not opting out will be available freely; full access can be purchased for a fee. That secures more access for this class of out-of-print but presumptively-under-copyright works than Google was initially proposing. And as this constitutes up to 75% of the books in the libraries to be scanned, that is hugely important and good. That’s good news for Google, and the AAP/Authors Guild, and the public. (My favorable views about the AAP at least are not, of course, reciprocated.)

It is also good news that the settlement does not presume to answer the question about what “fair use” would have allowed. The AAP/AG are clear that they still don’t agree with Google’s views about “fair use.” But this agreement gives the public (and authors) more than what “fair use” would have permitted. That leaves “fair use” as it is, and gives the spread of knowledge more that it would have had.

The hard issue here will be in the details (surprise, surprise). The agreement calls for the creation of a registry to be operated by a nonprofit corporation. That corporation will be governed by a board comprised of publishers and “authors” (meaning authors participating in the law suit). That corporation will administer the payments to authors and publishers that flow from the agreement. It will also administer a registry that will make it easier for works to be identified, and owners located.

The hard question for the registry is how far they will go to support the range of business models that authors and publishers might have. E.g., Yale Press “Books Unbound” and Bloomsbury Academic both have Creative Commons licensed authors. Will the registry enable that fact to be recognized? Indeed, though the comment was made by someone from the plaintiffs’ side that it would be “perverse” for authors to choose free licensing, it is perfectly plausible that an author would choose to make his or her work available freely electronically, but contract with one commercial publisher to deal with selling the physical book, or licensing rights commercially. That, again, is the Bloomsbury Academic business model. Ideally, this non-profit should encourage the widest range of rights-respecting business models. One clear signal about what kind of organization this is will come from this.

But key to the good in the agreement is that we don’t have to trust the nonprofit to do good here. Google has committed both to making the data it can control (not private data about telephone numbers and contact info, but public data about copyright registration, terms, etc.) nonexclusively available, and more importantly, downloadable by anyone who wants to build a competing and complementary database. It has also reserved important safe-harbors for its incredibly valuable public domain collection (which includes books people get free access to, and can download for free).

Here, too, however, there is an important challenge for Google. It has provided important value by making available works that have no rights attached to it. But it should do more to make available works that have some rights attached to it. Critical for evaluating whether the long term interest of Google is GOOd or GOOey, Google needs to build into its architecture assets that are licensed freely, or under noncommercial terms, to complement the assets that it claims are free for “noncommercial” download (namely, the public domain works it has). Acting to clearly support the non-proprietary movement as well as the proprietary is an important way for it to show that it stands in the middle, and that it, with the AAP/Authors Guild, have now done some real good.

The biggest loser in this whole battle is the Orphan Works legislation. If anyone needed evidence to demonstrate that it is WAY TOO EARLY for Congress to be passing massive new bureaucratic overlays to copyright to deal with the important problem of “orphan works,” this is the evidence. Let’s let this private alternative develop, while Congress puts away its billion-factor balancing tests for regulating access to “orphan works.” For earlier rants against the Orphan Works bill, see:

Copyright Policy: Orphan Works Reform

Internet Law: 2.5 done (round II on Orphans)

And here’s a video I did years ago against the original Orphan Works proposals.

And a video I did long ago about whether Google’s use was “fair use.”

  • Adrian Lopez

    Yesterday I sent you the following inquiry on this very subject:

    One thing I don’t understand is whether this settlement applies to future claims of infringement by persons who are not currently members of the class. Although I don’t think Google should need a copyright holder’s permission to scan books for search purposes, there’s something about this settlement that bugs me, if I haven’t misunderstood it. As I said on Slashdot:

    “Insofar as the plaintiffs raise legitimate points concerning the use of scanned material, this settlement should not grant Google an implied license to the works of those who don’t explicitly agree to its terms, but the class action settlement is such that you have to opt out. This is bad. No third party should ever have the power to license my works to another party without my explicit say so. That’s an exclusive right granted to me as an author.”

    What do you think?

  • ok

    Can you expand a bit on the idea that …

    “Google needs to build into its architecture assets that are licensed freely, or under noncommercial terms, to complement the assets that it claims are free for “noncommercial” download (namely, the public domain works it has). “

    What do you mean by this?

  • Jim Carlile

    This new agreement could be a win-win for everybody, but there’s already been a problem brewing with Google and their formal scanning agreements, at least with U.C.

    A few years ago some of us noticed that Google was holding back many pre-1922 full-view scans of PD works. Not all, but the common denominator of these books was that they had all been re-published after 1922. An odd coincidence– it seemed a little fishy.

    But when the terms of the Google/ U.C agreement were finally released a few years ago, the reason for this discrepancy became obvious. A specific clause tipped-off the idea that Google’s business model was to make OOP works available for a fee. Not a bad idea. But if there were also full-view PD versions of these same works already on their site, it would kill the market for the later ‘copyrighted’ re-releases. Who in their right mind would buy the one when the other was free?

    The problem was, Google’s U.C. agreement required that they make all PD books available at their site in full-view. But they were not doing so. They have improved in this area the last few months, probably in fear of claims that they were hoarding PD works, but they are still hoolding back a large number of full-view, pre-1922 scans. Why is this, especially if they are required to release them by formal agreement with these host libraries?

    Another problem for Google is that they have an arbitrary 1922 cutoff date for PD works, despite the fact that many works published after that date are in PD because their copyrights were never renewed. Because the library agreements require full release of all PD books, why aren’t they checking the individual status of these works, and then making them available, like the Internet Archive does?

    Again, Google has a problem– they have to do this by agreement, at least with their U.C. scans. The new settlement also allows them to enter into a JSTOR-like subscription plan with libraries, where patrons can download copyrighted OOP works at will, for free. But the question is, is Google going to insist on this arbitrary 1922 date for the determination of public domain, and then charge for downloading all post 1922 works, even those that are in PD status and that are required to be made available for free?

    I suspect they will, until they are forced to do otherwise.

  • Jim Carlile

    In case anyone’s interested, here’s an interesting clause in the Google/UC scanning agreement:

    4.3 Google use of Google Digital Copy. Subject to the restrictions set forth herein, Google may use the Google Digital Copy, in whole or in part at Google’s sole discretion, subject to copyright law, as part of the Google Services. Google agrees that to the extent that it or its successors use any Digitized Selected Content in connection with any Google Services, it shall provide a service at no cost to End Users (1) for both search and display of search results and (2) for access to the display of the full text of public domain works contained in the Digitized Selected Content. To the extent portions of the Google Digital Copy are either In the public domain or where Google has otherwise obtained authorization, Google shall have the right, in its sole discretion, among other things, to (a) index the full text or content, (b) serve and display full-sized digital images corresponding to those portions, (c) make available full text of content for printing and/or download, and (d) make copies of such portions of the Google Digital Copy and provide, license, or sell such copies (including, without limitation, to its syndication partners). For all other portions of the Google Digital Copy, Google may index the full text or content but may not serve or display the full-sized digital image or make available for printing, streaming and/or download the full content unless Google has permission or license from the copyright owner to do so; Google instead may serve and display (1) an excerpt that Google reasonably determines would constitute fair use under copyright law and (2) bibliographic (e.g., title, author, date, etc) and other non-copyrighted information. In the event that Google has received a license or other permission from the applicable copyright holder to use in-copyright works in the Google Digital Copy, Google may use those works in any manner permitted under the terms of such license.

    I think that the third and fourth sentences are the tip-off as to what they’ve been wanting to do from the beginning– license OOP material and sell it. That means this new settlement really just formalizes what they’ve already had in mind all along– this whole book scanning program was never about altruism and “knowledge.”

    But the second sentence is the interesting one, I think, especially (2), where all public domain works are to be made available on their site in full-view, and for free. So far, they haven’t always been doing this, and they’ve been holding to a strict pre-1922 cut-off date as well, when determining a book’s PD status.

    In my view, in order to adhere to the agreement with U.C. at least, Google is going to have to research the copyright status of all 1922-1964 books, and immediately place those that were not properly renewed into their PD list. This could account for a HUGE number of new PD scans, post 1922. Like, millions.

    Otherwise, if they don’t do this, libraries like UC that subscribe to a future institutional copy scheme with Google Books could be put into the position of having to buy back post-1922 PD books that should be free per their own agreement!

    It’ll be fun to watch how this pans out.

  • Andrew Katz

    I know of at least one pre-1922 work on Google Books that’s still restricted: Wired Love by Ella Cheever Thayer (1880). See: http://books.google.com/books?id=BjAOAAAAYAAJ&q=wired+love&dq=wired+love&pgis=1

    To me, what’s incredibly galling about this is that I spent several months of 2007 tracking down a copy of this book, trying to get a library in the UK to release it to me in a format I could OCR, at the same time acknowledging that it is a PD work so I could upload the text to project Gutenberg. So after spending $100 of my own cash to get the scans, and countless hours (much longer than I had anticipated) proofreading, I uploaded the work to Gutenberg, only to discover that Google had scanned the book themselves a few days previously, albeit without making it freely available.

    The upshot of this is that I no longer have any desire to set any more books free, if I know that Google are working behind the scenes scanning stuff in, if there is no mechanism to determine that my efforts are being duplicated by them. Needless to say, unlike Google, Gutenberg keeps a public list of “works in progess” which you can check to ensure there is no reinventing the wheel going on, and add your name to against a specific work to warn others likewise. Discovering that there is duplication of your efforts is very demotivating. That’s why am interested to see that there is a register of works proposed in the settlement, but I assume this will have nothing to do with reducing duplication of effort.

    So Google have, by their policies, had the practical effect of dissuading at least one volunteer from providing services to Project Gutenberg. How convenient it is that this also reduces competition, if it is indeed Google’s plan to charge for access.

    It is also interesting that the work is available from lulu.com, via an organisation called “publicdomainreprints.org”. They seem to have acquired their text from google, although since the whole text is not available from google, it’s not clear how they got it. Some sort of licensing deal perhaps?

    It’s also interesting that within hours of my efforts being uploaded to Gutenberg, various sites made e-book versions of it available (generally on a commercial basis). That’s fine, and exactly what I expected to happen – although I was surprised at the speed of it. What disappointed me a little, was that in every case but one (and I made a point of emailing the site owner and thanking him – and this one was, needless to say non-commercial) they had stripped out the line crediting my uploading efforts.

    It’s difficult to understand why anyone would want to do that, other than to an alienate someone who was providing something for free and therefore disincentivising anyone from doing similar work in the future. Pretty self-defeating behaviour, I think. I’d love to know if anyone can shed any light on this rather, to my mind, bizarre (and frankly rude) behaviour.

    - Andrew

  • http://dslprime.com Dave Burstein

    Larry
    You may not have been part of these negotiations, but I have enough perspective from outside to realize they would have been very different, possibly impossible, without your work.

    Which is a welcome reward for what you’ve put into this, I would expect. One reason you have that impact if your logic and earned respect, but I thought to mention your sense of clarity and political theater adds a great deal to your effectiveness.

    I can report from the middle of a different battle some progress. Verizon has decided (a while back, incidentally) that fighting the open Internet is a mistaken battle. At least some at AT&T agree, and have held them back significantly. I know the coming Comcast system is intended to be neutral, and it’s obtrusiveness in practice minor. It’s rare for anyone to be even slowed for more than 15 minutes per day.

    As always, ask freely if anything I’m working on is helpful.

    db

  • Jim Carlile

    I think the working rule is that Google is scanning absolutely everything. I haven’t found one work on any humanities bibliography– from any era– that isn’t indexed at Google Books as a formal scan.

    That news is frustrating I’m sure for other ebook sites, and will probably make them redundant– which is the danger from this Google monoculture, as Kahle points out.

    What happens also when these library admins. decide to toss all their hard copies, or Google institutes a fee model someday, after they’ve soaked up all the books? As Nicholson Baker and others have pointed out recently, we are currently living in an age of real barbarism when it comes to library “weeding” policies– these guys are tossing everything, and with glee, too– it’s pretty sick.

    And are the host libraries like UC or Michigan going to be required to actually pay for their subscriptiion to Google Books? Why should they, when they’re providing the fodder? Shouldn’t UC get it for free?

  • Jim Carlile

    Has anyone examined the actual terms of the agreement when it comes to accessing in-copyright works? It’s really pretty crappy, of little use to serious students and researchers.

    1) Even if you pay for a book, you can’t download it, all you can do is “purchase online access.” This makes it useless for serious purposes, unless Google comes up with some kind of note-taking or bookmarking ability. And how is that going to work– you have to read it all at once, or pay for access each time, or you get a time limit?

    2) Public libraries will be provided a “designated terminal” to view in-copyright works for free. Big whoop– who the hell wants to sit in front of a public computer in order to read a book?– this allowance is only suitable for browsing purposes.

    3) What requirements are there that Google will continue to provide a download ability for PD works? There don’t appear to be any in writing–it’s entirely discretionary on their part to allow this.

    The bottom line is– unless you can download books for your own use, or print them out– even portions– this settlement is just a gimmick. It really can’t be used as a substitute for the real thing. And,

    4) When is Google going to release post-1922 PD works for full-view access?

    Here’s what their own blog says about newer PD works:

    “For U.S. books published between 1923 and 1963, the rights holder needed to submit a form to the U.S. Copyright Office renewing the copyright 28 years after publication. In most cases, books that were never renewed are now in the public domain. Estimates of how many books were renewed vary, but everyone agrees that most books weren’t renewed. If true, that means that the majority of U.S. books published between 1923 and 1963 are freely usable.”

    My question is: so Google, where are they then?

    Sorry to sound cynical, but there’s really less to this new scheme than meets the eye. It’s pretty Mickey Mouse. More limited than most people know.

  • Jim Carlile

    “Libraries that permit their collections to be scanned should insist that the scanned data be made freely available so that competitors can create alternate and competitive delivery systems.”

    Well, the UC agreement specifically states that Google has to provide free full-view access to public domain works, so that’s a start.

    But there’s a hitch– a review of their Agreement shows that nowhere is Google required to allow people to download PD materials. Google is doing so– apparently– only through their good graces.

    I agree,– the host libraries should require more for the in-copyright works, and in a way, they’re getting the shaft here, because they’re providing all of the goodies and getting none of the profits. I suspect that UC at least will not quite so readily agree to these fishy terms– there will be amendments to the deal.

    Also– I’d love to have someone get to the judge and point out that being forced to pay good money for mere online access is really pretty unconscionable, when downloading and printing is the only thing that is useful to scholars and students. In other words, it’s a mickey mouse, useless ripoff, when the whole point is that most of these books will never be commercially reprinted.

    Bottom line– unless you can download and print out scanned books, the whole thing is useless.

  • http://www.google.com Brian

    Association of American Publishers no American Association of Publishers

  • River

    A couple things:

    Jim Carlile said:
    “And are the host libraries like UC or Michigan going to be required to actually pay for their subscriptiion to Google Books? Why should they, when they’re providing the fodder? Shouldn’t UC get it for free?”

    UC & Michigan, and I believe most other libraries participating in this project do get original copies of the scans. If they were going to pay Google for a subscription to Google Books, it would be for the indexing and search capabilities, not the actual copies of the scans. Most libraries have not been providing access to their scanned copies, largely because they haven’t built systems to host them, but it looks like that will be changing for PD works with the launch of HathiTrust: http://www.hathitrust.org/

    Another class of Public Domain item, which is largely over looked in many of these discussions is U.S. Government Documents. Google has been using the 1922 cut off for these items as well, even though these works are, by their very nature, in the public domain. Even more frustrating are pre-1922 documents that are similarly restricted. Michigan has been making some of these documents available through their catalog, and hopefully more will be available soon as part of Hathitrust.

  • Mary

    Larry writes something about the agreement that I can not find evidence of. Is this in fact the case that “Google has committed both to making the data it can control (not private data about telephone numbers and contact info, but public data about copyright registration, terms, etc.) nonexclusively available, and more importantly, downloadable by anyone who wants to build a competing and complementary database.” ??? I don’t find this in the agreement, but it is loooong …

  • http://ipswhatsup.blogspot.com goldenrail

    “The bottom line is– unless you can download books for your own use, or print them out– even portions– this settlement is just a gimmick. “
    I disagree. Yes, it would be best if we all had the ability to download and print, but even without that, the settlement is giving us something good. It’s allowing the project to go forward. Just the search function alone, even without downloading and printing, is extremely valuable. We can literally google the library. Say I’m doing some research and put in my search terms at Google Book Search. Google produces a selection of books matching my search. I am then able to browse the areas directly around my search terms in those books to see which resources may be particularly helpful. Although I may not be able to download or print from here, I now have all the information about the book: title, author, etc. I can use this information to locate a physical copy of the book. Yes, this doesn’t help in the case of some hard-to-find books, but with on-line library card catalogs, inter-library loans and online book sellers, our chances of finding most books are pretty decent.
    The current settlement may not be ideal, but it’s certainly a step in the right direction towards more productive research and easier access.

  • Frances Grimble

    Speaking as a copyright holder, I’m appalled that Google is being allowed to get away with violating the law by scanning books with unexpired copyrights, then forcing the holders of the rights to agree to a settlement the vast majority of them had no hand in negotiating. The “class” Google and the Authors’ Guild claim this settlement applies to, includes not only every single author holding a US copyright, but every author in every country that signed the Berne agreement. If these rights holders—and the ultimate rights holder is usually the author rather than the publisher—do not hear about the settlement before the opt-out deadline, they are legally considered to have accepted it, according to the terms of the settlement. There’s no way a great many of them will hear about it, so Google will have seized control of a large amount of copyrighted material and make large profits from it.

    According to the settlement, Google is only paying $60.00 for each book whose copyright it has violated, which is ridiculous given the amount of time and money required to write and produce a book. However, anyone who wants better compensation has to take legal action, probably to opt out of the settlement entirely and file a separate lawsuit.

    A book being merely out of print does not mean it’s unavailable used, let alone that it’s rare. Many out-of-print books are later reprinted or published in revised editions, and sell well. The fine print also says in the future Google gets to declare a book “commercially unavailable” and seize use of it if it is out of print for only a year. Once Google is marketing the book, it may well become impossible for the rights holder to sell it anywhere else. Google is likely to make books very widely available, and what publisher wants to lose all those sales they would have made without Google’s competition? Google also reserves the right to declare print-on-demand books “commercially unavailable” via criteria unspecified in the settlement.

    There’s no altruism here. Google could have set up a publishing arrangement where rights holders voluntarily submitted their copyrighted work, and many of them would have. But that would not have allowed Google the quick domination of the publishing industry they have positioned themselves for. Nor would it have allowed Google to make money from a vast number of works not voluntarily submitted, as they can now do according to the settlement.

    After spending 25 years as a professional writer and editor and usually, not receiving anything like decent remuneration for my work, I am also appalled at the increasing cultural attitude that I should labor unpaid, while others who earn comfortable incomes themselves are somehow entitled to use my work for free. I call this “one-way socialism.” I need an income to pay for my housing and groceries—and my business expenses—just like non-writers. Now Google also wants to make money from my work without my permission.

    I agree that there is nothing illegal or unethical in scanning public domain works. However, I have downloaded a number of them from Google Books and I am very disappointed with the quality. If all those blurred, crooked, missing, and finger-photo’d pages are going to be our culture’s best record of public domain works, it’s not much of a service.

  • Thouraoraws

    Apart from causing harm to one’s health, drug going and of these Compensation Carriers since they do not sell “health” insurance. more info This can be more problematic for teenagers during their Because some customer, nevertheless also his or her friends and family. The requirements of a qualified patient states to necessary.” opportunity officials, test back the first ‘Cannabis’ Its fraud, blackl, etc.