Google Books and the Book Industry

closeThis post was published 10 years 2 months 24 days ago. A number of changes have been made to the site since then, so please contact me if anything is broken or seems wrong.

I wrote this for my Journalism class at college, but figured I might as well share it here too.

The New York Times ran a story Monday about a new lawsuit filed against HathiTrust, a partnership of universities and research libraries that maintains a digital book collection on its website.

Plaintiffs in the suit include three major authors’ groups: the Authors Guild, the Australian Society of Authors, and the Québec Union of Writers. Eight individual authors are also party to the filing, among them Pat Cummings, Roxana Robinson, and T.J. Stiles.

The objections raised in the suit center around the HathiTrust collection itself. “[S]even million copyright-protected books” (according to Paul Aiken, executive director of the Authors Guild, as quoted by the NYT) are available without any consent from the authors. The Authors Guild and its fellow plaintiffs say that the collection violates copyright law.

HathiTrust’s collection consists of books digitized by Google, Inc. as part of the Google Books project, which has been steadily scanning books from participating university libraries across the United States.

The Google Books project has been the subject of many lawsuits over the years since work on it was begun in 2002. A few examples will help provide context:

  • 2005: The Authors Guild sues Google for “plain and brazen violation of copyright law” (archived press release from AG via
  • 2009: French court halts Google Books in France: the ruling applies only to books published in France under copyright (Los Angeles Times article)
  • 2010: Several professional photographers’ organizations bring a class-action suit regarding the reproduction of copyrighted images within the books scanned by Google ( article)

The Authors Guild has been involved with this issue before. This time, the fight has been brought to an organization with a bit less might than Google.

But never mind who sued whom, for what, and when. The issue is really quite simple, and most of the lawsuits against Google Books have had little to no merit.

United States copyright law (the laws under which most Google Books lawsuits have been filed) contains a doctrine known as Fair Use. It was originally intended to protect commentary, critique, and parody of copyrighted works. However, the principles of Fair Use (Cornell University Law School Legal Information Institute):

  1. “the purpose and character of the use” — e.g. for commentary, critique, parody, scholarship, etc.
  2. “the nature of the copyrighted work” — published/unpublished, fact/fiction
  3. “the amount and substantiality of the portion used” — how much of the work was used, and how significant the used portion is to the work as a whole
  4. “the effect of the use upon the potential market” — if the use of that portion will negatively affect demand for or the value of the original work

(Thanks to Stanford University’s Copyright & Fair Use information center for helping me refresh my own memory of these concepts.)

The way Google Books works is carefully designed to fit within existing copyright laws. Books in the public domain are fully accessible, with no restrictions. Copyrighted, in-print books allow whatever access the publisher has specified. For in-copyright books that do not have a publisher, Google restricts access to “snippets”, which show just a few words surrounding the user’s search term.

So: Whenever Google Books shows a significant portion of a book, it has permission from the publisher to do so. Without permission, Google Books displays tiny fractions of the full work in an immensely transformative manner.

Google Books falls well within Fair Use doctrine, at the very least. Displaying card catalog–type information about the book plus at most a sentence or so for each search result (I’ll go down the Fair Use list):

  1. Is for scholarly reasons
  2. Uses published works
  3. Displays at most a few percent of the whole book
  4. May actually increase demand for the books featured in the results

(Parts of Lawrence Lessig‘s 2006 video discussion of Google Book Search came in handy for an overview of how Google Books works.)

So why are publishers and authors suing Google and HathiTrust?

As far as I can tell,[original research?] HathiTrust follows the same rules as Google Books. This makes sense, as the content is from the Google Books program.

HathiTrust’s entire archive is intended for academic use. It’s unclear why the various plaintiffs in this new lawsuit are suing for the removal of their books from the archive, rather than suing for better access controls. If the concern is that anyone can access the books (which they can), then restricting access to verified researchers would clear up the problem.

It’s like big music, film, and television. The music industry figured out that it could simply adapt to the Internet and start offering content over the new medium, giving people an alternative to pirated copies shared through services like Napster, LimeWire, and BitTorrent. Film and television haven’t yet figured that out, and I guess the book industry is still working on it too.


I am an avid technology and software user, in addition to being reasonably well-versed in CSS, JavaScript, HTML, PHP, Python, and (though it still scares me) Perl. Aside from my technological tendencies, I am also a theatre technician, sound designer, violinist, singer, and actor.

Leave a Reply

Your email address will not be published. Required fields are marked *

Notify me of followup comments via e-mail (or subscribe without commenting)

Comments are subject to moderation, and are licensed for display in perpetuity once posted. Learn more.