As a member of both the New York Library and Creative Commons, I received a lot of advance notice about this week’s discussion entitled “The Battle Over Books: Authors and Publishers Take on the Google Print Library Project”. And, thanks to Larry Lessig, I got a chance to be in the audience during this match-up which forced me to reshape my thinking about Google, about Web 2.0, and about copyright regimes.
The discussion centered largely around the Google Print Library Project and Google’s decision to scan books without first asking for authorization from the copyright holders. They do content, however, that they will remove books from their index if the copyright holder asks them to do so. In the last few months, the Author’s Guild and the American Association of Publishers have sued Google, alleging violations of copyright law.
Meanwhile, a separate effort set up by some of Google’s competitors (notably Yahoo! and Microsoft) and called the Open Content Alliance has taken an opt-in approach to scanning copyright holdings, including only content that is no longer under copyright protection or content that has been expressly authorized by the copyright holder. This effort has not been sued by the two groups.
What is interesting here is that much of the debate really centers around an issue of public vs. private. Google is really creating a private holding out of content initially created by other people. While I initially was on the site of Google when I first heard about this debate, I am starting to wonder whether their position is correct. While it is a good thing that Google gives access to a way to search content which was not previously searchable, why is it OK for Google to not share that content with others? Why is it that they are not joining the Open Content Alliance and sharing access to content they have created? Why is it that they are creating a walled garden around content they did not create and only allow interaction with that content through Google? Those are questions that Google has not answered and need to be answered if we are to trust the company’s unofficial “Don’t be evil” motto.
However, this is an issue that goes far beyond books when you start thinking about it. Google has largely been building a reputation based on its ability to search various types of data, assuming that the copyright holders were allowing them to do so. I first looked into that issue about 5 years ago Deja News put out the “For Sale” sign, which was eventually picked up by Google. What is interesting is that Google needs data. Without it, Google is useless: the value of a search engine is related to how many assets it holds and how well it can organize them. This is why size does matter even though some now try to claim it no longer does.
I would go as far as extrapolate that this is the biggest dilemma for most web 2.0 companies: as more and more of them rely on system where the data is almost as important as how one interacts with it, they are found starving for data. However, they have to balance that with the ideal of being more transparent and share that data with other entities. The dilemma then becomes how to keep a private set of data in the public eye while keeping the public from stealing and/or misusing your private data.
I asked the panelists whether the issue was that Google was turning the author’s data into private Google property and whether Google joining the Open Content Alliance would solve the problem. David Drummond, who was there representing Google did not answer the question. Allan Adler, from the Association of American Publishers, stated that they would drop their objections (and thus potentially their lawsuit) if Google were to follow the established principles of the Open Content Alliance. In order to decipher that statement, I went back to the OCA’s website and looked for what those principles were. They are are follows:
- The OCA will encourage the greatest possible degree of access to and reuse of collections in the archive, while respecting the rights of content owners and contributors.
- Contributors will determine the terms and conditions under which their collections are distributed and how attribution should be made.
- The OCA need not be obligated to accept all content that is offered to it and may give preference to that which can be made widely accessible.
- The OCA will offer collection and item-level metadata of its hosted collections in a variety of formats.
- The OCA welcomes efforts to create and offer tools (including finding aids, catalogs, and indexes) that will enhance the usability of the materials in the archive.
- Copies of the OCA collections will reside in multiple archives internationally to ensure their long-term preservation and accessibility to all.
The last few words (“and accessibility to all”) are particularly interesting. These, I believe, may be a large part of the reason Google is not going to join the OCA.
In a way, Google is appropriating other people’s work (the actual content of the books) and creating a private property around it. Had Google created the content or provided the tools to do so, they might have a claim to being part of the creation. However, it seems that, in scanning the content, they are appropriating content which is not rightfully theirs without first asking for authority to do so. That can’t be right.
It is interesting that Google has wrapped its argument around the Fair Use doctrine as the copyright office seems to clearly state that one of the factors to consider is
the amount and substantiality of the portion used in relation to the copyrighted work as a whole
The reason that is interesting is that it points to an issue in terms of whether they are infringing or not. Considering the fact that they do have to copy the works in full in order to be successful in their undertaking, it seems that they would indeed be in infringement under a strict reading of that section of Copyright law.
One of the items that were overlooked by most of the media coverage is the question of price for the rights. Larry Lessig, during an exchange with Nick Taylor, of the Authors Guild, stated that he feared that the Author’s Guild and the Association of American Publishers would eventually settle their lawsuit with Google. This fear is well grounded when one realizes that the majority of lawsuits are settled out of court but it gains extra weight if there is a potential that Google will lose. To understand Lessig’s fears, however, one has to go one step further and start looking into the effect of such a settlement. First of all, Google is rich (as of this writing, Google had a market capitalization sitting north of $100 billion); There is nothing wrong with that, except for the fact that they can pay a lot more than other companies could. If they were to settle with the authors and publishers for a lot of money (which is what the receiving parties will be pushing for), they will create a precedent whereby rights that previously were available for free will now have a fairly hefty price tag.
This is not only bad for people trying to develop new businesses to compete with Google but has a potential for being bad for democracy in general as it might create two different groups in a society: those who can pay for access to certain content and those who can’t. In the long run, that sounds like a pretty evil thing to me and this is, once again, where the need for a system that is accessible to all and collections that reside in multiple archives are an important pre-requisite. If the authors and publishers are serious about being remunerated for their work, they are going to have to play this one for the long run. What it means is that settling is not an option! They must see this case all the way through to the Supreme Court of the United States. The reason this is necessary is that, if they settle, they change the negotiation from one where they are of equal weight to an asymmetric one where Google has all the power (because it keeps the access locked down). In the future, Google could decide what and when those authors and publishers have a say in that relationship. This is very dangerous. In a way, the relationship is one that fits a prisoner’s dilemma scheme nicely, showing that the only solution is to keep fighting:
|Â||Authors settle||Authors don’t settle|
|Publishers settle||Google wins complete control||Google asks publishers to lean on authors.|
|Publishers don’t settle||Google asks authors to avoid non-settling publishers. Offers way around them.||Decision is eventually made in the supreme court|
It is interesting to see that there is really no room but to fight. In a weird way, Google has become its own anti-thesis, being evil as a direct result of its own actions. Because, in order to protect its own economic interest, it must keep a walled garden, Google is stuck in a position where it will have to negotiate rights or lose the right to go after print. From the Google standpoint, the decision is to get one party to settle and leverage that into a position of strength to force the other party to settle. Once a settlement has been accomplished with both parties, however, Google will have established a price tags on rights. Because of that price tags, many parties (whether individuals or companies) will no longer be able to play in that space. Many could debate whether this is intentionally evil or not but few can deny that it creates an evil state of affairs.
A lot of this discussion, of course, cannot happen without taking into Creative Commons into account. I was surprised that Lessig was not making more of a case for the CC license to the publishers and authors. However, it was interesting to see him grilling Allan Adler on what constituted fair rights. Adler took a very evasive approach to dodge the question, leaving it absolutely unanswered. It is, however, an important question that needs to be dealt with if any resolution is to come.
One of the possible compromise would be for Google to agree they will no longer force an opt out model in exchange for a blanket endorsement of CC by the publishers and authors. Because CC licenses has a number of variables, it might allow some speeding up of the process in terms of willing to grant rights. This would also greatly benefit the Open Content Alliance project and thus ensure that content is widely shared and distributed while allowing content authors and publishers some level of control over what rights they would give away. The funny thing is that this may, in the end, be the only way out of the mess Google has created and that no one else seems to have suggested it.
© Tristan Louis 1994-present Some rights reserved.