I try not to let my emotions get the best of me, but when it comes to the topics of informatics and data preservation, my passion sometimes gets in the way of making a cogent statement.
I just had one of those little fits that, in retrospect, now looks like a childish temper tantrum. After ranting about my disdain for research paywalls on Facebook (while also mentioning I understand someone has to pay for information curation and distribution), some predictable and perhaps deserved comments were made, including (paraphrased):
• The library is free; you should use it.
• You’re another example of the “I want everything for free” crowd.
I feel compelled to address these valid points of view while also explaining my primary frustrations involved with the current state of information access and preservation. I keep in mind (and ask you to do the same) I likely represent a small percentage of people with interest in the subject of information curation and preservation, and I’m obviously biased by my own perceptions and experience.
A little background
I’m conducting historical and, to a lesser degree, scientific research for a free-to-use online wiki about laboratory informatics. It includes information about data acquisition and management as it relates to science and industry, incorporating articles about informatics, informatics tools, vendors, and open-source projects. When writing about these topics on the wiki, I’m expected to follow similar standards to Wikipedia: leave out the marketing and weasel words, incorporating facts backed up by references. Using tools like search engines, Google Scholar, Google Books and the Internet Archive, I’ve typically been able to find the references I need to support claims like “Company X was bought by Company Y in 1997″ or “modern laboratory information management systems (LIMS) now permit users to import and manage raw assay data.”
But sometimes the information can’t be found via those avenues, occasionally because the information is presumably not in a searchable digital format, even when searching The Invisible Web, but more often because a paywall prevents further research from being performed. It typically goes like this:
1. Enter search term in search engine or other query tool.
2. Eventually an intriguing prospect appears as a blurb of text taken from a source via the search engine or query tool.
3. Discover more times than not the full text of the potentially useful research item — as either a stepping stone to finding more information or as a citable source — is locked behind a paywall like JSTOR or HighBeam Research.
4. Make decision: find a suitable citation or information source elsewhere; leave the information in the article uncited and hope for the best; or decide not to include the information, effectively making the article worse than it could be.
Despite its shortcomings, I’m mostly a fan of Wikipedia despite one of its major issues: lack of sufficient inline citations. The Atlantic’s associate editor Rebecca J. Rosen recently touched on this topic, stating:
[Wikipedia editor Richard] Jensen believes that there is a way out of this [problem of losing volunteers]: “Wikipedia is now a mature reference work with a stable organizational structure and a well-established reputation. The problem is that it is not mature in a scholarly sense.” Wikipedia should devote more resources toward getting editors access to higher-quality scholarship (in private databases like JSTOR), admission to military-history conferences, and maybe even training in the field of historiography, so that they could bring the articles up to a more polished, professional standard.
Whether it’s a major entity like Wikipedia or a tiny drop-in-the-bucket laboratory informatics wiki, the issue is clear and more vocalized than ever: for any free-to-use online body of information to have any sense of authority and relevance to an ever-growing digital user base, it needs to be backed by other authoritative references on the subject. This demands wider public access to otherwise private information.
Grappling with the expectation of “free”
The clamor for wider access to academic and scholarly information is not new. But recently the clamor has turned into a roar, as people are starting to fight back against the rising costs academic publishers are imposing and questioning the knowledge flow restriction. Examples:
• 24 April 2012: Harvard University says it can’t afford journal publishers’ prices
• 19 June 2012: Open access is the future of academic publishing, says Finch report
• 23 October 2012: Paywalls – to build or tear down?
• 01 November 2012: Measured Innovation in Peer Review
Where does all this lead, however? It leads to more questions. What and how much scholarly and historical information should be made available? Should costs be at least reduced while access is expanded beyond scientists, academics, and students? How much knowledge should be free to the public?
Let me at least try to tackle that last question about free information. Strangely enough in two different online locations today I encountered enlightening comments about today’s technologically connected denizens expecting more for less via the Internet:
Quote 1: Mike Scialom at Cambridge News:
People working in the technology sector are spoilt: years of openness have produced awesome products, and the free-share ethos of the information age appears to have worked, for consumers at least. However, in the academic publishing pond, the current moves a tad slower…
Quote 2: throwersmcthrows at reddit:
I don’t pirate music because I have a problem with the pricing, or the labels the bands are on (I don’t listen to much, if any, RIAA-affiliated stuff), and I can afford to buy everything I download. I pirate because it’s free. I’ve developed an expectation of free music based on precedent, ease of access, a low risk of consequences, and the fact that these musicians are outside of my Monkeysphere. I can’t care about them, I don’t know them.
Both of these individuals rightly touch upon how the “free-share ethos of the information age” and its “ease of access” has brought with it expectations of free information, whether it’s news, music, or video games. This type of thought has inevitably led to tougher measures by media producers in the form of digital rights management (DRM) (in software, music, and e-books, for example) and more paywalls and advertising (for news providers).
As for academic publishing, the exclusivity — and thus the great cost of accessing it — has always been there, as Sarah Kendzior, a Washington University anthropology PhD, recently wrote about for Al Jazeera:
Academic publishing is structured on exclusivity. Originally, this exclusivity had to do with competition within journals. Acceptance rates at top journals are low, in some disciplines under 5 per cent, and publishing in prestigious venues was once an indication of one’s value as a scholar. Today, it all but ensures that your writing will go unread.
She goes on to talk about how this exclusivity and lack of openness has created the exact problem I am currently encountering as an independent scholar and writer:
Discussions of open access publishing have centred on whether research should be made free to the public. But this question sets up a false dichotomy between ‘the public’ and ‘the scholar’. Many people fall into a grey zone, the boundaries of which are determined by institutional affiliation and personal wealth. This category includes independent scholars, journalists, public officials, writers, scientists and others who are experts in their fields yet are unwilling or unable to pay for academic work.
This denial of resources is a loss to those who value scholarly inquiry. But it is also a loss for the academics themselves, whose ability to stay employed rests on their willingness to limit the circulation of knowledge. In academia, the ability to prohibit scholarship is considered more meaningful than the ability to produce it.
So where do all of these observations lead me in the debate on free access? I agree today’s technological paradigm has created higher expectations of free information and media.
I also recognize at least in some instances creators and curators of scholarly information — as well as more common media like news and magazine articles — shouldn’t be expected to distribute it to the masses at a monetary loss. An information distribution model needs to be sustainable, which is why someone like me who falls in “the grey zone” is willing to pay a reasonable rate for tiered access to scholarly and historical research. However, that must include some sort of flexibility in a pay rate.
For example: access to X number of articles a month or year for the low cost of Y dollars, with the ability to pay for more access when I need it. As it is, a yearly subscription to HighBeam is $200 to $300 a year or $30 per month, and JSTOR is… well, JSTOR, limited to mainly academic institutions and publishers. There’s no inherent flexibility in those models for someone like me.
Related is the idea of instead going to a library — which may or may not have JSTOR or HighBeam access — and utilizing their online and physical resources. I have to answer this suggestion with “I agree, but…” It’s true the United States has some amazing physical repositories of information, though not all of them are entirely open to the public. University libraries, for example, are presumably paid for by and limited to the student body and school staff, though a few allow non-students to pay for reading access, so I begrudgingly give them a pass. Others are open to the public and often have quality physical reference material.
However, library funding is evaporating, meaning cutbacks in online access to journals and other digital repositories. Additionally, there’s also the problem that libraries are still thought of as physical repositories; we desperately need people knowledgeable in library informatics to sort through all these repositories and start making curated digital collections which can be accessed online. A tightly connected collection of online knowledge brings greater relevancy to cited material and the ability to verify sources without having to commute to a library.
Despite being quite willing to pay a respectable sum for flexible online access to information or to rummage through a library’s physical stock, I believe there are certain circumstances where the free open-access model should be free to reign. Many (but not all) types of scientific and historic research should be made more open to encourage further research and innovation. Additionally, I believe texts of a certain age should fall into the public domain, but this gets me on the topic of why we need copyright reform, which is probably expanding the scope of this article beyond what it should. Finally, I believe the majority of news articles (I imagine I’m in the minority here; after all, how do you qualify what a “news article” is?) should either fall into the public domain after a certain period of time or be archived by an archiving service, even if a site owner has put a robot exclusion is in place.
There’s a reason for that last bit about archiving. In April of this year I wrote a small essay about data archiving and the difficulties and necessities involved. In it I note the difficult question of “what do we preserve” when we live in a time when petabytes of information are being generated via the Internet. While difficult to answer, one of the major considerations is this: we have a certain duty to continue humanity’s development by preserving and disseminating our history and knowledge.
And that ties directly into my issue with information paywalls in respect to documenting and referencing our culture. While not everyone shares the same enthusiasm I do about preserving our history and knowledge in the digital realm, I surely hope even your average citizen can see how putting information behind walls is detrimental to society as a whole, which leads me to…
“The social cost of restricting the sharing of knowledge”
This past October David Parry, assistant professor of Emerging Media and Communications at the University of Texas at Dallas, published an intriguing essay titled “Knowledge Cartels versus Knowledge Rights” that in its entirety would tie directly into this post. In his essay he discusses not only the economic issues associated with academic publishing but also the ethical issues, “the social cost of restricting the sharing of knowledge” as he puts it. Interestingly, both he and a good friend of mine — we’ll call her “D” — make the same point in this ethical consideration: capitalism isn’t helping. “[C]apitalism is the enemy of knowledge” D said after relating her and her friend’s extensive experiences in academic institutions to me. And then this from Mr. Parry:
Indeed, I would argue that controlling knowledge and the protocols of information flow is one of the primary organizing logics of post-industrial capitalist economy, the means by which those of privilege will be able to reproduce and concentrate power.
Here’s where it gets a little murky for me again. I don’t have an extensive economical background, so hailing a barrage of vitriol at capitalism seems counterproductive. However, I do know enough about capitalism, economics, and history to understand the concept “information is power” and, by extension, that quality information is tough to make free when demand is high in a capitalist or “fend-for-yourself” economy. However, I also believe we don’t have to strictly follow capitalism’s rules when it comes to managing and preserving information that benefits society as a whole, or at least for the broadest public possible.
I want to close this already long post with another section of Parry’s passionate text on the subject. I strongly believe his words say what I want to say (but even better) about breaking down information paywalls, though hopefully we open that access in a sustainable way:
We ought not to be complicit in an immoral and unjust system. The exchange of information is fundamental for knowledge creation and social progress. Society depends on this transfer of knowledge. Indeed, a rather crude way to look at a social space is the degree to which knowledge and information can be frictionlessly shared between parties and recognized as a public good. Even more so, I think we need to follow the work of Drahos and Braithwaite and recognize that sharing is crucial, a founding principle of having a rich democracy. Without knowledge transfer, inequalities quickly form, and political and economic power is rapidly concentrated in the few at the expense of the public. A dynamic public sphere or a just society requires a common free culture on which conversations can be built.
If we believe ourselves to be public caretakers of knowledge and culture, to have important voices that both ought to be considered and built upon, than it behooves us to take steps to guarantee that our contributions remain open to the widest possible audience, not transferred to cartels who would profit from artificially establishing scarcity.
• Academic paywalls mean publish and perish – Sarah Kendzior, Washington University
• Knowledge Cartels versus Knowledge Rights – David Parry, University of Texas at Dallas
• The problems with free – Peter Wayner, tech author