Peer review and open science

Michael Jennings (London) · Science & Technology

Traditional journal based scientific peer review works as follows. A researcher does his research and writes his paper. He then submits the paper to the editor of a journal. The editor of the journal then sends the paper to a number (usually two or three) of other researchers in the same field. These researchers then write short reports on the paper outlining what is good or bad about it and usually suggesting improvements, along with a recommendation as to whether the paper should be accepted by the journal. The reports are then forwarded to the author of the paper, who responds to suggested changes and then sends a revised version of the paper to the journal. After possibly several repetitions of this, an accepted paper will eventually be published in the journal.

Referees are supposedly anonymous. However, the author, the editor, and the referees often work in small fields where everybody knows one another, and people’s beliefs, foibles and writing styles are often well known, so this anonymity is often more theoretical than real. The theoretical reason for anonymity – that the referee can say what he pleases without consequences – is not always entirely true. The anonymity is one sided: the referee receives a paper with the name of the author at the top. The name of a famous and influential scientist at the top has an impact. The editor is very powerful, as he gets to select the referees and by choosing referees carefully clearly has influence whether a paper will be published or not. A good editor will choose referees of mixed levels of seniority (referees include everybody from graduate students to senior professors), and (in areas of some dispute) of mixed positions in any argument.

There are various ways in which this process can be corrupted, but (certainly in the field I worked in) this generally did not happen. Publishers of journals made a point of appointing people of integrity as editors. It was in their self-interest to do this, because the long term consequences of not doing so would be a loss of credibility for the journal. The danger, always, is that authors, editors, and referees all end up coming from the same clique, in which such a process can be corrupted.

Another danger is that fields become isolated from each other, and workers in one field do not properly absorb knowledge and techniques from other fields. Many scientists (and non-scientists) for that matter use a great deal of statistics in their work, and do a great deal of computer programming in their work. Often, they will not be experts in either statistics or computer programming. Sometimes they will do good work from a statistical perspective, and write good computer code. On the other hand, if their work is to be published in peer reviewed journals, and the referees for the papers selected by those peer reviewed journals are not experts in statistics or computer science, and use similarly sloppy methods themselves, then poorer quality work can at times be gotten away with (similarly, you should beware of anyone in business or finance who tells you that his “proprietary black box model” tells this, and that he cannot show it to you because it is “proprietary”. Similar situations of sloppy code and statistics are endemic here, too).

The obvious point is that, when relevant, the peers who do the peer review should include statisticians and computer scientists as well as other workers in the precise field as the author of the paper. Science has become very specialised, and specialists in the same field do not talk to experts in other fields nearly often enough. However, the techniques different scientists use are not nearly as specialised as many proponents think they are. With some effort, experts in one field can understand the work of experts in another.

Traditional peer review does not encourage this.

Which is why in many of the most rigorous, competitive fields, in which really good, high quality science is done, traditional peer review has lost much of its relevance.

Some history… It is easy to see the internet as something that was invented and came along in the mid 1990s on the back of the PC revolution of the 1970s and 1980s. Old fogeys sometimes talk about how the internet was really invented in 1969, but it is easy to dismiss the internet as being something experimental and insignificant before that.

To do this, would be wrong. There was an entire world of computers before the PC, and what really happened in the mid 1990s was that some of the technology and culture from that world crossed over into the PC world and mainstream consciousness.

The important thing to understand is that if you were using a Unix computer in the 1980s, it was connected to the internet. People using Unix computers in the 1980s included most university computer scientists, most physicists, many engineers, many mathematicians. By 1985 these people were talking to one another online, which allowed them to exchange work and gossip more easily than had been the case before.

In highly mathematical fields, another extremely influential (related) development was the invention of the computer typesetting language TeX by Donald Knuth and its macro package LaTeX by Leslie Lamport, which facilitated typesetting of mathematical papers. These also became useful and widely available in the early 1980s. Prior to their invention, scientific journals were responsible for typesetting. After, researchers were expected to submit papers already typeset. Everybody preferred this, as you do not want someone who does not understand your mathematical formulas being responsible for typesetting them. Publishing scientific journals became cheaper, as one cost had been outsourced to the authors.

Cheap air travel changed things too. Scientists have attended conferences to talk to one another since the time of Newton and Liebniz, but few non-scientists would realise to what extent a modern scientific career involves flying constantly around the world to attend conferences, visit labs, and talk to your peers.

Imagine you are doing this. You present a summary of a paper you have just submitted to a conference. Having done this, somebody in the audience asks you for more information. You have full paper fully typeset and printed. The journal hasn’t accepted it yet, but the full peer review and publishing process can take years and science moves faster than that. The whole purpose of coming to the conference was to get feedback and ideas from people such as the person who asked you questions, and you want feedback from them as well as the official referees allocated by the journal. If there are other people working on the same thing, and the work you are doing is repeatable and correct, you want credit for doing it first, and circulating the work to other researchers is a great way of ensuring this.

Thus evolved the concept of a preprint of a scientific paper. Typically, you write a paper, typeset it, send it to the journal, and print of lots of copies of the paper to give to other researchers. (The less common expression “postprint” refers to a paper that has been accepted by a scientific journal but not published yet. “pre” in “preprint” refers specifically to “before peer review”).

Except, of course, there is one other thing you do with it. You put it on the internet. You don’t just want feedback from people you meet at conferences. You want feedback from anyone who can give you useful feedback.

In 1991, a chap named Paul Ginsparg, of Los Alamos National Laboratory, decided that all these preprints of physics papers flying around the internet needed a standard repository where they could be stored and easily found. He thus created a system called ArXiv (found here, and pronounced “archive”, the spelling being a weak pun on the name of the Greek letter χ), which allowed physicists to submit preprints of their papers in a location where they could all be found.

Use of ArXiv has been ubiquitous amongst physicists for about 15 years. In the years since, it has also expanded to include papers in astronomy, mathematics, computer science, nonlinear science, quantitative biology and statistics. Its ubiquity varies a little from field to field. One also should not draw any inferences about a specific field based on whether it uses arXiv specifically. There certainly are fields full of scientists I respect enormously (particularly much biomedical science) that do not use it but which have other similar conventions and systems. The question of “Do you have some process like this?” is hugely relevant, however.

When you write a paper, you submit it to a journal, and you also upload it to ArXiv. At that point your priority on the work is established. There are some checks to make sure that uploaded papers have relevance to the field in which they are categorised (and there has inevitably been some controversy as a consequence) but the test is relevance, not correctness, papers that fail it tend to get reclassified rather than rejected outright, and it is much easier to upload a paper to ArXiv than it is to get it published in a peer reviewed journal.

Researchers in these fields to not read peer reviewed journals, because cutting edge papers take too long to reach them. They find things out at conferences and read papers on ArXiv. They find the good papers by paying attention to researchers’ reputations and following recommendations from other researchers. Really interesting or important work will be looked at from researchers outside the field more often than is the case in peer reviewed journals, but this is possibly still a weakness of the process. Papers will be revised in response to this, and revised versions will be uploaded. A great deal more of the scientific process is occurring in public view. If you do science this way, there is relatively little to hide. This is obviously good, if you indeed have nothing to hide.

Papers are still published in peer reviewed journals. It is not unheard of for important papers to only even be published in preprint form, but it is unusual (that said, the most usually quoted example – Grigori Perelman’s proof of the Poincare Conjecture – might well be the most famous mathematical paper of the last decade).

Peer review matters professionally. If you are submitting a Ph.D. thesis and the work in it has already been published in reputable, peer reviewed journals, then your examiners have little work to do. If you are applying for an academic job, or for promotion or tenure, then your publication record in peer reviewed journals is central to the process. However, the peer reviewed journals are a way of keeping score. Amongst physicists at least, they are not where the work is done or how it is communicated.

We have in recent weeks heard calls from various people for science to adopt a model more resembling open source software – one aspect of which is opening access to the evolution of work to more people than a small number of officially appointed referees. The “Many eyes make all bugs shallow” philosophy surely has wider reference than just to software, although when a good portion of the work is software, it’s probably even more relevant.

However, what has been less reported is that in many fields, particularly the most quantitative fields, this model already exists. The physicists got there first, partly because they got the internet a decade before most other fields. However, many others have followed. The question should be, “If not, why not?”

Or, perhaps “Show me the preprints”.

Alice

December 7, 2009 at 3:09 pm

Excellent post, Michael.

One minor addition – there are hierarchies within hierarchies. Some conferences have a combination of paper sessions (where the authors make a formal presentation) and poster sessions (where the author hangs around a poster & discusses the work with anyone who passes by). Interestingly, in some disciplines, the paper sessions tend to be dominated by the big names, while the really groundbreaking work is at the less prestigious poster sessions.

Peer review may be yesterday’s solution to the issue of ensuring scientific integrity. Tomorrow’s solution is still bubbling in the pot.

Michael Jennings

December 7, 2009 at 3:17 pm

Ah yes. The memory of being given three minutes on the floor to describe why people might want to come and talk to me about my poster is all coming back to me now. Science can be brutal.

And of course, much of the important stuff at any conference occurs in conversations in coffee sessions, conference dinners, and in the bar afterwards, but that’s certainly not unique to scientific conferences.

William H Stoddard

December 7, 2009 at 3:48 pm

I used to work directly for one of the world’s largest scientific publishers, and back in the 1990s I attended a TeX conference with representatives from several other publishers. I still work as a copy editor for scholarly journals, including several journals that get papers in TeX. And I can tell you that when it comes to actual print publication, TeX does not play the role you suppose it does, at least not in all cases.

Journal publishers want their journals to have a uniformity of typographic style that most authors are not capable of achieving. Both journal and book publishers want what they publish to look good by classic typographic standards, and most authors are not professional typographers. High-quality publications are still copy edited, and most publishers are not willing to pay the very high hourly rate that a copy editor who can work in TeX charges (several times that for a copy editor who works with WYSIWYG text).

For all these reasons, when a TeX file arrives at the publisher, one of the following commonly happens: it gets converted to a proprietary typesetting language, printed out, and turned over to the copy editor, who indicates changes that are then input into the typesetter’s file; it gets printed out straight from the TeX file and turned over to the copy editor, whose changes are then input by the typesetter into their converted file; or it gets printed out from the TeX file, copy edited, and then manually typeset from the hard copy, without the TeX code being used at all. We used to get authors asking if we could just send them the edited printout so that they could personally input the changes into the TeX file; what they did not understand was that the TeX file had already been discarded as not relevant.

This sometimes produced quirky results. I remember one case where an author had a mathematical expression that was typeset correctly when run into text, but not when displayed on a separate line. It turned out that the author had not liked the spacing in the standard TeX version of that expression in display, and had coded a macro that defined a new variant of the expression. Unfortunately, when the typesetter converted the .tex file, the conversion process did not recognize the macro, picked up part of the code that called it as text, and skipped over the rest!

I have been told that there are exceptions to this pattern; in particular, the Springer-Verlag “Lecture Notes” series are often published straight from the TeX file, with exactly the typography the author defined. But this is unusual, and represents a tradeoff of giving up typographic sophistication and elegance to allow more author control . . . which the audience of those books is unusually willing to tolerate, seemingly. But at least in a lot of cases, the exact typographic control granted to the author by TeX is all an illusion, because the publisher won’t use the .tex file directly.

What Dr. Knuth was really after was to eliminate the publisher. But that doesn’t seem to have worked out yet. Perhaps the further growth of the Web will render publishers obsolete; but it may also render TeX obsolete in favor of more advanced Web markup languages that are optimized for today’s computer environment rather than for the more restrictive one of the 1970s (remember when a megabyte was an inconceivably huge amount of memory?).

Laird

December 7, 2009 at 4:30 pm

Fascinating article, Michael. As a non-scientist I was completely unaware of the intracacies of the publication process. This takes much of the mystery out of “peer-review”; more people should be aware of it. Thanks.

The obvious questions today are whether the “scientists” at the heart of the AGW issue and “Climategate” followed a similar “preprinting” and broad public comment process and, if not (which certainly seems to be the case) why not?

Brian Micklethwait

December 7, 2009 at 5:00 pm

Yes fascinating.

I tried looking at ArXiv, and immediately found my way to a randomly selected paper, full of mathematical sanskrit.

And the thought immediately occurred to me that all the maths acts as a force field to keep out non-members of the club, like me. In other words, for stuff like maths, total public accessibility would not create problems.

But do the more accessible subjects have anything like this, history, literary criticism, and such like? I can imagine that making lots of difficulties for the specialists involved, precisely because what they say is either easy to understand, and hence, er, borrowable by the rest of us, or not easy to understand and therefore mockable by the rest of us.

I still think total public access would be a good idea, though. Well, I would wouldn’t I? (See previous para.)

December 7, 2009 at 5:14 pm

William Stoddard: Okay, that’s fair enough. I have no experience of what goes on from the publisher’s side. I had assumed that the transition to submission in TeX form had saved publishers work and that house style was achieved by using journal specific macros and config files within TeX. If not, that’s interesting. TeX was designed by and for scientists minds, and I think they find it an easy way to work. The same may not be true of non-scientists.

The field I worked in was a branch of mathematics, and papers had an unusually high ratio of formula to text. Whether that influences the publisher’s technology in any way, I do not know.

Anyway, the main point I was making was that near universal use of TeX makes it easy to circulate readable and coherent preprints. That’s certainly true. Use of TeX is built into the workings of ArXiv at a very fundamental level.

steve

December 7, 2009 at 5:26 pm

“the techniques different scientists use are not nearly as specialised as many proponents think they are.”

In my experience, the tough part of dealing with another field of specialty is not the techniques used its the jargon. Often, the same terms mean different things in different fields. The worst case is when the difference in meaning is subtle.

permanentexpat

December 7, 2009 at 5:31 pm

Thanks so much, Michael Jennings….as a layman I had not the foggiest idea of the complexities of peer review although aware that, as in all such processes, it had to contain hidden minefields. Not enough that ‘to err is human’ but rather that to manipulate is humsn.

Johnathan Pearce

December 7, 2009 at 6:29 pm

I remember that in a posting of several months ago, lapsed Samizdata commenter IanB (what’s happened to the fella?) denounced the whole process of peer review as a form of herd mentality. But I think that was too sweeping a judgement and I still do. As Michael says, a journal with a reputation for rigour presumably wants to protect it; the reputation of the UEA’s climate centre is now royally screwed. Not clever.

Bod

December 7, 2009 at 7:22 pm

Johnathan,

I think that IanB’s right in the case of studies such as palaeoclimate research.

I was fortunate that in my former academic existance, I was in a corner of the economic geology world where the penalties and hazards of unconventional research findings were considerably lower. Journals were (and seemingly still are) more open in their editorial policies, and most of the actionable research predictions made in the field were testable, provided that a copper producer was prepared to send out a prospecting team to test the predictions that our primitive models spewed out.

In the case of climate research, there’s so much more to gain based on very little ‘science’. Furthermore, it always seemed to me that in the early (80’s/90’s) development of this scam, there was already a small, motivated group of research bodies, who were in a prime position to gain ‘first opinion advantage’. And the Feynman example of Millikan’s Oil-Drop expreiment is indisputable – once the graybeards establish an othrodoxy, getting your PhD depends more on your acceptability to the pre-existing herd.

My understanding is that the CRU’s (and GISS’ for that matter) numbers – I hesitate to call it data – are used for calibration purposes, thereby tainting any other research that uses it. This double-damns CRU’s whole research department, right the way back to HADCRU1 and beyond, and should render these people persona non grata across the entire scientific community if what’s been uncovered is true. This is the equivalent of cutting 25.4 mm off the end of the standard metre. (Yeah, I know they don’t use the metre now)

My personal interest now focuses on just how much of the primary research out there is tainted because of ‘honest’ confirmation bias by early-career academics who are subject to the ‘Millikan Effect’, and how many of them were actively complicit in a bald-faced effort to affect policy.

As with all these kinds of situations, I suspect that we will see a continuum where there are very few participants at either extreme, and most in a gray area, whose research will be scrutinized and considered suspect (quite rightly) for the rest of their careers.

Gareth

December 7, 2009 at 7:47 pm

The peer review process is never ending. Getting a paper published is (or should be) little more than the first few steps. Then anyone with an interest and expertise can review it and write a competing paper.

Gaming the referees and journals and then saying ‘it’s peer reviewed so it is correct’ is not peer review, but that is what the small cabal of climate ‘scientists’ turned it into. They didn’t have to. Nor did they have to prevent true replication of their methods by not disclosing code and data in a timely fashion – that is a matter for the journals though – they could have (and one eventually did) insisted upon proper disclosure.

vulgar moralist

December 7, 2009 at 9:59 pm

According to Henry Bauer, the bureaucratization and commercialization of research – often requiring whole teams of specialists – has made it almost impossible to find disinterested reviewers in many fields. Bauer’s paper, written in 2004, could have been a detailed description of the Climategate shenanigans.

See “The war on the weather,”

http://vulgarmorality.wordpress.com/2009/11/29/the-war-on-the-weather/

Jeff

December 8, 2009 at 12:55 am

arXive.org is certainly a step in the right direction, but I think that ESR wants to go a step further… Datasets (maybe even raw datasets) and algorithms should be made available, in addition to the papers themselves. I suppose that this is particularly important with nontrivial/nonintuitive results (and when people make multi-trillion dollar decisions based on them). From now on, I’ll certainly question time-histories of just about everything, and wonder how many “tricks” were played between the raw measurements and the results.

Ivan

December 8, 2009 at 3:27 am

I can also speak from the position of someone who has seen the sausage factory of academic research from the inside, and that was in an area with immediate practical engineering applications and strong ties to private industry, which is in itself a very strong sanity check compared to most fields.

My opinion on the institution of peer review is, on the whole, very negative. For start, it’s hard, tedious, and unpaid work where there are no incentives whatsoever to do a quality job. A paper whose conclusions rest on complex and subtle logic, mathematics, statistics, or computer models takes an immense amount of time and effort to thoroughly check for errors, fallacies, and spin — and how many people will actually take the effort to do this without any compensation and with virtually no consequences for doing a bad job? Moreover, professors regularly offload this work to their overworked and underpaid graduate students, who are also often rookies in the field without much competence to judge about how the paper really fits into the big picture and what fallacies one should watch for. Unsurprisingly, most reviews I ever got were hack jobs providing almost no worthwhile feedback at all, and pretty much all the exceptions came from those lucky occasions when the job was assigned to someone who was working on pretty much the exact same thing and thus capable of fully understanding what’s going on with just a quick casual read.

All this is further exacerbated by the fact that in modern science, as long there is no outright plagiarism or falsification of data, there is no penalty whatsoever for carefully crafted biased presentations and arguments that are blatantly aimed at strengthening one’s case rather than impartial search for truth — such practices are in fact necessary to get published in many fields. Engineering and medicine are especially bad in this regard; in most of these fields, you’ll virtually never get published for an honest discussion of how you tried something and failed to achieve any spectacular results, even if a lot can be learned from it, but rubbish work regularly gets published if just enough spin is put on it.

Furthermore, Michael is, if anything, too optimistic when he discusses the possibility of whole fields degenerating into incestuous cliques where the authors, editors, and referees all come from the same small group, and then use the formal process as a mere masquerade for their internal power dynamics. This happens very often, and the recent email scandal provides a glimpse into just one of the countless such cliques that monopolize particular fields of science. Also, this inbreeding process isn’t always negative, since the group in question may well consist of honest enthusiasts that do a very good job, assuming the field on the whole is fundamentally sound and not politicized.

That said, science on the whole still goes forward. As with many other human enterprises, imperfections and abuses abound, but some productive work is done nevertheless. Still, when academic research on the whole is taken into consideration, the amount of junk science published is far higher, and the ratio of signal to noise is much lower than the average person probably imagines — and the fabled “peer review” is definitely nothing like the magnificent engine of truth and expertise that it’s commonly made out to be.

In fact, peer review isn’t even a very old tradition. It’s an institution that came into prominence with the bureaucratization of science and government research funding in the post-WW2 period. Before that, the system was much more informal. Nobody peer-reviewed Einstein’s famous 1905 papers; they were merely read and approved by Max Planck, who was the journal editor. (The first time Einstein had a paper rejected by a peer-reviewed journal sometime in the 1930s, he was outraged at the idea that some anonymous critic would be able to block his paper from being published!)

December 8, 2009 at 9:48 am

Wasn’t it Planck who said “as I grow older, I realise that ideas become accepted not by convincing their opponents but by outliving them. Funeral by funeral, science advances”.

Great post Michael and everyone.

mandrill

December 8, 2009 at 9:56 am

Fascinating peice Michael, thank you.

Could it not be feasible to attach something like Digg to ArXiv? Then it would be peer reviewed, albeit in a limited way.

Samizdata

Authors

Arts, Tech & Culture

Civil Liberties

Commentary

Economics

Politics

Specialist

Peer review and open science

16 comments to Peer review and open science

Who Are We?

Categories

Archives

Feed This Page

Meta

Link Icons