Annex II: Operationalizing the model
Annex II: Operationalizing the model
To operationalize the model we need to establish values for the parameters. This section outlines the sources and rationale for the choice of base case values. It also suggests where further data collection is required.
Returns to R&D: Sources and rationale for the range to be modeled
There have been many studies exploring the economic impacts of R&D at the firm, industry and national levels. A characteristic finding is that the returns to R&D are high – often in the region of 20% to 60%, and higher in some cases (Bernstein and Nadiri 1991; Griliches 1995; Industry Commission 1995; Salter and Martin 2001; Scott et al. 2002; Dowrick 2003; Shanks and Zheng 2006; Martin and Tang 2007; Sveikauskas 2007; Hall et al. 2009). While there is considerable variation in the rates of return reported, those presented in Table AII.1 are indicative. Coe and Helpman (1993), Jones and Williams (1998) and others have shown that similar rates of return arise from endogenous growth models, and champions of the evolutionary approach suggest that, limited to seeing new knowledge as the output of research, simple growth models do not include other forms of economic benefit (e.g. skills development, development of instrumentation, development of networks, etc.) (Salter and Martin 2001; Scott et al. 2002; Martin and Tang 2007). Hence, if anything, the approach used herein may understate the returns.
Table AII.1: Estimates of private and social rates of return to private R&D
Source: Salter, A.J. and Martin, B.R. (2001) ‘The economic benefits of publicly funded basic research: a critical review,’ Research Policy 30(3), p514.
There have been a number of studies showing the industry impacts of publicly funded science in general, and of scientific and scholarly publications in particular. Mansfield (1991; 1998) attempted to measure the returns to R&D for those innovations that are directly related to academic research. From a survey of R&D executives in US firms, he found that around 10% of new products and processes would not have occurred (without substantial delay) in the absence of recent academic research. In a follow-up study, Mansfield (1998) found that academic research was increasingly important for industrial innovation. Similarly, the PACE Survey of large European firms showed that firms rely heavily on scientific publications as a source of information about publicly funded research (Arundel et al. 1995). More recently, Ware (2009, p27) found that small and medium sized firms rated original research articles and review papers in journals as their most important sources of information (as did university and college based researchers). For large firms, technical information and standards were more important, but journal articles were still among the most important sources. Sveikauskas (2007, p26) concluded his review of the literature saying: “[t]hese articles show that, beyond the firm to firm transfers that comprise the core of the R&D literature, substantial technology is also transferred from government or universities to private firms.”
In establishing what is a plausible range of rates of return to use, we take a lead from the literature. Arundel and Geuna (2003, p3) surveyed the literature, and reported that estimates of the rate of return to publicly funded research ranged between 20% and 60%. Martin and Tang (2007, pp6-7) noted that:
…there have been numerous attempts to measure the economic impact of publicly funded research and development (R&D), all of which show a large positive contribution to economic growth. For instance, the studies cited in OTA (1986) and Griliches (1995) spanning over 30 years of work find a rate of return to public R&D of between 20 and 50%…
Mansfield (1991)… estimated the rate of [private] return for academic research to be 28%.
Toole (1999) has shown… that firms appropriate a [private] return on public science investment of between 12% and 41%.
Exploring the impact of research articles in the Netherlands, Verspagen (2004, pp10-11) concluded that:
In the optimistic scenario the rate of return is 81% and in the cautious scenario it is 2%. When applying the weights for domestic and foreign sources… we arrive at a point estimate of a 59% rate of return on academic research by universities in the Netherlands.
In one of the most recent reviews of the literature, Hall et al. (2009; 2010) summarized the results of almost 100 studies, showing that the returns reported in the US studies ranged from 18% to 76%. Hall et al. (2009, p23) concluded that:
On the whole, although the studies are not fully comparable, it may be concluded that R&D rates of return in developed economies during the past half century have been strongly positive and may be as high as 75% or so, although they are more likely to be in the 20% to 30% range.
In light of debates over difficulties attributing spillover and downstream returns to R&D or complementary investments, prices and measurement error (Sveikauskas 2007), we adopt the lower bound of 20% as a plausible conservative average rate of return to publicly funded research (including both private and spillover or social returns). In view of the large share of federally funded R&D in the US going to life sciences, wherein returns are often reported to be higher, this is likely to be erring on the conservative side.
Estimate for the model (lower bound of reported average returns): 20%.
How local are these returns?
There are various ways to explore the likely localization of returns within a country. Here we mention three.
1. Economic studies on the localization of returns: A number of studies have looked at the issue of the relative impact of local research on local returns and/or the international spillover of R&D. Jaffe (1989) suggested that domestic knowledge is twice as important as foreign knowledge (i.e. 66% was local). Coe and Helpman (1993; 1995) adopted a trade weighting approach, and concluded that approximately a quarter of the benefits from R&D in G-7 countries accrued to their trading partners, and 75% locally (Hall et al. 2009). Verspagen (2004, p10), citing Arundel and Guena (2004), suggested weights for domestic versus foreign sources of 73% for domestic and 27% for foreign sources.
2. Article and patent citation patterns: Article citations reflect just one specialized area of use, namely use in further published research, and do not reflect wider economic and social application. Nevertheless, citation patterns could be seen as an indicator of the local use of local research articles. The National Science Board (2010, pO-12) reported that 60% of the articles cited in US-authored articles are themselves US-authored articles. Looking at patent citation patterns, Sveikauskas (2007, p42) noted that between 30% and 53% of US patent citations were to non-US sources.
3. Repository statistics: Repository statistics are another possible source of information on the localization of use of scholarly work, especially that which is open access. Unfortunately they cannot be applied in the US and international data present a very mixed picture, with national downloads (i.e. those to the archive’s country-code top level domain – ccTLD) varying from highs of 95% and more to lows of 20% and less. In the small international sample explored, however, the mean across repositories (N=12) was around 45%. Such download percentages will tend to understate the share of local use as there is likely to be a further share of local global top level domains (gTLDs) that remain unidentified in the data as well as a substantial number of unresolved domains. Indicatively, perhaps, one could add 45% of the gTLD and unresolved traffic to the ccTLD traffic. As such, the evidence of local use from repository download statistics is broadly in accord with the reported shares of local in total returns to R&D from the economic studies noted above.
Estimate for the model (average of reported national returns, citations and downloads): 66%.
Box AII.1: Diffusion of knowledge and returns to R&D
Source: Sveikauskas, L. (2007) R&D and Productivity Growth: A Review of the Literature, US Bureau of Labor Statistics Working Paper 408, BLS, Washington, DC. pp4-6.
Accessibility: Sources and rationale for the range to be modeled
Accessibility is defined as the proportion of the stock of knowledge generated by R&D that is accessible to those who could use it productively.
The key question is what impact might the open archiving of articles from federally funded research, as proposed under the FRPAA, have on accessibility? This can be unpacked to the following questions:
We deal with each of these in turn.
What proportion of the stock of R&D knowledge produced by federally funded research is in journal articles?
Under the assumptions of the standard approach the stock of R&D knowledge is an output of the stream of expenditure on R&D, and whatever researchers do with the R&D funding they employ can contribute to the stock of R&D knowledge. Hence, a possible proxy for the proportion of the stock of R&D knowledge that is in journal articles is the proportion of researchers’ time spent reading and writing articles (and, perhaps, peer reviewing and acting in journal editorial capacities).
Tenopir and King (2000), and the subsequent tracking studies, report the average time spent by researchers in industry and in universities on a number of tasks, including the time spent reading and writing journal articles. These studies suggest that researchers spend an average of around 90 to 100 hours writing journal articles. Both reading and writing habits vary between industry and university based researchers, but reading times for journal articles range from around 75 to 150 hours per year, suggesting that active researchers spend around 20% to 25% of their time reading and writing journal articles. Hence, on the basis of time spent, we could say that journal articles constitute some 20% of the stock of R&D knowledge.
What proportion of the stock of R&D knowledge is potentially available to archiving and alternative access?
Noting again that the stock of R&D knowledge is the output of the stream of expenditure on R&D, part of the answer to this question lies in the sectoral shares of R&D expenditure (i.e. the level of federal research funding).
In addition, some share of the article output from federally funded research will already be available open access, so we adjust according to the estimated share of articles produced in 2008 that are available open access (20%) (Björk et al. 2010) and for the level of compliance with the existing NIH mandate (56%) (NIH 2008, p26). These adjustments suggest that around 37% of federally funded article output is already openly accessible.
Hence, in order to focus on the incremental impacts of the proposed open archiving mandate, we limit analysis to FRPAA agency-related federally funded research and adjust for the share that is already openly accessible. To estimate the overall impacts we simply include the share of articles that are already openly accessible.
What measures are there of the potential impacts of the FRPAA archiving mandate on accessibility?
The proposed FRPAA archiving mandate relates to published research articles, so the crucial issue is the additional access that might be achieved through archiving. In the US and other developed countries, many researchers already have access to published articles through the journals concerned (e.g. via institutional library subscriptions). Outside the major institutions, however, such access can be more limited (e.g. for small firms, professionals, practitioners, educators and the general public). Such access can also be more limited for potential users in developing countries. While it is very difficult to estimate the potential increase in access, there are a number of possible proxy indicators.
Reports of access limitations, access gaps and difficulties
There have been a number of studies exploring access issues for researchers in various fields of research, institutional and sectoral settings. In a brief review of such studies, focusing very largely on research authors access to research and developed countries, Davis (2009) found that most indicated reasonably good and improving levels access for research authors employed in developed countries, although a significant number reported access difficulties and/or gaps, as did those from developing countries. Among the studies noted, Rowlands and Olivieri (2006) found that 67% respondents in immunology and microbiology reported having good or excellent access (33% did not); and Ware (2007) found that among an international sample, 69% of respondents reported having good or excellent access (31% did not), and outside the US and Canada 53% of respondents reported having good or excellent access (47% did not).
Ware (2009) looked at Access by UK small and medium-sized enterprises to professional and academic literature, although his study also included researchers and users in universities and colleges, hospitals and public health facilities, public research institutions and government departments, and other practitioners, professionals and individuals (Table AII.2). Ware found that 73% of small to medium sized firm (SME) respondents, 53% of large firm respondents and 27% of university or college respondents reported having difficulties accessing articles. Just 2% of SMEs, 7% of large firms and 17% of higher education-based researchers reported having access to all the articles they needed for their work. Amongst those experiencing access difficulties, those difficulties affected 6% to 10% of articles read. Of the entire sample, however, Ware (2009, p13) concluded that the percentage of articles with access difficulties ranged between 10% and 20%, of which between 21% and 55% related to the toll access barrier. It should be noted that in Ware’s survey, 71% of SMEs reported using open access journals and 42% reported using institutional repositories, so the reported access difficulties included current levels of open access availability (accessibility).
Table AII.2: Access to research articles, June 2009 (per cent)
Source: Ware, M. (2009) Access by UK small and medium-sized enterprises to professional and academic literature, Publishing Research Consortium, Bristol.
RIN (2009, pp8-9) reported that more than 80% of survey respondents said that the difficulties they encountered in gaining access to content had an impact on their research, and nearly a fifth (16%) said that the impact was ‘significant’. The most common impacts reported were delays in research, and inconvenient and disruptive interruptions to workflow. Lack of access is also a hindrance to collaborative working, and can lead to delays in the submission of papers and of bids for funding. Peer reviewers are also hindered when they cannot access sources cited by an author, and scientists worry that lack of access to the latest findings and methodologies may lead them to undertake redundant work.
Estimates of the Open Access citation and download advantages
There are many studies of, and active discussion about, a possible open access (OA) citation advantage, with general agreement that there does seem to be an observable advantage and argument focusing mainly on why (EPS et al. 2006). The observed advantages vary considerably. In his brief review, Davis (2009) noted that: Davis et al. (2008) reported that freely-accessible articles received no more citations than subscription access articles, but they did receive significantly more downloads (i.e. an 89% increase in full text downloads, suggesting wider access and use); and Evans and Reimer (2009) found that freely accessible articles received about 8% more citations on average, and twice that for the poorer countries. In one of the more widely cited studies, Hajjem et al. (2005) concluding that:
In 2001, Lawrence found that articles in computer science that were openly accessible (OA) on the Web were cited substantially more than those that were not. We have since replicated this effect in physics. To further test its cross-disciplinary generality, we used 1,307,038 articles published across 12 years (1992-2003) in 10 disciplines (Biology, Psychology, Sociology, Health, Political Science, Economics, Education, Law, Business, Management). The overall percentage of OA (relative to total OA + NOA) articles varies from 5%-16% (depending on discipline, year and country) and is slowly climbing annually. Comparing OA and NOA articles in the same journal/year, OA articles have consistently more citations, the advantage varying from 25%-250% by discipline and year.
More recently, and most directly relevant to the proposed FRPAA archiving mandate, Gargouri et al. (2010) found that articles whose authors have supplemented subscription-based access to the publisher’s version by self-archiving their own final draft to make it accessible free for all on the web (archived) average twice as many citations as articles in the same journal and year that have not been made Open Access (archived).
Figure AII.1:Average citation ratios for articles in the same journal and year that were and were not made OA by author self-archiving (1992-2003)
Source: Harnad, S. et al. (2004) ‘The Access/Impact Problem and the Green and Gold Roads to Open Access: An Update,’ Serials Review 34, pp36-40.
In the most recent review of the literature, Swan (2010b) summarized the findings of 36 studies noting that 27 reported finding an open access citation advantage and 4 found no advantage, and that the citation advantage found ranged from 31% to 400%. Four of the studies focused on one of the most established archiving forums, finding that, on average, articles on arXiv received twice as many citations.
Few studies have looked at the sources of citations for subscription and open access materials (i.e. identifying the users of the content and the nature of use). However, looking at the evidence from two journals Zhang (2006) found that: on average open access articles received twice as many citations as non-open access articles; the largest increase in open access article citations came from non-scholarly documents, such as academic essays, encyclopedia, online discussions and research reports, and from course and teaching materials; and a major source of the observed citation boost came from developing countries. Zhang (2006, p155) concluded:
The Web citation advantage of OA journal JCMC was demonstrated. Published online, OA articles are freely accessible to any user having Internet access so that they may potentially have a much larger size of readership than traditional access journal articles, and consequently receive far more citations. The classification of Web citation sources shows that traditional access journal articles have a significantly smaller proportion of citations from non-formally published academic materials than OA articles. This indicates that the OA articles’ impact advantage over traditional access counterparts in informal academic communication is even more distinct than in formal communication, which is represented by formal publication and school education. The classification of Web citations by countries shows that JCMC articles receive a higher proportion of Web citations from developing countries and from a wider international scope. A convincing interpretation is that open access could effectively improve the articles’ impact in developing countries and contribute to decreasing the academic gap between developing countries and developed countries.
Of most immediate relevance are the studies relating to archiving, which show that articles that are openly archived receive around twice as many citations. Some adjustment of the citation and download advantages is necessary to take account of what might be the temporary and what a permanent advantage (Harnad 2005) and of what is already openly accessible. Björk et al. (2010) found that some 20% of articles published in 2008 were available open access, and NIH (2008, p26) reported a 56% compliance rate for their archiving mandate. Hence, we adjust the citation and download advantages to take account of an estimated average of 37% of articles from FRPAA-related federally funded research that are already openly accessible.
Box AII.2:Model parameter: Percentage change in accessibility
Source: Authors’ analysis.
For preliminary estimates we take 4.68% as a conservative estimate of the potential incremental increase in accessibility – based on the lower bound reported impacts (Box AII.2).
Efficiency is defined as the proportion of R&D spending that generates useful knowledge, and can have a number of dimensions relating to wasteful, inefficient and/or poorly directed research expenditure. The key question is what impact might the open archiving of research articles from federally funded research, as proposed under the FRPAA, have on efficiency?
Drawing on a previous analysis of the literature (Houghton and Oppenheim et al. 2009) suggested that key dimensions of impact might include:
With many possible impacts on efficiency (RIN 2009), but few immediately available metrics, the best we can do is to explore plausible scenarios as a way to get a sense of the potential scope and scale of possible impacts (for illustrative purposes only).
Scenario 1: Less risk of duplicative research being done and of pursuing blind alleys through greater access and more complete dissemination
If just 1% of total federally funded research time were spent performing duplicative research and pursuing blind alleys that could have been avoided if researchers had had more complete access to the findings of others, then the annual ‘saving’ would have been around $1.4 billion – equivalent to around 12 million researcher hours. With returns to publicly funded R&D of 20%, the implied lost annual returns (i.e. from the same amount of research expenditure that was not duplicative) would have been around $280 million annually.
Scenario 2: Collaborative research and new research opportunities made possible by greater access to research publications brings higher returns to R&D
It is widely held that there are advantages to collaborative research and greater use of the findings from collaborative work (Katz and Hicks 1997; Katz and Martin 1997; Walsh and Maloney 2001). Enhanced access through centralized archives can offer greater support for collaboration, on the basis of a share common base of materials. Enhanced access through centralized archives can also increase opportunities for new research approaches (e.g. text mining). If greater and easier collaboration and new research opportunities increased the returns to federally funded R&D by just 1%, then it too would be worth around $1.4 billion per annum.
Scenario 3: Enhanced accessibility saves research time, allowing more research to be done for the same R&D expenditure
Exploring the potential research activity time and cost savings relating to: (i) reduced search and discovery time through enhanced discoverability and greater access, and less use of proprietary silo access systems; (ii) less time spent on seeking and obtaining permissions to use (copyright and licensing); (iii) less time spent on checking during peer review through greater access, in turn making for better quality review; and (iv) less time spent on writing and preparation through greater access making reference checking etc. easier, Houghton and Oppenheim et al. (2009) reported potential annual research activity savings from open access of GBP 73 million in UK higher education – equivalent to around 1.2% of higher education research expenditure. Scaling these scenarios to the scope of the FRPAA article archiving mandate and translating to US research activity and expenditure levels, would suggest potential US federally funded research activity savings of $43 million per annum—equivalent to around 380,000 research hours per annum. Of course, these sorts of savings might be available to all research, not just that funded federally in the US.
Box AII.3: Model parameter: Percentage change in efficiency
Source: Authors’ analysis.
However, given the lack of a grounded metric we have not included any increase in efficiency in our preliminary estimates (Box AII.3).
Other parameters: Sources and rationale for the range to be modeled
There are a number of other parameters required in the modeling of impacts, for which we have adopted conservative values so as not to risk overstating the potential benefits.
Rate of growth of R&D spending
Various subsets of federal R&D spending are examined and there are differences in spending trends between sectors and agencies. However, the National Science Board (2010) reported 5.8% per annum growth in US R&D spending over the last 10 years in current values (3.3% per annum real), and that federal spending on R&D had increase by 3.2% per annum over the last 10 years.
Estimate for the model: 3.2% per annum.
Lag between R&D spending and impacts
Lags between research spending and impacts being felt can be very long in some fields, perhaps 20 to 30 years, and short in others, perhaps 1 to 2 years or less. Mansfield (1991; 1998) reported that for US firms the average lag between the publication of academic research and the timing of subsequent commercial innovation relying on it was around seven years (falling to 6.2 in the later study). One might expect some further speeding up of the research and commercialization process since that time, but we model an average lag of 10 years for the base case to take account of the seven years reported by Mansfield (1991; 1998) and allowing a further three years for the lag between project funding/expenditure and publication.
Estimate for the model: lag 10 years.
Distribution of impacts over time
As well as being lagged, impacts occur over time. Mansfield (1991; 1998) reported that for US firms the lag between the publication of academic research and the timing of subsequent commercial innovation relying on it ranged from a minimum of 4.2 years to a maximum of 9.8 years, falling to 5.2 years to 8.5 years in the later study. However, these are private returns. Sveikauskas (2007, p6) noted that: “as knowledge gradually leaks out, private benefits decline and spillover effects increase. Consequently, private and spillover returns follow different time paths… spillover effects are considerably more long lived than private effects.” Hence we distribute the impacts over approximately 10 years.
Estimate for the model: normal distribution over 10 years.
Rate of inflation (cost increase)
Costs change differently in different areas, but overall inflation (Consumer Price Index) gives an approximate guide, and reported CPI over the last 10 years has averaged around 3% per annum.
Estimate for the model: 3% per annum
There is active discussion of the appropriate discount rate to use in cost-benefit calculation, with some suggesting very low rates and others much more conservative rates (Evans and Sezer 2002; Harrison 2007). Again, we adopt the more conservative approach.
Estimate for the model: 10% per annum.
Box AII.4: Model parameter: Rate of return to R&D and other parameters
Source: Authors’ analysis.
Rate of depreciation of the underlying knowledge stock
Looking at the most appropriate rate of depreciation to apply, Hall et al. (2009, p16) noted that most researchers use the 15% that Griliches had settled on in his early work. However, this may be more suitable for private returns than publicly funded research. Sveikauskas (2007, p6) noted that:
“Okubo et al. (2006) calculate R&D asset stocks assuming a 15 percent (or greater) annual depreciation rate. In contrast, the Bureau of Labor Statistics (1989), measuring the longer lasting spillover effects, assumes 10 percent depreciation for applied research and development and zero depreciation for basic research, which implies an overall depreciation rate of less than 9 percent.”
If we apply these BLS rates to the balance of federally funded R&D in 2008, which was approximately 20% basic research, it implies an average depreciation rate of 8%.
Estimate for the model: 8% per annum.
Data: Sources and rationale for the base case values
The third piece of the puzzle is the input data required for the modeling. The main requirements include the implied archiving costs, the volume of federally funded research outputs (i.e. journal articles), and the levels of FRPAA-related federal research funding and expenditure trends. For the purposes of preliminary analysis we have used publicly available sources and published estimates (Box AII.5).
Box AII.5: Model parameters: Base case data sources and values
Source: Authors’ analysis.
Data relating to federal research funding, activities and outputs are taken from the most recent National Science Board Science and Engineering Indicators 2010 (NSB 2010). We explore three sources for archiving costs:
For the purposes of producing preliminary estimates, we explore this range of costs – noting that the mid-range NIH reported costing might be the best guide.
Box AII.6: A brief description of the model
Note: See Annexes I and II for details.
We have created a simplified model in MS Excel format, in order to enable anyone to examine a range of values for the various parameters, test sensitivities and explore the issues for themselves. It is available at http://www.cfses.com/FRPAA/. We encourage people to experiment with it and we would welcome any feedback.
 Useful reviews of this literature include Griliches 1995; Salter and Martin 2001; Scott et al. 2002; Shanks and Zheng 2006; Martin and Tang 2007; Sveikauskas 2007; and Hall et al. 2009; 2010.
 As Sveikauskas (2007, p6) noted: “measured benefits are limited to those that have a market evaluation. Some benefits, such as clean air or some types of medical advances, are not evaluated through market prices, and are typically not included in economic statistics.”
 Differences in rates of return to R&D across the developed countries are not large (Hall et al. 2009; Cutler et al. 2008).
 It should be noted that Ware’s sample is reported to have been based on author, subscriber and pay-per-view transaction lists supplied by publishers, even though one might expect access gaps to be less prevalent amongst such groups than more generally. As such, the study may understate access gaps.