The model and its operationalization

Twitter icon
Facebook icon
Google icon
LinkedIn icon
e-mail icon

The model and its operationalization
The task of fully quantifying the costs and benefits of the proposed FRPAA archiving mandate is daunting, but it is possible to gain some sense of the potential scale of impacts (Houghton et al. 2006; Houghton and Sheehan 2009; Houghton and Oppenheim et al. 2009).

The standard Solow-Swan model makes some key simplifying assumptions, including (i) that all R&D generates knowledge that is useful in economic or social terms (the efficiency of R&D), and (ii) that all knowledge is equally accessible to all entities that could make productive use of it (the accessibility of knowledge). Obviously, these assumptions are not realistic. In the real world, there are limits and barriers to access and limits to the usefulness of knowledge. So, we introduce accessibility and efficiency into the standard model as negative or friction variables, and then look at the impact on returns to R&D of reducing the friction by increasing accessibility and efficiency. Annex I presents details of the basis and development of the model.

Table 1: Summary of the base case parameter sources and values

Parameter

Basis

Value
ACCESSIBILITY

 

 

Percentage change in accessibility
(Reported access gaps)

Ware (2009): 10% to 20% of articles read presented access difficulties

Adjusting for share of difficulties due to toll access barriers
Percentage change in accessibility
(OA citation advantage)

Hajjem et al. (2005): 25% to 250% more citations
Gargouri et al. (2010); Zhang (2006): average 100%

Adjusting for what is already OA and articles as a share of the research stock of knowledge
Percentage change in accessibility
(OA download advantage)

Davis et al. (2008): 42% to 89% more pdf and full text downloads (average 66%)

Adjusting for what is already OA and articles as a share of the research stock of knowledge
Combined estimate

 

Taking the lower bound of the ranges above:
Estimate for model 4.68%
EFFICIENCY

 

 

Percentage change in efficiency (wasteful expenditure: duplicative research and blind alleys)

Scenario 1, for illustrative purposes

1%
Percentage change in efficiency (new opportunities: collaborative opportunities and new methods)

Scenario 2, for illustrative purposes

1%
Percentage change in efficiency (Research time savings)

Scenario 3, for illustrative purposes

..
Combined estimate

 

In the absence of a grounded metric 0%
OTHER PARAMETERS

 

 

Returns to R&D (per cent)

Conservative consensus from the literature (Arundel and Geuna (2003; Hall et al. 2009; etc.)

20% to 60% (estimate 20%)
Local share of returns to R&D (per cent)

Consensus from the literature (Jaffe 1989; Coe and Helpman 1995; Verspegan 2004, etc.), and national citation patterns (NSB 2010)

66%
Lag between R&D spending and impacts (years)

Mansfield (1991, 1998; Matsumoto 2008)

3 years to publication plus 7 years to impact, 10 years
Distribution of impacts (years)

Mansfield (1991, 1998; Sveikauskas 2007; Matsumoto 2008)

Normal distribution over
10 years
Depreciation of stock of research knowledge (per cent)

BLS method (Griliches 1995; Hall 2009; Sveikauskas 2007)

Applying the BLS method, estimate 8%
Discount rate / risk premium (per cent)

Conservative consensus from literature

10% per annum
Source: Authors’ analysis (See Annex II).

To operationalize the model it is necessary to establish values for the accessibility and efficiency parameters, as well as rates of return to R&D and of depreciation of the underlying stock of research knowledge. Annex II presents details of the model’s operationalization and explains the sources and rationale for the choice of base case values (Summarized in Table 1).

For the purposes of preliminary analysis we take 4.68% as a conservative estimate of the potential increase in accessibility. To put that into perspective, Ware (2009) reported that the equivalent of 10% to 20% of articles read by his survey respondents presented access difficulties, and on average across the studies reviewed citations doubled and downloads increased by around 66% when articles were made openly accessible (See Annex II).

Table 2: Summary of the base case data sources and values

Parameter

Basis

Value
Federal R&D Spending (USD billions)

NSB 2010 indicators: R&D expenditure by the 11 FRPAA departments in 2008

$61
Annual growth in federal R&D spending (per cent)

NSB 2010 indicators: reported growth over last 10 years

3.2%
Average annual salary of researchers (USD)

NSB 2010 indicators: reported average salaries in 2008

74,070
Number of articles published from federal R&D (2008)

NIH 2008: estimate based on the ratio of NIH expenditure to article output

170,000
Number of articles published from NIH funded research circa 2008

NIH (2008, p22)

80,000
Average annual growth in article output (per cent)

NSB 2010 indicators: over last 10 years

1.8%
Per article submission-based costs (USD)

ArXiv (2010)

$7
Per article submission-based costs (USD)

NIH (2008, p22)

$59
Per article life-cycle archiving cost in first year (USD)

LIFE2 Project: Year 1 life-cycle costs

$34
Per article life-cycle costs per year in subsequent years (USD)

LIFE2 Project: Subsequent year annual life-cycle costs

$12
Time for author deposit (minutes per article)

Reported average use of the NIHMS submission system (NIH 2008, p14)

10 mins
Annual growth in archiving costs (per cent)

BLS: Average US CPI over last 10 years

3%
Average level of compliance with mandate over 30 years (per cent)

Assumed full compliance for the base case

100%
Embargo period (months)

Assumed six month embargo for the base case

6
Source: Authors’ analysis (See Annex II).

The third piece of the puzzle is the input data required for the modeling. The main requirements include the implied archiving costs, the volume of federally funded research outputs (i.e. articles), the levels of federal research funding and expenditure trends. For the purposes of preliminary analysis we have used publicly available sources and published estimates, and where necessary have derived estimates of our own from them. Annex II presents details of the data sources (Summarized in Table 2).

Data relating to federal research funding, activities and outputs are taken from the most recent National Science Board Science and Engineering Indicators 2010 (NSB 2010). It should be noted that FRPAA agencies’ funded article output is an estimate based on the ratio of NIH funding to articles produced and may overstate article output and, thereby, inflate archiving costs and lead to an underestimate of net benefits. We explore three sources of archiving costs:

  • The LIFE2 Project (Ayris et al. 2008), which reported life-cycle costs for articles and other items held on institutional archives in the UK, and found costs equivalent to up to $34 per article in the first year, and $12 per article held per annum in subsequent years;
  • Reporting costs on a submissions equivalent basis, NIH (2008) estimated that it would cost $4.5 million per annum to host the estimated 80,000 articles from NIH funding circa 2008 and noted that they had spent a further $250,000 on policy-related staff costs, implying a per article cost of around $59 per submission; and
  • Also reporting approximate costs on a submissions equivalent basis, arXiv (2010) noted that their annual budget was $400,000 rising to $500,000 by 2012 and that 64,047 articles had been submitted in 2009, implying a per article cost of around $7 per submission.
    For the purposes of producing preliminary estimates, we explore this range of costs – noting that the mid-range NIH reported costing might be the best guide.