Instead of relying on performance measures, the DBF concept builds on a combination of content analysis and data-mining in combination with scientometric methods. It presents a new approach in that it explicitly utilizes information present in research proposals submitted to a grant agency and relates it to the bulk of information drawn from activities of the larger research community in a specific field. To this end, textual information (e.g. keywords and longer strings extracted from proposals) as well as references (e.g. papers or patents) are combined using, e.g., human expert systems, clustering techniques, relational mapping, networks, bibliometric functions, various indices and data filtering techniques. This approach makes consistent use of one initial core source of information, namely data by the ERC (e.g., on panels, proposals). Subsequently, additional sources are employed to enrich the core set: 

1) the first supplementary data basis added to the ERC panel data through the construction of a scientific landscape; and 2) the second supplementary data basis added to ERC proposal.

The identification of frontier research through a combination of scientometric and text-mining methods is ambitious. The concept introduced here builds upon the understanding of the High Level Expert Group’s notion of frontier research. The relation between a sought quantitative model and the above definition is made transparent through the correspondence between each identified key attribute and its indicator as well as a selection function (see Figure: Core review process and corresponding indicators and the selection function).

Although each indicator has a clear measurement function and subjective interpretation, it is insufficient to represent frontier research. A faithful representation is achieved in combination, which is implemented in form of a selection function. Clearly the notion of ‘revolutionary breakthrough’ is practically inaccessible by scientometric and textual methods alone. Here two indicators capture different albeit related aspects of the research activity in question: the “timeliness” (one aspect of novelty) of the knowledge-base explicitly used by the author and the “proximity to emerging research topics” (another aspect of novelty) of the proposed research project inferred through the dynamic change of the scientific research landscape pertinent to this discipline. In computing indicators, an initial step identifies from a corpus of grant application relevant scientometric (e.g., publications, citations, patents) and content data (e.g. text-strings, keywords) bearing relevance to frontier research, extracts and subjects them to data mining. This is essentially a filtering step to pre-process raw data of high quantity into input data of lower quantity but higher quality. In a subsequent step, actual indicators are automatically and robustly computed and subjected to a selection function for comparison between empirical and model parameter. Finally, cross-validation and iterative variation of thresholds, classification criteria, and metrics as well as expert feedback from panel members and chairs, refine in a last step the performance of the model to sufficiently high usability. The following sections describe indicators and selection function in more detail. For various reasons the comparison of proposals is only meaningful within one scientific discipline. Thus we obtain indicator values and apply the selection function for each discipline (panel) individually. 

DBF Scheme
IndicatorDefinition (Frontier research…)
NOVELTY“… stands at the forefront of creating new knowledge and developing new understanding. Those involved are responsible for fundamental discoveries and advances in theoretical and empirical understanding, and even achieving the occasional revolutionary breakthrough that completely changes our knowledge of the world.”
RISK“… is an intrinsically risky endeavour. In the new and most exciting research areas, the approach or trajectory that may prove most fruitful for developing the field is often not clear. Researchers must be bold and take risks. Indeed, only researchers are generally in a position to identify the opportunities of greatest promise. The task of funding agencies is confined to supporting the best researchers with the most exciting ideas, rather than trying to identify priorities.”
PASTEURESQUENESS“… may well be concerned with both new knowledge about the world and with generating potentially useful knowledge at the same time. Therefore, there is a much closer and more intimate connection between the resulting science and technology, with few of the barriers that arise when basic research and applied research are carried out separately.”
INTERDISCIPLINARITY“… pursues questions irrespective of established disciplinary boundaries. It may well involve multi-, inter- or trans-disciplinary research that brings together researchers from different disciplinary backgrounds, with different theoretical and conceptual approaches, techniques, methodologies and instrumentation, perhaps even different goals and motivations.”

Indicator Development

Indicators are developed to affect the decision probability of a grant application, which has in principle three possible outcomes: Type-A) above threshold and funded, Type-B) above threshold and not funded, and Type-C) below threshold. Yet as there are a number of other possible factors influencing the peer review process, a statistical analysis (discrete choice model) determining the actual association between indicators and the funding decisions is carried out separately for SG and AG applications.

A discrete choice model is used to estimate how various exogenous factors influence the probability for a project proposal to get accepted. The above concept aims at developing quantitative methods for determining and examining the relationship between peer review and decisions on grant allocation in terms of attributes of frontier research: "Can attributes of frontier research be faithfully represented and validly quantified to evaluate the grant allocation decision by bibliometric approaches?"

The detailed development has focused on the ERC grant scheme (with data in 2007-2009), but the concept might be applicable more generally, depending on mission, review process and guidelines, attributes and correspondence of indicators for grant schemes. The implemented concept is intended to yield a bibliometric model in which indicators are expected to have a positive effect on the decision probability for ERC grant applications. Thus a follow up and specific question is:  "How well do bibliometric indicators and the decision probability discriminate between grant applicants accepted resp. rejected for funding?"

Described indicators and selection function are currently implemented. First ex post analyses between model and review process can be expected a mixture of similarity between peer reviewer selection (i.e. Types-A/B) and dissimilarity (i.e. type C). Depending on the found correlation between the discrimination of types A/B vs. C obtained from the bibliometric model and selection by peer review, numerical algorithms for the computation of indicators might need refinement, resulting in a modification of the model, or improvements of the future operation of the peer review process can be envisioned. In any case, it requires careful investigation (statistical independence, positive vs. negative correlation, outliers), explicit differentiation between measurement concept and interpretation, and careful feedback on parameters on a discipline-specific basis (Haindl 2010). In order to serve its purpose, the development and refinement will be hand-in-hand with experts involved in the review process to determine what the metrics could be used for and how it is affecting the review process. 

Ultimately the concept shall result in a methodology that allows the grant agency to monitor the operation of the peer review process from a bibliometric perspective and thereby provide a basis for the further refinement of the peer review process, including the ex ante bibliometric evaluation of future grant applications to support reviewers with orientation knowledge for the review assessment.

Indicator Timelines

 

 

Indicator Proximity

Indicator Pasteuresqueness

Indicator Risk

Indicator Interdisciplinarity