From the Authors
Diversity in scientific research along many different axes is important and powerful. We believe that diversity in race, gender and background of the scientists carrying out scientific research is critical and foundational to ensuring broad based scientific results as well as ensuring an ever increasing circle of role models and mentors to encourage anyone who is passionate about science to bring their unique talents and insights to bear on the exciting and pressing problems that the world has to offer for the benefit of all of us.
This paper is about diversity in scientific approach. In financial theory, a portfolio of investments has low risk if those investments (e.g. debt (bonds), equity (stocks) and fixed assets (real estate)) are uncorrelated with each other. On average if some investments perform poorly others will perform well. We believe that it is similarly important in science that we build a system which backs both a diversity of scientists as well as a diversity of approaches.
As a famous example, for some 20 years, Alzheimer’s disease was described by the amyloid hypotheses predicated on the idea that Alzheimer’s is caused by the accumulation of fibrillar amyloid β (Aβ) peptide[1]. It was very difficult to get Alzheimer’s grants based on alternate mechanisms of action and billions of dollars were spent developing Aβ targeting therapeutics. The net result: Zero efficacious medicines. Thankfully there are now hundreds of therapeutic research programs based on alternative mechanisms of action for Alzheimer’s but several decades were lost by not funding alternative approaches.
In this paper we took a two step approach towards outlining a quantitative method for the allocation of finite resources to a diverse set of scientific approaches. The first step was the development of a model (which we have termed DELPHI) to predict research likely to have high impact (defined as a time de-biased version of PageRank[2], similar to the metric used to rank webpages) in the future. This model learns the pattern of features typical for papers in the past that have ended up being high impact. A particular example is the pattern of second and third order citations which indicate how the scientific community not only recognizes but builds upon high impact results[3]. The second step is to create a correlation matrix which finds clusters of research in a particular field which are relatively uncorrelated along some dimension (e.g. the citation graph for each cluster) but in which each cluster is predicted to be of high impact. The lowest risk path to achieving success in a given field is to back multiple uncorrelated clusters each with high impact expectation [4].
In summary we believe that a system which backs both an increasing diversity of people, and an increasing diversity in the set of approaches they take to solving scientific problems, is the best approach for ensuring a maximally beneficial future for all of us.
Sincerely,
James Weis and Joseph Jacobson
[1] Kametani, F. and Hasegawa, M., 2018. Reconsideration of amyloid hypothesis and tau hypothesis in Alzheimer's disease. Frontiers in neuroscience, 12, p.25.
[2] See: Xu, S., Mariani, M.S., Lü, L. and Medo, M., 2020. Unbiased evaluation of ranking metrics reveals consistent performance in science and technology citation data. Journal of informetrics, 14(1), p.101005. This metric has been shown to outperform more common metrics in identifying milestone research.
[3] We note that once a publication accumulates several citations, the citation graph becomes the dominant feature in predicting future impact. Prior to any citations only reputational features (e.g. author, network, journal) exist.
[4] We note that this is not the path of highest return. That is realized by picking one or a small number of clusters which if successful have high return but also high risk. In practice an optimal risk- return operating point is one in which N clusters of uncorrelated approaches are funded, such that N is the largest number for which the marginal utility of the Nth approach continues to exceed the marginal cost of that approach.