BRIEF COMMUNICATION A Thermodynamic Explanation for the Glänzel–Schubert Model for the h-Index Gangan Prathap National Institute of Science Communication and Information Resources, New Delhi 110 012, India. E-mail: [email protected] Recently, it was shown that among existing theoretical models for the h-index, the Glänzel–Schubert model provides the best fit for a chosen example involving the research evaluation of universities. In this brief communication, we propose a thermodynamic explanation for the success of the Glänzel–Schubert model of the h-index. Recently, Ye (2010) showed that from the three main contenders among existing theoretical models for the h-index— namely, Hirsch’s original approach, the Egghe–Rousseau model, and the Glänzel–Schubert model—at the level of universities, the Glänzel–Schubert model fits best. Ye also provided a unified theoretical explanation for these three models. In this brief note, we propose a simple thermodynamic explanation for the success of the Glänzel–Schubert model. The h-index (Hirsch 2005) and many of its variants purport to be able to measure research performance as a composite of a quality measure (usually taken as impact i expressed as the ratio of citations C to papers published P) and the quantity measure (i.e., P itself, or Q). The Glänzel–Schubert model h ∼ P 1/3 (C/P)2/3 (1) h ∼ C2/3 /P 1/3 = (C2 /P)1/3 (2) can be rewritten as The composite term C2 /P combines both size or quantity (C) and quality (C/P) of scientific effort and can be interpreted as an energy-like term X = iC. A thermodynamic explanation or justification then can be constructed proceeding from this identification. Consider that a person Received November 11, 2010; revised January 25, 2011; accepted January 25, 2011 © 2011 ASIS&T • Published online 14 March 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/asi.21508 has published a single paper in a publication window (i.e., period over which papers are published). If the total number of papers is represented by P, then in this case, P = 1.Assume this paper, over a fixed citation window (i.e., period over which citations are counted) has collected c citations. The total number of citations during this citation window is designated by C, and in this case, C = c. We now define the energy e of the single paper as e = C2 /P = (c2 /1) = c2 . The basic or elementary unit of effort or energy is defined as the energy that a single paper gathering a single citation (measured over the citation window) possesses. The term e = c2 then can be considered to have c2 times the basic or elementary unit of effort or energy. It is the knowledge energy in a paper as measured over the citation window. We now present a few structured exercises to expand on this idea. Exercise 1 Let us now assume that a second person has two papers, which have collected exactly c citations each. The total energy (i.e., sum of the energies) is E = c2 + c2 = 2c2 . But what then are the energies of the sum of the papers? We designate this by X for reasons which will become clear as we proceed. If we invoke the C2 /P definition, this leads to C = 2c, P = 2, and X = 2c2 . Thus, in this thermodynamically “perfect” case, E = X. The h-index for this case is 1 if c = 1, and 2 if c = 2. If c > 3, the h-index remains at 2! Exercise 2 We now introduce a thermodynamically “imperfect” case where the two papers (P = 2) have collected c1 and c2 citations, respectively. The individual energies of the papers are then e1 = (c1 )2 and e2 = (c2 )2 . The total energy (defined as the sum of the individual energies) is E = (c1 )2 + (c2 )2 . Let us now look at what we have termed the total exergy X. To compute this, we need the sum of the citations, C = c1 + c2 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 62(5):992–994, 2011 and the sum of the papers (P = 2). The “energy” of the sums (i.e., the total exergy) is X = (c1 + c2 )2 /2. We see at once that except in the case where c1 = c2 , the total energy will always be greater than the total exergy. Let us define this discrepancy as the entropy, which we designate by the term S. Then, S = E − X. Exercise 3 We now consider yet another “perfect” case. An author has n papers during the publication window, all of which have collected c citations each during the citation window. Then, the total number of papers published during the publication window is P = n. The unit of energy of each paper is e = c2 . The total energy (i.e., sum of the energies) is simply E = nc2 . The “energy” of the sum of the papers (which we have agreed to designate by the term X), called exergy, leads through the C2 /P definition, with C = nc and P = n to X = nc2 . Again, this is a thermodynamically “perfect” case with E = X, as the entropy S is zero. This is not unexpected. Knowing the number of citations of any one paper, that of all other papers is predictable, being exactly the same. However, it is interesting to observe what happens to the evaluation of the h-index and the p-index (Prathap 2010). Whatever the value of n, we find that the p-index, defined as p = X1/3 yields p = n1/3 c2/3 . This is a formula that is first hinted at in Glänzel (2006), and developed further in Glänzel (2008) and in Schubert and Glänzel (2007). However, the h-index depends on whether n is greater than or less than c. For all n ≤ c, h = n, and for all n ≥ c, h = c. The h-index is initially limited by the number of papers published, and when n crosses this threshold (n = h = c), h remains at the limiting value h = c. The p-index does not suffer from such a limitation. Note the significance of the curious conjunction at n = h = p = c. Here, we have the thermodynamically “perfect” portfolio of p papers published having p citations per paper so that E = X = p3 . The total energy is evenly distributed among all the states; that is, p papers having p2 units of bibliometric energy each. This also leads us to a geometric analogy of the whole process. Let i be the impact given by C/P. The exergy is defined as C2 /P (= iC = i2 P). Think of exergy as being represented by the volume of a rectangular prism of sides i, i, and P. For a given exergy X, p is the dimension of a cube that has zero entropy dispersion of citations over papers, and has the same volume (energy) as the rectangular prism. Note that for a single paper, the exergy is the same as the energy (i.e., x = e). Exercise 4 It is interesting now to consider a more general, thermodynamically “perfect” case where X = Cα /P β the publication window is P = n. The unit of energy of each paper is e = cα . The total energy (i.e., sum of the energies) is simply E = P cα . The “energy” of the sum of the papers (which we have agreed to designate by the term X), called exergy, leads through the Cα /P β definition, with C = nc and P = n to X = P α cα /P β . As this is a thermodynamically “perfect” case with E = X, the entropy S is zero, leading to the equality Pcα = P α cα /P β , which leads to α = β + 1. The simplest thermodynamically consistent relationship that can be proposed (Occam’s razor) is the case where α = 2 and β = 1. This, indeed, is the Glänzel–Schubert model for the h-index. This thermodynamic interpretation adds to the perspective provided by the unification of the three theoretical models for the study of h-indices (Ye, 2010). The Glänzel–Schubert model, being based on the composite term C2 /P, which combines both size or quantity (C) and quality (C/P) of scientific effort, leads easily to the energy-like term X = iC. This makes it possible to complete a trinity of thermodynamic-like terms: energy, exergy, and entropy. Unlike the classical statistical thermodynamics entropy term which needs an additional temperature term (i.e., requires the definition of a new entity called temperature) to bring it into units of energy, here we have a seemingly simple relationship that indicates that entropy S = E − X, directly has the units of energy, and can be interpreted as the unusable energy due to disorder in the system. When there is perfect order (i.e., what we called the thermodynamically “perfect” cases in Exercises 1, 3, and 4 earlier), entropy becomes zero. E can be thought of as the total internal energy of the system (disorder and all) while X is the usable energy (akin to the free energy) that is available to do work externally. Neither the Hirsch-type estimation (based only on C) nor the Egghe–Rousseau estimation (based only on P) can lead to such interpretations. There is promise that by introducing a thermodynamic analogy to bibliometric research assessment, it is possible to come up with more meaningful performance indicators. Energy, exergy, and entropy terms are scalar quantities that can be displayed as time series (variation over time) or in event terms (variation as papers are published) and also in the form of phase diagrams (energy–exergy–entropy representations). Exergy (related to h3 ) is possibly the most meaningful single number scalar indicator of an entity’s performance while entropy then becomes a measure of the unevenness (disorder) of the research portfolio. The Glänzel–Schubert model links exergy X to h, and this is why it appears to be most successful (Ye, 2010). (3) References An author has n papers during the publication window, all of which have collected c citations each during the citation window. Then, the total number of papers published during Glänzel, W. (2006). On the h-index—A mathematical approach to a new measure of publication activity and citation impact. Scientometrics, 67, 315–321. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi 993 Glänzel, W. (2008). On some new bibliometric applications of statistics related to the h-index. (OR 0801). Leuven, Belgium: Katholieke Universiteit Leuven. Hirsch, J.E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, USA, 102, 16569–16572. Prathap, G. (2010). The 100 most prolific economists using the p-index. Scientometrics, 84, 167–172. 994 Schubert, A., & Glänzel, W. (2007). A systematic analysis of Hirsch-type indices for journals. Journal of Informetrics, 1, 179–184. Ye, F.Y. (2011). A unification of three models for the h-Index. Journal of American Society for Information Science and Technology, 62(1), 205–207. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi
1/--страниц