TY - JOUR
T1 - A comparison of methods for detecting hot topics
AU - Tseng, Yuen Hsien
AU - Lin, Yu I.
AU - Lee, Yi Yang
AU - Hung, Wen Chi
AU - Lee, Chun Hsiang
N1 - Funding Information:
In a project, funded by Science & Technology Policy Research and Information Center in Taiwan, to analyze a large set of scientific publications in the agricultural areas, we have the opportunity to detect upward trends for a group of experts and have their feedback in trend type labelling. We then analyze the effectiveness of our detection and prediction methods based on this feedback. To better utilize the valuable expert feedback, we propose a computerized approach to evaluate different methods based on the evaluation measures in information retrieval to know which one is better, quantitatively.
PY - 2009/10
Y1 - 2009/10
N2 - In scientometrics for trend analysis, parameter choices for observing trends are often made ad hoc in past studies. For examples, different year spans might be used to create the time sequence and different indices were chosen for trend observation. However, the effectiveness of these choices was hardly known, quantitatively and comparatively. This work provides clues to better interpret the results when a certain choice was made. Specifically, by sorting research topics in decreasing order of interest predicted by a trend index and then by evaluating this ordering based on information retrieval measures, we compare a number of trend indices (percentage of increase vs. regression slope), trend formulations (simple trend vs. eigen-trend), and options (various year spans and durations for prediction) in different domains (safety agriculture and information retrieval) with different collection scales (72500 papers vs. 853 papers) to know which one leads to better trend observation. Our results show that the slope of linear regression on the time series performs constantly better than the others. More interestingly, this index is robust under different conditions and is hardly affected even when the collection was split into arbitrary (e.g., only two) periods. Implications of these results are discussed. Our work does not only provide a method to evaluate trend prediction performance for scientometrics, but also provides insights and reflections for past and future trend observation studies.
AB - In scientometrics for trend analysis, parameter choices for observing trends are often made ad hoc in past studies. For examples, different year spans might be used to create the time sequence and different indices were chosen for trend observation. However, the effectiveness of these choices was hardly known, quantitatively and comparatively. This work provides clues to better interpret the results when a certain choice was made. Specifically, by sorting research topics in decreasing order of interest predicted by a trend index and then by evaluating this ordering based on information retrieval measures, we compare a number of trend indices (percentage of increase vs. regression slope), trend formulations (simple trend vs. eigen-trend), and options (various year spans and durations for prediction) in different domains (safety agriculture and information retrieval) with different collection scales (72500 papers vs. 853 papers) to know which one leads to better trend observation. Our results show that the slope of linear regression on the time series performs constantly better than the others. More interestingly, this index is robust under different conditions and is hardly affected even when the collection was split into arbitrary (e.g., only two) periods. Implications of these results are discussed. Our work does not only provide a method to evaluate trend prediction performance for scientometrics, but also provides insights and reflections for past and future trend observation studies.
UR - http://www.scopus.com/inward/record.url?scp=70350383492&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70350383492&partnerID=8YFLogxK
U2 - 10.1007/s11192-009-1885-x
DO - 10.1007/s11192-009-1885-x
M3 - Article
AN - SCOPUS:70350383492
SN - 0138-9130
VL - 81
SP - 73
EP - 90
JO - Scientometrics
JF - Scientometrics
IS - 1
ER -