What Do Scholars Propose for Future COVID-19 Research in Academic Publications? A Topic Analysis Based on Autoencoder

Lihuan Guo, Wei Wang, Yenchun Jim Wu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


To analyze the directions for future research suggested and to project future research plans, we extract relevant text from these publications with respect to COVID-19-related research based on 54,136 relevant academic journals published from the initial outbreak of COVID-19 in January 2020 until December 2020. First, we extract and preprocess the corpus and then determine that, according to the Elbow method, the optimal number of clusters is 7. Then, we construct a text clustering model based on an autoencoder, with the support of an artificial neural network. Distance measurements, such as correlation, cosine, Braycurtis, and Jaccard are compared, and the clustering results are evaluated with normal mutual information. The results show that cosine similarity has the best effect on clustering of COVID-19-related documents. A topic model analysis shows that the directions of future research can mainly be grouped into the following seven categories: infectivity testing, genome analysis, vaccine testing, diagnosis and infection characteristics, pandemic management, nursing care, and clinical testing. Among them, the topics of pandemic management, diagnosis and infection characteristics, and clinical testing trended upward in proportion to future directions. The topic of vaccine testing remains steady over the observation window, whereas other topics (infectivity testing, genome analysis, and nursing care) slowly trended downward. Among all the topics, medical research comprises 80%, and about 20% of the topics are related to public management, government functions, and economic development. This study enriches our scientific understanding of COVID-19 and helps us to effectively predict future scientific research output on COVID-19.

Original languageEnglish
JournalSAGE Open
Issue number2
Publication statusPublished - 2023 Apr 1


  • COVID-19
  • autoencoder
  • deep learning
  • text mining
  • topic model

ASJC Scopus subject areas

  • General Arts and Humanities
  • General Social Sciences


Dive into the research topics of 'What Do Scholars Propose for Future COVID-19 Research in Academic Publications? A Topic Analysis Based on Autoencoder'. Together they form a unique fingerprint.

Cite this