Recurrent Learning on PM_{2.5} Prediction Based on Clustered Airbox Dataset

Chia Yu Lo, Wen Hsing Huang, Ming Feng Ho, Min Te Sun, Ling Jyh Chen, Kazuya Sakai, Wei Shinn Ku

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


The reliance on thermal power plants as well as increased vehicle emissions have constituted the primary factors of serious air pollution. Inhaling too much particulate air pollution may lead to respiratory diseases and even death, especially PM2.5. By predicting the air pollutant concentration, people can take precautions to avoid overexposure to air pollutants. Consequently, accurate PM2.5 prediction becomes more important. In this thesis, we propose a PM2.5 prediction system, which utilizes the dataset from EdiGreen Airbox and Taiwan EPA. Autoencoder and Linear interpolation are adopted for solving the missing value problem. Spearman's correlation coefficient is used to identify the most relevant features for PM2.5. Two prediction models (i.e., LSTM and LSTM based on K-means) are implemented which predict PM2.5 value for each Airbox device. To assess the performance of the model prediction, the daily average error and the hourly average accuracy for the duration of a week are calculated. The experimental results show that LSTM based on K-means has the best performance among all methods.

Original languageEnglish
JournalIEEE Transactions on Knowledge and Data Engineering
Publication statusAccepted/In press - 2020
Externally publishedYes


  • Air pollution
  • Air quality
  • Atmospheric modeling
  • Computational modeling
  • Data models
  • Neural networks
  • Predictive models
  • Real-time systems
  • clustering
  • prediction
  • recurrent neural network

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Recurrent Learning on PM_{2.5} Prediction Based on Clustered Airbox Dataset'. Together they form a unique fingerprint.

Cite this