• ISSN: 2010-3646
    • Abbreviated Title: Int. J. Social. Scienc. Humanit.
    • Frequency: Bimonthly (2011-2014); Monthly (2015-2018); Quarterly (Since 2019)
    • DOI: 10.18178/IJSSH
    • Editor-in-Chief: Prof. Aurica Briscaru
    • Executive Editor: Mr. Ron C. Wu
    • Abstracting/ Indexing: Google Scholar, Index Copernicus, Crossref, Electronic Journals Library
    • E-mail: ijssh@ejournal.net
IJSSH 2018 Vol.8(7): 220-224 ISSN: 2010-3646
doi: 10.18178/ijssh.2018.V8.964

Estimation of Beijing Air Quality Index Using Baidu Search Entries

Fengyuan Pan
Abstract—Protecting the environment while sustaining economic growth is a tough task for every country in the world, especially for China. China has required major cities to publicise their Air Pollution Index since 2000 (changed to Air Quality Index in 2012). Since then, the AQI has become one of the critical indicators for the central government to assess the local governments' performance. Comparing official AQI data from the US Embassy and 35 Beijing air quality monitoring stations, result reveals a significant manipulation of AQI data (to just below the Blue Sky threshold of 100). This research aims to find a way to predict the true AQI values through search entries in Baidu – the largest search engine in China. This would remove the need to rely on the data reported by the air quality monitoring stations, which seems to be unreliable. 73 search entries relating to air pollution and haze were collected from Baidu to run a LASSO (least absolute shrinkage and selection operator) analysis. To justify the LASSO analysis and find out the shrinkage factor, cross-validation method was used. After the LASSO analysis and cross-validation process, 33 predictors remained to predict AQI from search entries with R2 0.69. These results indicate that search entries can be an alternative way to predict AQI with 69% prediction accuracy. In addition, due to limited time, there are only 73 search entries included in the dataset. For future research, a much higher prediction accuracy would be expected if more than 500 search entries included.

Index Terms—Air quality index prediction, search entries, justification of air quality index, lasso, cross-validation.

Fengyuan Pan is with the University College London, UK. (Email: fengyuan.pan.15@ucl.ac.uk).


Cite: Fengyuan Pan, "Estimation of Beijing Air Quality Index Using Baidu Search Entries," International Journal of Social Science and Humanity vol. 8, no. 7, pp. 220-224, 2018.

Copyright © 2008-2020. International Journal of Social Science and Humanity. All rights reserved.
E-mail: ijssh@ejournal.net