TY - JOUR
T1 - A data driven methodology for social science research with left-behind children as a case study
AU - Wu, Chao
AU - Wang, Guolong
AU - Hu, Simon
AU - Liu, Yue
AU - Mi, Hong
AU - Zhou, Ye
AU - GUO, Yi-Ke
AU - Song, Tongtong
N1 - Funding Information:
This work was supported by Fundamental Research Funds for the Central Universities, Zhejiang Natural Science Foundation (LY19F020051), Program of ZJU and Tongdun Joint Research Lab, CAS Earth Science Research Project (XDA19020104), National Natural Science Foundation of China (U19B2042). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
PY - 2020/11
Y1 - 2020/11
N2 - For decades, traditional correlation analysis and regression models have been used in social science research. However, the development of machine learning algorithms makes it possible to apply machine learning techniques for social science research and social issues, which may outperform standard regression methods in some cases. Under the circumstances, this article proposes a methodological workflow for data analysis by machine learning techniques that have the possibility to be widely applied in social issues. Specifically, the workflow tries to uncover the natural mechanisms behind the social issues through a data-driven perspective from feature selection to model building. The advantage of data-driven techniques in feature selection is that the workflow can be built without so much restriction of related knowledge and theory in social science. The advantage of using machine learning techniques in modelling is to uncover non-linear and complex relationships behind social issues. The main purpose of our methodological workflow is to find important fields relevant to the target and provide appropriate predictions. However, to explain the result still needs theory and knowledge from social science. In this paper, we trained a methodological workflow with left-behind children as the social issue case, and all steps and full results are included.
AB - For decades, traditional correlation analysis and regression models have been used in social science research. However, the development of machine learning algorithms makes it possible to apply machine learning techniques for social science research and social issues, which may outperform standard regression methods in some cases. Under the circumstances, this article proposes a methodological workflow for data analysis by machine learning techniques that have the possibility to be widely applied in social issues. Specifically, the workflow tries to uncover the natural mechanisms behind the social issues through a data-driven perspective from feature selection to model building. The advantage of data-driven techniques in feature selection is that the workflow can be built without so much restriction of related knowledge and theory in social science. The advantage of using machine learning techniques in modelling is to uncover non-linear and complex relationships behind social issues. The main purpose of our methodological workflow is to find important fields relevant to the target and provide appropriate predictions. However, to explain the result still needs theory and knowledge from social science. In this paper, we trained a methodological workflow with left-behind children as the social issue case, and all steps and full results are included.
UR - http://www.scopus.com/inward/record.url?scp=85096814159&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0242483
DO - 10.1371/journal.pone.0242483
M3 - Journal article
C2 - 33216786
AN - SCOPUS:85096814159
SN - 1932-6203
VL - 15
JO - PLoS ONE
JF - PLoS ONE
IS - 11
M1 - e0242483
ER -