节点文献
The feasibility and flexibility of selecting quasars by variability using ensemble machine learning algorithms
【摘要】 In this work,we train three decision-tree based ensemble machine learning algorithms(Random Forest Classifier,Adaptive Boosting and Gradient Boosting Decision Tree respectively) to study quasar selection in the variable source catalog in SDSS Stripe 82.We build training and test samples(both containing 1:1 of quasars and stars) using the spectroscopic confirmed sources in SDSS DR14(including8330 quasars and 3966 stars).We find that when trained with variation parameters alone,all three models can select quasars with similarly and remarkably high precision and completeness(~98.5% and 97.5%),even better than trained with SDSS colors alone(~97.2% and 96.5%),consistent with previous studies.By applying the trained models on the variable sources without spectroscopic identifications,we estimate the spectroscopically confirmed quasar sample in Stripe 82 variable source catalog is ~93% complete(95% for m_i <19.0).Using the Random Forest Classifier we derive the relative importance of the observational features utilized for classifications.We further show that even using one-or two-year time domain observations,variability-based quasar selection could still be highly efficient.
【Abstract】 In this work,we train three decision-tree based ensemble machine learning algorithms(Random Forest Classifier,Adaptive Boosting and Gradient Boosting Decision Tree respectively) to study quasar selection in the variable source catalog in SDSS Stripe 82.We build training and test samples(both containing 1:1 of quasars and stars) using the spectroscopic confirmed sources in SDSS DR14(including8330 quasars and 3966 stars).We find that when trained with variation parameters alone,all three models can select quasars with similarly and remarkably high precision and completeness(~98.5% and 97.5%),even better than trained with SDSS colors alone(~97.2% and 96.5%),consistent with previous studies.By applying the trained models on the variable sources without spectroscopic identifications,we estimate the spectroscopically confirmed quasar sample in Stripe 82 variable source catalog is ~93% complete(95% for m_i <19.0).Using the Random Forest Classifier we derive the relative importance of the observational features utilized for classifications.We further show that even using one-or two-year time domain observations,variability-based quasar selection could still be highly efficient.
【Key words】 quasars:general; catalogs; methods:data analysis;
- 【文献出处】 Research in Astronomy and Astrophysics ,天文和天体物理学研究(英文版) , 编辑部邮箱 ,2021年04期
- 【分类号】P158
- 【下载频次】14