EAS 2021 poster contributions

Three poster contributions during EAS 2021 with the following … statistics: all of them on massive stars,  two within the framework of the ASSESS project, and two on machine-learning applications.

1. Applying machine-learning methods to build a photometric classifier for massive stars in nearby galaxies

Grigoris Maravelias, Alceste Bonanos, Frank Tramper, Stephan de Wit, Ming Yang, Paolo Bonfini

Mass loss is a key parameter in the evolution of massive stars. Despite the recent progress in the theoretical understanding of how stars lose mass, discrepancies between theory and observations still hold. Even worse, episodic mass loss in evolved massive stars is not included in the models while the importance of its role in the evolution os massive stars is currently undetermined. A major hindrance to determining the role of episodic mass loss is the lack of large samples of classified stars. Given the recent availability of extensive photometric catalogs from various surveys spanning a range of metallicity environments, we aim to remedy the situation by applying machine learning techniques to these catalogs.We compiled a large catalog of known massive stars in M31 and M33, using IR (Spitzer) and optical (Pan-STARRS) photometry, as well as Gaia astrometric information. We grouped them in 7 classes (Blue, Red, Yellow, B[e] supergiants, Luminous Blue Variables, Wolf-Rayet, and outliers, e.g. QSO’s and background galaxies). Using this catalog as a training set, we built an ensemble classifier utilizing color indices as features. The probabilities from three machine-learning algorithms (Support Vector Classification, Random Forests, Neural Networks) are combined to obtain the final classifications. The overall performance of the classifier is ~87%. Highly populated (Red/Blue/Yellow Supergiants) and well-defined classes (B[e] Supergiants) have a high recovery rate between ~98-74%. On the contrary, Wolf-Rayet sources are detected at ~20% while Luminous Blue Variables are almost non-existent. The is mainly due to the small sample sizes of these classes, although M31 and M33 have spectral classifications for several massive stars (about 2500). In addition, the mixing of spectral types, as there are no strict boundaries in the features space (color indexes) between those classes, complicates the classification. In an independent application of the classifier to other galaxies (IC 1613, WLM, Sextans A) we obtained an overall accuracy of ~71% despite the missing values on their features (which we replace with averaged values from the training sample). This approach results only in a few percent difference, with the remaining discrepancy attributed to the different metallicity environments of their host galaxies. The classifier’s prediction capability is only limited by the available number of sources per class, reflecting the rarity of these objects and the possible physical links between these massive star phases. Our methodology is also efficient in correctly classifying sources with missing data and at lower metallicities, making it an excellent tool for spotting interesting objects and prioritizing targets for observations. Future spectroscopic observations will offer a test-bed of its actual performance along with opportunities for improvement.

2. A new automated tool for the spectral classification of OB stars

E. Kyritsis, G. Maravelias, A. Zezas, P. Bonfini, K. Kovlakas, P. Reig

As more and more large spectroscopic surveys become available, an automated approach in spectral classification becomes necessary. Due to the importance of the massive stars it is of paramount importance to identify the phenomenological parameters of these stars (e.g., the spectral type ) which can be used as proxies to their physical parameters (e.g mass, temperature).
In this work, we use the Random Forest (RF) algorithm to develop a tool for automated spectral classification of the OB-type stars into their sub-types. We use the regular RF algorithm, the Probabilistic RF (PRF) which is an extension of RF that incorporates uncertainties, and we introduce the KDE – RF method which is a combination of the Kernel-Density Estimation and the RF algorithm. We train the algorithms on the Equivalent Width (EW) of characteristic absorption lines measured in the spectra from large Galactic (LAMOST, GOSSS) and extragalactic surveys (2dF, VFTS) with available spectral-type classification. By following an adaptive binning approach we group the labels of these data on 11 sub-types within the range O3-B9. We examined which of the characteristic spectral lines (features) are more important to use based on a number of feature selection methods and we searched for the optimal hyper-parameters of the classifiers, to achieve the best performance.
From the feature screening process, we find 13 spectral lines as the optimal number of features. We find that the overall accuracy score is ~ 76 % with similar results across all approaches, with our KDE – RF being slightly lower at ~ 73 %. In addition, we show that our optimized RF model can reach an overall accuracy score of ~ 85 % in the ideal case of robust measurement of the weakest characteristic spectral lines. We apply our model in other observational data sets providing examples of potential application of our classifier on real science cases. We find that it performs well for both single massive stars and for the companion massive stars in Be X-ray Binaries, especially for data with S/N in the range 50-300. Furthermore, we present an alternative model for lower quality data S/N < 25 based on a reduced feature-set classification scheme, including only the strongest spectral lines.
The similarity in the performances of our models indicates the robustness and the reliability of the RF algorithm when used for spectral classification of early-type stars. This is strengthened also by the fact that we are working with real-world data and not with simulations. In addition, the approach presented in this work is very fast and applicable to products from different surveys in terms of quality (e.g different resolutions) and of different formats (e.g., absolute or normalized flux).

3. Evolved massive stars in the Magellanic Clouds

Ming Yang, Alceste Bonanos, Biwei Jiang, Jian Gao, Panagiotis Gavras, Grigoris Maravelias, Man I Lam, Shu Wang, Xiaodian Chen, Yi Ren, Frank Tramper, Zoi Spetsieri

We present two clean, magnitude-limited (IRAC1 or WISE1≤15.0 mag) multiwavelength source catalogs for the Large and Small Magellanic Cloud (LMC and SMC). The catalogs were built by crossmatching (1”) and deblending (3”) between the source list of Spitzer Enhanced Imaging Products (SEIP) and Gaia Data Release 2 (DR2), with strict constraints on the Gaia astrometric solution in order to remove the foreground contamination. It is estimated that about 99.5% of the targets in our catalog are most likely genuine members of the LMC and SMC. The LMC catalog contains 197,004 targets in 52 different bands, while SMC catalog including contains 45,466 targets in 50 different bands, ranging from the ultraviolet to the far-infrared. Additional information about radial velocities and spectral and photometric classifications were collected from the literature. For the LMC, we compare our sample with the sample from Gaia Collaboration et al. (2018), indicating that the bright end of our sample is mostly comprised of blue helium-burning stars (BHeBs) and red HeBs with inevitable contamination of main sequence stars at the blue end. For the SMC, by using the evolutionary tracks and synthetic photometry from MESA Isochrones & Stellar Tracks and the theoretical J-Ks color cuts, we identified and ranked 1,405 red supergiant (RSG), 217 yellow supergiant (YSG), and 1,369 blue supergiant (BSG) candidates in the SMC in five different color-magnitude diagrams (CMDs), where attention should also be paid to the incompleteness of our sample. For the LMC, due to the problems with models, we applied modified magnitude and color cuts based on previous studies, and identified and ranked 2,974 RSG, 508 YSG, and 4,786 BSG candidates in the LMC in six CMDs. The comparison between the CMDs from the two catalogs of the LMC SMC indicates that the most distinct difference appears at the bright red end of the optical and near-infrared CMDs, where the cool evolved stars (e.g., RSGs, asymptotic giant branch stars, and red giant stars) are located, which is likely due to the effect of metallicity and star formation history. A further quantitative comparison of colors of massive star candidates in equal absolute magnitude bins suggests that there is essentially no difference for the BSG candidates, but a large discrepancy for the RSG candidates since LMC targets are redder than the SMC ones, which may be due to the combined effect of metallicity on both spectral type and mass-loss rate as well as the age effect. The effective temperatures (Teff) of massive star populations are also derived from reddening-free color of (J-Ks). The Teff ranges are 3500≤Teff≤5000 K for an RSG population, 5000≤Teff≤8000 K for a YSG population, and Teff≥8000 K for a BSG population, with larger uncertainties toward the hotter stars.

