Probing the application of generative adversarial newworks(GANs) to the star-galaxy classification problem in a semi-supervised setting. Utilizing data from the Sloan Digital Sky Survey(SDSS), demonstrating that semi-supervised GANs are able to produce accuarate and well-calibrated classifications using only a small amount of labeled examples
The ever-growing datasets in observational astronomy have challenged scientists in many aspects, including an efficient and interactive data exploration and visualization. Many tools have been developed to confront this challenge. However, they usually focus on displaying the actual images or focus on visualizing patterns within catalogs in a predefined way. In this paper we introduce Vizic, a Python visualization library that builds the connection between images and catalogs through an interactive map of the sky region.
Applying a quadratic estimator with KL-compression to calculate the angular power spectrum of a volume-limited Sloan Digital Sky Survey (SDSS) Data Release 7 (DR7) galaxy sample out to ℓ=200ℓ=200.Determining the angular power spectrum of selected subsamples with photometric redshifts z ‹ 0.3z ‹ 0.3 and 0.3 ‹ z ‹ 0.40.3 ‹ z ‹ 0.4 to examine the possible evolution of the angular power spectrum, as well as early-type and late-type galaxy subsamples to examine the relative linear bias. In addition, calculating the angular power spectrum of the SDSS DR7 main galaxy sample in a ∼53.7∼53.7 square degree area out to l=1600l=1600 to determine the SDSS DR7 angular power spectrum to high multipoles. Performing a χ2χ2 fit to compare the resulting angular power spectra to theoretical nonlinear angular power spectra to extract cosmological parameters and the linear bias. Finding the best-fit cosmological parameters of Ωm=0.267±0.038Ωm=0.267±0.038 and Ωb=0.045±0.012Ωb=0.045±0.012. We find an overall linear bias of b=1.075±0.056b=1.075±0.056, an early-type bias of be=1.727±0.065be=1.727±0.065, and a late-type bias of bl=1.256±0.051bl=1.256±0.051. Finally, presenting evidence of a selective misclassification of late-type galaxies as stars by the SDSS photometric data reduction pipeline in areas of high stellar density (e.g., at low Galactic latitudes).
Most existing star-galaxy classifiers use the reduced summary information from catalogs, requiring careful feature extraction and selection. The latest advances in machine learning that use deep convolutional neural networks allow a machine to automatically learn the features directly from data, minimizing the need for input from human experts. We present a star-galaxy classification framework that uses deep convolutional neural networks (ConvNets) directly on the reduced, calibrated pixel values. Using data from the Sloan Digital Sky Survey (SDSS) and the Canada-France-Hawaii Telescope Lensing Survey (CFHTLenS), we demonstrate that ConvNets are able to produce accurate and well-calibrated probabilistic classifications that are competitive with conventional machine learning techniques. Future advances in deep learning may bring more success with current and forthcoming photometric surveys, such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope (LSST), because deep neural networks require very little, manual feature engineering.
There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template fitting method. Using data from the CFHTLenS survey, we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2, SDSS, VIPERS, and VVDS, and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.
We analyze the clustering of photometrically selected galaxy pairs by using the halo-occupation distribution (HOD) model. We measure the two-point auto-correlation functions, ω(θ)ω(θ), for galaxies and galaxy pairs and develop an HOD to model their clustering. Our results are successfully fit by these HOD models, and we see the separation of “1-halo” and “2-halo” clustering terms for both single galaxies and galaxy pairs. Our clustering measurements and HOD model fits for the single galaxy samples are consistent with previous results. We find that the galaxy pairs generally have larger clustering amplitudes than single galaxies, and the quantities computed during the HOD fitting, e.g., effective mass, MeffMeff, and linear bias, bgbg, are also larger for galaxy pairs. We find that the central fractions for galaxy pairs are significantly higher than for single galaxies, which confirms that galaxy pairs are formed at the center of more massive dark matter haloes. We also model the clustering dependence of the galaxy pair correlation function on redshift, galaxy type, and luminosity. We find early-early pairs cluster more strongly than late-late pairs, that bright galaxy pairs cluster more strongly than dim galaxy pairs, and that the clustering does not depend on the luminosity contrast between the two galaxies in the compact group.
The estimation and utilization of photometric redshift probability density functions (photo-z PDFs) has become increasingly important over the last few years. Primarily this is because of the prominent role photo-z PDFs play in enabling photometric survey data to be used to make cosmological constraints, especially when compared to single photo-z estimates. Currently there exist a wide variety of algorithms to compute photo-z ’s, each with their own strengths and weaknesses. In this paper, we present a novel and efficient Bayesian framework that combines the results from different photo-z techniques into a more powerful and robust estimate by maximizing the information from the photometric data. To demonstrate this we use a supervised machine learning technique based on pre- diction trees and a random forest, an unsupervised method based on self organizing maps and a random atlas, and a standard template fitting method but can be easily extend to other existing techniques. We use data from the DEEP2 survey and more than 106 galaxies from the SDSS survey to explore different methods for combining the photo-z predictions from these three techniques. In addition, by using different performance metrics, we demonstrate that we can improve the accuracy of our final photo-z estimate over the best input technique, that the fraction of outliers is reduced, and that the identification of outliers is significantly improved when we apply a Näıve Bayes Classifier to this combined photo-z information. Furthermore, we introduce a new approach to explore how different techniques perform across the different areas within the information space supported by the photometric data. Our more robust and accurate photo-z PDFs will allow even more precise cosmological constraints to be made by using current and future photometric surveys. These improvements are crucial as we move to analyze photometric data that push to or even past the limits of the available training data, which will be the case with the Large Synoptic Survey Telescope.