The emptiness inside: Finding gaps, valleys, and lacunae with geometric data analysis
Auteurs : Gabriella Contardo, David W. Hogg, Jason A. S. Hunt, Joshua E. G. Peek, Yen-Chi Chen
Résumé : Discoveries of gaps in data have been important in astrophysics. For example, there are kinematic gaps opened by resonances in dynamical systems, or exoplanets of a certain radius that are empirically rare. A gap in a data set is a kind of anomaly, but in an unusual sense: Instead of being a single outlier data point, situated far from other data points, it is a region of the space, or a set of points, that is anomalous compared to its surroundings. Gaps are both interesting and hard to find and characterize, especially when they have non-trivial shapes. We present methods to address this problem. First, we present a methodological approach to identify critical points, a criterion to select the most relevant ones and use those to trace the `valleys' in the density field. We then build on the observed properties of critical points to propose a novel gappiness criterion that can be computed at any point in the data space. This allows us to identify a broader variety of gaps, either by highlighting regions of the data-space that are `gappy' or by selecting data points that lie in local under densities. We also explore methodological ways to make the detected gaps robust to changes in the density estimation and noise in the data. We illustrate our methods on the velocity distribution of nearby stars in the Milky Way disk plane, which exhibits gaps that could originate from different processes. Identifying and characterizing those gaps could help determine their origins.
