In my last post, I presented a high-level analysis looking at some of the factors leading to high observation counts in the AAVSO database. That analysis made use of information from the General Catalog of Variable Stars (GCVS) as well as spreadsheets provided by the AAVSO containing observation counts and estimates of citations for all long-period variables (LPVs) in the AAVSO database. We found, for example, that observation counts are highly skewed towards a relatively few LPVs, that these popular stars occur almost exclusively above zero degrees declination, and that the magnitude of the star at maximum (brightest) strongly influences the popularity (observation counts) – LPVs whose maximum falls below 10m (an approximate binocular limit – at least here in the light-polluted skies of Boston suburbia) rarely have large numbers of observations.
There are of course many other factors that observers take into account when selecting stars of interest. Variables that are close together in the sky seem more likely to be observed because it is easier to knock them both off at once. Stars with more interesting light-curves that offer “surprises” are I suspect going to garner greater attention, all else being equal. A star may be difficult to observer because charts are lacking or because there are no convenient comparison stars. And we should not forget the important influence of the AAVSO in making special requests of the user community via special bulletins, or regular articles such as “Variable of the Season.” Indeed, one of the most important functions of the AAVSO, in my view, is to help align amateurs with the scientific goals of the professional astronomical community.
With the recent announcement that the AAVSO was forming a special section dedicated to long-period variables, an important task before the section is to choose LPVs of scientific and community interest. Citation counts estimated by the AAVSO provide some measure of the overall scientific importance of an individual star. We use these citation counts together with the observation counts to propose a methodology for selecting candidate LPVs for the new section.
The most obvious thing to do is to consider references per 1000 observations (RefPerKObs). For example, Y Cas with 6,604 observerations has accumulated 100 citations giving it a RefPerKObs of 15.1. By comparison, RV Her (Obs=10219, Refs=34) has a RefPerKObs of only 3.3. The problem with this statistic is that it is undefined for near 3/4ths of the LPV stars in the AAVSO database having 0 observation counts, and tends to be highly skewed when the observation counts are very low. (The difference between having 1 observation and 2 observations is a factor of two difference in RefPerKObs!)
So instead we consider computing two percentiles – one based on its ranking (1,2,3…) with respect to observations, the other with respect to number of citations. Then we define a score:
(1) RankPercentileDiff = CitationRankPercentile – ObservationRankPercentile
The resulting score varies from [-1...+1], and a high (>.5) RankPercentile suggests an under-observed star with relatively high scientific interesest worthy perhaps of being added to the LPV program. It turns out this score is actually quite bad because of what I noted in the last time: observation counts are highly skewed towards a relatively few popular stars. As a result the M-type star, LX Cyg, ranks in the 50th percentile in observations with a paltry 1700+ observations. But over 92% of the 6.7 million LPV observations in the AAVSO database are associated with higher-ranking stars. The RankPercentile defined above tends to underestimate the importance of a star. LX Cyg achieve a maximum magnitude of only 11.5, so as I noted above, it’s low observation counts are to be expected. We address these observability issues below. Nor have we taken into account how its 23 citations stack up.
To address the above concern, we modify the Observation and Citation percentiles based on actual counts rather than a simple ranking. Thus for each LPV:
AAVSO_Obs_Percentile = percent of observations occurring in stars with fewer observations.
NumRefs_Percentile = percent of citations associated with stars having fewer citations.
Note that stars having the same number of observations or references have the same observational or reference percentiles, respectively.
Finally:
(2) PercentileDiff = AAVSO_Obs_Percentile – NumRefs_Percentile
Figure 1 below plots the observational percentile vs. the reference percentile, with redder stars having a higher percentile difference. (Click on each image to see ful size).
As noted in my last post, sometimes stars like CW Leo, a proto-planetary nebula surrounded by water-containing comets, generates high scientific interest outside of its variability. But by way of example, compare RS Virginis (marked, upper left) to AF Cyg (right). RS Vir has garnered far fewer observations (6469 -vs- 57799) but has far more citations (164 -vs- 68), a victim perhaps of the declination effect where more southerly stars tend to be neglected. Am I suggesting that AF Cyg should be excluded from the LPV section? Absolutely not! But if I’m trying to get members to focus their attention, I’m focussing them more towards RS Vir – a star that seems to have high scientific interest, but has been relatively neglected by observers.
To account for observability, we filter out stars having declination south of -20 degrees, and a maximum magnitude below 10.0m We further exclude LPVs having amplitudes less than 1.0. We also limit ourselves to visual (V) and photographic (p) bands. One might choose different cutoffs. The point is simply to demonstrate that reasonable cutoffs still lead to a goodly number of focus candidates. The above constraints still leave about 600 candidates to choose from (about 5%)
Figure 2 is a revised scatter plot with the above constraints in place. These are stars that are more accessible to the observing community.
Figure 3 below shows the distribution on PercentileDifference for the 600 or so remaining candidates. I would argue that the 154 stars in the marked should all be included in the LPV section.
The table below lists these 154 stars and their basic properties. I emphasize that this list isn’t intended to be exclusive. Furthermore, the final selection process should ensure that different LPV types (irregulars, semi-regulars, miras) are well represented. Finally, it should be noted that U Herculis at the top of the list appears to be the result of an observation count error in the AAVSO spreadsheets, and that there may be a systemic error in the counts involving variable designations with Greek letters, or names that can be confused with greek letters (U = u = mu?). This table isn’t intended to be the final word, but rather a demonstration of a methodology for narrowing down candidates.
rachlin_154_lpv_candidates.pdf
Addendum: Another thought occurred to me this afternoon. You might ask – why consider low observation counts as a factor? Why not just look at the stars with the highest scientific interest? It would certainly be reasonable to include such stars if they are overlooked in the table above. What I’m focussing on here is finding stars where increases in the observation counts could potentially have the greatest scientific impact. Adding five or ten thousand observations to stars with a few hundred in the books might provide novel scientific insights – a new dimension of understanding to a star whose scientific importance has already been established. Adding more observations to a star already teaming with tens of thousands of observations may have less potential for impact, in my opinion. For this reason, I consider stars currently having fewer observations as an important selection criteria.
AAVSO/RJB


