Detection Rates of Geckos in Visual Surveys: Turning Confounding Variables into Useful Knowledge
Transect surveys without some means of estimating detection probabilities generate population size indices prone to bias because survey conditions differ in time and space. Knowing what causes such bias can help guide the collection of relevant survey covariates, correct the survey data, anticipate situations where bias might be unacceptably large, and elucidate the ecology of target species. We used negative binomial regression to evaluate confounding variables for gecko (primarily Hemidactylus frenatus and Lepidodactylus lugubris) counts on 220-m-long transects surveyed at night, primarily for snakes, on 9,475 occasions. Searchers differed in gecko detection rates by up to a factor of six. The worst and best headlamps differed by a factor of at least two. Strong winds had a negative effect potentially as large as those of searchers or headlamps. More geckos were seen during wet weather conditions, but the effect size was small. Compared with a detection nadir during waxing gibbous (nearly full) moons above the horizon, we saw 28% more geckos during waning crescent moons below the horizon. A sine function suggested that we saw 24% more geckos at the end of the wet season than at the end of the dry season. Fluctuations on a longer timescale also were verified. Disturbingly, corrected data exhibited strong short-term fluctuations that covariates apparently failed to capture. Although some biases can be addressed with measured covariates, others will be difficult to eliminate as a significant source of error in long-term monitoring programs.Abstract

The 11-step model-building procedure used to evaluate 20 variables (defined in Table 1) in addition to an intercept, and the two models at which we arrived. Dots represent variables (listed across the top) included in the focal model (row); gray cells indicate variable(s) evaluated in a particular step. For each step, various models are compared with each other (model evaluation for suitable phase shifts of cyclic variables not shown); the penalization factor K and ΔAIC values are listed to the right. Apart from initial fitting of cyclic variables in steps 1 and 2, each step is founded on the best model from the previous step. Variables TIMEPERIOD and SEASON can correct for true population fluctuations over time and thereby help increase the explanatory value of a model; other variables are considered to be confounding effects for which a population index should be corrected. Some factorial states for variables SEARCHER, HEADLAMP, and TIMEPERIOD were collinear (see text and Appendix 2), yet those variables had strong effects. Other variables were not collinear, and considering the large sample size and overall strong effects, the final models should be immune to the order (steps 3–9) with which the variables were evaluated.

Model selection to find a seasonal cyclicity in gecko detections (panel A; based on model 10.x) showed that a sinusoidal function with a 365-d wavelength fitted data best (i.e., ΔAIC = 0; notice the logarithmic y-axis scale for ΔAIC of models with less good fit) when the function's maximum was shifted to either late June, corresponding to the shift from the dry to the wet season, or late December, corresponding to the shift from the wet to the dry season. As expected, the coefficient for variable SEASON (panel B; estimate and associated 95% CI for all phase shifts) changed sign depending on the phase shift: there was a positive correlation between the function's positive (+1) maximum and the gecko detection rate from September through March (empty symbols in panel A), then a negative correlation between the positive maximum and the detection rate during the remainder of the year. Hence, the model suggests that gecko detections peaked in late December and were lowest in late June.

(A) Model support as the MOONPHASE variable is phase shifted by 1-d intervals throughout an entire moon cycle (evaluated in step 11); the large circle indicates the 25-d shift rendering the most plausible model (i.e., ΔAIC = 0; notice the logarithmic y-axis scale for ΔAIC of models with less good fit). (B) The coefficient for MOONPHASE (and its 95% CI) in each of the 30 models with different phase shifts. The most plausible model does not render the strongest effect, but is associated with a narrower MOONPHASE effect CI than does the phase shift with the most extreme coefficient estimate. (C) The negative effect of a moon above the horizon (simultaneously estimated); it was least obvious when MOONPHASE was phase shifted only a small amount (i.e., when the variable code +1 roughly coincided with the full moon) or when phase shifted about 180° (so that +1 coincided with the new moon). Conversely, the negative effect of a moon above the horizon was strongest during waxing and waning moons. The large circle is the estimated effect in the model using the most plausible 25-d MOONPHASE shift.

Comparing relative effect sizes of the variables affecting gecko detection rate in the time-adjusted model. (For an illustration of the corresponding index model, see the supplementary figure online.) The left y-axis shows the linear estimates. For categorical variables, this simply is the negative binomial model coefficients (βi) for the different states i (filled symbols) relative to the variable's reference state (empty symbol); vertical bars indicate the 95% CI. The searcher and headlamp (lamp) states are listed by order with which they contributed data (many transect searches to few). For continuous variables modeled linearly (time since sunset, moon phase, season), the linear estimate is the variable value times β (xβ); for the variable wind (modeled as a third-degree polynomial), the linear estimate equals (xβlinear + x2βquadratic + x3βcubic). Shown are the predicted effect functions (continuous lines) and their 95% CIs (dotted lines) for the relevant range of values each variable takes. The right y-axis equals ([elinear estimate] × 100) and is logarithmic; it renders an index value of 100 for each variable's reference state or reference value (= 0) and makes it easier to appreciate the percentage change in gecko detections that will occur when changing from reference states to another state (e.g., another searcher or headlamp) or when comparing different values of continuous variables. The model intercept (star symbol) has no reference group and is relevant to interpret only ON the left y-axis. It gives the number of geckos detected on a transect search where all variables are at their reference state or value (0) (time-adjusted model: predicted gecko count = e−0.325 = 0.72 and index model-predicted gecko count = e−0.201 = 0.82). In reality, an average transect rendered 9,776/9,475 = 1.03 gecko detections. This discrepancy is explained by dummy coding and an unbalanced experimental design, and we did not standardize continuous variables; hence no variable had a mean of zero.

Unexplained variation in gecko detections in three negative binomial models of different complexities. Smoothed nightly mean residual gecko detections are shown to avoid excessive clutter (because N = 9,475 person-transect searches) and to better illustrate temporal trends. The null model (containing only an intercept; upper panel) renders data similar to smoothed average gecko counts, but on a different scale. Adjusting for confounding variables (but not seasonal or more long-term temporal patterns) results in a somewhat dampened pattern (middle panel; y-axis scales of all panels are identical). Adding a seasonal effect and a more long-term temporal effect (five discrete time states) most markedly aligns the five time slots at the same level; the adjustment for season has a less striking effect. Neither model explains more than 21% of the variance in the data. The x-axis of all panels has 1 January 2004 as day 1. The effect of correcting for confounding variables can have a more dramatic impact on single data points than appreciated from this illustration; averaging and smoothing the patterns tends to obscure outliers. (We usually smoothed the nightly mean residuals over 11 search occasions: five prior search nights, the focal search night, and the five search nights to follow; in case of a gap exceeding 25 nights between consecutive search nights, we truncated to smooth over as little as six search nights.)
Contributor Notes