Chas, thanks for your very detailed reply concerning my question about
error terms for multiple population estimates. Al Zale of the Coop Unit
here mentioned that he thought it was "a very clear and elegant response"!
I am familiar with the two-stage sampling work by Hankin, but had not
thought of it in terms of my question, so that was helpful to have you
point that out. I also was not familiar with the terms "process" vs.
"enumeration" variance. In the recent Aquatic Habitat Msmts chapter I
wrote with Zale and Orth for the new Techniques book, I referred to these
variance terms from a habitat msmt perspective as "measurement" error and
"extrapolation" error for want of better terminology; again, to get
students/readers to think that both are important, and as you stated, the
latter is often more important than the former.
I didn't get much in the way of other responses to my inquiry. A
statistician from Australia sent me a variance formula which I am still
trying to decipher, so your answer by far was the most helpful.
P.S. I have finally rec'd all three reviewer comments on the low flow ms.
you reviewed recently. I will send you my summary and the editors
(Hubert's) when completed. Thanks for your good comments-- all in all,
reviewers found the results worth reporting, but substantial rethinking,
reanalysis, and redirection of the paper was needed.
At 10:10 AM 5/30/97 -0400, you wrote:
>Tom McMahon wrote:
>>How does one calculate an error term for multiple population estimates? As
>>an example, you want to test the hypothesis that fish abundance in two
>>different habitat types was statistically different. You picked five
>>samples of each habitat type and performed a depletion/removal estimate in
>>each habitat. For each habitat sampled, you obtain a point population
>>estimate with an associated variance. To compare population abundance
>>among the two habitat types, it is straightforward to take the mean of the
>>five population estimates, but my question is, how do you calculate the
>>error term used in the statistical comparison? Using the variance of the
>>five means would seemingly highly underestimate the true variance around
>>each point estimate of abundance, but I'm unsure of the correct way to
>>essentially take the 'mean' variance of the five individual estimates.
>It is OK to use the variance of the five means as the error term. Call this
>total variance. This total variance is actually the sum of two variance
>Var total = Var process + Var enumeration.
>The first component is due to process variance, which is true variation
>among the sampled habitat units (i.e., each habitat unit, in truth,
>contained a different number of animals). The second component of the
>variance is enumeration variation, which is variation arising because you
>did not count every animal in the sampled habitat units, you made an
>estimate (i.e., you are not sure what the true population is in even one
>sampled habitat unit).
>Think of it this way. If the habitat units you sampled all, in truth, had
>exactly the same number of animals (process variation = 0), your five
>estimates would probably still be different because each is an estimate with
>associated enumeration variation. Thus, the total variance observed among
>the five units would be due all to enumeration variation. On the other
>hand, if the habitat units sampled did differ in the number of animals, but
>you were able to do a complete count in each unit (enumeration variation =
>0), then the total variance would all be due to process variation. In
>truth, the situation is somewhere in between: you have both process
>variation and enumeration variation included in the total variance you
>observe among the five estimates.
>A conservative statistical test (i.e., one that would tend not to reject the
>null hypothesis of no difference between habitat-types) would use the total
>variance as the error term. If you do this test and it rejects, you can
>make a good case that the two habitat types really do support different
>numbers of fish. Try a t-test.
>If the test doesn't reject, it may be due to low power, and the low power
>may be due to high enumeration variation. In this case you would like to
>subtract out the enumeration variation because it is inflating your error
>term. You could do this because you have estimates of enumeration variation
>(the variances of each of the 5 population estimates; these variances are
>based on the statistical model [the removal model] that you used to generate
>the estimates). Details can be found in Skalski and Robson (1992.
>Techniques for wildlife investigations - design and analysis of capture
>data. Academic Press). However, in almost all cases, it turns out that
>most of the total variance is due to process variation, not enumeration
>variation, so you don't get much more power in your test anyway. If your
>simple t-test using the total variance doesn't reject, the problem is
>probably that your sample size of habitat units is too small, not that you
>did a poor job estimating abundance within each sampled unit.
>In truth, what you have is a multi-stage sampling design. At the first
>stage, you *randomly* selected habitat units from some larger population of
>habitat units. At the second stage, you made an estimate of abundance in
>each sampled habitat unit. Two stages, two sources of variation. Hankin
>(1984. Multistage sampling designs in fisheries research: applications in
>small streams. CJFAS 41:1575-1591) does a nice job of explaining all this.
>These ideas are most important when designing a sampling program. The basic
>question is, given limited time and money, should you put your effort into
>sampling more habitat units with less efficiency in each (i.e., reducing
>process variance at the expense of increased enumeration variation), or
>should you put your effort into sampling fewer habitat units with greater
>efficiency in each (i.e., increased process variance but with reduced
>enumeration variation). Almost always, the best choice is more units with
>less effort in each (see Hankin and Reeves 1988, CJFAS 45:834-844).
>So, take your two sets of five estimates and do a t-test. If it rejects,
>rejoice. If not, you probably will need to sample more habitat units.
>Department of Biology and Environmental Studies Program
>Ashland, VA 23005
>email: [log in to unmask]