How to Avoid a Common Mistake When Comparing Two Inventories

This article first appeared in the March issue the Forestry Source.

Comparing two inventories is challenging. There are many factors that need to be considered when comparing two inventory estimates and the math can get pretty deep pretty fast. So it's not surprising that many foresters use shortcuts and rules of thumb to evaluate inventory results. In this article, we'll take a closer look at a common shortcut we've seen in the real world and see why it can be misleading.

Consider the common situation where two cruises, each with a 90% confidence level, come back with a different estimate of the average basal area (BA) in a stand. Let's say that the seller's inventory estimates the square feet per acre of BA at 186 ± 16, so the low end of the confidence interval is 170 and the high end is 202. Then a potential buyer conducts an independent inventory that estimates BA at 167 ± 15 ft2/acre. How should the seller feel about the buyer's inventory?

Here's a flawed rule of thumb that we've seen used. The buyer's average BA (167 ft2/acre) falls outside the 90% confidence interval of the seller's BA estimate (170 - 202 ft2/acre). Because the buyer's mean estimate is outside the confidence interval, the seller might claim that the buyer's inventory was "bad" and can't be trusted. However, this is not a good way to compare two inventories.

Remember that any timber cruise is just one particular sample of the stand. A different cruise is going to have plots in different places and will pick up different trees, so there will be some difference. The real question is, how much difference should we expect?

To get a handle on this question, we can simulate cruising a stand many times. Without getting too into the weeds, we constructed a population of 100,000 potential sample plots with a mean BA of 175 and a standard deviation of 37.7 ft2/acre. Then we simulated 20 cruises by randomly picking 15 plots for each cruise. Each one of these cruises is an unbiased, representative sample of our stand. The code for this simulation is available at if you'd like to get into the details.

We've graphed the 90% confidence interval for each simulated cruise below. The circle in the middle of each confidence interval is the cruise mean BA.

Figure 1: 20 simulated cruises of the same stand


There are a few things to notice about this figure. First, note that 18 of the 20 confidence intervals contain the population mean BA (the solid gray line at 175 ft2/acre). 10% of the cruises don't contain the population mean - this is exactly what we would expect from a 90% confidence interval.

Secondly, not all of the confidence intervals overlap with each other. Our original seller's inventory is highlighted blue at the top of this figure (simulation 1). The light gray shading covers the 90% confidence interval of simulation 20, making it easy to see which cruises have means that fall outside (shown in red). Note that 6 of the 20 simulated cruises (including #2, our buyer's inventory) don't have means that fall within the seller's confidence interval. Remember, each one of these simulated cruises is an unbiased, representative sample of our stand. Even though these are all valid cruises, 30% of them don't pass our "rule of thumb" test. This is clearly not a robust test.

What did our rule of thumb get wrong? One key mistake was that it only considered the mean of buyer's inventory, rather than the full confidence interval. Recall that we set a 90% confidence level for our confidence intervals. This means that if we cruised this stand 20 times, we would expect 18 (90%) of the cruise confidence intervals to include the true population mean. The confidence interval conveys important information about how "good" an inventory is - in general better inventories will have tighter confidence intervals and a worse inventories will have wider ones. Confidence intervals also indicate the underlying variation in the sample. If a stand has patches of especially high or low stocking that are measured, the confidence interval will be wider. It’s inadvisable to discard this critical information when comparing two estimates.

A simple comparison of confidence intervals still may not be adequate to determine if an inventory is problematic. Note that simulation 15's confidence interval doesn't intersect with simulation 1's, even though they're both totally valid cruises of the same stand. This highlights an important fact - at any confidence level less than 100%, there is a possibility that the confidence intervals won't overlap. It may be unlikely, but it's going to happen sometimes. An individual stand may pass or fail and it won't tell you too much about the quality of the inventories you're comparing.

A more robust statistical comparison could include t-tests (which ask the question “Are these two samples statistically different?”), or a technique called equivalence testing, which asks the opposite question - “Are these two samples statistically similar?”. Comparing cruises in multiple stands to get a more complete picture of a strata or property is also valuable, because it reduces the impact of variability at the stand level.

This simulation also illustrates the importance of understanding confidence intervals as well as the limitations of samples and their resulting estimates to provide absolute certainty.

Furthermore, our comparison of simulated cruises is intentionally simplified. In the real world, there are usually other significant differences between the inventories, including the sample design, the cruise protocol, the collection date, etc. that further complicate a fair comparison. We'll explore these issues in more depth in the future.

At one time or another, nearly all foresters find themselves having to decide between two conflicting inventories, whether on a timber sale or when evaluating new inventory methods. We hope this article helps you avoid a common pitfall when comparing inventories - be careful out there!