Lesson 2: Confidence Intervals and Sample Size | STAT
The following simulation of 95% confidence intervals is recommended to help you . The increased sample size has allowed us to conclude that the difference . of a large number of different anti-microbial agents on a bacterium and a virus. Is it practical to construct the. 90% confidence interval for the population proportion, p? • Condition 1: n)≤N. • The sample size (10) is less than 5% of the. SURVEY OF SEROPREVALENCE OF PSEUDORABIES VIRUS . confidence level desired, the larger the sample size needed. . Intro-cluster correlation coefficients calculated for swine, with a G1 antibody response against Pseudorabies.
In this paper we derive methods for determining sample sizes for cross-sectional surveys to estimate incidence with sufficient precision. We further show how to specify sample sizes for two successive cross-sectional surveys to detect changes in incidence with adequate power.
In these surveys biomarkers such as CD4 cell count, viral load, and recently developed serological assays are used to determine which individuals are in an early disease stage of infection. The total number of individuals in this stage, divided by the number of people who are uninfected, is used to approximate the incidence rate.
Our methods account for uncertainty in the durations of time spent in the biomarker defined early disease stage. We find that failure to account for this uncertainty when designing surveys can lead to imprecise estimates of incidence and underpowered studies.
We evaluated our sample size methods in simulations and found that they performed well in a variety of underlying epidemics.
Sample Size Methods for Estimating HIV Incidence from Cross-Sectional Surveys
Code for implementing our methods in R is available with this paper at the Biometrics website on Wiley Online Library. There exists considerable uncertainty in these estimates.
Accurate estimation of the rate at which new infections occur is crucial for tracking and surveillance of the epidemic. Knowledge of this hazard rate, known as the incidence, is also critical for effectively designing, targeting, and evaluating prevention efforts. Cohort studies are traditionally used to estimate hazard rates.
HIV incidence has been measured through longitudinal follow up of cohorts in various subpopulations Karon et al. In these studies ethical considerations may necessitate counseling against risky sexual behavior. This counseling can change the incidence — the very quantity the study is designed to estimate.
Additional complications arise from selection bias and loss to follow up Brookmeyer, These challenges, combined with cost, have led public health agencies to look for alternative methodologies for estimating HIV incidence. Cross-sectional surveys offer an increasingly promising alternative for estimating HIV incidence Laeyendecker et al.
The cross-sectional surveys utilize biomarkers, such as HIV viral load and CD4 cell count, to define and mark people in an early disease stage. We would say that we have failed to reject the null hypothesis, i. Does that mean that there is actually no difference in the response to the two drugs? This is a really terrible drug trial because it has so few participants. No real drug trial would have only four participants in treatment group.
It is quite possible that the response to the two drugs is actually different i. On the other hand, it is also possible that the drug does nothing at all the null hypothesis is true. We may have just had bad luck in sampling and our few participants were not very representative of the population in general. The only way we could have known which of these two possibilities were true would be if we had sampled more patients.
We will now examine how the situation could change under each of these two scenarios if we had more patients. Let us imagine what would have happened under the first scenario where the alternative hypothesis is true if we had measured a total of 16 patients per group instead of 4. Because we have so many more measurements, the histogram looks a lot more like a bell-shaped curve.
In this scenario, the first four measurements were actually pretty representative of their populations, so the sample means did not change that much with the addition of 12 more patients.
The increased sample size has allowed us to conclude that the difference between the sample means is significant. Under this scenario, we can see that we really did have bad luck with our first four measurements. By chance, one of the first four participants that we sampled in the old drug treatment had an uncharacteristically small response to the drug and the other three had a smaller response than the average of all sixteen patients.
By chance, three of the four participants who received the new drug had higher than average responses to the drug. But now that we have sampled a greater number of participants, the difference between the two treatment groups has evaporated. The uncharacteristic measurements have been canceled out by other measurements that vary in the opposite direction.
Although the confidence interval is narrower with 16 participants than it was with four, that narrowing did not result in a significant difference because at the same time, the estimate of the mean for the two treatments got better as well.
Since under this scenario there was no difference between the two treatments, the better estimates for the sample means converged on the single population mean about 16 mm Hg. In discussing these two scenarios, we supposed that we somehow knew which of the two hypotheses was true. In reality, we never know this.
But if we have a sufficiently large sample size, we are in a better position to judge which hypothesis is best supported by the statistics.
You have also seen that to a large extent our ability to detect real differences depends on the number of trials or measurements we make. Of course, if the differences we observe are not real, then no increase in sample size will be big enough to make the difference significant.
Our ability to detect real differences is called statistical power. Obviously we would like to have the most statistical power possible. There are several ways we can obtain greater statistical power.
One way is to increase the size of the effect by increasing the size of the experimental factor. An example would be to try to produce a larger effect in a drug trial by increasing the dosage of the drug. Another way is to reduce the amount of uncontrolled variation in the results.
For example, standardizing your method of data collection, reducing the number of different observers conducting the experiment, using less variable experimental subjects, and controlling the environment of the experiment as much as possible are all ways to reduce uncontrolled variability.
Sample Size Methods for Estimating HIV Incidence from Cross-Sectional Surveys
A third way of increasing statistical power is to change the design of the experiment in a way that allows you to conduct a more powerful test. For example, having equal numbers of replicates in all of your treatments usually increases the power of the test.
Simplifying the design of the experiment may increase the power of the test. Using a more appropriate test can also increase statistical power.
Finally, increasing the sample size or number of replicates nearly always increases the statistical power.
Lesson 2: Confidence Intervals and Sample Size
Obviously the practical economics of time and money place a limit on the number of replicates you can have. Theoretically, the outcome of an experiment should be equally interesting regardless of whether the outcome of an experiment shows a factor to have a significant effect or not. As a practical matter, however, there are far more experiments published showing significant differences than studies showing factors to not be significant. There is an important practical reason for this.
If an experiment shows differences that are significant, then we assume that is because the factor has a real effect. However, if an experiment fails to show significant differences, this could be because the factor doesn't really have any effect. But it could also be that the factor has an effect but the experiment just didn't have enough statistical power to detect it.
The latter possibility has less to do with the biology involved and more to do with the experimenter's possible failure at planning and experimental design - not something that a scientific journal is going to want to publish a paper about! Generally, in order to publish experiments that do not have significant differences it is necessary to conduct a power test.
A power test is used to show that a test would have been capable of detecting differences of a certain size if those differences had existed.