Confidence Intervals and t-Tests

Confidence Intervals: Concept and Interpretation

A confidence interval (CI) is a range of values within which the true value of a population parameter is likely to be found with a certain probability. A confidence interval is constructed based on sample data and reflects the uncertainty associated with using a sample instead of the entire population.

Key Components of a Confidence Interval

Point Estimate — the value of a statistic computed from the sample (for example, the arithmetic mean X̄)
Confidence Level — the probability that the interval contains the true parameter (usually 95% or 99%)
Margin of Error — the amount added to and subtracted from the point estimate

Formula for the CI of the mean: CI = X̄ ± t(α/2) × (s / √n), where X̄ is the sample mean, t(α/2) is the critical value of the t-distribution, s is the sample standard deviation, n is the sample size.

Proper Interpretation

Correct: "We are 95% confident that the population mean lies in the interval from 4.2 to 5.8."

Incorrect: "There is a 95% probability that the mean is in this interval." The true mean is a fixed value, not a random variable.

Link Between Confidence Intervals and Hypothesis Testing

Confidence intervals and hypothesis testing are closely related. If the 95% confidence interval for the difference of means does not include zero, this is equivalent to rejecting the null hypothesis (H₀: μ₁ = μ₂) at the significance level α = 0.05. Confidence intervals provide more information than a simple p-value, as they show the range of plausible parameter values and the direction of the effect.

One-Sample t-Test

A one-sample t-test checks whether the mean value of a sample differs from a known (hypothetical) population value.

When to use: when you need to compare the mean of one group to a specified benchmark value (for example, students’ average score to a standard value of 70).

Hypotheses: H₀: μ = μ₀ (the mean equals the benchmark value); H₁: μ ≠ μ₀ (the mean differs from the benchmark).

Steps in SPSS: Analyze → Compare Means → One-Sample T Test → move the variable to the Test Variable(s) list → specify Test Value → click OK.

SPSS output interpretation: the table includes the t-statistic, degrees of freedom (df = n − 1), and the value Sig. (2-tailed) — this is the p-value. If p < 0.05, the null hypothesis is rejected. The table also shows the mean difference and the 95% confidence interval for this difference.

Independent Samples t-Test

This test compares the means of two independent (unrelated) groups. For example, comparing exam results between males and females, or between experimental and control groups.

Hypotheses: H₀: μ₁ = μ₂ (means of the two groups are equal); H₁: μ₁ ≠ μ₂ (means differ).

Levene's Test

Before interpreting the t-test, SPSS automatically performs Levene's test for equality of variances. If the Sig. value of Levene's test > 0.05, use the row "Equal variances assumed." If Sig. < 0.05, variances are not equal — use the row "Equal variances not assumed" (Welch's correction is applied).

Steps in SPSS: Analyze → Compare Means → Independent-Samples T Test → move the dependent variable to Test Variable(s) → move the grouping variable to Grouping Variable → click Define Groups and specify group codes → OK.

Paired Samples t-Test

The paired t-test is used when measurements are taken on the same subjects in two conditions or at two points in time ("before and after" design, pre-test / post-test).

Example: evaluating the effectiveness of a training by comparing test results before and after the course among the same participants.

Hypotheses: H₀: μ_d = 0 (the mean difference equals zero); H₁: μ_d ≠ 0 (the mean difference differs from zero).

Steps in SPSS: Analyze → Compare Means → Paired-Samples T Test → move the pair of variables (before/after) to the Paired Variables list → OK.

Interpretation: SPSS outputs the mean difference, the standard deviation of differences, the t-statistic, and the p-value. If p < 0.05, there is a statistically significant difference between the measurements.

Effect Size: Cohen's d

Statistical significance (p-value) indicates only the presence of an effect, but not its practical significance. The Cohen's d is used to assess the magnitude of the effect.

Formula: d = (M₁ − M₂) / SD_pooled, where SD_pooled is the pooled standard deviation.

Interpretation according to Cohen:

Value of d	Interpretation
0.2	Small effect
0.5	Medium effect
0.8	Large effect

SPSS does not compute Cohen's d automatically — it can be calculated manually or using online calculators from the mean and standard deviation values in the SPSS output.

Assumption Checking (Assumptions)

Normality of Distribution

t-tests assume the data are approximately normally distributed. Methods of checking: Shapiro-Wilk test in SPSS: Analyze → Descriptive Statistics → Explore → Plots → check Normality plots with tests. If p > 0.05, the distribution does not significantly differ from normal. For n > 30, the t-test is robust to violations of normality (central limit theorem).

Homogeneity of Variances

For the independent t-test, equal variances in the two groups are assumed. This is checked by Levene's test (automatically output in SPSS). When variances are unequal, Welch's correction is used.

Introduction to Analysis of Variance (ANOVA)

When comparing the means of more than two groups, the t-test is not suitable (multiple comparisons increase the probability of a Type I error). In this case, one-way analysis of variance (One-Way ANOVA) is used.

Logic of ANOVA: variance between groups (between-group variance) is compared with variance within groups (within-group variance). If the between-group variance significantly exceeds the within-group variance, a conclusion about significant differences is drawn.

Steps in SPSS: Analyze → Compare Means → One-Way ANOVA → specify the dependent and factor variables → Post Hoc → choose tests (Tukey, Bonferroni) → OK.

Post-hoc tests (Tukey HSD, Bonferroni) are conducted after a significant ANOVA result to determine between which specific groups differences exist, with correction for multiple comparisons.

Practical Assignments

Assignment 1. The average score of students in research methods was 72.5 (SD = 8.3, n = 40). Test whether this result differs from the benchmark value of 70 points. Calculate the t-statistic: t = (72.5 − 70) / (8.3 / √40) = 2.5 / 1.313 = 1.904. For df = 39, the critical value t(0.025) ≈ 2.023. Since 1.904 < 2.023, the null hypothesis is not rejected (p > 0.05). The average score does not statistically differ from 70.

Assignment 2. Experimental group (n = 25, M = 78, SD = 10) and control group (n = 25, M = 72, SD = 12). Calculate Cohen's d: SD_pooled = √((10² + 12²) / 2) = √(122) ≈ 11.05. d = (78 − 72) / 11.05 ≈ 0.54. This is a medium effect size according to Cohen's classification.

Assignment 3. An instructor measured students’ anxiety levels before (M = 45.2, SD = 9.1) and after (M = 38.6, SD = 8.7) a relaxation course (n = 30). The mean difference: 6.6 points. Determine the type of t-test (paired) and explain why it is appropriate in this situation. What assumptions need to be checked?