1 Introduction
1.2 Probability Distributions for Categorical Data
1.2.1 Binomial Distribution
\[\begin{equation} P(y) = \frac{n!}{y(n-y)!}\pi^y(1-\pi)^{n-y},\ y = 0, 1, 2, ..., n. \tag{1} \end{equation}\]
\[P(0) = \frac{10!}{0!10!}(0.20)^0(0.80)^{10} = (0.80)^{10} = 0.107\] \[P(1) = \frac{10!}{1!9!}(0.20)^1(0.80)^{9} = 10(0.20)(0.80)^{9} = 0.268\]
library(kableExtra)
<- function(pi, n, y) {
bino factorial(n) / (factorial(y) * factorial(n - y)) * pi ^ y * (1 - pi) ^ (n - y)
}<-0:10
y
<- data.frame(y, x2 = bino(.2, 10, y), x3 = bino(.5, 10, y), x4 = bino(.8, 10, y))
table1_1
kable(table1_1,
digits = 3,
align='c',
col.names = c("$y$",
"$P(y)$ when $\\pi = 0.20$ $(\\mu = 2.0, \\sigma = 1.26)$",
"$P(y)$ when $\\pi = 0.50$ $(\\mu = 5.0, \\sigma = 1.58)$",
"$P(y)$ when $\\pi = 0.80$ $(\\mu = 8.0, \\sigma = 1.26)$"))
\(y\) | \(P(y)\) when \(\pi = 0.20\) \((\mu = 2.0, \sigma = 1.26)\) | \(P(y)\) when \(\pi = 0.50\) \((\mu = 5.0, \sigma = 1.58)\) | \(P(y)\) when \(\pi = 0.80\) \((\mu = 8.0, \sigma = 1.26)\) |
---|---|---|---|
0 | 0.107 | 0.001 | 0.000 |
1 | 0.268 | 0.010 | 0.000 |
2 | 0.302 | 0.044 | 0.000 |
3 | 0.201 | 0.117 | 0.001 |
4 | 0.088 | 0.205 | 0.006 |
5 | 0.026 | 0.246 | 0.026 |
6 | 0.006 | 0.205 | 0.088 |
7 | 0.001 | 0.117 | 0.201 |
8 | 0.000 | 0.044 | 0.302 |
9 | 0.000 | 0.010 | 0.268 |
10 | 0.000 | 0.001 | 0.107 |
\[E(Y)=\mu=n\pi,\ \sigma = \sqrt{n\pi(1-\pi)}\] ### 1.2.2 Multinomial Distribution {#x1.2.2}
1.3 Statistical Inference for a Proportion
1.3.2 Significance Test about a Binomial Parameter
\[E(\hat{\pi})=\pi,\ \sigma(\hat{\pi}) = \sqrt{\frac{\pi(1-\pi)}{n}}\]
\[z = \frac{\hat{\pi}-\pi_0}{SE_0} = \frac{\hat{\pi}-\pi_0}{ \sqrt{\frac{\pi_0(1-\pi_0)}{n}}}\]
1.3.3 Example: Surveyed Opinions About Legalized Abortion
round(837/1810, 4)
[1] 0.4624
\[z = \frac{\hat{\pi}-\pi_0}{ \sqrt{\frac{\pi_0(1-\pi_0)}{n}}} == \frac{.4624-.5}{ \sqrt{\frac{0.50(0.50)}{1810}}} = -3.2\]
<- round((.4624-.5)/sqrt(0.50*0.50/1810), 2)
z round(2 * pnorm(z, lower.tail=TRUE), 4)
[1] 0.0014
1.4 Statistical Inference for Discrete Data
1.4.1 Wald, Likelihood-Ratio, and Score Tests
\[z = (\hat{\beta}-\beta)/SE\] The two-tailed standard normal probability of 0.05 that falls below -1.96 and above 1.96 equals the right-tail chi-squared probability above \((1.96)^2 = 3.84\) when df = 1.
2 * pnorm(-1.96) # 2 * standard normal cumulative prob below -1.96
[1] 0.04999579
pchisq(1.96^2, 1) # chi-square cumulative probability
[1] 0.9500042
1 - pchisq(1.96^2, 1) # right tailed prob above 1.96 * 1.96 when df = 1
[1] 0.04999579
pchisq(1.96^2, 1, lower.tail = FALSE) # same
[1] 0.04999579
\[2\ \mathrm{log}(\ell_1 / \ell_0) = \mathrm{oberved/null}\]
1.5 Bayesian Inference for Proportions
1.5.1 The Bayesian Approach to Statistical Inference
\[ \ \ \ \ \ \ \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ell(\beta)\ \ \ prior \\\ g(\beta|y) \mathrm{\ is\ proportional\ to}\ p(y|\beta)f(\beta) \]
1.6 Using R
software for Statistical Inference about Proportions
1.6.1 Reading Data Files and Installing Packages
<- read.table("http://users.stat.ufl.edu/~aa/cat/data/Clinical.dat",
Clinical header = TRUE)
Clinical
subject response
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 6 1
7 7 1
8 8 1
9 9 1
10 10 0
1.6.2 Using R
for Statistical Inference about Proportions
library(binom)
prop.test(837, 1810, p = 0.50, alternative = "two.sided", correct = FALSE)
1-sample proportions test without continuity correction
data: 837 out of 1810, null probability 0.5
X-squared = 10.219, df = 1, p-value = 0.00139
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.4395653 0.4854557
sample estimates:
p
0.4624309
prop.test(837, 1810, p = 0.50, alternative = "less", correct = FALSE)
1-sample proportions test without continuity correction
data: 837 out of 1810, null probability 0.5
X-squared = 10.219, df = 1, p-value = 0.0006951
alternative hypothesis: true p is less than 0.5
95 percent confidence interval:
0.0000000 0.4817492
sample estimates:
p
0.4624309
prop.test(sum(Clinical$response), n = 10, conf.level = 0.95, correct = FALSE)
1-sample proportions test without continuity correction
data: sum(Clinical$response) out of 10, null probability 0.5
X-squared = 6.4, df = 1, p-value = 0.01141
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.5958500 0.9821238
sample estimates:
p
0.9
with(Clinical, prop.test(sum(response), n = 10, conf.level = 0.95, correct = FALSE))
1-sample proportions test without continuity correction
data: sum(response) out of 10, null probability 0.5
X-squared = 6.4, df = 1, p-value = 0.01141
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.5958500 0.9821238
sample estimates:
p
0.9
binom.confint(9, 10, conf.level = 0.95,
method = c("asymptotic", "wilson","agresti-coull"))
method x n mean lower upper
1 agresti-coull 9 10 0.9 0.5740323 1.0039415
2 asymptotic 9 10 0.9 0.7140615 1.0859385
3 wilson 9 10 0.9 0.5958500 0.9821238
binom.test(9, 10, 0.5, alternative = "two.sided")
Exact binomial test
data: 9 and 10
number of successes = 9, number of trials = 10, p-value = 0.02148
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.5549839 0.9974714
sample estimates:
probability of success
0.9
binom.test(9, 10, 0.5, alternative = "greater")
Exact binomial test
data: 9 and 10
number of successes = 9, number of trials = 10, p-value = 0.01074
alternative hypothesis: true probability of success is greater than 0.5
95 percent confidence interval:
0.6058367 1.0000000
sample estimates:
probability of success
0.9
library(exactci)
::binom.exact(9, 10, 0.50, alternative = "greater", midp = TRUE) exactci
Exact one-sided binomial test, mid-p version
data: 9 and 10
number of successes = 9, number of trials = 10, p-value = 0.005859
alternative hypothesis: true probability of success is greater than 0.5
95 percent confidence interval:
0.6504873 1.0000000
sample estimates:
probability of success
0.9
library(PropCIs)
midPci(9, 10, 0.95)
data:
95 percent confidence interval:
0.5966 0.9946
qbeta(c(0.025, 0.975), 837.5, 973.5)
[1] 0.4395369 0.4854450
pbeta(0.50, 837.5, 973.5)
[1] 0.9993082
1 - pbeta(0.50, 837.5, 973.5)
[1] 0.0006918185