July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 CHAPTER 1 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. PROBABILITY AND PROBABILITY DISTRIBUTIONS Jian Shi? 1.1. The Axiomatic Definition of Probability1,2 There had been various de?nitions and methods of calculating probability at the early stage of the development of probability theory, such as those of classical probability, geometric probability, the frequency and so on. In 1933, Kolmogorov established the axiomatic system of probability theory based on the measure theory, which laid the foundation of modern probability theory. The axiomatic system of probability theory: Let ? be the set of point ?, and F be the set of subset A of ?. F is called an ?-algebra of ? if it satis?es the conditions: (i) ? ? F; (ii) if A ? F, then its complement set Ac ? F; (iii) if An ? F for n = 1, 2, . . ., then ?? n=1 An ? F. Let P (A)(A ? F) be a real-valued function on the ?-algebra F. Suppose P ( ) satis?es: (1) 0 ? P (A) ? 1 for every A ? F; (2) P (?) = 1; ? (3) P (?? n=1 An ) = n=1 P (An ) holds for An ? F, n = 1, 2, . . ., where Ai ? Aj = ? for i = j, and ? is the empty set. Then, P is a probability measure on F, or probability in short. In addition, a set in F is called an event, and (?, F, P ) is called a probability space. Some basic properties of probability are as follows: ? Corresponding author: jshi@iss.ac.cn 1 page 1 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 2 1. P (?) = 0; 2. For events A and B, if B ? A, then P (A ? B) = P (A) ? P (B), P (A) ? P (B), and particularly, P (Ac ) = 1 ? P (A); 3. For any events A1 , . . . , An and n ? 1, there holds P (?ni=1 Ai ) ? n P (Ai ); i=1 4. For any events A and B, there holds Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. P (A ? B) = P (A) + P (B) ? P (A ? B) Suppose a variable X may represent di?erent values under di?erent conditions due to accidental, uncontrolled factors of uncertainty and randomness, but the probability that the value of X falls in a certain range is ?xed, then X is a random variable. The random variable X is called a discrete random variable if it represents only a ?nite or countable number of values with ?xed probabilities. Suppose X takes values x1 , x2 , . . . with probability pi = P {X = xi } for i = 1, 2, . . ., respectively. Then, it holds that: (1) pi ? 0, i = 1, 2, . . .; and ? (2) i=1 pi = 1. The random variable X is called a continuous random variable if it can represent values from the entire range of an interval and the probability for X falling into any sub-interval is ?xed. For a continuous random variable X, if there exists a non-negative integrable function f (x), such that b f (x)dx, P {a ? X ? b} = a holds for any ?? < a < b < ?, and ? f (x)dx = 1, ?? then f (x) is called the density function of X. For a random variable X, if F (x) = P {X ? x} for ?? < x < ?, then F (x) is called the distribution function of X. When X is a discrete random variable, its distribution function is F (x) = i:xi ?x pi and similarly, when x X is a continuous random variable, its distribution function is F (x) = ?? f (t)dt. page 2 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 3 1.2. Uniform Distribution2,3,4 If the random variable X can take values from the interval [a, b] and the probabilities for X taking each value in [a, b] are equal, then we say X Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d follows the uniform distribution over [a, b] and denote it as X = U (a, b). In particular, when a = 0 and b = 1, we say X follows the standard uniform distribution U (0, 1). The uniform distribution is the simplest continuous distribution. d If X = U (a, b), then the density function of X is 1 b?a , if a ? x ? b, f (x; a, b) = 0, otherwise, and the distribution function of X is ? ? 0, x < a, ? ? x?a F (x; a, b) = b?a , a ? x ? b, ? ? ? 1, x > b. A uniform distribution has the following properties: d 1. If X = U (a, b), then the k-th moment of X is E(X k ) = bk+1 ? ak+1 , (k + 1)(b ? a) k = 1, 2, . . . d 2. If X = U (a, b), then the k-th central moment of X is 0, when k is odd, k E((X ? E(X)) ) = (b?a)k , when k is even. 2k (k+1) d 3. If X = U (a, b), then the skewness of X is s = 0 and the kurtosis of X is ? = ?6/5. d 4. If X = U (a, b), then its moment-generating function and characteristic function are M (t) = E(etX ) = ?(t) = E(eitX ) = respectively. ebt ? eat , (b ? a)t eibt ? eiat , i(b ? a)t and page 3 July 7, 2017 8:11 Handbook of Medical Statistics 4 9.61in x 6.69in b2736-ch01 J. Shi Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 5. If X1 and X2 are independent and identically distributed random variables both with the distribution U (? 12 , 12 ), then the density function of X = X1 + X2 , is 1 + x, ?1 ? x ? 0, f (x) = 1 ? x, 0 ? x ? 1. This is the so-called ?Triangular distribution?. 6. If X1 , X2 , and X3 are independent and identically distributed random variables with common distribution U (? 12 , 12 ), then the density function of X = X1 + X2 + X3 is ?1 3 2 3 1 ? 2 (x + 2 ) , ? 2 ? x ? ? 2 , ? ? ? ? ? 3 ? x2 , ? 12 < x ? 12 , 4 f (x) = 1 3 2 1 3 ? ? ? 2 (x ? 2 ) , 2 < x ? 2, ? ? ? 0, otherwise. The shape of the above density function resembles that of a normal density function, which we will discuss next. d d 7. If X = U (0, 1), then 1 ? X = U (0, 1). 8. Assume that a distribution function F is strictly increasing and continud ous, F ?1 is the inverse function of F , and X = U (0, 1). In this case, the distribution function of the random variable Y = F ?1 (X) is F . In stochastic simulations, since it is easy to generate pseudo random numbers of the standard uniform distribution (e.g. the congruential method), pseudo random numbers of many common distributions can therefore be generated using property 8, especially for cases when inverse functions of distributions have explicit forms. 1.3. Normal Distribution2,3,4 If the density function of the random variable X is (x ? х)2 1 x?х exp ? =? , ? ? 2? 2 2?? where ?? < x, х < ? and ? > 0, then we say X follows the normal d distribution and denote it as X = N (х, ? 2 ). In particular, when х = 0 and ? = 1, we say that X follows the standard normal distribution N (0, 1). page 4 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 5 d If X = N (х, ? 2 ), then the distribution function of X is x x?х t?х ? = dt. ? ? ? ?? If X follows the standard normal distribution, N (0, 1), then the density and distribution functions of X are ?(x) and ?(x), respectively. The Normal distribution is the most common continuous distribution and has the following properties: d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 1. If X = N (х, ? 2 ), then Y = d X?х ? d d = N (0, 1), and if X = N (0, 1), then Y = a + ?X = N (a, ? 2 ). Hence, a general normal distribution can be converted to the standard normal distribution by a linear transformation. d 2. If X = N (х, ? 2 ), then the expectation of X is E(X) = х and the variance of X is Var(X) = ? 2 . d 3. If X = N (х, ? 2 ), then the k-th central moment of X is 0, k is odd, E((X ? х)k ) = k! k ? , k is even. 2k/2 (k/2)! d 4. If X = N (х, ? 2 ), then the moments of X are E(X 2k?1 ) = ? 2k?1 k i=1 (2k ? 1)!!х2i?1 , (2i ? 1)!(k ? i)!2k?i and 2k E(X ) = ? 2k k i=0 (2k)!х2i (2i)!(k ? i)!2k?i for k = 1, 2, . . .. d 5. If X = N (х, ? 2 ), then the skewness and the kurtosis of X are both 0, i.e. s = ? = 0. This property can be used to check whether a distribution is normal. d 6. If X = N (х, ? 2 ), then the moment-generating function and the characteristic function of X are M (t) = exp{tх + 12 t2 ? 2 } and ?(t) = exp {itх ? 12 t2 ? 2 }, respectively. d 7. If X = N (х, ? 2 ), then d a + bX = N (a + bх, b2 ? 2 ). page 5 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 6 d 8. If Xi = N (хi , ?i2 ) for 1 ? i ? n, and X1 , X2 , . . . , Xn are mutually independent, then n n n d 2 Xi = N хi , ?i . i=1 i=1 i=1 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 9. If X1 , X2 , . . . , Xn represent a random sample of the population N (х, ? 2 ), 2 d then the sample mean X?n = n1 ni=1 Xi satis?es X?n = N (х, ?n ). The central limit theorem: Suppose that X1 , . . . , Xn are independent and identically distributed random variables, and that х = E(X1 ) and 0 < ? 2 = ? Var(X1 ) < ?, then the distribution of Tn = n(X?n ?х)/? is asymptotically standard normal when n is large enough. The central limit theorem reveals that limit distributions of statistics in many cases are (asymptotically) normal. Therefore, the normal distribution is the most widely used distribution in statistics. The value of the normal distribution is the whole real axis, i.e. from negative in?nity to positive in?nity. However, many variables in real problems take positive values, for example, height, voltage and so on. In these cases, the logarithm of these variables can be regarded as being normally distributed. d Log-normal distribution: Suppose X > 0. If ln X ? N (х, ? 2 ), then we d say X follows the log-normal distribution and denote it as X ? LN (х, ? 2 ). 1.4. Exponential Distribution2,3,4 If the density function of the random variable X is ?e??x , x ? 0, f (x) = 0, x < 0, where ? > 0, then we say X follows the exponential distribution and denote d it as X = E(?). Particularly, when ? = 1, we say X follows the standard exponential distribution E(1). d If X = E(?), then its distribution function is 1 ? e??x , x ? 0, F (x; ?) = 0, x < 0. Exponential distribution is an important distribution in reliability. The life of an electronic product generally follows an exponential distribution. When page 6 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 7 the life of a product follows the exponential distribution E(?), ? is called the failure rate of the product. Exponential distribution has the following properties: d 1. If X = E(?), then the k-th moment of X is E(X k ) = k??k , k = 1, 2, . . . . d 2. If X = E(?), then E(X) = ??1 and Var(X) = ??2 . d 3. If X = E(?), then its skewness is s = 2 and its kurtosis is ? = 6. d 4. If X = E(?), then the moment-generating function and the characteris? ? for t < ? and ?(t) = ??it , respectively. tic function of X are M (t) = ??t d d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 5. If X = E(1), then ??1 X = E(?) for ? > 0. d 6. If X = E(?), then for any x > 0 and y > 0, there holds P {X > x + y|X > y} = P {X > x}. This is the so-called ?memoryless property? of exponential distribution. If the life distribution of a product is exponential, no matter how long it has been used, the remaining life of the product follows the same distribution as that of a new product if it does not fail at the present time. d 7. If X = E(?), then for any x > 0, there hold E(X|X > a) = a + ??1 and Var(X|X > a) = ??2 . 8. If x and y are independent and identically distributed as E(?), then min(X, Y ) is independent of X ? Y and d {X|X + Y = z} ? U (0, z). 9. If X1 , X2 , . . . , Xn are random samples of the population E(?), let X(1,n) ? X(2,n) ? и и и ? X(n,n) be the order statistics of X1 , X2 , . . . , Xn . Write Yk = (n ? k + 1)(X(k,n) ? X(k?1,n) ), 1 ? k ? n, where X(0,n) = 0. Then, Y1 , Y2 , . . . , Yn are independent and identically distributed as E(?). 10. If X1 , X2 , . . . , Xn are random samples of the population of E(?), then n d i=1 Xi ? ?(n, ?), where ?(n, ?) is the Gamma distribution in Sec. 1.12. d d 11. If then Y ? U (0, 1), then X = ? ln(Y ) ? E(1). Therefore, it is easy to generate random numbers with exponential distribution through uniform random numbers. 1.5. Weibull Distribution2,3,4 If the density function of the random variable X is ? ? ??1 exp{? (x??) }, x ? ?, ? (x ? ?) ? f (x; ?, ?, ?) = 0, x < ?, page 7 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 8 d then we say X follows the Weibull distribution and denote it as X ? W (?, ?, ?). Where ? is location parameter, ? > 0 is shape parameter, ? > 0, is scale parameter. For simplicity, we denote W (?, ?, 0) as W (?, ?). Particularly, when ? = 0, ? = 1, Weibull distribution W (1, ?) is transformed into Exponential distribution E(1/?). d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. If X ? W (?, ?, ?), then its distribution function is ? }, x ? ?, 1 ? exp{? (x??) ? F (x; ?, ?, ?) = 0, x < ?, Weibull distribution is an important distribution in reliability theory. It is often used to describe the life distribution of a product , such as electronic product and wear product. Weibull distribution has the following properties: d 1. If X ? E(1), then d Y = (X?)1/? + ? ? W (?, ?, ?) Hence, Weibull distribution and exponential distribution can be converted to each other by transformation. d 2. If X ? W (?, ?), then the k-th moment of X is k ? k/? , E(X k ) = ? 1 + ? where ?(и) is the Gamma function. d 3. If X ? W (?, ?, ?), then 1 ? 1/? + ?, E(X) = ? 1 + ? 1 2 2 ?? 1+ ? 2/? . Var(X) = ? 1 + ? ? 4. Suppose X1 , X2 , . . . , Xn are mutually independent and identically distributed random variables with common distribution W (?, ?, ?), then d X1,n = min(X1 , X2 , . . . , Xn ) ? W (?, ?/n, ?), d d whereas, if X1,n ? W (?, ?/n, ?), then X1 ? W (?, ?, ?). page 8 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions b2736-ch01 9 1.5.1. The application of Weibull distribution in reliability Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. The shape parameter ? usually describes the failure mechanism of a product. Weibull distributions with ? < 1 are called ?early failure? life distributions, Weibull distributions with ? = 1 are called ?occasional failure? life distributions, and Weibull distributions with ? > 1 are called ?wear-out (aging) failure? life distributions. d If X ? W (?, ?, ?) then its reliability function is ? }, x ? ?, exp{? (x??) ? R(x) = 1 ? F (x; ?, ?, ?) = 1, x < ?. When the reliability R of a product is known, then xR = ? + ? 1/? (? ln R)1/? is the Q-percentile life of the product. If R = 0.5, then x0.5 = ? + ? 1/? (ln 2)1/? is the median life; if R = e?1 , then xe?1 ? + ? 1/? is the characteristic life; R = exp{??? (1 + ??1 )}, then xR = E(X), that is mean life. The failure rate of Weibull distribution W (?, ?, ?) is ? (x ? ?)??1 , x ? ?, f (x; ?, ?, ?) = ? ?(x) = R(x) 0, x < ?. The mean rate of failure is 1 ??(x) = x?? x ?(t)dt = (x??)??1 , ? x ? ?, 0, x < ?. ? Particularly, the failure rate of Exponential distribution E(?) = W (1, 1/?) is constant ?. 1.6. Binomial Distribution2,3,4 We say random variable follows the binomial distribution, if it takes discrete values and P {X = k} = Cnk pk (1 ? p)n?k , k = 0, 1, . . . , n, where n is positive integer, Cnk is combinatorial number, 0 ? p ? 1. We d denote it as X ? B(n, p). Consider n times independent trials, each with two possible outcomes ?success? and ?failure?. Each trial can only have one of the two outcomes. The probability of success is p. Let X be the total number of successes in this page 9 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 10 d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. n trials, then X ? B(n, p). Particularly, if n = 1, B(1, p) is called Bernoulli distribution or two-point distribution. It is the simplest discrete distribution. Binomial distribution is a common discrete distribution. d If X ? B(n, p), then its density function is min([x],n) Cnk pk q n?k , x ? 0, k=0 B(x; n, p) = 0, x < 0, where [x] is integer part of x, q = 1 ? p. x Let Bx (a, b) = 0 ta?1 (1?t)b?1 dt be the incomplete Beta function, where 0 < x < 1, a > 0, b > 0, then B(a, b) = B1 (a, b) is the Beta function. Let Ix (a, b) = Bx (a, b)/B(a, b) be the ratio of incomplete Beta function. Then the binomial distribution function can be represented as follows: B(x; n, p) = 1 ? Ip (x + 1, n ? [x]), 0 ? x ? n. Binomial distribution has the following properties: 1. Let b(k; n, p) = Cnk pk q n?k for 0 ? k ? n. If k ? [(n+1)p], then b(k; n, p) ? b(k ? 1; n, p); if k > [(n + 1)p], then b(k; n, p) < b(k ? 1; n, p). 2. When p = 0.5, Binomial distribution B(n, 0.5) is a symmetric distribution; when p = 0.5, Binomial distribution B(n, p) is asymmetric. 3. Suppose X1 , X2 , . . . , Xn are mutually independent and identically distributed Bernoulli random variables with parameter p, then Y = n d Xi ? B(n, p). i=1 d 4. If X ? B(n, p), then E(X) = np, Var(X) = npq. d 5. If X ? B(n, p), then the k-th moment of X is k E(X ) = k S2 (k, i)Pni pi , i=1 where S2 (k, i) is the second order Stirling number, Pnk is number of permutations. d 6. If X ? B(n, p), then its skewness is s = (1 ? 2p)/(npq)1/2 and kurtosis is ? = (1 ? 6pq)/(npq). d 7. If X ? B(n, p), then the moment-generating function and the characteristic function of X are M (t) = (q + pet )n and ?(t) = (q + peit )n , respectively. page 10 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 11 8. When n and x are ?xed, the Binomial distribution function b(x; n, p) is a monotonically decreasing function with respect to p(0 < p < 1). d 9. If Xi ? B(ni , p) for 1 ? i ? k, and X1 , X2 , . . . , Xk are mutually indepen d dent, then X = ki=1 Xi ? B( ki=1 ni , p). 1.7. Multinomial Distribution2,3,4 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. If an n(n ? 2)-dimensional random vector X = (X1 , . . . , Xn ) satis?es the following conditions: (1) Xi ? 0, 1 ? n, and ni=1 Xi = N ; (2) Suppose m1 , m2 , . . . , mn are any non-negative integers with ni=1 mi = N and the probability of the following event is P {X1 = m1 , . . . , Xn = N! i ?ni=1 pm mn } = m1 !иииm i , n! where pi ? 0, 1 ? i ? n, n i=1 pi = 1, then we say X follows the multinomial d distribution and denote it as X ? P N (N ; p1 , . . . , pn ). Particularly, when n = 2, multinomial distribution degenerates to binomial distribution. Suppose a jar has balls with n kinds of colors. Every time, a ball is drawn randomly from the jar and then put back to the jar. The probability for the ith color ball being drawn is pi , 1 ? i ? n, ni=1 pi = 1. Assume that balls are drawn and put back for N times and Xi is denoted as the number of drawings of the ith color ball, then the random vector X = (X1 , . . . , Xn ) follows the Multinomial distribution P N (N ; p1 , . . . , pn ). Multinomial distribution is a common multivariate discrete distribution. Multinomial distribution has the following properties: d ? = 1. If (X1 , . . . , Xn ) ? P N (N ; p1 , . . . , pn ), let Xi+1 n p , 1 ? i < n, then j=i+1 i n ? j=i+1 Xi , pi+1 = d ? ) ? P N (N ; p , . . . , p , p? ), (i) (X1 , . . . , Xi , Xi+1 1 i i+1 d (ii) Xi ? B(N, pi ), 1 ? i ? n. More generally, let 1 = j0 < j1 < и и и < jm = n, and X?k = jk j k d i=jk?1 +1 Xi , p?k = i=jk?1 +1 pi , 1 ? k ? m, then (X?1 , . . . , X?m ) ? P N (N ; p?1 , . . . , p?m ). page 11 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 12 d 2. If (X1 , . . . , Xn ) ? P N (N ; p1 , . . . , pn ), then its moment-generating function and the characteristic function of are ? ? ?N ?N n n M (t1 , . . . , tn ) = ? pj etj ? and ?(t1 , . . . , tn ) = ? pj eitj ? , j=1 j=1 respectively. d 3. If (X1 , . . . , Xn ) ? P N (N ; p1 , . . . , pn ), then for n > 1, 1 ? k < n, (X1 , . . . , Xk |Xk+1 = mk+1 , . . . , Xn = mn ) ? P N (N ? M ; p?1 , . . . , p?k ), Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d where M= n mi , 0 < M < N, i=k+1 pj p?j = k i=1 pi , 1 ? j ? k. 4. If Xi follows Poisson distribution P (?i ), 1 ? i ? n, and X1 , . . . , Xn are mutually independent, then for any given positive integer N , there holds n d Xi = N ? P N (N ; p1 , . . . , pn ), X1 , . . . , Xn i=1 n where pi = ?i / j=1 ?j , 1 ? i ? n. 1.8. Poisson Distribution2,3,4 If random variable X takes non-negative integer values, and the probability is P {X = k} = ?k ?? e , k! ? > 0, k = 0, 1, . . . , d then we say X follows the Poisson distribution and denote it as X ? P (?). d If X ? P (?), then its distribution function is P {X ? x} = P (x; ?) = [x] p(k; ?), k=0 where p(k; ?) = e?? ?k /k!, k = 0, 1, . . . . Poisson distribution is an important distribution in queuing theory. For example, the number of the purchase of the ticket arriving in ticket window in a ?xed interval of time approximately follows Poisson distribution. Poisson page 12 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 13 distribution have a wide range of applications in physics, ?nance, insurance and other ?elds. Poisson distribution has the following properties: 1. If k < ?, then p(k; ?) > p(k ? 1; ?); if k > ?, then p(k; ?) < p(k ? 1; ?). If ? is not an integer, then p(k; ?) has a maximum value at k = [?]; if ? is an integer, then p(k, ?) has a maximum value at k = ? and ? ? 1. 2. When x is ?xed, P (x; ?) is a non-increasing function with respect to ?, that is Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. P (x; ?1 ) ? P (x; ?2 ) if ?1 < ?2 . When ? and x change at the same time, then P (x; ?) ? P (x ? 1; ? ? 1) if x ? ? ? 1, P (x; ?) ? P (x ? 1; ? ? 1) if x ? ?. d 3. If X ? P (?), then the k-th moment of X is E(X k ) = where S2 (k, i) is the second order Stirling number. k i i=1 S2 (k, i)? , d 4. If X ? P (?), then E(X) = ? and Var(X) = ?. The expectation and variance being equal is an important feature of Poisson distribution. d 5. If X ? P (?), then its skewness is s = ??1/2 and its kurtosis is ? = ??1 . d 6. If X ? P (?), then the moment-generating function and the characteristic function of X are M (t) = exp{?(et ? 1)} and ?(t) = exp{?(eit ? 1)}, respectively. 7. If X1 , X2 , . . . , Xn are mutually independent and identically distributed, d d then X1 ? P (?) is equivalent to ni=1 Xi ? P (n?). d 8. If Xi ? P (?i ) for 1 ? i ? n, and X1 , X2 , . . . , Xn are mutually independent, then n n d Xi ? P ?i . i=1 d i=1 d 9. If X1 ? P (?1 ) and X2 ? P (?2 ) are mutually independent, then conditional distribution of X1 given X1 + X2 is binomial distribution, that is d (X1 |X1 + X2 = x) ? B(x, p), where p = ?1 /(?1 + ?2 ). page 13 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 14 1.9. Negative Binomial Distribution2,3,4 For positive integer m, if random variable X takes non-negative integer values, and the probability is k pm q k , P {X = k} = Ck+m?1 k = 0, 1, . . . , where 0 < p < 1, q = 1 ? p, then we say X follows the negative binomial d distribution and denotes it as X ? N B(m, p). Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d If X ? N B(m, p), then its distribution function is [x] k m k k=0 Ck+m?1 p q , x ? 0, N B(x; m, p) = 0, x < 0, Negative binomial distribution is also called Pascal distribution. It is the direct generalization of binomial distribution. Consider a success and failure type trial (Bernoulli distribution), the probability of success is p. Let X be the total number of trial until it has m times of ?success?, then X ? m follows the negative binomial distribution N B(m, p), that is, the total number of ?failure? follows the negative binomial distribution N B(m, p). Negative binomial distribution has the following properties: k pm q k , where 0 < p < 1, k = 0, 1, . . . , then 1. Let nb(k; m, p) = Ck+m?1 nb(k + 1; m, p) = m+k и nb(k; m, p). k+1 Therefore, if k < m?1 p ? m, nb(k; m, p) increases monotonically; if k > m?1 p ? m, nb(k; m, p) decreases monotonically with respect to k. 2. Binomial distribution B(m, p) and negative binomial distribution N B(r, p) has the following relationship: N B(x; r, p) = 1 ? B(r ? 1; r + [x], p). 3. N B(x; m, p) = Ip (m, [x] + 1), where Ip (и, и) is the ratio of incomplete Beta function. d 4. If X ? N B(m, p), then the k-th moment of X is k E(X ) = k S2 (k, i)m[i] (q/p)i , i=1 where m[i] = m(m + 1) и и и (m + i ? 1), 1 ? i ? k, S2 (k, i) is the second order Stirling number. page 14 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions b2736-ch01 15 d 5. If X ? N B(m, p), then E(X) = mq/p, Var(X) = mq/p2 . d 6. If X ? N B(m, p), then its skewness and kurtosis are s = (1 + q)/(mq)1/2 and ? = (6q + p2 )/(mq), respectively. d 7. If X ? N B(m, p), then the moment-generating function and the characteristic function of X are M (t) = pm (1 ? qet )?m and ?(t) = pm (1 ? qeit )?m , respectively. Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d 8. If Xi ? N B(mi , p) for 1 ? i ? n, and X1 , X2 , . . . , Xn are mutually independent, then n n d Xi ? N B mi , p . i=1 i=1 d 9. If X ? N B(mi , p), then there exists a sequence random variables X1 , . . . , Xm which are independent and identically distributed as G(p), such that d X = X1 + и и и + Xm ? m, where G(p) is the Geometric distribution in Sec. 1.11. 1.10. Hypergeometric Distribution2,3,4 Let N, M, n be positive integers and satisfy M ? N, n ? N . If the random variable X takes integer values from the interval [max(0, M + n ? N ), min(M, n)], and the probability for X = k is k C n?k CM N ?M , P {X = k} = n CN where max(0, M + n ? N ) ? k ? min(M, n), then we say X follows the d hypergeometric distribution and denote it as X ? H(M, N, n). d If X ? H(M, N, n), then the distribution function of X is n?k k min([x],K2 ) CM CN?M , x ? K1 , n k=K1 CN H(x; n, N, M ) = 0, x < K1 , where K1 = max(0, Mn ? N ), K2 = min(M, n). The hypergeometric distribution is often used in the sampling inspection of products, which has an important position in the theory of sampling inspection. Assume that there are N products with M non-conforming ones. We randomly draw n products from the N products without replacement. Let page 15 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 16 X be the number of non-conforming products out of this n products, then it follows the hypergeometric distribution H(M, N, n). Some properties of hypergeometric distribution are as follows: k C n?k /C n , then 1. Denote h(k; n, N, M ) = CM N n?M h(k; n, N, M ) = h(k; M, N, n), Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. h(k; n, N, M ) = h(N ? n ? M + k; N ? n, N, M ), where K1 ? k ? K2 . 2. The distribution function of the hypergeometric distribution has the following expressions H(x; n, N, M ) = H(N ? n ? M + x; N ? n, N, N ? M ) = 1 ? H(n ? x ? 1; n, N, N ? M ) = 1 ? H(M ? x ? 1; N ? n, N, M ) and 1 ? H(n ? 1; x + n, N, N ? m) = H(x; n + x, N, M ), where x ? K1 . d 3. If X ? H(M, N, n), then its expectation and variance are E(X) = nM , N Var(X) = nM (N ? n)(N ? M ) . N 2 (N ? 1) For integers n and k, denote n(n ? 1) и и и (n ? k + 1), n(k) = n! k < n, k ? n. d 4. If X ? H(M, N, n), the k-th moment of X is E(X k ) = k S2 (k, i) i=1 n(i) M (i) . N (i) d 5. If X ? H(M, N, n), the skewness of X is s= (N ? 2M )(N ? 1)1/2 (N ? 2n) . (N M (N ? M )(N ? n))1/2 (N ? 2) page 16 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 17 d 6. If X ? H(M, N, n), the moment-generating function and the characteristic function of X are (N ? n)!(N ? M )! M (t) = F (?n, ?M ; N ? M ? n + 1; et ) N !(N ? M ? n)! and Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. ?(t) = (N ? n)!(N ? M )! F (?n, ?M ; N ? M ? n + 1; eit ), N !(N ? M ? n)! respectively, where F (a, b; c; x) is the hypergeometric function and its de?nition is ab x a(a + 1)b(b + 1) x2 F (a, b; c, x) = 1 + + + иии c 1! c(c + 1) 2! with c > 0. A typical application of hypergeometric distribution is to estimate the number of ?sh. To estimate how many ?sh in a lake, one can catch M ?sh, and then put them back into the lake with tags. After a period of time, one re-catches n(n > M ) ?sh from the lake among which there are s ?sh with the mark. M and n are given in advance. Let X be the number of ?sh with the mark among the n re-caught ?sh. If the total amount of ?sh in the lake is assumed to be N , then X follows the hypergeometric distribution H(M, N, n). According to the above property 3, E(X) = nM/N , which can be estimated by the number of ?sh re-caught with the mark, i.e., s ? E(X) = nM/N . Therefore, the estimated total number of ?sh in the lake is N? = nM/s. 1.11. Geometric Distribution2,3,4 If values of the random variable X are positive integers, and the probabilities are P {X = k} = q k?1 p, k = 1, 2, . . . , where 0 < p ? 1, q = 1 ? p, then we say X follows the geometric distribution d d and denote it as X ? G(p). If X ? G(p), then the distribution function of X is 1 ? q [x] , x ? 0, G(x; p) = 0, x < 0. Geometric distribution is named according to what the sum of distribution probabilities is a geometric series. page 17 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 18 In a trial (Bernoulli distribution), whose outcome can be classi?ed as either a ?success? or a ?failure?, and p is the probability that the trial is a ?success?. Suppose that the trials can be performed repeatedly and independently. Let X be the number of trials required until the ?rst success occurs, then X follows the geometric distribution G(p). Some properties of geometric distribution are as follows: Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 1. Denote g(k; p) = pq k?1 , k = 1, 2, . . . , 0 < p < 1, then g(k; p) is a monotonically decreasing function of k, that is, g(1; p) > g(2; p) > g(3; p) > и и и . d 2. If X ? G(p), then the expectation and variance of X are E(X) = 1/p and Var(X) = q/p2 , respectively. d 3. If X ? G(p), then the k-th moment of X is k K S2 (k, i)i!q i?1 /pi , E(X ) = i=1 where S2 (k, i) is the second order Stirling number. d 4. If X ? G(p), the skewness of X is s = q 1/2 + q ?1/2 . d 5. If X ? G(p), the moment-generating function and the characteristic function of X are M (t) = pet (1 ? et q)?1 and ?(t) = peit (1 ? eit q)?1 , respectively. d 6. If X ? G(p), then P {X > n + m|X > n} = P {X > m}, for any nature number n and m. Property 6 is also known as ?memoryless property? of geometric distribution. This indicates that, in a success-failure test, when we have done n trials with no ?success? outcome, the probability of the even that we continue to perform m trials still with no ?success? outcome has nothing to do with the information of the ?rst n trials. The ?memoryless property? is a feature of geometric distribution. It can be proved that a discrete random variable taking natural numbers must follow geometric distribution if it satis?es the ?memoryless property?. d 7. If X ? G(p), then E(X|X > n) = n + E(X). 8. Suppose X and Y are independent discrete random variables, then min(X, Y ) is independent of X ? Y if and only if both X and Y follow the same geometric distribution. page 18 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions b2736-ch01 19 1.12. Gamma Distribution2,3,4 If the density function of the random variable X is ? ??1 ??x ? x e , x ? 0, ?(?) g(x; ?, ?) = 0, x < 0, where ? > 0, ? > 0, ?(и) is the Gamma function, then we say X follows the Gamma distribution with shape parameter ? and scale parameter ?, and d denote it as X ? ?(?, ?). Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d If X ? ?(?, ?), then the distribution function of X is ? ??1 ??x ? t e dt, x ? 0, ?(?) ?(x; ?, ?) = 0, x < 0. Gamma distribution is named because the form of its density is similar to Gamma function. Gamma distribution is commonly used in reliability theory to describe the life of a product. When ? = 1, ?(?, 1), is called the standard Gamma distribution and its density function is ??1 ?x x e , x ? 0, ?(?) g(x; ?, 1) = 0, x < 0. When ? = 1(1, ?) is called the single parameter Gamma distribution, and it is also the exponential distribution E(?) with density function ?e??x , x ? 0, g(x; 1, ?) = 0, x < 0. More generally, the gamma distribution with three parameters can be obtained by means of translation transformation, and the corresponding density function is ? ? (x??)??1 e?? (x??) , x ? 0, ?(?) g(x; ?, ?, ?) = 0, x < ?. Some properties of gamma distribution are as follows: d d 1. If X ? ?(?, ?), then ?X ? ?(?, 1). That is, the general gamma distribution can be transformed into the standard gamma distribution by scale transformation. page 19 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 20 2. For x ? 0, denote I? (x) = 1 ?(?) x t??1 e?t dt 0 to be the incomplete Gamma function, then ?(x; ?, ?) = I? (?x). Particularly, ?(x; 1, ?) = 1 ? e??x . 3. Several relationships between gamma distributions are as follows: Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. (1) ?(x; ?, 1) ? ?(x;?? + 1, 1) = g(x; ?, 1). (2) ?(x; 12 , 1) = 2?( 2x) ? 1, where ?(x) is the standard normal distribution function. d 4. If X ? ?(?, ?), then the expectation of X is E(X) = ?/? and the variance of X is Var(X) = ?/? 2 . d 5. If X ? ?(?, ?), then the k-th moment of X is E(X k ) = ? ?k ?(k+?)/?(?). d 6. If X ? ?(?, ?), the skewness of X is s = 2??1/2 and the kurtosis of X is ? = 6/?. d ? ? ) , 7. If X ? ?(?, ?), the moment-generating function of X is M (t) = ( ??t ? )? for t < ?. and the characteristic function of X is ?(t) = ( ??it d 8. If Xi ? ?(?i , ?), for 1 ? i ? n, and X1 , X2 , . . . , Xn and are independent, then n n d Xi ? ? ?i , ? . i=1 d i=1 d 9. If X ? ?(?1 , 1), Y ? ?(?2 , 1), and X is independent of Y , then X + Y is independent of X/Y . Conversely, if X and Y are mutually independent, non-negative and non-degenerate random variables, and moreover X + Y is independent of X/Y , then both X and Y follow the standard Gamma distribution. 1.13. Beta Distribution2,3,4 If the density function of the random variable X is a?1 x (1?x)b?1 , 0 ? x ? 1, B(a,b) f (x; a, b) = 0, otherwise, where a > 0, b > 0, B(и, и) is the Beta function, then we say X follows the d Beta distribution with parameters a and b, and denote it as X ? BE(a, b). page 20 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 21 d If X ? BE(a, b), then the distribution function of X is ? ? ? Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. BE(x; a, b) = ? ? 1, x > 1, Ix (a, b), 0 < x ? 1, 0, x ? 0, where Ix (a, b) is the ratio of incomplete Beta function. Similar to Gamma distribution, Beta distribution is named because the form of its density function is similar to Beta function. Particularly, when a = b = 1, BE(1, 1) is the standard uniform distribution U (0, 1). Some properties of the Beta distribution are as follows: d d 1. If X ? BE(a, b), then 1 ? X ? BE(b, a). 2. The density function of Beta distribution has the following properties: (1) (2) (3) (4) (5) when a < 1, b ? 1, the density function is monotonically decreasing; when a ? 1, b < 1, the density function is monotonically increasing; when a < 1, b < 1, the density function curve is U type; when a > 1, b > 1, the density function curve has a single peak; when a = b, the density function curve is symmetric about x = 1/2. d 3. If X ? BE(a, b), then the k-th moment of X is E(X k ) = B(a+k,b) B(a,b) . d 4. If X ? BE(a, b), then the expectation and variance of X are E(X) = a/(a + b) and Var(X) = ab/((a + b + 1)(a + b)2 ), respectively. d 5. If X ? BE(a, b), the skewness of X is s = kurtosis of X is ? = 3(a+b)(a+b+1)(a+1)(2b?a) ab(a+b+2)(a+b+3) + 2(b?a)(a+b+1)1/2 (a+b+2)(ab)2 a(a?b) a+b ? 3. and the d 6. If X ? BE(a, b), the moment-generating function and the characteris? ?(a+k) tk tic function of X are M (t) = ?(a+b) k=0 ?(a+b+k) ?(k+1) and ?(t) = ?(a) ?(a+k) (it)k ?(a+b) ? k=0 ?(a+b+k) ?(k+1) , respectively. ?(a) d 7. Suppose X1 , X2 , . . . , Xn are mutually independent, Xi ? BE(ai , bi ), 1 ? i ? n, and ai+1 = ai + bi , 1 ? i ? n ? 1, then n i=1 d Xi ? BE a1 , n bi . i=1 8. Suppose X1 , X2 , . . . , Xn are independent and identically distributed random variables with common distribution U (0, 1), then min(X1 , . . . , Xn ) page 21 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 22 d ? BE(1, n). Conversely, if X1 , X2 , . . . , Xn are independent and identically distributed random variables, and d min(X1 , . . . , Xn ) ? U (0, 1), d then X1 ? BE(1, 1/n). 9. Suppose X1 , X2 , . . . , Xn are independent and identically distributed random variables with common distribution U (0, 1), denote X(1,n) ? X(2,n) ? и и и ? X(n,n) as the corresponding order statistics, then Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d X(k,n) ? BE(k, n ? k + 1), 1 ? k ? n, d X(k,n) ? X(i,n) ? BE(k ? i, n ? k + i + 1), 1 ? i < k ? n. 10. Suppose X1 , X2 , . . . , Xn are independent and identically distributed random variables with common distribution BE(a, 1). Let Y = min(X1 , . . . , Xn ), then Y d a ? BE(1, n). d 11. If X ? BE(a, b), where a and b are positive integers, then BE(x; a, b) = a+b?1 i Ca+b?1 xi (1 ? x)a+b?1?i . i=a 1.14. Chi-square Distribution2,3,4 If Y1 , Y2 , . . . , Yn are mutually independent and identically distributed random variables with common distribution N (0, 1), then we say the random variable X = ni=1 Yi2 change position with the previous math formula follows the Chi-square distribution (?2 distribution) with n degree of freedom, d and denote it as X ? ?2n . d If X ? ?2n , then the density function of X is ?x/2 n/2?1 e x x > 0, n/2 ?(n/2) , 2 f (x; n) = 0, x ? 0, where ?(n/2) is the Gamma function. Chi-square distribution is derived from normal distribution, which plays an important role in statistical inference for normal distribution. When the degree of freedom n is quite large, Chi-square distribution ?2n approximately becomes normal distribution. page 22 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 23 Some properties of Chi-square distribution are as follows: d d d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 1. If X1 ? ?2n , X2 ? ?2m , X1 and X2 are independent, then X1 + X2 ? ?2n+m . This is the ?additive property? of Chi-square distribution. 2. Let f (x; n) be the density function of Chi-square distribution ?2n . Then, f (x; n) is monotonically decreasing when n ? 2, and f (x; n) is a single peak function with the maximum point n ? 2 when n ? 3. d 3. If X ? ?2n , then the k-th moment of X is n ?(n/2 + k) = 2k ?k?1 . ? i + E(X k ) = 2k i=0 ?(n/2) 2 d 4. If X ? ?2n , then E(X) = n, Var(X) = 2n. ? then the skewness of X is s = 2 2n?1/2 , and the kurtosis of 5. If X ? X is ? = 12/n. d ?2n , d 6. If X ? ?2n , the moment-generating function of X is M (t) = (1 ? 2t)?n/2 and the characteristic function of X is ?(t) = (1?2it)?n/2 for 0 < t < 1/2. 7. Let K(x; n) be the distribution function of Chi-square distribution ?2n , then we have (1) K(x; 2n) = 1 ? 2 ni=1 f (x; 2i); (2) K(x; 2n + 1) = 2?(x) ? 1 ? 2 ni=1 f (x; 2i + 1); (3) K(x; n) ? K(x; n + 2) = ( x2 )n/2 e?x/2 /?( n+2 2 ), where ?(x) is the standard normal distribution function. d d d 8. If X ? ?2m , Y ? ?2n , X and Y are independent, then X/(X + Y ) ? BE(m/2, n/2), and X/(X + Y ) is independent of X + Y . Let X1 , X2 , . . . , Xn be the random sample of the normal population N (х, ? 2 ). Denote n n 1 Xi , S 2 = (Xi ? X?)2 , X? = n i=1 then d S 2 /? 2 ? i=1 ?2n?1 and is independent of X?. 1.14.1. Non-central Chi-square distribution d Suppose random variables Y1 , . . . , Yn are mutually independent, Yi ? N (хi , 1), 1 ? i ? n, then the distribution function of the random variable X = ni=1 Yi2 is the non-central Chi-square distribution with the degree of freedom n and the non-central parameter ? = ni=1 х2i , and is denoted as ?2n,? . Particularly, ?2n,0 = ?2n . page 23 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 24 d 9. Suppose Y1 , . . . , Ym are mutually independent, and Yi ? ?2ni ,?i for 1 ? m m d 2 i ? m, then m i=1 Yi ? ?n,? where, n = i=1 ni , ? = i=1 ?i . d 10. If X ? ?2n,? then E(X) = n + ?, Var(X) = 2(n + 2?), the skewness of X ? n+3? n+4? is s = 8 (n+2?) 3/2 , and the kurtosis of X is ? = 12 (n+2?)2 . d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 11. If X ? ?2n,? , then the moment-generating function and the characteristic function of X are M (t) = (1 ? 2t)?n/2 exp{t?/(1 ? 2t)} and ?(t) = (1 ? 2it)?n/2 exp{it?/(1 ? 2it)}, respectively. 1.15. t Distribution2,3,4 d d Assume X ? N (0, 1), Y ? ? ?2n , and X is independent of Y . We say the ? random variable T = nX/ Y follows the t distribution with n degree of d freedom and denotes it as T ? tn . d If X ? tn , then the density function of X is ?( n+1 2 ) t(x; n) = (n?)1/2 ?( n2 ) x2 1+ n ?(n+1)/2 , for ?? < x < ?. De?ne T (x; n) as the distribution function of t distribution, tn , then T (x; n) = 1 1 n 2 In/(n+x2 ) ( 2 , 2 ), 1 1 1 n 2 + 2 In/(n+x2 ) ( 2 , 2 ), x ? 0, x < 0, where In/(n+x2 ) ( 12 , n2 ) is the ratio of incomplete beta function. Similar to Chi-square distribution, t distribution can also be derived from normal distribution and Chi-square distribution. It has a wide range of applications in statistical inference on normal distribution. When n is large, the t distribution tn with n degree of freedom can be approximated by the standard normal distribution. t distribution has the following properties: 1. The density function of t distribution, t(x; n), is symmetric about x = 0, and reaches the maximum at x = 0. 2 2. limn?? t(x; n) = ?12? e?x /2 = ?(x), the limiting distribution for t distribution is the standard normal distribution as the degree of freedom n goes to in?nity. page 24 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 25 d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 3. Assume X ? tn . If k < n, then E(X k ) exists, otherwise, E(X k ) does not exist. The k-th moment of X is ? 0 if 0 < k < n, and k is odd, ? ? ? ? ? ? ?( k+1 )?( n?k ) k2 ? ? 2? 2 if 0 < k < n, and k is even, ??( n ) E(X k ) = 2 ? ? ? ? doesn?t exist if k ? n, and k is odd, ? ? ? ? ? if k ? n, and k is even. d 4. If X ? tn , then E(X) = 0. When n > 3, Var(X) = n/(n ? 2). d 5. If X ? tn , then the skewness of X is 0. If n ? 5, the kurtosis of X is ? = 6/(n ? 4). 6. Assume that X1 and X2 are independent and identically distributed random variables with common distribution ?2n , then the random variable Y = 1 n1/2 (X2 ? X1 ) d ? tn . 2 (X1 X2 )1/2 Suppose that X1 , X2 , . . . , Xn are random samples of the normal population N (х, ? 2 ), de?ne X? = n1 ni=1 Xi , S 2 = ni=1 (Xi ? X?)2 , then T = n(n ? 1) X? ? х d ? tn?1 . S 1.15.1. Non-central t distribution d d 2 Suppose that ? X ? N (?, 1), Y ? ?n , X and Y are independent, then ? T = nX/ Y is a non-central t distributed random variable with n degree d of freedom and non-centrality parameter ?, and is denoted as T ? tn,? . Particularly, tn,0 = tn . 7. Let T (x; n, ?) be the distribution function of the non-central t distribution tn,? , then we have T (x; n, ?) = 1 ? T (?x; n, ??), ? T (1; 1, ?) = 1 ? ?2 (?/ 2). d 8. If X ? tn,? , then E(X) = ?2 ) ? (E(X))2 for n > 2. n ?( n?1 ) 2 2 ?( n ) 2 T (0; n, ?) = ?(??), ? for n > 1 and Var(X) = n n?2 (1 + page 25 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 26 1.16. F Distribution2,3,4 d d Let X and Y be independent random variables such that X ? ?2m , Y ? ?2n . Y De?ne a new random variable F as F = X m / n . Then the distribution of F is called the F distribution with the degrees of freedom m and n, denoted d as F ? Fm,n . d If X ? Fm,n , then the density function of X is Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. f (x; m, n) = ? m m m?2 (n)2 ? ? B( 1+ m n x 2 ) ? ? 2 2 m+n mx ? 2 n , x > 0, x ? 0. 0, Let F (x; m, n) be the distribution function of F distribution, Fm,n , then F (x; m, n) = Ia (n/2, m/2), where a = xm/(n + mx), Ia (и, и) is the ratio of incomplete beta function. F distribution is often used in hypothesis testing problems on two or more normal populations. It can also be used to approximate complicated distributions. F distribution plays an important role in statistical inference. F distribution has the following properties: 1. F distributions are generally skewed, the smaller of n, the more it skews. 2. When m = 1 or 2, f (x; m, n) decreases monotonically; when m > 2, f (x; m, n) is unimodal, the mode is n(m?2) (n+2)m . d d 3. If X ? Fm,n , then Y = 1/X ? Fn,m . d d 4. If X ? tn , then X 2 ? F1,n . d 5. If X ? Fm,n , then the k-th moment of X is E(X k ) = ? m n n k ?( 2 +k)?( 2 ?k) ? ?( m ) , 0 < k < n/2, ?( m )?( n ) 2 ? ? 2 ?, d k ? n/2. 6. Assume that X ? Fm,n . If n > 2, then E(X) = Var(X) = 2n2 (m+n?2) . m(n?2)2 (n?4) d n n?2 ; if n > 4, then 7. Assume that X ? Fm,n . If n > 6, then the skewness of X is (2m+n?2)(8(n?4))1/2 ; if n > (n?6)(m(mn ?2))1/2 12((n?2)2 (n?4)+m(m+n?2)(5n?22)) . m(n?6)(n?8)(m+n?2) s = 8, then the kurtosis of X is ? = page 26 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 27 8. When m is large enough and n > 4, the normal distribution function ?(y) can be used to approximate the F distribution function F (x; m, n), where y = x?n n?2 2(n+m?2) 1/2 n ( ) n?2 m(n?4) , that is, F (x; m, n) ? ?(y). d Suppose X ? Fm,n . Let Zm,n = ln X, when both m and n are large enough, the distribution of Zm,n can be approximated by the normal distribution 1 1 ), 12 ( m + n1 )), that is, N ( 12 ( n1 ? m Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d Zm,n ? N 1 1 1 ? , 2 n m 1 2 1 1 + m n . Assume that X1 , . . . , Xm are random samples of the normal population N (х1 , ?12 ) and Y1 , . . . , Yn are random samples of the normal population N (х2 , ?22 ). The testing problem we are interested in is whether ?1 and ?2 are equal. n 2 2 ?1 2 De?ne ??12 = (m ? 1)?1 m i=1 (Xi ? X?) and ??2 = (n ? 1) i=1 (Yi ? Y? ) 2 2 as the estimators of ?1 and ?2 , respectively. Then we have d ??12 /?12 ? ?2m?1 , d ??22 /?22 ? ?2n?1 , where ??12 and ??22 are independent. If ?12 = ?22 , by the de?nition of F distribution, the test statistics 2 (n ? 1)?1 m ??12 /?12 d i=1 (Xi ? X?) = ? Fm?1,n?1 . F = (m ? 1)?1 ni=1 (Yi ? Y? )2 ??22 /?22 1.16.1. Non-central F distribution d d Y If X ? ?2m,? , Y ? ?2n , X and Y are independent, then F = X m / n follows a non-central F distribution with the degrees of freedom m and n and nond centrality parameter ?. Denote it as F ? Fm,n,? . Particularly, Fm,n,0 = Fm,n . d d 10. If X ? tn,? , then X 2 ? F1,n,? . d 11. Assume that X ? F1,n,? . If n > 2 then E(X) = Var(X) = n 2 (m+?)2 +(m+2?)(n?2) ) . 2( m (n?2)2 (n?4) (m+?)n (n?2)m ; if n > 4, then page 27 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 28 1.17. Multivariate Hypergeometric Distribution2,3,4 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. Suppose X = (X1 , . . . , Xn ) is an n-dimensional random vector with n ? 2, which satis?es: (1) 0 ? Xi ? Ni , 1 ? i ? n, ni=1 Ni = N ; (2) let m1 , . . . , mn be positive integers with ni=1 mi = m, the probability of the event {X1 = m1 , . . . , Xn = mn } is n mi i=1 CNi , P {X1 = m1 , . . . , Xn = mn } = m CN then we say X follows the multivariate hypergeometric distribution, and d denote it as X ? M H(N1 , . . . , Nn ; m). Suppose a jar contains balls with n kinds of colors. The number of balls of the ith color is Ni , 1 ? i ? n. We draw m balls randomly from the jar without replacement, and denote Xi as the number of balls of the ith color for 1 ? i ? n. Then the random vector (X1 , . . . , Xn ) follows the multivariate hypergeometric distribution M H(N1 , . . . , Nn ; m). Multivariate hypergeometric distribution has the following properties: d 1. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m). k k Xi , Nk? = ji=j Ni , For 0 = j0 < j1 < и и и < js = n, let Xk? = ji=j k?1 +1 k?1 +1 d 1 ? k ? s, then (X1? , . . . , Xs? ) ? M H(N1? , . . . , Ns? ; m). Combine the components of the random vector which follows multivariate hypergeometric distribution into a new random vector, the new random vector still follows multivariate hypergeometric distribution. d 2. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m), then for any 1 ? k < n,, we have m? P {X1 = m1 , . . . , Xk = mk } = where N = n ? i=1 Ni , Nk+1 = n mk m1 m2 CN CN2 и и и CN CN ?k+1 1 k ? i=k+1 Ni , mk+1 k+1 m CN =m? Especially, when k = 1, we have P {X1 = m1 } = H(N1 , N, m). , k i=1 mi . m? m CN 1 CN ?2 1 2 m CN d , that is X1 ? page 28 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 29 Multivariate hypergeometric distribution is the extension of hypergeometric distribution. d 3. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m), 0 < k < n, then P {X1 = m1 , . . . , Xk = mk |Xk+1 = mk+1 , . . . , Xn = mn } Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. = mk m1 и и и CN CN 1 k ? m CN ? , where, N ? = ki=1 Ni , m?k+1 = m? ni=k+1 mi . This indicates that, under the condition of Xk+1 = mk+1 , . . . , Xn = mn , the conditional distribution of (X1 , . . . , Xk ) is M H(N1 , . . . , Nk ; m? ). d 4. Suppose Xi ? B(Ni , p), 1 ? i ? n, 0 < p < 1, and X1 , . . . , Xn are mutually independent, then n d Xi = m ? M H(N1 , . . . , Nn ; m). X1 , . . . , Xn i=1 This indicates that, when the sum of independent binomial random variables is given, the conditional joint distribution of these random variables is a multivariate hypergeometric distribution. d 5. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m). If Ni /N ? pi when N ? ? for 1 ? i ? n, then the distribution of (X1 , . . . , Xn ) converges to the multinomial distribution P N (N ; p1 , . . . , pn ). In order to control the number of cars, the government decides to implement the random license-plate lottery policy, each participant has the same probability to get a new license plate, and 10 quotas are allowed each issue. Suppose 100 people participate in the license-plate lottery, among which 10 are civil servants, 50 are individual household, 30 are workers of stateowned enterprises, and the remaining 10 are university professors. Denote X1 , X2 , X3 , X4 as the numbers of people who get the license as civil servants, individual household, workers of state-owned enterprises and university professors, respectively. Thus, the random vector (X1 , X2 , X3 , X4 ) follows the multivariate hypergeometric distribution. M H(10, 50, 30, 10; 10). Therefore, in the next issue, the probability of the outcome X1 = 7, X2 = 1, X3 = 1, X4 = 1 is P {X1 = 7, X2 = 1, X3 = 1, X4 = 1} = 7 C1 C1 C1 C10 50 30 10 . 10 C100 page 29 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 30 1.18. Multivariate Negative Binomial Distribution2,3,4 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. Suppose X = (X1 , . . . , Xn ) is a random vector with dimension n(n ? 2) which satis?es: (1) Xi takes non-negative integer values, 1 ? i ? n; (2) If the probability of the event {X1 = x1 , . . . , Xn = xn } is (x1 + и и и + xn + k ? 1)! k x1 p0 p1 и и и pxnn , P {X1 = x1 , . . . , Xn = xn } = x1 ! и и и xn !(k ? 1)! where 0 < pi < 1, 0 ? i ? n, ni=0 pi = 1, k is a positive integer, then we say X follows the multivariate negative binomial distribution, denoted d as X ? M N B(k; p1 , . . . , pn ). Suppose that some sort of test has (n + 1) kinds of di?erent results, but only one of them occurs every test with the probability of pi , 1 ? i ? (n + 1). The sequence of tests continues until the (n + 1)-th result has occurred k times. At this moment, denote the total times of the i-th result occurred as Xi for 1 ? i ? n, then the random vector (X1 , . . . , Xn ) follows the multivariate negative binomial distribution MNB(k; p1 , . . . , pn ). Multivariate negative binomial distribution has the following properties: d 1. Suppose (X1 , . . . , Xn ) ? M N B(k; p1 . . . , pn ). For 0 = j0 < j1 < и и и < jk jk ? js = n, let Xk? = i=jk?1 +1 Xi , pk = i=jk?1 +1 pi , 1 ? k ? s, then d (X1? , . . . , Xs? ) ? M N B(k; p?1 . . . , p?s ). Combine the components of the random vector which follows multivariate negative binomial distribution into a new random vector, the new random vector still follows multivariate negative binomial distribution. d r1 rn 2. If (X1 , . . . , X 1 и и и Xn ) = (k + Pnn) ? M N B(k; p1 . . . , pn ), then E(X n n ri i=1 ri ?n i=1 (pi /p0 ) , where p0 = 1 ? i=1 ri ? 1) i=1 pi . d d 3. If (X1 , . . . , Xn ) ? M N B(k; p1 . . . , pn ), 1 ? s < n, then (X1 , . . . , Xs ) ? MNB(k; p?1 . . . , p?s ), where p?i = pi /(p0 + p1 + и и и + ps ), 1 ? i ? s, p0 = 1 ? ni=1 pi . d 0 ). Especially, when s = 1, X1 ? N B(k, p0p+p 1 1.19. Multivariate Normal Distribution5,2 A random vector X = (X1 , . . . , Xp ) follows the multivariate normal distri d bution, denoted as X ? Np (х, ), if it has the following density function ?1 1 p ? 2 1 (x ? х) , f (x) = (2?)? 2 exp ? (x ? х) 2 page 30 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 31 where x = (x1 , . . . , xp ) ? Rp , х ? Rp , is a p О p positive de?nite matrix, ?| и |? denotes the matrix determinant, and ? ? denotes the transition matrix transposition. Multivariate normal distribution is the extension of normal distribution. It is the foundation of multivariate statistical analysis and thus plays an important role in statistics. Let X1 , . . . , Xp be independent and identically distributed standard normal random variables, then the random vector X = (X1 , . . . , Xp ) follows Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d the standard multivariate normal distribution, denoted as X ? Np (0, Ip ), where Ip is a unit matrix of p-th order. Some properties of multivariate normal distribution are as follows: 1. The necessary and su?cient conditions for X = (X1 , . . . , Xp ) following multivariate normal distribution is that a X also follows normal distribution for any a = (a1 , . . . , ap ) ? Rp . d 2. If X ? Np (х, ), we have E(X) = х, Cov(X) = . d 3. If X ? Np (х, ), its moment-generating function and characteristic function are M (t) = exp{х t + 12 t t} and ?(t) = exp{iх t ? 12 t t} for t ? Rp , respectively. 4. Any marginal distribution of a multivariate normal distribution is still a d multivariate normal distribution. Let X = (X1 , . . . , Xp ) ? N (х, ), = (?ij )pОp . For any 1 ? q < p, set where х = (х1 , . . . , хp ) , (1) (1) X = (X1 , . . . , Xq ) , х = (х1 , . . . , хq ) , 11 = (?ij )1?i,j?1 , then we d d have X(1) ? Nq (х(1) , 11 ). Especially, X1 ? N (хi , ?ii ), 1 ? i ? p. d 5. If X ? Np (х, ), B denotes an q О p constant matrix and a denotes an q О 1 constant vector, then we have d B , a + BX ? Nq a + Bх, B which implies that the linear transformation of a multivariate normal random vector still follows normal distribution. d 6. If Xi ? Np (хi , i ), 1 ? i ? n, and X1 , . . . , Xn are mutually indepen d dent, then we have ni=1 Xi ? Np ( ni=1 хi , ni=1 i ). d d 7. If X ? Np (х, ), then (X ? х) ?1 (X ? х) ? ?2p . d as follows: 8. Let X = (X1 , . . . , Xp ) ? Np (х, ), and divide X, х and х(1) X(1) 11 12 , х= , = , X= (2) (2) X х 21 22 page 31 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 32 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. where X(1) and х(1) are q О 1 vectors, and 11 is an q О q matrix, q < p, then X(1) and X(2) are mutually independent if and only if 12 = 0. d in the same 9. Let X = (X1 , . . . , Xp ) ? Np (х, ), and divide X, х and manner as property 8, then the conditional distribution of X(1) given ?1 (2) ? х(2) ), X(2) is Nq (х(1) + 12 ?1 11 ? 12 21 ). 22 (X 22 d the 10. Let X = (X1 , . . . , Xp ) ? Np (х, ), and divide X, х and in (1) are X same manner as property 8, then X(1) and X(2) ? 21 ?1 11 d (1) (1) independent, and X ? Nq (х , 11 ), ?1 (1) ?1 (1) d (2) X ? N ? ? ? х , ? ? ? ? ? х X(2) ? ?21 ??1 p?q 21 11 22 21 11 12 . 11 Similarly, X(2) and X(1) ? (х(2) , 22 ), ?1 12 22 d X(2) are independent, and X(2) ? Np?q ?1 (2) ?1 (2) d (1) X ? N ? ? ? х , ? ? ? ? ? х . X(1) ? ?12 ??1 q 12 11 12 21 22 22 22 1.20. Wishart Distribution5,6 Let X1 , . . . , Xn be independent and identically distributed p-dimensional random vectors with common distribution Np (0, ), and X = (X1 , . . . , Xn ) be an pОn random matrix. Then, we say the p-th order random matrix W = XX = ni=1 Xi Xi follows the p-th order (central) Wishart distribution with d n degree of freedom, and denote it as W ? Wp (n, ). Here the distribution of a random matrix indicates the distribution of the random vector generated by matrix vectorization. d 2 ?n , which implies Particularly, if p = 1, we have W = ni=1 Xi2 ? that Wishart distribution is the extension of Chi-square distribution. d > 0, n ? p, then density function of W is If W ? Wp (n, ), and |W|(n?p?1)/2 exp{? 12 tr( ?1 W)} , fp (W) = 2(np/2) | |n/2 ? (p(p?1)/4) ?pi=1 ?( (n?i+1) ) 2 where W > 0, and ?tr? denotes the trace of a matrix. Wishart distribution is a useful distribution in multivariate statistical analysis and plays an important role in statistical inference for multivariable normal distribution. Some properties of Wishart distribution are as follows: d 1. If W ? Wp (n, ), then E(W) = n . page 32 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 33 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d d 2. If W ? Wp (n, ), and C denotes an k О p matrix, then CWC ? Wk (n, C C ). d 3. If W ? Wp (n, ), its characteristic function is E(e{itr(TW)} ) = |Ip ? 2i T|?n/2 , where T denotes a real symmetric matrix with order p. d 4. If Wi ? Wp (ni , ), 1 ? i ? k, and W1 , . . . , Wk are mutually indepen d dent, then ki=1 Wi ? Wp ( ki=1 ni , ). 5. Let X1 , . . . , Xn be independent and identically distributed p-dimensional > 0, and X = random vectors with common distribution Np (0, ), (X1 , . . . , Xn ). (1) If A is an n-th order idempotent matrix, then the quadratic form d matrix Q = XAX ? Wp (m, ), where m = r(A), r(и) denotes the rank of a matrix. (2) Let Q = XAX , Q1 = XBX , where both A and B are idempotent matrices. If Q2 = Q ? Q1 = X(A ? B)X ? 0, then d Q2 ? Wp (m ? k, ), where m = r(A), k = r(B). Moreover, Q1 and Q2 are independent. d > 0, n ? p, and divide W and into q-th order 6. If W ? Wp (n, ), and (p ? q)-th order parts as follows: W= W11 W12 W21 W22 , 11 = 21 12 , 22 then d (1) W11 ? Wq (n, 11 ); ?1 W12 and (W11 , W21 ) are independent; (2) W22 ? W21 W11 d ?1 (3) W22 ? W21 W11 W12 ? Wp?q (n ? q, 2|1 ) where 2|1 = 22 ? 21 ?1 12 . 11 ?1 d 1 > 0, n > p + 1, then E(W?1 ) = n?p?1 . 7. If W ? Wp (n, ), p d d > 0, n ? p, then |W| = | | i=1 ?i , where 8. If W ? Wp (n, ), d ?1 , . . . , ?p are mutually independent and ?i ? ?2n?i+1 , 1 ? i ? p. d > 0, n ? p, then for any p-dimensional non-zero 9. If W ? Wp (n, ), vector a, we have a ?1 a d 2 ? ?n?p+1 . a W?1 a page 33 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 34 1.20.1. Non-central Wishart distribution Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. Let X1 , . . . , Xn be independent and identically distributed p-dimensional random vectors with common distribution Np (х, ), and X = (X1 , . . . , Xn ) be an p О n random matrix. Then, we say the random matrix W = XX follows the non-central Wishart distribution with n degree of freedom. When х = 0, the non-central Wishart distribution becomes the (central) Wishart distribution Wp (n, ). 1.21. Hotelling T2 Distribution5,6 d d Suppose that X ? Np (0, )W ? Wp (n, ), X and W are independent. Let T2 = nX W?1 X, then we say the random variable T2 follows the (central) d Hotelling T2 distribution with n degree of freedom, and denote it as T2 ? Tp2 (n). If p = 1, Hotelling T2 distribution is the square of univariate t distribution. Thus, Hotelling T2 distribution is the extension of t distribution. The density function of Hotelling T2 distribution is f (t) = (t/n)(p?2)/2 ?((n + 1)/2) . ?(p/2)?((n ? p + 1)/2) (1 + t/n)(n+1)/2 Some properties of Hotelling T2 distribution are as follows: d d 1. If X and W are independent, and X ? Np (0, ), W ? Wp (n, ), then d X W?1 X = ?2p 2 ?n?p+1 , where the numerator and denominator are two inde- pendent Chi-square distributions. d 2. If T2 ? Tp2 (n), then n?p+1 2 d T = np ?2p p ?2n?p+1 n?p+1 d ? Fp,n?p+1 . Hence, Hotelling T2 distribution can be transformed to F distribution. 1.21.1. Non-central T2 distribution d d Assume X and W are independent, and X ? Np (х, ), W ? Wp (n, ), then the random variable T2 = nX W?1 X follows the non-central Hotelling T2 distribution with n degree of freedom. When х = 0, non-central Hotelling T2 distribution becomes central Hotelling T2 distribution. page 34 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions d 3. Suppose that X and W are independent, X ? Np (х, Let T2 = nX W?1 X, then n?p+1 2 d T = np Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. where a = х ?1 ?2p,a p ?2n?p+1 n?p+1 35 d ), W ? Wp (n, ). d ? Fp,n?p+1,a , х. Hotelling T2 distribution can be used in testing the mean of a multivariate normal distribution. Let X1 , . . . , Xn be random samples of the multivariate > 0, is unknown, n > p. We want normal population Np (х, ), where to test the following hypothesis: vs H1 : х = х0 . H0 : х = х0 , n Let X?n = n i=1 Xi be the sample mean and Vn = i=1 (Xi ? X?n )(Xi ? X?n ) be the sample dispersion matrix. The likelihood ratio test statistic is T2 = n(n ? 1)(X?n ? х0 ) Vn?1 (X?n ? х0 ). Under the null hypothesis H0 , we n ?1 d n?p 2 d (n?1)p T ? Fp,n?p . n?p P {Fp,n?p ? (n?1)p T 2 }. have T2 ? Tp2 (n?1). Moreover, from property 2, we have Hence, the p-value of this Hotelling T2 test is p = 1.22. Wilks Distribution5,6 d Assume that W1 and W2 are independent, W1 ? Wp (n, (m, ), where > 0, n ? p. Let A= d ), W2 ? Wp |W1 | , |W1 + W2 | then the random variable A follows the Wilks distribution with the degrees of freedom n and m, and denoted as ?p,n,m. Some properties of Wilks distribution are as follows: d d 1. ?p,n,m = B1 B2 и и и Bp , where Bi ? BE((n ? i + 1)/2, m/2), 1 ? i ? p, and B1 , . . . , Bp are mutually independent. d 2. ?p,n,m = ?m,n+m?p,p. 3. Some relationships between Wilks distribution and F distribution are: (1) (2) (3) n 1??1,n,m d m ?1,n,m ? Fm,n ; n+1?p 1??p,n,1 d ? Fp,(n+1?p) ; p ??p,n,1 ? d n?1 1? ? 2,n,m ? F2m,2(n?1) ; m ?2,n,m page 35 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 36 ? Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. (4) ? d n+1?p 1? ? p,n,2 ? p ?p,n,2 F2p,2(n+1?p) . Wilks distribution is often used to model the distribution of a multivariate covariance. Suppose we have k mutually independent populations d > 0 and is unknown. Let xj1 , . . . , xjnj be the Xj ? Np (хj , ), where random samples of population Xj , 1 ? j ? k. Set n = kj=1 nj , and we have n ? p + k. We want to test the following hypothesis: H0 : х1 = и и и = хk , vs H1 : х1 , . . . , хk are not identical. Set nj ?1 x?j = nj xji , 1 ? j ? k, i=1 (xji ? x?j )(xji ? x?j ) , nj Vj = x? = i=1 k 1 ? j ? k, nj x?j /n, j=1 k nj (x?j ? x?)(x?j ? x?) be the between-group variance, SSB = k j=1 SSB = j=1 Vj be the within-group variance. The likelihood ratio test statistic is |SSW| . ?= |SSW + SSB| d Under the null hypothesis H0 , we have ? ? ?p,n?k,k?1. Following the relationships between Wilks distribution and F distribution, we have following conclusions: (1) If k = 2, let n?p?1 1?? d и ? Fp,n?p?1 , p ? then the p-value of the test is p = P {Fp,n?p?1 ? F}. F= (2) If p = 2, let ? n?k?1 1? ? d и ? ? F2(k?1),2(n?k?1) , F= k?1 ? then the p-value of the test is p = P {F2(k?1),2(n?k?1) ? F}. page 36 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions (3) If k = 3, let b2736-ch01 37 ? n?p?2 1? ? d и ? ? F2p,2(n?p?2) , F= p ? then the p-value of the test is p = P {F2p,2(n?p?2) ? F}. Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. References 1. Chow, YS, Teicher, H. Probability Theory: Independence, Interchangeability, Martingales. New York: Springer, 1988. 2. Fang, K, Xu, J. Statistical Distributions. Beijing: Science Press, 1987. 3. Krishnamoorthy, K. Handbook of Statistical Distributions with Applications. Boca Raton: Chapman and Hall/CRC, 2006. 4. Patel, JK, Kapadia, CH, and Owen, DB. Handbook of Statistical Distributions. New York: Marcel Dekker, 1976. 5. Anderson, TW. An Introduction to Multivariate Statistical Analysis. New York: Wiley, 2003. 6. Wang, J. Multivariate Statistical Analysis. Beijing: Science Press, 2008. About the Author Dr. Jian Shi, graduated from Peking University, is Professor at the Academy of Mathematics and Systems Science in Chinese Academy of Sciences. His research interests include statistical inference, biomedical statistics, industrial statistics and statistics in sports. He has held and participated in several projects of the National Natural Science Foundation of China as well as applied projects. page 37 ? and ? > 0, then we say X follows the normal d distribution and denote it as X = N (х, ? 2 ). In particular, when х = 0 and ? = 1, we say that X follows the standard normal distribution N (0, 1). page 4 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 5 d If X = N (х, ? 2 ), then the distribution function of X is x x?х t?х ? = dt. ? ? ? ?? If X follows the standard normal distribution, N (0, 1), then the density and distribution functions of X are ?(x) and ?(x), respectively. The Normal distribution is the most common continuous distribution and has the following properties: d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 1. If X = N (х, ? 2 ), then Y = d X?х ? d d = N (0, 1), and if X = N (0, 1), then Y = a + ?X = N (a, ? 2 ). Hence, a general normal distribution can be converted to the standard normal distribution by a linear transformation. d 2. If X = N (х, ? 2 ), then the expectation of X is E(X) = х and the variance of X is Var(X) = ? 2 . d 3. If X = N (х, ? 2 ), then the k-th central moment of X is 0, k is odd, E((X ? х)k ) = k! k ? , k is even. 2k/2 (k/2)! d 4. If X = N (х, ? 2 ), then the moments of X are E(X 2k?1 ) = ? 2k?1 k i=1 (2k ? 1)!!х2i?1 , (2i ? 1)!(k ? i)!2k?i and 2k E(X ) = ? 2k k i=0 (2k)!х2i (2i)!(k ? i)!2k?i for k = 1, 2, . . .. d 5. If X = N (х, ? 2 ), then the skewness and the kurtosis of X are both 0, i.e. s = ? = 0. This property can be used to check whether a distribution is normal. d 6. If X = N (х, ? 2 ), then the moment-generating function and the characteristic function of X are M (t) = exp{tх + 12 t2 ? 2 } and ?(t) = exp {itх ? 12 t2 ? 2 }, respectively. d 7. If X = N (х, ? 2 ), then d a + bX = N (a + bх, b2 ? 2 ). page 5 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 6 d 8. If Xi = N (хi , ?i2 ) for 1 ? i ? n, and X1 , X2 , . . . , Xn are mutually independent, then n n n d 2 Xi = N хi , ?i . i=1 i=1 i=1 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 9. If X1 , X2 , . . . , Xn represent a random sample of the population N (х, ? 2 ), 2 d then the sample mean X?n = n1 ni=1 Xi satis?es X?n = N (х, ?n ). The central limit theorem: Suppose that X1 , . . . , Xn are independent and identically distributed random variables, and that х = E(X1 ) and 0 < ? 2 = ? Var(X1 ) < ?, then the distribution of Tn = n(X?n ?х)/? is asymptotically standard normal when n is large enough. The central limit theorem reveals that limit distributions of statistics in many cases are (asymptotically) normal. Therefore, the normal distribution is the most widely used distribution in statistics. The value of the normal distribution is the whole real axis, i.e. from negative in?nity to positive in?nity. However, many variables in real problems take positive values, for example, height, voltage and so on. In these cases, the logarithm of these variables can be regarded as being normally distributed. d Log-normal distribution: Suppose X > 0. If ln X ? N (х, ? 2 ), then we d say X follows the log-normal distribution and denote it as X ? LN (х, ? 2 ). 1.4. Exponential Distribution2,3,4 If the density function of the random variable X is ?e??x , x ? 0, f (x) = 0, x < 0, where ? > 0, then we say X follows the exponential distribution and denote d it as X = E(?). Particularly, when ? = 1, we say X follows the standard exponential distribution E(1). d If X = E(?), then its distribution function is 1 ? e??x , x ? 0, F (x; ?) = 0, x < 0. Exponential distribution is an important distribution in reliability. The life of an electronic product generally follows an exponential distribution. When page 6 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 7 the life of a product follows the exponential distribution E(?), ? is called the failure rate of the product. Exponential distribution has the following properties: d 1. If X = E(?), then the k-th moment of X is E(X k ) = k??k , k = 1, 2, . . . . d 2. If X = E(?), then E(X) = ??1 and Var(X) = ??2 . d 3. If X = E(?), then its skewness is s = 2 and its kurtosis is ? = 6. d 4. If X = E(?), then the moment-generating function and the characteris? ? for t < ? and ?(t) = ??it , respectively. tic function of X are M (t) = ??t d d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 5. If X = E(1), then ??1 X = E(?) for ? > 0. d 6. If X = E(?), then for any x > 0 and y > 0, there holds P {X > x + y|X > y} = P {X > x}. This is the so-called ?memoryless property? of exponential distribution. If the life distribution of a product is exponential, no matter how long it has been used, the remaining life of the product follows the same distribution as that of a new product if it does not fail at the present time. d 7. If X = E(?), then for any x > 0, there hold E(X|X > a) = a + ??1 and Var(X|X > a) = ??2 . 8. If x and y are independent and identically distributed as E(?), then min(X, Y ) is independent of X ? Y and d {X|X + Y = z} ? U (0, z). 9. If X1 , X2 , . . . , Xn are random samples of the population E(?), let X(1,n) ? X(2,n) ? и и и ? X(n,n) be the order statistics of X1 , X2 , . . . , Xn . Write Yk = (n ? k + 1)(X(k,n) ? X(k?1,n) ), 1 ? k ? n, where X(0,n) = 0. Then, Y1 , Y2 , . . . , Yn are independent and identically distributed as E(?). 10. If X1 , X2 , . . . , Xn are random samples of the population of E(?), then n d i=1 Xi ? ?(n, ?), where ?(n, ?) is the Gamma distribution in Sec. 1.12. d d 11. If then Y ? U (0, 1), then X = ? ln(Y ) ? E(1). Therefore, it is easy to generate random numbers with exponential distribution through uniform random numbers. 1.5. Weibull Distribution2,3,4 If the density function of the random variable X is ? ? ??1 exp{? (x??) }, x ? ?, ? (x ? ?) ? f (x; ?, ?, ?) = 0, x < ?, page 7 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 8 d then we say X follows the Weibull distribution and denote it as X ? W (?, ?, ?). Where ? is location parameter, ? > 0 is shape parameter, ? > 0, is scale parameter. For simplicity, we denote W (?, ?, 0) as W (?, ?). Particularly, when ? = 0, ? = 1, Weibull distribution W (1, ?) is transformed into Exponential distribution E(1/?). d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. If X ? W (?, ?, ?), then its distribution function is ? }, x ? ?, 1 ? exp{? (x??) ? F (x; ?, ?, ?) = 0, x < ?, Weibull distribution is an important distribution in reliability theory. It is often used to describe the life distribution of a product , such as electronic product and wear product. Weibull distribution has the following properties: d 1. If X ? E(1), then d Y = (X?)1/? + ? ? W (?, ?, ?) Hence, Weibull distribution and exponential distribution can be converted to each other by transformation. d 2. If X ? W (?, ?), then the k-th moment of X is k ? k/? , E(X k ) = ? 1 + ? where ?(и) is the Gamma function. d 3. If X ? W (?, ?, ?), then 1 ? 1/? + ?, E(X) = ? 1 + ? 1 2 2 ?? 1+ ? 2/? . Var(X) = ? 1 + ? ? 4. Suppose X1 , X2 , . . . , Xn are mutually independent and identically distributed random variables with common distribution W (?, ?, ?), then d X1,n = min(X1 , X2 , . . . , Xn ) ? W (?, ?/n, ?), d d whereas, if X1,n ? W (?, ?/n, ?), then X1 ? W (?, ?, ?). page 8 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions b2736-ch01 9 1.5.1. The application of Weibull distribution in reliability Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. The shape parameter ? usually describes the failure mechanism of a product. Weibull distributions with ? < 1 are called ?early failure? life distributions, Weibull distributions with ? = 1 are called ?occasional failure? life distributions, and Weibull distributions with ? > 1 are called ?wear-out (aging) failure? life distributions. d If X ? W (?, ?, ?) then its reliability function is ? }, x ? ?, exp{? (x??) ? R(x) = 1 ? F (x; ?, ?, ?) = 1, x < ?. When the reliability R of a product is known, then xR = ? + ? 1/? (? ln R)1/? is the Q-percentile life of the product. If R = 0.5, then x0.5 = ? + ? 1/? (ln 2)1/? is the median life; if R = e?1 , then xe?1 ? + ? 1/? is the characteristic life; R = exp{??? (1 + ??1 )}, then xR = E(X), that is mean life. The failure rate of Weibull distribution W (?, ?, ?) is ? (x ? ?)??1 , x ? ?, f (x; ?, ?, ?) = ? ?(x) = R(x) 0, x < ?. The mean rate of failure is 1 ??(x) = x?? x ?(t)dt = (x??)??1 , ? x ? ?, 0, x < ?. ? Particularly, the failure rate of Exponential distribution E(?) = W (1, 1/?) is constant ?. 1.6. Binomial Distribution2,3,4 We say random variable follows the binomial distribution, if it takes discrete values and P {X = k} = Cnk pk (1 ? p)n?k , k = 0, 1, . . . , n, where n is positive integer, Cnk is combinatorial number, 0 ? p ? 1. We d denote it as X ? B(n, p). Consider n times independent trials, each with two possible outcomes ?success? and ?failure?. Each trial can only have one of the two outcomes. The probability of success is p. Let X be the total number of successes in this page 9 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 10 d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. n trials, then X ? B(n, p). Particularly, if n = 1, B(1, p) is called Bernoulli distribution or two-point distribution. It is the simplest discrete distribution. Binomial distribution is a common discrete distribution. d If X ? B(n, p), then its density function is min([x],n) Cnk pk q n?k , x ? 0, k=0 B(x; n, p) = 0, x < 0, where [x] is integer part of x, q = 1 ? p. x Let Bx (a, b) = 0 ta?1 (1?t)b?1 dt be the incomplete Beta function, where 0 < x < 1, a > 0, b > 0, then B(a, b) = B1 (a, b) is the Beta function. Let Ix (a, b) = Bx (a, b)/B(a, b) be the ratio of incomplete Beta function. Then the binomial distribution function can be represented as follows: B(x; n, p) = 1 ? Ip (x + 1, n ? [x]), 0 ? x ? n. Binomial distribution has the following properties: 1. Let b(k; n, p) = Cnk pk q n?k for 0 ? k ? n. If k ? [(n+1)p], then b(k; n, p) ? b(k ? 1; n, p); if k > [(n + 1)p], then b(k; n, p) < b(k ? 1; n, p). 2. When p = 0.5, Binomial distribution B(n, 0.5) is a symmetric distribution; when p = 0.5, Binomial distribution B(n, p) is asymmetric. 3. Suppose X1 , X2 , . . . , Xn are mutually independent and identically distributed Bernoulli random variables with parameter p, then Y = n d Xi ? B(n, p). i=1 d 4. If X ? B(n, p), then E(X) = np, Var(X) = npq. d 5. If X ? B(n, p), then the k-th moment of X is k E(X ) = k S2 (k, i)Pni pi , i=1 where S2 (k, i) is the second order Stirling number, Pnk is number of permutations. d 6. If X ? B(n, p), then its skewness is s = (1 ? 2p)/(npq)1/2 and kurtosis is ? = (1 ? 6pq)/(npq). d 7. If X ? B(n, p), then the moment-generating function and the characteristic function of X are M (t) = (q + pet )n and ?(t) = (q + peit )n , respectively. page 10 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 11 8. When n and x are ?xed, the Binomial distribution function b(x; n, p) is a monotonically decreasing function with respect to p(0 < p < 1). d 9. If Xi ? B(ni , p) for 1 ? i ? k, and X1 , X2 , . . . , Xk are mutually indepen d dent, then X = ki=1 Xi ? B( ki=1 ni , p). 1.7. Multinomial Distribution2,3,4 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. If an n(n ? 2)-dimensional random vector X = (X1 , . . . , Xn ) satis?es the following conditions: (1) Xi ? 0, 1 ? n, and ni=1 Xi = N ; (2) Suppose m1 , m2 , . . . , mn are any non-negative integers with ni=1 mi = N and the probability of the following event is P {X1 = m1 , . . . , Xn = N! i ?ni=1 pm mn } = m1 !иииm i , n! where pi ? 0, 1 ? i ? n, n i=1 pi = 1, then we say X follows the multinomial d distribution and denote it as X ? P N (N ; p1 , . . . , pn ). Particularly, when n = 2, multinomial distribution degenerates to binomial distribution. Suppose a jar has balls with n kinds of colors. Every time, a ball is drawn randomly from the jar and then put back to the jar. The probability for the ith color ball being drawn is pi , 1 ? i ? n, ni=1 pi = 1. Assume that balls are drawn and put back for N times and Xi is denoted as the number of drawings of the ith color ball, then the random vector X = (X1 , . . . , Xn ) follows the Multinomial distribution P N (N ; p1 , . . . , pn ). Multinomial distribution is a common multivariate discrete distribution. Multinomial distribution has the following properties: d ? = 1. If (X1 , . . . , Xn ) ? P N (N ; p1 , . . . , pn ), let Xi+1 n p , 1 ? i < n, then j=i+1 i n ? j=i+1 Xi , pi+1 = d ? ) ? P N (N ; p , . . . , p , p? ), (i) (X1 , . . . , Xi , Xi+1 1 i i+1 d (ii) Xi ? B(N, pi ), 1 ? i ? n. More generally, let 1 = j0 < j1 < и и и < jm = n, and X?k = jk j k d i=jk?1 +1 Xi , p?k = i=jk?1 +1 pi , 1 ? k ? m, then (X?1 , . . . , X?m ) ? P N (N ; p?1 , . . . , p?m ). page 11 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 12 d 2. If (X1 , . . . , Xn ) ? P N (N ; p1 , . . . , pn ), then its moment-generating function and the characteristic function of are ? ? ?N ?N n n M (t1 , . . . , tn ) = ? pj etj ? and ?(t1 , . . . , tn ) = ? pj eitj ? , j=1 j=1 respectively. d 3. If (X1 , . . . , Xn ) ? P N (N ; p1 , . . . , pn ), then for n > 1, 1 ? k < n, (X1 , . . . , Xk |Xk+1 = mk+1 , . . . , Xn = mn ) ? P N (N ? M ; p?1 , . . . , p?k ), Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d where M= n mi , 0 < M < N, i=k+1 pj p?j = k i=1 pi , 1 ? j ? k. 4. If Xi follows Poisson distribution P (?i ), 1 ? i ? n, and X1 , . . . , Xn are mutually independent, then for any given positive integer N , there holds n d Xi = N ? P N (N ; p1 , . . . , pn ), X1 , . . . , Xn i=1 n where pi = ?i / j=1 ?j , 1 ? i ? n. 1.8. Poisson Distribution2,3,4 If random variable X takes non-negative integer values, and the probability is P {X = k} = ?k ?? e , k! ? > 0, k = 0, 1, . . . , d then we say X follows the Poisson distribution and denote it as X ? P (?). d If X ? P (?), then its distribution function is P {X ? x} = P (x; ?) = [x] p(k; ?), k=0 where p(k; ?) = e?? ?k /k!, k = 0, 1, . . . . Poisson distribution is an important distribution in queuing theory. For example, the number of the purchase of the ticket arriving in ticket window in a ?xed interval of time approximately follows Poisson distribution. Poisson page 12 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 13 distribution have a wide range of applications in physics, ?nance, insurance and other ?elds. Poisson distribution has the following properties: 1. If k < ?, then p(k; ?) > p(k ? 1; ?); if k > ?, then p(k; ?) < p(k ? 1; ?). If ? is not an integer, then p(k; ?) has a maximum value at k = [?]; if ? is an integer, then p(k, ?) has a maximum value at k = ? and ? ? 1. 2. When x is ?xed, P (x; ?) is a non-increasing function with respect to ?, that is Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. P (x; ?1 ) ? P (x; ?2 ) if ?1 < ?2 . When ? and x change at the same time, then P (x; ?) ? P (x ? 1; ? ? 1) if x ? ? ? 1, P (x; ?) ? P (x ? 1; ? ? 1) if x ? ?. d 3. If X ? P (?), then the k-th moment of X is E(X k ) = where S2 (k, i) is the second order Stirling number. k i i=1 S2 (k, i)? , d 4. If X ? P (?), then E(X) = ? and Var(X) = ?. The expectation and variance being equal is an important feature of Poisson distribution. d 5. If X ? P (?), then its skewness is s = ??1/2 and its kurtosis is ? = ??1 . d 6. If X ? P (?), then the moment-generating function and the characteristic function of X are M (t) = exp{?(et ? 1)} and ?(t) = exp{?(eit ? 1)}, respectively. 7. If X1 , X2 , . . . , Xn are mutually independent and identically distributed, d d then X1 ? P (?) is equivalent to ni=1 Xi ? P (n?). d 8. If Xi ? P (?i ) for 1 ? i ? n, and X1 , X2 , . . . , Xn are mutually independent, then n n d Xi ? P ?i . i=1 d i=1 d 9. If X1 ? P (?1 ) and X2 ? P (?2 ) are mutually independent, then conditional distribution of X1 given X1 + X2 is binomial distribution, that is d (X1 |X1 + X2 = x) ? B(x, p), where p = ?1 /(?1 + ?2 ). page 13 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 14 1.9. Negative Binomial Distribution2,3,4 For positive integer m, if random variable X takes non-negative integer values, and the probability is k pm q k , P {X = k} = Ck+m?1 k = 0, 1, . . . , where 0 < p < 1, q = 1 ? p, then we say X follows the negative binomial d distribution and denotes it as X ? N B(m, p). Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d If X ? N B(m, p), then its distribution function is [x] k m k k=0 Ck+m?1 p q , x ? 0, N B(x; m, p) = 0, x < 0, Negative binomial distribution is also called Pascal distribution. It is the direct generalization of binomial distribution. Consider a success and failure type trial (Bernoulli distribution), the probability of success is p. Let X be the total number of trial until it has m times of ?success?, then X ? m follows the negative binomial distribution N B(m, p), that is, the total number of ?failure? follows the negative binomial distribution N B(m, p). Negative binomial distribution has the following properties: k pm q k , where 0 < p < 1, k = 0, 1, . . . , then 1. Let nb(k; m, p) = Ck+m?1 nb(k + 1; m, p) = m+k и nb(k; m, p). k+1 Therefore, if k < m?1 p ? m, nb(k; m, p) increases monotonically; if k > m?1 p ? m, nb(k; m, p) decreases monotonically with respect to k. 2. Binomial distribution B(m, p) and negative binomial distribution N B(r, p) has the following relationship: N B(x; r, p) = 1 ? B(r ? 1; r + [x], p). 3. N B(x; m, p) = Ip (m, [x] + 1), where Ip (и, и) is the ratio of incomplete Beta function. d 4. If X ? N B(m, p), then the k-th moment of X is k E(X ) = k S2 (k, i)m[i] (q/p)i , i=1 where m[i] = m(m + 1) и и и (m + i ? 1), 1 ? i ? k, S2 (k, i) is the second order Stirling number. page 14 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions b2736-ch01 15 d 5. If X ? N B(m, p), then E(X) = mq/p, Var(X) = mq/p2 . d 6. If X ? N B(m, p), then its skewness and kurtosis are s = (1 + q)/(mq)1/2 and ? = (6q + p2 )/(mq), respectively. d 7. If X ? N B(m, p), then the moment-generating function and the characteristic function of X are M (t) = pm (1 ? qet )?m and ?(t) = pm (1 ? qeit )?m , respectively. Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d 8. If Xi ? N B(mi , p) for 1 ? i ? n, and X1 , X2 , . . . , Xn are mutually independent, then n n d Xi ? N B mi , p . i=1 i=1 d 9. If X ? N B(mi , p), then there exists a sequence random variables X1 , . . . , Xm which are independent and identically distributed as G(p), such that d X = X1 + и и и + Xm ? m, where G(p) is the Geometric distribution in Sec. 1.11. 1.10. Hypergeometric Distribution2,3,4 Let N, M, n be positive integers and satisfy M ? N, n ? N . If the random variable X takes integer values from the interval [max(0, M + n ? N ), min(M, n)], and the probability for X = k is k C n?k CM N ?M , P {X = k} = n CN where max(0, M + n ? N ) ? k ? min(M, n), then we say X follows the d hypergeometric distribution and denote it as X ? H(M, N, n). d If X ? H(M, N, n), then the distribution function of X is n?k k min([x],K2 ) CM CN?M , x ? K1 , n k=K1 CN H(x; n, N, M ) = 0, x < K1 , where K1 = max(0, Mn ? N ), K2 = min(M, n). The hypergeometric distribution is often used in the sampling inspection of products, which has an important position in the theory of sampling inspection. Assume that there are N products with M non-conforming ones. We randomly draw n products from the N products without replacement. Let page 15 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 16 X be the number of non-conforming products out of this n products, then it follows the hypergeometric distribution H(M, N, n). Some properties of hypergeometric distribution are as follows: k C n?k /C n , then 1. Denote h(k; n, N, M ) = CM N n?M h(k; n, N, M ) = h(k; M, N, n), Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. h(k; n, N, M ) = h(N ? n ? M + k; N ? n, N, M ), where K1 ? k ? K2 . 2. The distribution function of the hypergeometric distribution has the following expressions H(x; n, N, M ) = H(N ? n ? M + x; N ? n, N, N ? M ) = 1 ? H(n ? x ? 1; n, N, N ? M ) = 1 ? H(M ? x ? 1; N ? n, N, M ) and 1 ? H(n ? 1; x + n, N, N ? m) = H(x; n + x, N, M ), where x ? K1 . d 3. If X ? H(M, N, n), then its expectation and variance are E(X) = nM , N Var(X) = nM (N ? n)(N ? M ) . N 2 (N ? 1) For integers n and k, denote n(n ? 1) и и и (n ? k + 1), n(k) = n! k < n, k ? n. d 4. If X ? H(M, N, n), the k-th moment of X is E(X k ) = k S2 (k, i) i=1 n(i) M (i) . N (i) d 5. If X ? H(M, N, n), the skewness of X is s= (N ? 2M )(N ? 1)1/2 (N ? 2n) . (N M (N ? M )(N ? n))1/2 (N ? 2) page 16 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 17 d 6. If X ? H(M, N, n), the moment-generating function and the characteristic function of X are (N ? n)!(N ? M )! M (t) = F (?n, ?M ; N ? M ? n + 1; et ) N !(N ? M ? n)! and Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. ?(t) = (N ? n)!(N ? M )! F (?n, ?M ; N ? M ? n + 1; eit ), N !(N ? M ? n)! respectively, where F (a, b; c; x) is the hypergeometric function and its de?nition is ab x a(a + 1)b(b + 1) x2 F (a, b; c, x) = 1 + + + иии c 1! c(c + 1) 2! with c > 0. A typical application of hypergeometric distribution is to estimate the number of ?sh. To estimate how many ?sh in a lake, one can catch M ?sh, and then put them back into the lake with tags. After a period of time, one re-catches n(n > M ) ?sh from the lake among which there are s ?sh with the mark. M and n are given in advance. Let X be the number of ?sh with the mark among the n re-caught ?sh. If the total amount of ?sh in the lake is assumed to be N , then X follows the hypergeometric distribution H(M, N, n). According to the above property 3, E(X) = nM/N , which can be estimated by the number of ?sh re-caught with the mark, i.e., s ? E(X) = nM/N . Therefore, the estimated total number of ?sh in the lake is N? = nM/s. 1.11. Geometric Distribution2,3,4 If values of the random variable X are positive integers, and the probabilities are P {X = k} = q k?1 p, k = 1, 2, . . . , where 0 < p ? 1, q = 1 ? p, then we say X follows the geometric distribution d d and denote it as X ? G(p). If X ? G(p), then the distribution function of X is 1 ? q [x] , x ? 0, G(x; p) = 0, x < 0. Geometric distribution is named according to what the sum of distribution probabilities is a geometric series. page 17 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 18 In a trial (Bernoulli distribution), whose outcome can be classi?ed as either a ?success? or a ?failure?, and p is the probability that the trial is a ?success?. Suppose that the trials can be performed repeatedly and independently. Let X be the number of trials required until the ?rst success occurs, then X follows the geometric distribution G(p). Some properties of geometric distribution are as follows: Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 1. Denote g(k; p) = pq k?1 , k = 1, 2, . . . , 0 < p < 1, then g(k; p) is a monotonically decreasing function of k, that is, g(1; p) > g(2; p) > g(3; p) > и и и . d 2. If X ? G(p), then the expectation and variance of X are E(X) = 1/p and Var(X) = q/p2 , respectively. d 3. If X ? G(p), then the k-th moment of X is k K S2 (k, i)i!q i?1 /pi , E(X ) = i=1 where S2 (k, i) is the second order Stirling number. d 4. If X ? G(p), the skewness of X is s = q 1/2 + q ?1/2 . d 5. If X ? G(p), the moment-generating function and the characteristic function of X are M (t) = pet (1 ? et q)?1 and ?(t) = peit (1 ? eit q)?1 , respectively. d 6. If X ? G(p), then P {X > n + m|X > n} = P {X > m}, for any nature number n and m. Property 6 is also known as ?memoryless property? of geometric distribution. This indicates that, in a success-failure test, when we have done n trials with no ?success? outcome, the probability of the even that we continue to perform m trials still with no ?success? outcome has nothing to do with the information of the ?rst n trials. The ?memoryless property? is a feature of geometric distribution. It can be proved that a discrete random variable taking natural numbers must follow geometric distribution if it satis?es the ?memoryless property?. d 7. If X ? G(p), then E(X|X > n) = n + E(X). 8. Suppose X and Y are independent discrete random variables, then min(X, Y ) is independent of X ? Y if and only if both X and Y follow the same geometric distribution. page 18 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in Probability and Probability Distributions b2736-ch01 19 1.12. Gamma Distribution2,3,4 If the density function of the random variable X is ? ??1 ??x ? x e , x ? 0, ?(?) g(x; ?, ?) = 0, x < 0, where ? > 0, ? > 0, ?(и) is the Gamma function, then we say X follows the Gamma distribution with shape parameter ? and scale parameter ?, and d denote it as X ? ?(?, ?). Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d If X ? ?(?, ?), then the distribution function of X is ? ??1 ??x ? t e dt, x ? 0, ?(?) ?(x; ?, ?) = 0, x < 0. Gamma distribution is named because the form of its density is similar to Gamma function. Gamma distribution is commonly used in reliability theory to describe the life of a product. When ? = 1, ?(?, 1), is called the standard Gamma distribution and its density function is ??1 ?x x e , x ? 0, ?(?) g(x; ?, 1) = 0, x < 0. When ? = 1(1, ?) is called the single parameter Gamma distribution, and it is also the exponential distribution E(?) with density function ?e??x , x ? 0, g(x; 1, ?) = 0, x < 0. More generally, the gamma distribution with three parameters can be obtained by means of translation transformation, and the corresponding density function is ? ? (x??)??1 e?? (x??) , x ? 0, ?(?) g(x; ?, ?, ?) = 0, x < ?. Some properties of gamma distribution are as follows: d d 1. If X ? ?(?, ?), then ?X ? ?(?, 1). That is, the general gamma distribution can be transformed into the standard gamma distribution by scale transformation. page 19 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 20 2. For x ? 0, denote I? (x) = 1 ?(?) x t??1 e?t dt 0 to be the incomplete Gamma function, then ?(x; ?, ?) = I? (?x). Particularly, ?(x; 1, ?) = 1 ? e??x . 3. Several relationships between gamma distributions are as follows: Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. (1) ?(x; ?, 1) ? ?(x;?? + 1, 1) = g(x; ?, 1). (2) ?(x; 12 , 1) = 2?( 2x) ? 1, where ?(x) is the standard normal distribution function. d 4. If X ? ?(?, ?), then the expectation of X is E(X) = ?/? and the variance of X is Var(X) = ?/? 2 . d 5. If X ? ?(?, ?), then the k-th moment of X is E(X k ) = ? ?k ?(k+?)/?(?). d 6. If X ? ?(?, ?), the skewness of X is s = 2??1/2 and the kurtosis of X is ? = 6/?. d ? ? ) , 7. If X ? ?(?, ?), the moment-generating function of X is M (t) = ( ??t ? )? for t < ?. and the characteristic function of X is ?(t) = ( ??it d 8. If Xi ? ?(?i , ?), for 1 ? i ? n, and X1 , X2 , . . . , Xn and are independent, then n n d Xi ? ? ?i , ? . i=1 d i=1 d 9. If X ? ?(?1 , 1), Y ? ?(?2 , 1), and X is independent of Y , then X + Y is independent of X/Y . Conversely, if X and Y are mutually independent, non-negative and non-degenerate random variables, and moreover X + Y is independent of X/Y , then both X and Y follow the standard Gamma distribution. 1.13. Beta Distribution2,3,4 If the density function of the random variable X is a?1 x (1?x)b?1 , 0 ? x ? 1, B(a,b) f (x; a, b) = 0, otherwise, where a > 0, b > 0, B(и, и) is the Beta function, then we say X follows the d Beta distribution with parameters a and b, and denote it as X ? BE(a, b). page 20 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 21 d If X ? BE(a, b), then the distribution function of X is ? ? ? Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. BE(x; a, b) = ? ? 1, x > 1, Ix (a, b), 0 < x ? 1, 0, x ? 0, where Ix (a, b) is the ratio of incomplete Beta function. Similar to Gamma distribution, Beta distribution is named because the form of its density function is similar to Beta function. Particularly, when a = b = 1, BE(1, 1) is the standard uniform distribution U (0, 1). Some properties of the Beta distribution are as follows: d d 1. If X ? BE(a, b), then 1 ? X ? BE(b, a). 2. The density function of Beta distribution has the following properties: (1) (2) (3) (4) (5) when a < 1, b ? 1, the density function is monotonically decreasing; when a ? 1, b < 1, the density function is monotonically increasing; when a < 1, b < 1, the density function curve is U type; when a > 1, b > 1, the density function curve has a single peak; when a = b, the density function curve is symmetric about x = 1/2. d 3. If X ? BE(a, b), then the k-th moment of X is E(X k ) = B(a+k,b) B(a,b) . d 4. If X ? BE(a, b), then the expectation and variance of X are E(X) = a/(a + b) and Var(X) = ab/((a + b + 1)(a + b)2 ), respectively. d 5. If X ? BE(a, b), the skewness of X is s = kurtosis of X is ? = 3(a+b)(a+b+1)(a+1)(2b?a) ab(a+b+2)(a+b+3) + 2(b?a)(a+b+1)1/2 (a+b+2)(ab)2 a(a?b) a+b ? 3. and the d 6. If X ? BE(a, b), the moment-generating function and the characteris? ?(a+k) tk tic function of X are M (t) = ?(a+b) k=0 ?(a+b+k) ?(k+1) and ?(t) = ?(a) ?(a+k) (it)k ?(a+b) ? k=0 ?(a+b+k) ?(k+1) , respectively. ?(a) d 7. Suppose X1 , X2 , . . . , Xn are mutually independent, Xi ? BE(ai , bi ), 1 ? i ? n, and ai+1 = ai + bi , 1 ? i ? n ? 1, then n i=1 d Xi ? BE a1 , n bi . i=1 8. Suppose X1 , X2 , . . . , Xn are independent and identically distributed random variables with common distribution U (0, 1), then min(X1 , . . . , Xn ) page 21 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 22 d ? BE(1, n). Conversely, if X1 , X2 , . . . , Xn are independent and identically distributed random variables, and d min(X1 , . . . , Xn ) ? U (0, 1), d then X1 ? BE(1, 1/n). 9. Suppose X1 , X2 , . . . , Xn are independent and identically distributed random variables with common distribution U (0, 1), denote X(1,n) ? X(2,n) ? и и и ? X(n,n) as the corresponding order statistics, then Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d X(k,n) ? BE(k, n ? k + 1), 1 ? k ? n, d X(k,n) ? X(i,n) ? BE(k ? i, n ? k + i + 1), 1 ? i < k ? n. 10. Suppose X1 , X2 , . . . , Xn are independent and identically distributed random variables with common distribution BE(a, 1). Let Y = min(X1 , . . . , Xn ), then Y d a ? BE(1, n). d 11. If X ? BE(a, b), where a and b are positive integers, then BE(x; a, b) = a+b?1 i Ca+b?1 xi (1 ? x)a+b?1?i . i=a 1.14. Chi-square Distribution2,3,4 If Y1 , Y2 , . . . , Yn are mutually independent and identically distributed random variables with common distribution N (0, 1), then we say the random variable X = ni=1 Yi2 change position with the previous math formula follows the Chi-square distribution (?2 distribution) with n degree of freedom, d and denote it as X ? ?2n . d If X ? ?2n , then the density function of X is ?x/2 n/2?1 e x x > 0, n/2 ?(n/2) , 2 f (x; n) = 0, x ? 0, where ?(n/2) is the Gamma function. Chi-square distribution is derived from normal distribution, which plays an important role in statistical inference for normal distribution. When the degree of freedom n is quite large, Chi-square distribution ?2n approximately becomes normal distribution. page 22 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 23 Some properties of Chi-square distribution are as follows: d d d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 1. If X1 ? ?2n , X2 ? ?2m , X1 and X2 are independent, then X1 + X2 ? ?2n+m . This is the ?additive property? of Chi-square distribution. 2. Let f (x; n) be the density function of Chi-square distribution ?2n . Then, f (x; n) is monotonically decreasing when n ? 2, and f (x; n) is a single peak function with the maximum point n ? 2 when n ? 3. d 3. If X ? ?2n , then the k-th moment of X is n ?(n/2 + k) = 2k ?k?1 . ? i + E(X k ) = 2k i=0 ?(n/2) 2 d 4. If X ? ?2n , then E(X) = n, Var(X) = 2n. ? then the skewness of X is s = 2 2n?1/2 , and the kurtosis of 5. If X ? X is ? = 12/n. d ?2n , d 6. If X ? ?2n , the moment-generating function of X is M (t) = (1 ? 2t)?n/2 and the characteristic function of X is ?(t) = (1?2it)?n/2 for 0 < t < 1/2. 7. Let K(x; n) be the distribution function of Chi-square distribution ?2n , then we have (1) K(x; 2n) = 1 ? 2 ni=1 f (x; 2i); (2) K(x; 2n + 1) = 2?(x) ? 1 ? 2 ni=1 f (x; 2i + 1); (3) K(x; n) ? K(x; n + 2) = ( x2 )n/2 e?x/2 /?( n+2 2 ), where ?(x) is the standard normal distribution function. d d d 8. If X ? ?2m , Y ? ?2n , X and Y are independent, then X/(X + Y ) ? BE(m/2, n/2), and X/(X + Y ) is independent of X + Y . Let X1 , X2 , . . . , Xn be the random sample of the normal population N (х, ? 2 ). Denote n n 1 Xi , S 2 = (Xi ? X?)2 , X? = n i=1 then d S 2 /? 2 ? i=1 ?2n?1 and is independent of X?. 1.14.1. Non-central Chi-square distribution d Suppose random variables Y1 , . . . , Yn are mutually independent, Yi ? N (хi , 1), 1 ? i ? n, then the distribution function of the random variable X = ni=1 Yi2 is the non-central Chi-square distribution with the degree of freedom n and the non-central parameter ? = ni=1 х2i , and is denoted as ?2n,? . Particularly, ?2n,0 = ?2n . page 23 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 24 d 9. Suppose Y1 , . . . , Ym are mutually independent, and Yi ? ?2ni ,?i for 1 ? m m d 2 i ? m, then m i=1 Yi ? ?n,? where, n = i=1 ni , ? = i=1 ?i . d 10. If X ? ?2n,? then E(X) = n + ?, Var(X) = 2(n + 2?), the skewness of X ? n+3? n+4? is s = 8 (n+2?) 3/2 , and the kurtosis of X is ? = 12 (n+2?)2 . d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 11. If X ? ?2n,? , then the moment-generating function and the characteristic function of X are M (t) = (1 ? 2t)?n/2 exp{t?/(1 ? 2t)} and ?(t) = (1 ? 2it)?n/2 exp{it?/(1 ? 2it)}, respectively. 1.15. t Distribution2,3,4 d d Assume X ? N (0, 1), Y ? ? ?2n , and X is independent of Y . We say the ? random variable T = nX/ Y follows the t distribution with n degree of d freedom and denotes it as T ? tn . d If X ? tn , then the density function of X is ?( n+1 2 ) t(x; n) = (n?)1/2 ?( n2 ) x2 1+ n ?(n+1)/2 , for ?? < x < ?. De?ne T (x; n) as the distribution function of t distribution, tn , then T (x; n) = 1 1 n 2 In/(n+x2 ) ( 2 , 2 ), 1 1 1 n 2 + 2 In/(n+x2 ) ( 2 , 2 ), x ? 0, x < 0, where In/(n+x2 ) ( 12 , n2 ) is the ratio of incomplete beta function. Similar to Chi-square distribution, t distribution can also be derived from normal distribution and Chi-square distribution. It has a wide range of applications in statistical inference on normal distribution. When n is large, the t distribution tn with n degree of freedom can be approximated by the standard normal distribution. t distribution has the following properties: 1. The density function of t distribution, t(x; n), is symmetric about x = 0, and reaches the maximum at x = 0. 2 2. limn?? t(x; n) = ?12? e?x /2 = ?(x), the limiting distribution for t distribution is the standard normal distribution as the degree of freedom n goes to in?nity. page 24 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 25 d Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. 3. Assume X ? tn . If k < n, then E(X k ) exists, otherwise, E(X k ) does not exist. The k-th moment of X is ? 0 if 0 < k < n, and k is odd, ? ? ? ? ? ? ?( k+1 )?( n?k ) k2 ? ? 2? 2 if 0 < k < n, and k is even, ??( n ) E(X k ) = 2 ? ? ? ? doesn?t exist if k ? n, and k is odd, ? ? ? ? ? if k ? n, and k is even. d 4. If X ? tn , then E(X) = 0. When n > 3, Var(X) = n/(n ? 2). d 5. If X ? tn , then the skewness of X is 0. If n ? 5, the kurtosis of X is ? = 6/(n ? 4). 6. Assume that X1 and X2 are independent and identically distributed random variables with common distribution ?2n , then the random variable Y = 1 n1/2 (X2 ? X1 ) d ? tn . 2 (X1 X2 )1/2 Suppose that X1 , X2 , . . . , Xn are random samples of the normal population N (х, ? 2 ), de?ne X? = n1 ni=1 Xi , S 2 = ni=1 (Xi ? X?)2 , then T = n(n ? 1) X? ? х d ? tn?1 . S 1.15.1. Non-central t distribution d d 2 Suppose that ? X ? N (?, 1), Y ? ?n , X and Y are independent, then ? T = nX/ Y is a non-central t distributed random variable with n degree d of freedom and non-centrality parameter ?, and is denoted as T ? tn,? . Particularly, tn,0 = tn . 7. Let T (x; n, ?) be the distribution function of the non-central t distribution tn,? , then we have T (x; n, ?) = 1 ? T (?x; n, ??), ? T (1; 1, ?) = 1 ? ?2 (?/ 2). d 8. If X ? tn,? , then E(X) = ?2 ) ? (E(X))2 for n > 2. n ?( n?1 ) 2 2 ?( n ) 2 T (0; n, ?) = ?(??), ? for n > 1 and Var(X) = n n?2 (1 + page 25 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 26 1.16. F Distribution2,3,4 d d Let X and Y be independent random variables such that X ? ?2m , Y ? ?2n . Y De?ne a new random variable F as F = X m / n . Then the distribution of F is called the F distribution with the degrees of freedom m and n, denoted d as F ? Fm,n . d If X ? Fm,n , then the density function of X is Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. f (x; m, n) = ? m m m?2 (n)2 ? ? B( 1+ m n x 2 ) ? ? 2 2 m+n mx ? 2 n , x > 0, x ? 0. 0, Let F (x; m, n) be the distribution function of F distribution, Fm,n , then F (x; m, n) = Ia (n/2, m/2), where a = xm/(n + mx), Ia (и, и) is the ratio of incomplete beta function. F distribution is often used in hypothesis testing problems on two or more normal populations. It can also be used to approximate complicated distributions. F distribution plays an important role in statistical inference. F distribution has the following properties: 1. F distributions are generally skewed, the smaller of n, the more it skews. 2. When m = 1 or 2, f (x; m, n) decreases monotonically; when m > 2, f (x; m, n) is unimodal, the mode is n(m?2) (n+2)m . d d 3. If X ? Fm,n , then Y = 1/X ? Fn,m . d d 4. If X ? tn , then X 2 ? F1,n . d 5. If X ? Fm,n , then the k-th moment of X is E(X k ) = ? m n n k ?( 2 +k)?( 2 ?k) ? ?( m ) , 0 < k < n/2, ?( m )?( n ) 2 ? ? 2 ?, d k ? n/2. 6. Assume that X ? Fm,n . If n > 2, then E(X) = Var(X) = 2n2 (m+n?2) . m(n?2)2 (n?4) d n n?2 ; if n > 4, then 7. Assume that X ? Fm,n . If n > 6, then the skewness of X is (2m+n?2)(8(n?4))1/2 ; if n > (n?6)(m(mn ?2))1/2 12((n?2)2 (n?4)+m(m+n?2)(5n?22)) . m(n?6)(n?8)(m+n?2) s = 8, then the kurtosis of X is ? = page 26 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 27 8. When m is large enough and n > 4, the normal distribution function ?(y) can be used to approximate the F distribution function F (x; m, n), where y = x?n n?2 2(n+m?2) 1/2 n ( ) n?2 m(n?4) , that is, F (x; m, n) ? ?(y). d Suppose X ? Fm,n . Let Zm,n = ln X, when both m and n are large enough, the distribution of Zm,n can be approximated by the normal distribution 1 1 ), 12 ( m + n1 )), that is, N ( 12 ( n1 ? m Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d Zm,n ? N 1 1 1 ? , 2 n m 1 2 1 1 + m n . Assume that X1 , . . . , Xm are random samples of the normal population N (х1 , ?12 ) and Y1 , . . . , Yn are random samples of the normal population N (х2 , ?22 ). The testing problem we are interested in is whether ?1 and ?2 are equal. n 2 2 ?1 2 De?ne ??12 = (m ? 1)?1 m i=1 (Xi ? X?) and ??2 = (n ? 1) i=1 (Yi ? Y? ) 2 2 as the estimators of ?1 and ?2 , respectively. Then we have d ??12 /?12 ? ?2m?1 , d ??22 /?22 ? ?2n?1 , where ??12 and ??22 are independent. If ?12 = ?22 , by the de?nition of F distribution, the test statistics 2 (n ? 1)?1 m ??12 /?12 d i=1 (Xi ? X?) = ? Fm?1,n?1 . F = (m ? 1)?1 ni=1 (Yi ? Y? )2 ??22 /?22 1.16.1. Non-central F distribution d d Y If X ? ?2m,? , Y ? ?2n , X and Y are independent, then F = X m / n follows a non-central F distribution with the degrees of freedom m and n and nond centrality parameter ?. Denote it as F ? Fm,n,? . Particularly, Fm,n,0 = Fm,n . d d 10. If X ? tn,? , then X 2 ? F1,n,? . d 11. Assume that X ? F1,n,? . If n > 2 then E(X) = Var(X) = n 2 (m+?)2 +(m+2?)(n?2) ) . 2( m (n?2)2 (n?4) (m+?)n (n?2)m ; if n > 4, then page 27 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 28 1.17. Multivariate Hypergeometric Distribution2,3,4 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. Suppose X = (X1 , . . . , Xn ) is an n-dimensional random vector with n ? 2, which satis?es: (1) 0 ? Xi ? Ni , 1 ? i ? n, ni=1 Ni = N ; (2) let m1 , . . . , mn be positive integers with ni=1 mi = m, the probability of the event {X1 = m1 , . . . , Xn = mn } is n mi i=1 CNi , P {X1 = m1 , . . . , Xn = mn } = m CN then we say X follows the multivariate hypergeometric distribution, and d denote it as X ? M H(N1 , . . . , Nn ; m). Suppose a jar contains balls with n kinds of colors. The number of balls of the ith color is Ni , 1 ? i ? n. We draw m balls randomly from the jar without replacement, and denote Xi as the number of balls of the ith color for 1 ? i ? n. Then the random vector (X1 , . . . , Xn ) follows the multivariate hypergeometric distribution M H(N1 , . . . , Nn ; m). Multivariate hypergeometric distribution has the following properties: d 1. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m). k k Xi , Nk? = ji=j Ni , For 0 = j0 < j1 < и и и < js = n, let Xk? = ji=j k?1 +1 k?1 +1 d 1 ? k ? s, then (X1? , . . . , Xs? ) ? M H(N1? , . . . , Ns? ; m). Combine the components of the random vector which follows multivariate hypergeometric distribution into a new random vector, the new random vector still follows multivariate hypergeometric distribution. d 2. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m), then for any 1 ? k < n,, we have m? P {X1 = m1 , . . . , Xk = mk } = where N = n ? i=1 Ni , Nk+1 = n mk m1 m2 CN CN2 и и и CN CN ?k+1 1 k ? i=k+1 Ni , mk+1 k+1 m CN =m? Especially, when k = 1, we have P {X1 = m1 } = H(N1 , N, m). , k i=1 mi . m? m CN 1 CN ?2 1 2 m CN d , that is X1 ? page 28 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 29 Multivariate hypergeometric distribution is the extension of hypergeometric distribution. d 3. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m), 0 < k < n, then P {X1 = m1 , . . . , Xk = mk |Xk+1 = mk+1 , . . . , Xn = mn } Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. = mk m1 и и и CN CN 1 k ? m CN ? , where, N ? = ki=1 Ni , m?k+1 = m? ni=k+1 mi . This indicates that, under the condition of Xk+1 = mk+1 , . . . , Xn = mn , the conditional distribution of (X1 , . . . , Xk ) is M H(N1 , . . . , Nk ; m? ). d 4. Suppose Xi ? B(Ni , p), 1 ? i ? n, 0 < p < 1, and X1 , . . . , Xn are mutually independent, then n d Xi = m ? M H(N1 , . . . , Nn ; m). X1 , . . . , Xn i=1 This indicates that, when the sum of independent binomial random variables is given, the conditional joint distribution of these random variables is a multivariate hypergeometric distribution. d 5. Suppose (X1 , . . . , Xn ) ? M H(N1 , . . . , Nn ; m). If Ni /N ? pi when N ? ? for 1 ? i ? n, then the distribution of (X1 , . . . , Xn ) converges to the multinomial distribution P N (N ; p1 , . . . , pn ). In order to control the number of cars, the government decides to implement the random license-plate lottery policy, each participant has the same probability to get a new license plate, and 10 quotas are allowed each issue. Suppose 100 people participate in the license-plate lottery, among which 10 are civil servants, 50 are individual household, 30 are workers of stateowned enterprises, and the remaining 10 are university professors. Denote X1 , X2 , X3 , X4 as the numbers of people who get the license as civil servants, individual household, workers of state-owned enterprises and university professors, respectively. Thus, the random vector (X1 , X2 , X3 , X4 ) follows the multivariate hypergeometric distribution. M H(10, 50, 30, 10; 10). Therefore, in the next issue, the probability of the outcome X1 = 7, X2 = 1, X3 = 1, X4 = 1 is P {X1 = 7, X2 = 1, X3 = 1, X4 = 1} = 7 C1 C1 C1 C10 50 30 10 . 10 C100 page 29 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 30 1.18. Multivariate Negative Binomial Distribution2,3,4 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. Suppose X = (X1 , . . . , Xn ) is a random vector with dimension n(n ? 2) which satis?es: (1) Xi takes non-negative integer values, 1 ? i ? n; (2) If the probability of the event {X1 = x1 , . . . , Xn = xn } is (x1 + и и и + xn + k ? 1)! k x1 p0 p1 и и и pxnn , P {X1 = x1 , . . . , Xn = xn } = x1 ! и и и xn !(k ? 1)! where 0 < pi < 1, 0 ? i ? n, ni=0 pi = 1, k is a positive integer, then we say X follows the multivariate negative binomial distribution, denoted d as X ? M N B(k; p1 , . . . , pn ). Suppose that some sort of test has (n + 1) kinds of di?erent results, but only one of them occurs every test with the probability of pi , 1 ? i ? (n + 1). The sequence of tests continues until the (n + 1)-th result has occurred k times. At this moment, denote the total times of the i-th result occurred as Xi for 1 ? i ? n, then the random vector (X1 , . . . , Xn ) follows the multivariate negative binomial distribution MNB(k; p1 , . . . , pn ). Multivariate negative binomial distribution has the following properties: d 1. Suppose (X1 , . . . , Xn ) ? M N B(k; p1 . . . , pn ). For 0 = j0 < j1 < и и и < jk jk ? js = n, let Xk? = i=jk?1 +1 Xi , pk = i=jk?1 +1 pi , 1 ? k ? s, then d (X1? , . . . , Xs? ) ? M N B(k; p?1 . . . , p?s ). Combine the components of the random vector which follows multivariate negative binomial distribution into a new random vector, the new random vector still follows multivariate negative binomial distribution. d r1 rn 2. If (X1 , . . . , X 1 и и и Xn ) = (k + Pnn) ? M N B(k; p1 . . . , pn ), then E(X n n ri i=1 ri ?n i=1 (pi /p0 ) , where p0 = 1 ? i=1 ri ? 1) i=1 pi . d d 3. If (X1 , . . . , Xn ) ? M N B(k; p1 . . . , pn ), 1 ? s < n, then (X1 , . . . , Xs ) ? MNB(k; p?1 . . . , p?s ), where p?i = pi /(p0 + p1 + и и и + ps ), 1 ? i ? s, p0 = 1 ? ni=1 pi . d 0 ). Especially, when s = 1, X1 ? N B(k, p0p+p 1 1.19. Multivariate Normal Distribution5,2 A random vector X = (X1 , . . . , Xp ) follows the multivariate normal distri d bution, denoted as X ? Np (х, ), if it has the following density function ?1 1 p ? 2 1 (x ? х) , f (x) = (2?)? 2 exp ? (x ? х) 2 page 30 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 31 where x = (x1 , . . . , xp ) ? Rp , х ? Rp , is a p О p positive de?nite matrix, ?| и |? denotes the matrix determinant, and ? ? denotes the transition matrix transposition. Multivariate normal distribution is the extension of normal distribution. It is the foundation of multivariate statistical analysis and thus plays an important role in statistics. Let X1 , . . . , Xp be independent and identically distributed standard normal random variables, then the random vector X = (X1 , . . . , Xp ) follows Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d the standard multivariate normal distribution, denoted as X ? Np (0, Ip ), where Ip is a unit matrix of p-th order. Some properties of multivariate normal distribution are as follows: 1. The necessary and su?cient conditions for X = (X1 , . . . , Xp ) following multivariate normal distribution is that a X also follows normal distribution for any a = (a1 , . . . , ap ) ? Rp . d 2. If X ? Np (х, ), we have E(X) = х, Cov(X) = . d 3. If X ? Np (х, ), its moment-generating function and characteristic function are M (t) = exp{х t + 12 t t} and ?(t) = exp{iх t ? 12 t t} for t ? Rp , respectively. 4. Any marginal distribution of a multivariate normal distribution is still a d multivariate normal distribution. Let X = (X1 , . . . , Xp ) ? N (х, ), = (?ij )pОp . For any 1 ? q < p, set where х = (х1 , . . . , хp ) , (1) (1) X = (X1 , . . . , Xq ) , х = (х1 , . . . , хq ) , 11 = (?ij )1?i,j?1 , then we d d have X(1) ? Nq (х(1) , 11 ). Especially, X1 ? N (хi , ?ii ), 1 ? i ? p. d 5. If X ? Np (х, ), B denotes an q О p constant matrix and a denotes an q О 1 constant vector, then we have d B , a + BX ? Nq a + Bх, B which implies that the linear transformation of a multivariate normal random vector still follows normal distribution. d 6. If Xi ? Np (хi , i ), 1 ? i ? n, and X1 , . . . , Xn are mutually indepen d dent, then we have ni=1 Xi ? Np ( ni=1 хi , ni=1 i ). d d 7. If X ? Np (х, ), then (X ? х) ?1 (X ? х) ? ?2p . d as follows: 8. Let X = (X1 , . . . , Xp ) ? Np (х, ), and divide X, х and х(1) X(1) 11 12 , х= , = , X= (2) (2) X х 21 22 page 31 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 32 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. where X(1) and х(1) are q О 1 vectors, and 11 is an q О q matrix, q < p, then X(1) and X(2) are mutually independent if and only if 12 = 0. d in the same 9. Let X = (X1 , . . . , Xp ) ? Np (х, ), and divide X, х and manner as property 8, then the conditional distribution of X(1) given ?1 (2) ? х(2) ), X(2) is Nq (х(1) + 12 ?1 11 ? 12 21 ). 22 (X 22 d the 10. Let X = (X1 , . . . , Xp ) ? Np (х, ), and divide X, х and in (1) are X same manner as property 8, then X(1) and X(2) ? 21 ?1 11 d (1) (1) independent, and X ? Nq (х , 11 ), ?1 (1) ?1 (1) d (2) X ? N ? ? ? х , ? ? ? ? ? х X(2) ? ?21 ??1 p?q 21 11 22 21 11 12 . 11 Similarly, X(2) and X(1) ? (х(2) , 22 ), ?1 12 22 d X(2) are independent, and X(2) ? Np?q ?1 (2) ?1 (2) d (1) X ? N ? ? ? х , ? ? ? ? ? х . X(1) ? ?12 ??1 q 12 11 12 21 22 22 22 1.20. Wishart Distribution5,6 Let X1 , . . . , Xn be independent and identically distributed p-dimensional random vectors with common distribution Np (0, ), and X = (X1 , . . . , Xn ) be an pОn random matrix. Then, we say the p-th order random matrix W = XX = ni=1 Xi Xi follows the p-th order (central) Wishart distribution with d n degree of freedom, and denote it as W ? Wp (n, ). Here the distribution of a random matrix indicates the distribution of the random vector generated by matrix vectorization. d 2 ?n , which implies Particularly, if p = 1, we have W = ni=1 Xi2 ? that Wishart distribution is the extension of Chi-square distribution. d > 0, n ? p, then density function of W is If W ? Wp (n, ), and |W|(n?p?1)/2 exp{? 12 tr( ?1 W)} , fp (W) = 2(np/2) | |n/2 ? (p(p?1)/4) ?pi=1 ?( (n?i+1) ) 2 where W > 0, and ?tr? denotes the trace of a matrix. Wishart distribution is a useful distribution in multivariate statistical analysis and plays an important role in statistical inference for multivariable normal distribution. Some properties of Wishart distribution are as follows: d 1. If W ? Wp (n, ), then E(W) = n . page 32 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 Probability and Probability Distributions 33 Handbook of Medical Statistics Downloaded from www.worldscientific.com by 80.82.77.83 on 10/25/17. For personal use only. d d 2. If W ? Wp (n, ), and C denotes an k О p matrix, then CWC ? Wk (n, C C ). d 3. If W ? Wp (n, ), its characteristic function is E(e{itr(TW)} ) = |Ip ? 2i T|?n/2 , where T denotes a real symmetric matrix with order p. d 4. If Wi ? Wp (ni , ), 1 ? i ? k, and W1 , . . . , Wk are mutually indepen d dent, then ki=1 Wi ? Wp ( ki=1 ni , ). 5. Let X1 , . . . , Xn be independent and identically distributed p-dimensional > 0, and X = random vectors with common distribution Np (0, ), (X1 , . . . , Xn ). (1) If A is an n-th order idempotent matrix, then the quadratic form d matrix Q = XAX ? Wp (m, ), where m = r(A), r(и) denotes the rank of a matrix. (2) Let Q = XAX , Q1 = XBX , where both A and B are idempotent matrices. If Q2 = Q ? Q1 = X(A ? B)X ? 0, then d Q2 ? Wp (m ? k, ), where m = r(A), k = r(B). Moreover, Q1 and Q2 are independent. d > 0, n ? p, and divide W and into q-th order 6. If W ? Wp (n, ), and (p ? q)-th order parts as follows: W= W11 W12 W21 W22 , 11 = 21 12 , 22 then d (1) W11 ? Wq (n, 11 ); ?1 W12 and (W11 , W21 ) are independent; (2) W22 ? W21 W11 d ?1 (3) W22 ? W21 W11 W12 ? Wp?q (n ? q, 2|1 ) where 2|1 = 22 ? 21 ?1 12 . 11 ?1 d 1 > 0, n > p + 1, then E(W?1 ) = n?p?1 . 7. If W ? Wp (n, ), p d d > 0, n ? p, then |W| = | | i=1 ?i , where 8. If W ? Wp (n, ), d ?1 , . . . , ?p are mutually independent and ?i ? ?2n?i+1 , 1 ? i ? p. d > 0, n ? p, then for any p-dimensional non-zero 9. If W ? Wp (n, ), vector a, we have a ?1 a d 2 ? ?n?p+1 . a W?1 a page 33 July 7, 2017 8:11 Handbook of Medical Statistics 9.61in x 6.69in b2736-ch01 J. Shi 34 1.20.1. Non-central Wishart distribution Handbook of Medical Statistics D

1/--страниц