De la Garza Phenomenon Bikas K Sinha

1 De la Garza Phenomenon Bikas K SinhaIndian Statistical ...
Author: 淋权 陆
0 downloads 3 Views

1 De la Garza Phenomenon Bikas K SinhaIndian Statistical Institute, Kolkata ASU Statistical Sciences : Feb. 19, 2016 Collaborators : N K Mandal & M Pal Calcutta University

2 Quote of the Day …… than There is more to see (on the ground)what we see (from the sky).

3 Nomenclature..…… Liski-Mandal-Shah-Sinha (2002) : Topics in Optimal Design : Springer-Verlag Monograph Pukelshiem (1993/2006) : Optimal Design of Experiments Refers as …..Property of Admissibility Khuri-Mukherjee-Sinha-Ghosh (2006) : Statistical Science …..de la Garza Phenomenon Min Yang (2010) : Annals of Statistics …title of the paper ‘On the de la Garza Phenomenon’

4 Reference Journals / Books 1993 2002 2006 2009/10

5 Motivating Example : First Course in Linear RegressionY : … … … … … … … Fit a linear regression equation of Y on X under the usual model assumptions….etc etc Mean Model Yx = α + βx with Homoscedastic Errors Computations yield : n = 7 m’1(x) = xbar = and m’2(x) = Properties / Precisions of α^ and β^ depend on these two non-stochastic quantities [x-dependent], apart from random 2^ [y-dependent] with 5 df.

6 Transformation….. X-transformed to U……min…’-1’ & max… ‘+1’U = [2(x – x min)/( x max – x min )] – 1 U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 Unified approach …covering all sorts of x-types and measurement variations ……scale-free Motivating Question : If we believe in the linear regression model, what good are so many X-values …..Why can’t we work with exactly two X-values & , that too, possibly with the two extremes +/- 1 when converted to U ? Yes….n=7 can be brought into the picture by referring to replication numbers at +/- 1 to make up for 7 obs. Options : {U : -1(f) & +1(n-f); f=1, 2, …, 6; n=7}

7 Linear Regression ModelMean Model Yx = α + βx with Homoscedastic Errors Given DN = [(x1, n 1); (x2, n 2); …(xk, n k)] ; N=∑ni χ = Space of the Regressor ‘X’ = [a, b], a < b WOLG : a ≤ x1 < x2 <….< xk ≤ b; x’s all distinct For each i, ni ≥ 1 such that ∑ni = N [given] estimability of α and β ensured iff k ≥ 2….distinct Fitting of Linear Regression Model : β^ = byx= SPyx / SSxx ; α^ = ybar – byx xbar, 2^ Inference rests on normality of errors etc etc

8 Motivating Theory : Undergraduate LevelX : a ≤ x 1 < x 2 <….< xk ≤ b [k > 1, all x’s distinct]….n_i=1 Y : y1 , y2 , y3 , …. yk ……responses on Y…all singletons Assume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’. Smart Student’s thought…..pairwise unbiased estimators… β^(i, j) =b(i, j) = (yi – yj) / (xi – xj), 1  i < j  k So….BLUE can be based on the {b(i, j) ’s}…..k_c_2 pairs All Distinct ? / Correlated / Uncorrelated ? Basis : {b(1, 2) , b(1, 3) , …, b(1, k)} …each unbiased but Jointly correlated estimates…..y1 is involved everywhere…

9 Formation of BLUE….. Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE. Define ‘η’ as the (k-1)x1 col. vector of the ‘difference estimators’ i.e., η =( b(1, 2) , b(1, 3) , …, b(1, k) ) so that E[η] = β1 & Disp.(η) = σ2 W, W being a pd matrix Then blue of β = η’ W -1 1 / 1’ W -1 1 Show that indeed the above simplifies to β^=b=∑ (yi - ybar)(xi – xbar)/ ∑(xi -xbar)2. Good problem for Stat Major Exam.

10 Smarter move….. V1 = [y1 – y2] / [x1 – x2]V2 = [y1 + y2 – 2y3] / [x1 + x2 - 2x3] ……. Vk-1= [y1 + y2 +…- (k-1)yk] / [x1 + x2 +…- (k-1)xk] Then these V’s are uncorrelated. Hence W(V) is a diagonal matrix etc etc…. Derivation of β^ is much easier…… Claim: Same result….novel derivation …use of Helmert’s Orthogonal Transformation. Again….a good problem for Stat Majors…..

11 Motivating Theory : Master LevelRegression Design on X : (x1 , n1); ( x2, n2); ……(xk , nk) [k > 1]; all x’s distinct Y : {(y1j); (y2j); ….(ykj)}…altogether N = sum ni obs. Assume Linear Regression of Y on X : E[Yx_i] = α + βxi Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’. Smart Student’s thought…..pairwise unbiased estimators… β^(i, j) = b(i, j) = (ybari – ybarj) / (xi – xj), 1  i < j  k So….BLUE can be based on the {b(i, j)’s}. How many ? Correlated /Uncorrelated ? Basis : [b(1, 2) b(1, 3) , …, b(1, k) ] …each unbiased but Jointly correlated estimates…..y1 is involved everywhere

12 Motivating Theory : Master Level & Beyond…..Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE. Define ‘η’ as the vector of these ‘difference estimators’ so that E[η] = β1 & Disp.(η) = σ2 W…..Complicated ? Then blue of β = η’ W-1 1 / 1’ W-1 1 Show that indeed the above simplifies to β^=b =∑ni (ybari - ybarbar)(xi – xbar) / ∑ni (xi -xbar)2.

13 Smarter move…. Define V1 = [√n1 ybar1 - √n2 ybar2]/[√n1 x1 - √n2 x2]V2 = [√n1 ybar1 + √n2 ybar2 - 2√n3 ybar3]/[...] Etc etc This time W-matrix becomes a diagonal matrix… Tremendous simplification in the formation of β^ Claim…..same result Good problem for Qualifying Exam….

14 Turn back to the basic question…X : U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 Motivating Question : If you believe in the linear regression model E[Yx] = α + βx = δ + γu = E[Yu] what good are so many u-values ? Why can’t you work with exactly two u-values &, that too, possibly with +/- 1 ? Total No. of Observations…n….split into +/- 1 Q. How to decide ? Well…concept of Information Matrix of the vector parameter

15 Fisher Information MatrixI(θ; DN) = X’ X = 2 x 2 matrix with elements [(N T1); (T T2)] where T1 = ∑ ni xi & T2 = ∑ ni x2i X Nx2 = [1 Nx1 , col. vector of xi’s with ni repeats] Averaged Information Matrix per Observation IBAR = (I/N) I(θ) = [( μ’1) (μ’ μ’2)] where μ’1 = ∑ ni xi / N μ’2 = ∑ ni x2i / N First and 2nd degree moments of x-distribution I(θ) : pd matrix iff k ≥ 2 distinct x’s are considered

16 de la Garza Phenomenon [de la Garza, A. (1954) : AMS]Research Paper [Annals of Statistics] : Springer-Verlag Monograph on Optimal Designs : 2002 Wiley Book on Optimal Designs : 2006 Continuous Flow of Papers involving Linear & Non-Linear Models – both qualitative and quantitative responses – enormous impact of de la Garza Phenomenon in optimality studies

17 Continuous Design TheoryContext : Linear Regression Model Space of Regressor : χ = [a, b], a < b k ≥ 2 distinct x-values in χ with positive weights (mass) w1, w2, …, wk such that ∑wi = 1 In applications, we consider in terms of ‘N’ observations, with Nwi = ni observations taken at x = xi , i = 1, 2, …, k. [Choice of ‘N’ ensures integral values of ni’s] Continuous Version of IBAR = [( μ’1) (μ’ μ’2)] where μ’1 = ∑ wi xi AND μ’2 = ∑ wi x2i Known as Information Matrix arising out of a Continuous Design, in terms of {(xi ,wi ); i = 1, 2, …, k}

18 De la Garza Phenomenon : Continuous Design TheoryContext : Linear Regression Model with Homoscedastic Errors Claim 1 : Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ = [a, b] : a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi = 1], whenever k > 2, we can find exactly 2 points ‘x*’ and ‘x**’ with suitable weights ‘p*’ and ‘p**’ such that (i) x 1 ≤ x* < x** ≤ x k; (ii) p* + p** = 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]

19 Proof of Claim 1 Recall μ’1 = ∑ wi xi [1st moment]AND μ’2 = ∑ wi x2i [2nd moment] Start with IBAR = [( μ’1) (μ’ μ’2 )] Set IBAR = I*BAR and derive defining equ. p*x* + p** x** = μ’1 …………………..(1) p*x*2 + p** x**2 = μ’2…………..(2) Claim : There is an acceptable solution for [(x*, p*); (x**, p**) satisfying (1) and (2).

20 Proof….contd. WOLG : x1 = -1 AND xk = +1Solution set : Define μ2 = μ’2 – μ’12 > 0 x* = μ’1 - [p** μ2/p*] X** = μ’1 + [p* μ2/p**] Further, since x* < x**, we readily verify -1 < x* = μ’1 – [p** μ2/p*] AND x** = μ’1 + [p* μ2/p**] < 1 whenever μ2 / [μ2 + (1 + μ’1)2] < p* < (1- μ’1)2 / [μ2 + (1- μ’1)2] NOTE : Verified LHS < RHS

21 Statement of Information Equivalence : Polynomial RegressionTherefore : Guaranteed existence of [(x*, p*); (x**, p**)]; -1 < x* < x** < 1; 0 < p* < 1 such that IBAR = IBAR*. de la Garza Phenomenon applies to pth degree polynomial regression model in terms of Information Equivalence of any k [>p+1]–point supported continuous design with that of a suitably chosen exactly (p+1)-point supported continuous design !

22 Caratheodory’s TheoremIf ‘p+1’ is the number of parameters in a model, one can restrict attention to at most (p+1)(p+2)/2 observations. Strength…..model specification …most general Weakness….pth degree polynomial regression model…de la Garza provides much better result [ p+1 < <(p+1)(p+2)/2, in general terms]

23 Higher Degree Polynomial RegressionYes….de a Garza Phenomenon holds for higher degree polynomial regressions as well…..proof is a marvel exercise in matrix theory !!! Equate given pd matrix I(D) to I(D*) where I(D*) = X*W*X*, with X* being a square matrix and W* being a diagonal matrix. The claim is that such X* and W* matrices exist with minimum number of support points …..this is the spirit of de la Garza Phenomenon in terms of Information Equivalence. Information Dominance came much later.

24 Back to de la Garza Phenomenon: Exact Design Theory [EDT]This aspect …somehow…has been bypassed in the literature……difficult to provide a general theory as to the exact sample size for Info. Equi. to work ! Motivating Example : Linear Regression with 3 points to start with : [-1, 0, 1] so that k = 3 > 2. According to de la Garza Phenomenon, under continuous design theory, there are weights 0 < w -1, w0 , w +1 < 1, sum = 1 assigned to these points. AND then we can find

25 De la Garza Phenomenon : EDTone 2-point design, say [(a, p); (b, q)] such that -1 ≤ a < b ≤ 1, 0 < p < 1, q=1-p and there is Information Equivalence between the two designs. What if we are in an exact design scenario with a given total number of observations ‘N’ and its decomposition into n(-), n(o) and n(+) – being assigned to -1, 0 and 1 respectively ? Can we now find a solution to [(a, na); (b, nb)] satisfying

26 EDT… (i) -1 ≤ a < b ≤ 1; (ii) na + nb = N – both being integers(iii) Information Equivalence ? Do we need a condition on ‘N’ at all ? Crucial Observation : NOT ALL VALUES OF ‘N’ ARE AMENABLE TO SUPPORTING THE EQUIVALENCE THEOREM OF THE INFORMATION MATRICES .….NEEDED A MINIMUM VALUE…… ….ONLY THEN IT WORKS IN THE EXACT SENSE !

27 EDT : Choice of ‘N’ Examples : N Remark(i) -1(1), 0(1), +1 (1) : NOT Possible (ii) -1(2), 0(2), +1(2) : Possible (iii) -1(1), 0(2), +1(1) : Possible (iv) -1(2), 0(1), +1(1) : Not Possible (v) -1(4), 0(2), +1(2) : Possible (vi) -1(1), 0(3), +1(1) : Possible (vii) -1(1), 0(2), +1 (4) : Possible (viii) -1(1), t(1), +1(1) : Not Possible (vi) -1(2), t(2), +1(2) : Possible iff 3 – 2(3) < t < 2 (3) – 3

28 EDT : General Theory for 3 points with point symmetryConsider a general allocation design : -1 (n-), 0(no) and 1(n+) where each of n-, no and n+ is a positive integer and (n-) + (no) + (n+) = N ≥ 3. Once more, we want to replace the above 3-point point-symmetric design by a two point design of the form : (a, na) and (b, nb) so that na + nb = N and, moreover, Information Equivalence holds. That suggests  

29 EDT a na + b nb = (n+) – (n-) ..…….(3)Set A= na, B = nb, T1 = (n+) – (n-) and T2 = (n+ ) + (n-) ……………(5) From (3) and (4), in terms of (5), we obtain a = [T1 / (A+B)] ± [{B[(A+B)T2 – T12]}/A(A+B)2] b = [T1 / (A+B)] ±[{A[(A+B)T2 – T12]}/B(A+B)2] It can be readily verified that (A+B) T2 > T12.

30 EDT Let us choose a = [T1 / (A+B)] + [{B[(A+B)T2 – T12]}/A(A+B)2] andb = [T1 / (A+B)] - [{A[(A+B)T2 – T12]}/B(A+B)2] so that b < a. Note that T1 and T2 are both known. We will now sort out values of A and B subject to A + B = N so as to satisfy the requirement that -1 ≤ b < a  ≤ 1.

31 EDT First, note that (i) A + B = N(ii)        expressions for a and b depend on A and B only through A/B or B/A. Set n(-)/N = P n( 0) / N = Po n(+)/N = P+ Conditions : -1 ≤ b AND a ≤ 1 Equivalent to : 1 + T1/(A+B) ≥ [{A[(A+B)T2 – T12]}/B(A+B) 2]                            AND 1 – T1/(A+B) ≥[{B[(A+B) T2 – T12]}/A(A+B) 2]

32 EDT Equivalent to : [Po(1-Po)+ 4(P+)(P-)]/[2(P-) + Po]2 ≤ A/BA/B ≤ [2(P+) + Po]2 /[Po(1-Po)+4(P+)(P-)] L =[Po(1-Po)+ 4(P+)(P-)] / [Po(1-Po)+ 4(P+)(P-)+[2(P-) + Po]2] ≤ A / N ≤ [2(P+) + Po]2 / [Po(1- Po)+ 4(P+)(P-) + [2(P+) + Po]2] = U Written alternatively as :  N.L ≤ A ≤N.U.

33 EDT Implication : Choice of ‘N’ must be such that the interval [N.L, N.U] includes at least one integer which can serve as the value of ‘A’. A sufficient condition for this to happen is, of course, that the length of the interval viz. N(U - L) ≥ 1.  Even otherwise, a choice of A could be ensured. Note : So far….this [length less than unity] has been eluding us !!!

34 EDT (i) Po = P+ = P- = 1/3 [point and mass symmetric design]Here we find L = 2/5 and U = 3/5. So, for N = 3, N.L = 6/5 and N.U =9/5, which do not include any integer. So 3-point design with point and mass symmetry cannot be replaced by a 2-point design whenever N = 3. Again, for N = 6, we have N.L = 12/5, N.U = 18/5 and these include the integer ‘3’. So there is a solution and we have : ±  (2/3), each with 3 observations… …..as was mentioned before.

35 EDT For N = 9, we have N.L = 18/5 and N.U = 27/5. These include 2 integers : 4 and 5. So we have two solutions :            [-5/(30), 4]; [4/(30), 5] AND [-4/(30), 5]; [5/(30), 4].

36 EDT (ii)     Po = 2/7, P+ = 4/7 and P- = 1/7 i.e., the initial design has a size which is a multiple of 7, say N = 7k. This design is pt-sym but mass-asymmetric. And explicitly it is : [(-1, k); (0, 2k), (1, 4k)] where k is an integer. Note that L and U are independent of k. Computations yield : L = 13/21 [= 39/63] and U =50/63. (a)    k =1 : N = 7; N.L=13/3 < N.U=50/9 : one sol. A = 5, a = 3/7 + (1040)/70; B = 2, b = 3/7 – 5 (1040)/140  

37 EDT (b) k = 2 : N = 14….three solutions A = 9, a = 3/7 + (520)/42;B = 5, b = 3/7 – 3 (2080)/140   A =10, a = 3/7 + (1040)/70; B = 4, b = 3/7 –(260)/14   A =11, a = 3/7 + (3432)/154; B = 3, b = 3/7 –(3432)/ 42  

38 EDT (iii) Po = 3/5, P+ = P- = 1/5 i.e., the initial design has size multiple of 5, say N = 5k and explicitly it is : [(-1, k); (0, 3k); (1, k)] where k is an integer. .  This is point and mass-symmetric Note that L and U are independent of k. Computations yield : L = 2/7 and U = 5/7. k = 1 : N = 5, 10/7 ≤ nx ≤ 25/7 : (A, B) = (2, 3) OR (3, 2). Solutions : a = 2/(15) and b = -3/(15) with A = 3 and B = 2; a = 3/(15) and b = -2/(15) with A = 2 and B = 3.

39 EDT k = 2 : N = 10, 20/7 ≤ nx ≤ 50/7 : A = 3, 4, 5, 6, 7. Solutions:a = 6/(210) and b = -14/(210) for (A, B) = (7, 3) a = 14/(210) and b = -6/(210) for (A, B) = (3, 7) a = 4/(60) and b = -6/(60) for (A, B) = (6, 4) a = 6/(60) and b = -4/(60) for (A, B) = (4, 6) a = 2/(10) and b = -2/(10) for (A, B) = (5, 5).

40 EDT EXAMPLE of 3 -point asymmetric design : N = 3Consider an asymmetric design [(-1, 1), (t, 1), (1, 1)] with t # 0. WOLG, we take t > 0. Consider Information Equivalence with [(a, 2), (b, 1)]. Then t = 2a + b……………………..…(6) 2 + t2= 2a2+ b2…………………..(7) This yields : a = t/3 ± 2/3 times (t2 + 3) and for 0 < t < 1, it turns out that t/3 – 2/3 times  (t2+ 3) < -1 and 1 < t/3 + 2/3 times (t2 + 3). Hence, N = 3 does not work !

41 EDT For N = 6, naturally, equal allocation of 2 at each of the 3 points will yield the same negative result when we opt for [(a, 4), (b, 2)]. It follows that [(a, 5), (b, 1)] also fails to yield any affirmative result. For [(a, 3), (b, 3)], we require 2t = 3(a+b) 4 + 2t2= 3(a2+ b2). We obtain : a, b = t/3 ±1/3 times (6 + 2t2)

42 EDT Note : For t = 0, this leads to :a, b = ±  (2/3). This was discussed earlier. Condition : -1 < a, b < 1 leads to : 0 < t < 2(3) – 3, if t > 0. This was stated earlier.

43 EDT More examples….. [(-1, 1); (0, 2); (1, 1)] is equivalent to[(-1/(2), 2); ((1/(2), 2)] [(-1, 2); (0, 1); (1, 1)] : Impossible [(-1, 4); (0, 2); (1, 2)] is equivalent to [(-1/4 - (165)/20; 5); (-1/4 + (165)/12, 3]

44 Turning back to the example…Under Linear Regression : Does there exist a 2- point Information Equivalent Design ? Computations yield : n = 7 μ’1 (u) = -1/7= ; μ’2 (u) = /7 Equi. Choice : -1 < a(4) < 0 < b(3) < 1 for 7 obs. 4a + 3b = -1 and 4a2 + 3b2 = a = (4) AND b = (3) ….this is the reqd. solution

45 Quadratic Regression : Info Equi.Context : Quadratic Regression Model with Homoscedastic Errors [ Mean Model Yx = α + βx + γx2 ] Claim : Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ =[a, b] : a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi = 1], whenever k > 3, we can find exactly 3 points ‘x*’, ‘x**’ and ‘x***’ with suitable weights ‘p*’, ‘p**’ and ‘p***’ such that (i) x 1 ≤ x* < x** < x*** ≤ x k; (ii) p* + p** + p***= 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]

46 Quadratic Regression : EDTProblem # 1 Given D_4 : [(-1, 1); (-a, 1); (a, 1); (1, 1)] Can we find [(x, 2); (y, 1); (z, 1)] for Information Equivalence with -1 ≤ x # y # z ≤ 1? Answer : Impossible ! Problem # 2 Given D_6 : [(-1, 1); (-0.5, 2); (0.5, 2); (1, 1)] Can we find [(-x, f); (0, 6-2f); (x, f)] for Information Equivalence with 0 < x < 1 ? Yes : Unique sol. x = (3)/2 and f = 2.

47 More on Quadratic Regression : EDTProblem # 3. What about D_(2k+2) : [(-1, 1); (-0.5, k); (0.5, k); (1, 1)] ? Sol. [(-x, f); (0, 2k+2-2f); (x, f)] for some x & f ? ‘No’ for k = 3 to 7 For k = 8 (n=18) : f = 6 and x = 1/(2) ! More Affirmative Cases : D_36 :[-1, 2);(-0.5, 16);(0.5, 16);(1, 2)]  D_36 : [(-1/ (2), 12); (0, 12); (1/ (2), 12)] (ii) D_68 :[-1, 2);(-0.5, 32);(0.5, 32);(1, 2)]  D_68 : [(-(2/5), 25); (0, 18); ((2/5), 25)]

48 Information Domination…De la Garza Phenomenon : Info Equivalence More to it in terms of Information Domination WOLG ………..χ = [-1, 1] Claim 2: Given D*=[(x*, p*); (x**, p**)] with (x*, x**) NOT both equal to (-1, 1), there exists 0 < c < 1 so that Dc = [(-1, c); (+1, 1-c)] produces an Information Matrix I(Dc) which ‘dominates’ I(D*) in the sense of ‘matrix domination’. That is, I(Dc) – I(D*) is nnd. In a way, I(Dc) dominates I(D*) in every sense ! This is the best result one can think of ………...in terms of ‘improving’ over I(D*) !!

49 Information Domination….Proof of Claim 2 : Set 1 – 2c = μ’1 and solve for c =[1- μ’1]/2. Note that (x*, x**) # (-1, 1) so that -1 < μ’1 < 1 and so 0 < c < 1. Next note that μ’2 < 1. Therefore, I(Dc) – I(D*) = [(0, 0) (0, 1- μ’2)] which is nnd. Message : Push the points to the boundaries !

50 Quadratic Regression : Information DominanceContext : Quadratic Regression Model with Homoscedastic Errors [ Mean Model Yx = α + βx + γx2 ] Set χ = [-1, 1] WOLG. Claim : Given any continuous regression design ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ with -1 < x* < x** < X*** < 1, there exist proportions ‘p’, ‘q’ and ‘r’ and a constant c, -1 < c < 1 such that the design D_[(-1, p); (c, r); (+1, q)] provides Information Dominance over the design D*.

51 Sketch of the Proof…. I= (1 μ’1 μ’2) (μ’1 μ’2 μ’3) (μ’2 μ’3 μ’4 )(μ’ μ’ μ’3) (μ’ μ’ μ’4 ) I* = etc etc Equate μ’1 , μ’2 and μ’3 to those of I* and solve for p, q, r and c. Then show that μ’4 < μ*’4 For details…..Pukelsheim’s Book Also…….Liski et al Monograph [2002] : Topics in Optimal Design

52 Binary Response ModelsImpressive Literature on Optimality Issues de la Garza Phenomenon & Information Dominance…recent advances…. Optimal designs for binary data under logistic regression. Mathew-Sinha (2001) Jour. Stat Plan. & Inf., 93,

53 Binary Response Model….P[Yx = 1] = 1/[1+exp{-(α + βx)}] {(xi, ni)}; i=1, 2, …, k ….given data Binomial model…..log likelihood….differentiation etc etc…Information Matrix….. Approximate Theory : {(xi, pi)} etc……∑ pi = 1 Set ai = α + βxi for each i I(α,β)=[(∑ pi exp(-ai)/[1+exp(-ai)]2; (∑ pi xi exp(-ai)/[1+exp(-ai)]2; do; (∑ pi xi2 exp(-ai)/[1+exp(-ai)]2

54 Domination in Logistic Regression ModelGiven {(xi , pi)} etc……subject to ∑ pi = 1 and a set of distinct real numbers ai‘s…there exists a real number ‘c’ satisfying (i) ∑ pi xi exp(-ai)/[1+exp(-ai)]2 = c exp(-c)/[1+exp(-c)]2; (ii) (∑ pi xi2 exp(-ai)/[1+exp(-ai)]2 <= c2 exp(-c)/[1+exp(-c)]2 Remark : +/- ‘c’ does better than ai’s…k > 2…

55 Non-Linear Models ? For most non-linear models, de la Garza phenomenon holds and it goes beyond in the sense of Matrix Domination….known as ‘Loewner Domination’….. Stufken & Yang [Annals of Stat., 2009] Min Yang [Annals of Stat., 2010] Non-linear Models with 3 parameters theta_o + {theta_1 x / [x + theta_2]}…E_max theta_o + {theta_1 exp(x/theta_2)}…Expon. Theta_o + {theta_1 log(x + theta_2)}..Loglinear There are designs supported by exactly 3 points (including the two extreme points) which are as good as those supported by more than 3 points in the sense of Matrix Domination !

56 Non-Linear Models….More Ref.UIC School…..strong research group……. Fang & Hedayat….2008….Annals Li & Majumdar….2009…..JSPI Stufken & Yang…..2009….Annals Others in UIC group……2010 / 2011/2012 German School…….Augsburg Finland….Tampere University

57 Here I stop…… B.K.Sinha ASU/Tempe Feb. 19, 2016