Predicted probability after multilevel logit model

ashish123 · Post by **ashish123** » Thu Aug 28, 2014 8:58 pm

Hi all,
I am running three level multilevel model in runmlwin (State-Household-Individual) to predict the probability of death for each states. I am not able to proceed with stata predict command. Since, it is not working. I would like to know the exact code for predicted probability(state-v024) for the above mentioned case.
Codes are given below.

Code: Select all

runmlwin censor cons  i. moth_age i.moth_edu i.v190  i.toi_fac  lit_wom full_anc,level3( v024: cons) level2( hhid: cons) level1( id2:) discrete(distribution(binomial) link(logit)denominator(cons) mql1)
(pql2 is not working?)

where censor is coded as death (1) and survived(0).I would like to obtain probability of death for each state(v024) after adjusting with other variables.
How can I obtain predicted probability for the above mentioned scenario? Hoping for positive response.

GeorgeLeckie · Post by **GeorgeLeckie** » Fri Aug 29, 2014 12:45 pm

Hi Ashish,

The runmlwin command pulls back the model parameter estimates and associated standard errors (and optionally the predicted random effects and their standard errors). This is all you need to perform inference and prediction.

So to obtain state predicted probabilities you should do something like the following

Code: Select all

* Fit model by MQL1 to obtain starting values for PQL2 estimation
. runmlwin censor cons  i. moth_age i.moth_edu i.v190  i.toi_fac  lit_wom full_anc, ///
    level3(v024: cons) ///
    level2(hhid: cons) ///
    level1(id2:) ///
    discrete(distribution(binomial) link(logit)denominator(cons) mql1)

* Re-fit model by PQL2 using the MQL1 estimates as starting values
. runmlwin censor cons  i. moth_age i.moth_edu i.v190  i.toi_fac  lit_wom full_anc, ///
    level3(v024: cons, residuals(v)) ///
    level2(hhid: cons) ///
    level1(id2:) ///
    discrete(distribution(binomial) link(logit)denominator(cons) pql2) ///
    initsprevious

* Predict the linear predictor
. predict xb, xb

* Predict the probabilities
. generate p = invlogit(xb + v)

Note that it is best to provide starting values when you wish to fit models by PQL2 and I have illustrated this above.

More generally, you might like to take a look at Chapter 9 of the MLwiN User Manual and Module 7 of our free online course together with their associated runmlwin do-files which go into more detail about generating predicted probabilities after multilevel logistic regression.

http://www.bristol.ac.uk/cmm/software/r ... /examples/

Best wishes

George

ashish123 · Post by **ashish123** » Sun Aug 31, 2014 11:54 am

Hi George,
Thanx for awesome and prompt response.The predict command worked very well.However,after running the mql1 model i was not successful in running the model with pql 2 command despite taking the previous estimated values of mql1 model.Then i have tried to drop level 2 from the model which worked well.Probably inclusion of the second level were making difficult for the model to converge.I would love to hear your comment on this.
The runmlwin has immensely helped me in my research work in so many ways.It made my work very enjoyable.Moreover, I have two doubts.If possible then please suggest the solution of the problems

1) I have data set with 2,83,000 observation and runmlwin is taking 30-40 minutes to read the whole data set. How i can increase the capacity of runmlwin for quickening the process.
2) Initially i was running the model with individual and household identifier(string "1 1 2 21") type .when i have converted them into numeric with no space"11221" then data scanned in few seconds in small data set(56000 observation).I would like to know that whether identifier should always be in numeric type for smooth and quick scanning of the data.
your suggestion are eagerly awaited.

GeorgeLeckie · Post by **GeorgeLeckie** » Mon Sep 01, 2014 12:48 pm

Hi Ashish,

> Probably inclusion of the second level were making difficult for the model to converge.I would love to hear your comment on this.

Yes, it might do if the level-2 variance is very close to zero. Or if your dataset is excessively unbalanced, for example lots of level-2 units with only 1 level-1 unit.

> I would like to know that whether identifier should always be in numeric type for smooth and quick scanning of the data.

Yes, if your dataset is large, you would be better off making all variables numeric (especially for level IDs) in Stata before proceeding to attempting to fit your model. This way you convert from string to numeric once, rather repeatedly on each call of MLwiN.

Best wishes

George

www.cmm.bristol.ac.uk/forum

Predicted probability after multilevel logit model

Predicted probability after multilevel logit model

Re: Predicted probability after multilevle logit model

Re: Predicted probability after multilevel logit model

Re: Predicted probability after multilevel logit model