R Binomial Regression









up vote
0
down vote

favorite












I am trying to develop a binomial model in R.



I want to use a formula that looks like this: VAL = X0 + b1 * X1 + b2 * X2



Where X0, X1, and X2 are variables in my data frame and b1 and b2 are the coefficients I want to develop. I want the target value Y to be TRUE/1 if this formula produces a VAL > 0 and FALSE/0 if it produces a VAL < 0.



Sample Data with b1 & b2 set to 1:
Target X0 X1 X2 VAL Result
1 86 -54 17 49 1
0 0 -54 17 -37 0
1 40 -15 23 48 1
0 50 -20 -25 5 1



I want the value of X0 to be incorporated in the prediction, but I do not want this variable to have a coefficient (as this is a predefined formula that I can't change).



The reason I need X0 in the model is because if X1 and X2 are equal for two observations that have different X0 values (as in first 2 observations), I want to reflect that in my formula. One observation's X0 could cause VAL to be negative and the other observations's X0 could cause VAL to be positive, but this would not be reflected if X0 was left completely out of the model. Also note the last observation in which I would either need to increase b1 or b2 so that VAL is negative and the result is 0 (which the model would not see without seeing X0).



I currently am using a formula that looks like glm("Y~X0+X1+X2", family = binomial(link = "logit")), but this model produces a coefficient for X0. How would I develop a model forcing X0 to have no coefficient?










share|improve this question



















  • 1




    What do you mean you want X0 incorporated in the prediction but not have a coefficient? Having a coefficient is what allows you to use X0 in the prediction.
    – mickey
    Nov 9 at 18:03










  • Are you looking for a model without an intercept?
    – Harro Cyranka
    Nov 9 at 18:05






  • 1




    I don't understand how you can force something to be in the model without a coefficient. If you don't want it to have a coefficient, don't put it in the model. Maybe it would be easier to help with a proper reproducible example with sample input and the desired output.
    – MrFlick
    Nov 9 at 18:11










  • you can fit a no intercept model like this glm(status==1~0+age, data =lung, family = binomial). However, I highly recommend having a sufficient justification for this- see this post stats.stackexchange.com/questions/260209/…
    – Mike
    Nov 9 at 19:05










  • @mickey I edited my question to be more specific. To be clear, I don't mind having an intercept on top of X0 (if that is possible). I just don't want the model to develop a coefficient for X0 (making the model too dependent on X0), since I will not be able to use that coefficient in my predictions due to external restrictions.
    – Sarah Langford
    Nov 9 at 19:19














up vote
0
down vote

favorite












I am trying to develop a binomial model in R.



I want to use a formula that looks like this: VAL = X0 + b1 * X1 + b2 * X2



Where X0, X1, and X2 are variables in my data frame and b1 and b2 are the coefficients I want to develop. I want the target value Y to be TRUE/1 if this formula produces a VAL > 0 and FALSE/0 if it produces a VAL < 0.



Sample Data with b1 & b2 set to 1:
Target X0 X1 X2 VAL Result
1 86 -54 17 49 1
0 0 -54 17 -37 0
1 40 -15 23 48 1
0 50 -20 -25 5 1



I want the value of X0 to be incorporated in the prediction, but I do not want this variable to have a coefficient (as this is a predefined formula that I can't change).



The reason I need X0 in the model is because if X1 and X2 are equal for two observations that have different X0 values (as in first 2 observations), I want to reflect that in my formula. One observation's X0 could cause VAL to be negative and the other observations's X0 could cause VAL to be positive, but this would not be reflected if X0 was left completely out of the model. Also note the last observation in which I would either need to increase b1 or b2 so that VAL is negative and the result is 0 (which the model would not see without seeing X0).



I currently am using a formula that looks like glm("Y~X0+X1+X2", family = binomial(link = "logit")), but this model produces a coefficient for X0. How would I develop a model forcing X0 to have no coefficient?










share|improve this question



















  • 1




    What do you mean you want X0 incorporated in the prediction but not have a coefficient? Having a coefficient is what allows you to use X0 in the prediction.
    – mickey
    Nov 9 at 18:03










  • Are you looking for a model without an intercept?
    – Harro Cyranka
    Nov 9 at 18:05






  • 1




    I don't understand how you can force something to be in the model without a coefficient. If you don't want it to have a coefficient, don't put it in the model. Maybe it would be easier to help with a proper reproducible example with sample input and the desired output.
    – MrFlick
    Nov 9 at 18:11










  • you can fit a no intercept model like this glm(status==1~0+age, data =lung, family = binomial). However, I highly recommend having a sufficient justification for this- see this post stats.stackexchange.com/questions/260209/…
    – Mike
    Nov 9 at 19:05










  • @mickey I edited my question to be more specific. To be clear, I don't mind having an intercept on top of X0 (if that is possible). I just don't want the model to develop a coefficient for X0 (making the model too dependent on X0), since I will not be able to use that coefficient in my predictions due to external restrictions.
    – Sarah Langford
    Nov 9 at 19:19












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I am trying to develop a binomial model in R.



I want to use a formula that looks like this: VAL = X0 + b1 * X1 + b2 * X2



Where X0, X1, and X2 are variables in my data frame and b1 and b2 are the coefficients I want to develop. I want the target value Y to be TRUE/1 if this formula produces a VAL > 0 and FALSE/0 if it produces a VAL < 0.



Sample Data with b1 & b2 set to 1:
Target X0 X1 X2 VAL Result
1 86 -54 17 49 1
0 0 -54 17 -37 0
1 40 -15 23 48 1
0 50 -20 -25 5 1



I want the value of X0 to be incorporated in the prediction, but I do not want this variable to have a coefficient (as this is a predefined formula that I can't change).



The reason I need X0 in the model is because if X1 and X2 are equal for two observations that have different X0 values (as in first 2 observations), I want to reflect that in my formula. One observation's X0 could cause VAL to be negative and the other observations's X0 could cause VAL to be positive, but this would not be reflected if X0 was left completely out of the model. Also note the last observation in which I would either need to increase b1 or b2 so that VAL is negative and the result is 0 (which the model would not see without seeing X0).



I currently am using a formula that looks like glm("Y~X0+X1+X2", family = binomial(link = "logit")), but this model produces a coefficient for X0. How would I develop a model forcing X0 to have no coefficient?










share|improve this question















I am trying to develop a binomial model in R.



I want to use a formula that looks like this: VAL = X0 + b1 * X1 + b2 * X2



Where X0, X1, and X2 are variables in my data frame and b1 and b2 are the coefficients I want to develop. I want the target value Y to be TRUE/1 if this formula produces a VAL > 0 and FALSE/0 if it produces a VAL < 0.



Sample Data with b1 & b2 set to 1:
Target X0 X1 X2 VAL Result
1 86 -54 17 49 1
0 0 -54 17 -37 0
1 40 -15 23 48 1
0 50 -20 -25 5 1



I want the value of X0 to be incorporated in the prediction, but I do not want this variable to have a coefficient (as this is a predefined formula that I can't change).



The reason I need X0 in the model is because if X1 and X2 are equal for two observations that have different X0 values (as in first 2 observations), I want to reflect that in my formula. One observation's X0 could cause VAL to be negative and the other observations's X0 could cause VAL to be positive, but this would not be reflected if X0 was left completely out of the model. Also note the last observation in which I would either need to increase b1 or b2 so that VAL is negative and the result is 0 (which the model would not see without seeing X0).



I currently am using a formula that looks like glm("Y~X0+X1+X2", family = binomial(link = "logit")), but this model produces a coefficient for X0. How would I develop a model forcing X0 to have no coefficient?







r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 9 at 19:06

























asked Nov 9 at 18:02









Sarah Langford

12




12







  • 1




    What do you mean you want X0 incorporated in the prediction but not have a coefficient? Having a coefficient is what allows you to use X0 in the prediction.
    – mickey
    Nov 9 at 18:03










  • Are you looking for a model without an intercept?
    – Harro Cyranka
    Nov 9 at 18:05






  • 1




    I don't understand how you can force something to be in the model without a coefficient. If you don't want it to have a coefficient, don't put it in the model. Maybe it would be easier to help with a proper reproducible example with sample input and the desired output.
    – MrFlick
    Nov 9 at 18:11










  • you can fit a no intercept model like this glm(status==1~0+age, data =lung, family = binomial). However, I highly recommend having a sufficient justification for this- see this post stats.stackexchange.com/questions/260209/…
    – Mike
    Nov 9 at 19:05










  • @mickey I edited my question to be more specific. To be clear, I don't mind having an intercept on top of X0 (if that is possible). I just don't want the model to develop a coefficient for X0 (making the model too dependent on X0), since I will not be able to use that coefficient in my predictions due to external restrictions.
    – Sarah Langford
    Nov 9 at 19:19












  • 1




    What do you mean you want X0 incorporated in the prediction but not have a coefficient? Having a coefficient is what allows you to use X0 in the prediction.
    – mickey
    Nov 9 at 18:03










  • Are you looking for a model without an intercept?
    – Harro Cyranka
    Nov 9 at 18:05






  • 1




    I don't understand how you can force something to be in the model without a coefficient. If you don't want it to have a coefficient, don't put it in the model. Maybe it would be easier to help with a proper reproducible example with sample input and the desired output.
    – MrFlick
    Nov 9 at 18:11










  • you can fit a no intercept model like this glm(status==1~0+age, data =lung, family = binomial). However, I highly recommend having a sufficient justification for this- see this post stats.stackexchange.com/questions/260209/…
    – Mike
    Nov 9 at 19:05










  • @mickey I edited my question to be more specific. To be clear, I don't mind having an intercept on top of X0 (if that is possible). I just don't want the model to develop a coefficient for X0 (making the model too dependent on X0), since I will not be able to use that coefficient in my predictions due to external restrictions.
    – Sarah Langford
    Nov 9 at 19:19







1




1




What do you mean you want X0 incorporated in the prediction but not have a coefficient? Having a coefficient is what allows you to use X0 in the prediction.
– mickey
Nov 9 at 18:03




What do you mean you want X0 incorporated in the prediction but not have a coefficient? Having a coefficient is what allows you to use X0 in the prediction.
– mickey
Nov 9 at 18:03












Are you looking for a model without an intercept?
– Harro Cyranka
Nov 9 at 18:05




Are you looking for a model without an intercept?
– Harro Cyranka
Nov 9 at 18:05




1




1




I don't understand how you can force something to be in the model without a coefficient. If you don't want it to have a coefficient, don't put it in the model. Maybe it would be easier to help with a proper reproducible example with sample input and the desired output.
– MrFlick
Nov 9 at 18:11




I don't understand how you can force something to be in the model without a coefficient. If you don't want it to have a coefficient, don't put it in the model. Maybe it would be easier to help with a proper reproducible example with sample input and the desired output.
– MrFlick
Nov 9 at 18:11












you can fit a no intercept model like this glm(status==1~0+age, data =lung, family = binomial). However, I highly recommend having a sufficient justification for this- see this post stats.stackexchange.com/questions/260209/…
– Mike
Nov 9 at 19:05




you can fit a no intercept model like this glm(status==1~0+age, data =lung, family = binomial). However, I highly recommend having a sufficient justification for this- see this post stats.stackexchange.com/questions/260209/…
– Mike
Nov 9 at 19:05












@mickey I edited my question to be more specific. To be clear, I don't mind having an intercept on top of X0 (if that is possible). I just don't want the model to develop a coefficient for X0 (making the model too dependent on X0), since I will not be able to use that coefficient in my predictions due to external restrictions.
– Sarah Langford
Nov 9 at 19:19




@mickey I edited my question to be more specific. To be clear, I don't mind having an intercept on top of X0 (if that is possible). I just don't want the model to develop a coefficient for X0 (making the model too dependent on X0), since I will not be able to use that coefficient in my predictions due to external restrictions.
– Sarah Langford
Nov 9 at 19:19












1 Answer
1






active

oldest

votes

















up vote
0
down vote













It looks like what you want is to have the coefficient for X0 be zero. If you can't change the formula (to omit X0), you could change the data. Here's an example:



n = 1000
df = data.frame('x1'=rnorm(n), 'x2'=rnorm(n))
df0 = df
df0[,2] = 0

y = 0.5 + 1.5*df[,1] - 1.0*df[,2] + rnorm(n, 0, 0.1)

mod1 = lm(y ~ x1, data = df)
mod2 = lm(y ~ x1 + x2, data = df)
mod3 = lm(y ~ x1 + x2, data = df0)


It sounds like mod1 is what you want, but since you can't change the formula, you're stuck with mod2 or mod3. mod2 won't work since this will give an estimate for x2. mod3 is the same as mod1 except the coefficient for x2 will be NA, but the intercept and x1 will have the same cofficients.



Having the coefficient for x2 be NA is comparable to having it be zero. The predictions from mod1 and mod3 will be the same, but mod3 does throw a warning.






share|improve this answer




















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53231108%2fr-binomial-regression%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    It looks like what you want is to have the coefficient for X0 be zero. If you can't change the formula (to omit X0), you could change the data. Here's an example:



    n = 1000
    df = data.frame('x1'=rnorm(n), 'x2'=rnorm(n))
    df0 = df
    df0[,2] = 0

    y = 0.5 + 1.5*df[,1] - 1.0*df[,2] + rnorm(n, 0, 0.1)

    mod1 = lm(y ~ x1, data = df)
    mod2 = lm(y ~ x1 + x2, data = df)
    mod3 = lm(y ~ x1 + x2, data = df0)


    It sounds like mod1 is what you want, but since you can't change the formula, you're stuck with mod2 or mod3. mod2 won't work since this will give an estimate for x2. mod3 is the same as mod1 except the coefficient for x2 will be NA, but the intercept and x1 will have the same cofficients.



    Having the coefficient for x2 be NA is comparable to having it be zero. The predictions from mod1 and mod3 will be the same, but mod3 does throw a warning.






    share|improve this answer
























      up vote
      0
      down vote













      It looks like what you want is to have the coefficient for X0 be zero. If you can't change the formula (to omit X0), you could change the data. Here's an example:



      n = 1000
      df = data.frame('x1'=rnorm(n), 'x2'=rnorm(n))
      df0 = df
      df0[,2] = 0

      y = 0.5 + 1.5*df[,1] - 1.0*df[,2] + rnorm(n, 0, 0.1)

      mod1 = lm(y ~ x1, data = df)
      mod2 = lm(y ~ x1 + x2, data = df)
      mod3 = lm(y ~ x1 + x2, data = df0)


      It sounds like mod1 is what you want, but since you can't change the formula, you're stuck with mod2 or mod3. mod2 won't work since this will give an estimate for x2. mod3 is the same as mod1 except the coefficient for x2 will be NA, but the intercept and x1 will have the same cofficients.



      Having the coefficient for x2 be NA is comparable to having it be zero. The predictions from mod1 and mod3 will be the same, but mod3 does throw a warning.






      share|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        It looks like what you want is to have the coefficient for X0 be zero. If you can't change the formula (to omit X0), you could change the data. Here's an example:



        n = 1000
        df = data.frame('x1'=rnorm(n), 'x2'=rnorm(n))
        df0 = df
        df0[,2] = 0

        y = 0.5 + 1.5*df[,1] - 1.0*df[,2] + rnorm(n, 0, 0.1)

        mod1 = lm(y ~ x1, data = df)
        mod2 = lm(y ~ x1 + x2, data = df)
        mod3 = lm(y ~ x1 + x2, data = df0)


        It sounds like mod1 is what you want, but since you can't change the formula, you're stuck with mod2 or mod3. mod2 won't work since this will give an estimate for x2. mod3 is the same as mod1 except the coefficient for x2 will be NA, but the intercept and x1 will have the same cofficients.



        Having the coefficient for x2 be NA is comparable to having it be zero. The predictions from mod1 and mod3 will be the same, but mod3 does throw a warning.






        share|improve this answer












        It looks like what you want is to have the coefficient for X0 be zero. If you can't change the formula (to omit X0), you could change the data. Here's an example:



        n = 1000
        df = data.frame('x1'=rnorm(n), 'x2'=rnorm(n))
        df0 = df
        df0[,2] = 0

        y = 0.5 + 1.5*df[,1] - 1.0*df[,2] + rnorm(n, 0, 0.1)

        mod1 = lm(y ~ x1, data = df)
        mod2 = lm(y ~ x1 + x2, data = df)
        mod3 = lm(y ~ x1 + x2, data = df0)


        It sounds like mod1 is what you want, but since you can't change the formula, you're stuck with mod2 or mod3. mod2 won't work since this will give an estimate for x2. mod3 is the same as mod1 except the coefficient for x2 will be NA, but the intercept and x1 will have the same cofficients.



        Having the coefficient for x2 be NA is comparable to having it be zero. The predictions from mod1 and mod3 will be the same, but mod3 does throw a warning.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 9 at 19:40









        mickey

        56513




        56513



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53231108%2fr-binomial-regression%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Use pre created SQLite database for Android project in kotlin

            Darth Vader #20

            Ondo