control that predictions are > 0 using GridSearchCV



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I am using GridSearchCV in order to estimate the parameters of my regressor.
I use the scoring function mean_squared_log_error (and I would like to continue using it)



from sklearn.model_selection import GridSearchCV
import xgboost as xgb

gs = GridSearchCV(xgb.XGBRegressor(),
param_grid='max_depth': range(5, 10),
scoring='neg_mean_squared_log_error', cv=5, return_train_score=True)

gs.fit(X, y)


y is always positive, but what happens is that some predictions within the 5 fold gridsearch happen to be negative (even if it shouldn't happen because my target variable is always positive) and therefore I get the error message



ValueError: Mean Squared Logarithmic Error cannot be used when targets contain negative values.


because the scorer is trying to calculate the log of a negative number (the unfortunate prediction).



Is there a way to control the predictions inside the GridSearchCV? How would you tackle this issue?










share|improve this question






















  • you could use the mean absolute error (MAE)

    – makaros
    Nov 15 '18 at 23:27











  • Or you could a use a custom wrapper which just wraps around the neg_mean_squared_log_error and clips the negative values to 0 before passing it to the scorer, if that makes sense.

    – Vivek Kumar
    Nov 16 '18 at 4:39

















0















I am using GridSearchCV in order to estimate the parameters of my regressor.
I use the scoring function mean_squared_log_error (and I would like to continue using it)



from sklearn.model_selection import GridSearchCV
import xgboost as xgb

gs = GridSearchCV(xgb.XGBRegressor(),
param_grid='max_depth': range(5, 10),
scoring='neg_mean_squared_log_error', cv=5, return_train_score=True)

gs.fit(X, y)


y is always positive, but what happens is that some predictions within the 5 fold gridsearch happen to be negative (even if it shouldn't happen because my target variable is always positive) and therefore I get the error message



ValueError: Mean Squared Logarithmic Error cannot be used when targets contain negative values.


because the scorer is trying to calculate the log of a negative number (the unfortunate prediction).



Is there a way to control the predictions inside the GridSearchCV? How would you tackle this issue?










share|improve this question






















  • you could use the mean absolute error (MAE)

    – makaros
    Nov 15 '18 at 23:27











  • Or you could a use a custom wrapper which just wraps around the neg_mean_squared_log_error and clips the negative values to 0 before passing it to the scorer, if that makes sense.

    – Vivek Kumar
    Nov 16 '18 at 4:39













0












0








0








I am using GridSearchCV in order to estimate the parameters of my regressor.
I use the scoring function mean_squared_log_error (and I would like to continue using it)



from sklearn.model_selection import GridSearchCV
import xgboost as xgb

gs = GridSearchCV(xgb.XGBRegressor(),
param_grid='max_depth': range(5, 10),
scoring='neg_mean_squared_log_error', cv=5, return_train_score=True)

gs.fit(X, y)


y is always positive, but what happens is that some predictions within the 5 fold gridsearch happen to be negative (even if it shouldn't happen because my target variable is always positive) and therefore I get the error message



ValueError: Mean Squared Logarithmic Error cannot be used when targets contain negative values.


because the scorer is trying to calculate the log of a negative number (the unfortunate prediction).



Is there a way to control the predictions inside the GridSearchCV? How would you tackle this issue?










share|improve this question














I am using GridSearchCV in order to estimate the parameters of my regressor.
I use the scoring function mean_squared_log_error (and I would like to continue using it)



from sklearn.model_selection import GridSearchCV
import xgboost as xgb

gs = GridSearchCV(xgb.XGBRegressor(),
param_grid='max_depth': range(5, 10),
scoring='neg_mean_squared_log_error', cv=5, return_train_score=True)

gs.fit(X, y)


y is always positive, but what happens is that some predictions within the 5 fold gridsearch happen to be negative (even if it shouldn't happen because my target variable is always positive) and therefore I get the error message



ValueError: Mean Squared Logarithmic Error cannot be used when targets contain negative values.


because the scorer is trying to calculate the log of a negative number (the unfortunate prediction).



Is there a way to control the predictions inside the GridSearchCV? How would you tackle this issue?







python scikit-learn xgboost gridsearchcv msle






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 14:14









gabboshowgabboshow

1,76341847




1,76341847












  • you could use the mean absolute error (MAE)

    – makaros
    Nov 15 '18 at 23:27











  • Or you could a use a custom wrapper which just wraps around the neg_mean_squared_log_error and clips the negative values to 0 before passing it to the scorer, if that makes sense.

    – Vivek Kumar
    Nov 16 '18 at 4:39

















  • you could use the mean absolute error (MAE)

    – makaros
    Nov 15 '18 at 23:27











  • Or you could a use a custom wrapper which just wraps around the neg_mean_squared_log_error and clips the negative values to 0 before passing it to the scorer, if that makes sense.

    – Vivek Kumar
    Nov 16 '18 at 4:39
















you could use the mean absolute error (MAE)

– makaros
Nov 15 '18 at 23:27





you could use the mean absolute error (MAE)

– makaros
Nov 15 '18 at 23:27













Or you could a use a custom wrapper which just wraps around the neg_mean_squared_log_error and clips the negative values to 0 before passing it to the scorer, if that makes sense.

– Vivek Kumar
Nov 16 '18 at 4:39





Or you could a use a custom wrapper which just wraps around the neg_mean_squared_log_error and clips the negative values to 0 before passing it to the scorer, if that makes sense.

– Vivek Kumar
Nov 16 '18 at 4:39












1 Answer
1






active

oldest

votes


















0














If you have knowledge that your dependent (y value) is always positive you can use a loss function that constrains your predictions to be in the positive domain as well.



One example supported in XGBoost is Gamma regression (see reg:gamma) or you can design your own loss function like the Mean Squared Log Error, you'd have to derive first and second order derivatives in that case.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53321395%2fcontrol-that-predictions-are-0-using-gridsearchcv%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    If you have knowledge that your dependent (y value) is always positive you can use a loss function that constrains your predictions to be in the positive domain as well.



    One example supported in XGBoost is Gamma regression (see reg:gamma) or you can design your own loss function like the Mean Squared Log Error, you'd have to derive first and second order derivatives in that case.






    share|improve this answer



























      0














      If you have knowledge that your dependent (y value) is always positive you can use a loss function that constrains your predictions to be in the positive domain as well.



      One example supported in XGBoost is Gamma regression (see reg:gamma) or you can design your own loss function like the Mean Squared Log Error, you'd have to derive first and second order derivatives in that case.






      share|improve this answer

























        0












        0








        0







        If you have knowledge that your dependent (y value) is always positive you can use a loss function that constrains your predictions to be in the positive domain as well.



        One example supported in XGBoost is Gamma regression (see reg:gamma) or you can design your own loss function like the Mean Squared Log Error, you'd have to derive first and second order derivatives in that case.






        share|improve this answer













        If you have knowledge that your dependent (y value) is always positive you can use a loss function that constrains your predictions to be in the positive domain as well.



        One example supported in XGBoost is Gamma regression (see reg:gamma) or you can design your own loss function like the Mean Squared Log Error, you'd have to derive first and second order derivatives in that case.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 19 '18 at 14:17









        BarBar

        1,3011931




        1,3011931





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53321395%2fcontrol-that-predictions-are-0-using-gridsearchcv%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

            Syphilis

            Darth Vader #20