statmodels OLS giving a TypeError in python
I am trying to fit a set of features to statsmodel's OLS linear regression model.
I am adding a few features at a time. With the first two features, it works fine. But when I keep adding new features it gives me an error.
Traceback (most recent call last):
File "read_xml.py", line 337, in <module>
model = sm.OLS(Y, X).fit()
...
File "D:pythonprojectstestprojtest_envlibsite-packagesstatsmodelsbasedata.py", line 132, in _handle_constant
if not np.isfinite(ptp_).all():
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
So I changed the type of input using
X = X.astype(float)
Then a different error pops out.
Traceback (most recent call last):
File "read_xml.py", line 339, in <module>
print(model.summary())
...
File "D:pythonprojectstestprojtest_envlibsite-packagesscipystats_distn_infrastructure.py", line 1824, in sf
place(output, (1-cond0)+np.isnan(x), self.badvalue)
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
My code looks like this.
new_df0 = pd.concat([df_lex[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_lex[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:6,:]
Y = data.values[6,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
First error triggered in model = sm.OLS(Y,X).fit()
Second error triggered in model.summary()
But with some other features, there are no errors.
new_df0 = pd.concat([df_len[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_len[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:2,:]
Y = data.values[2,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
print(X.shape)
print(Y.shape)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
It looks like when I have only two features it works. But when different 6 features added, it gives the errors. My major concern is to understand the error. Because I have read similar question related to plots in python. But this is triggered in the built-in functions, not in my code. Any suggestions to debug is highly appreciated.
python python-3.x statsmodels sklearn-pandas
add a comment |
I am trying to fit a set of features to statsmodel's OLS linear regression model.
I am adding a few features at a time. With the first two features, it works fine. But when I keep adding new features it gives me an error.
Traceback (most recent call last):
File "read_xml.py", line 337, in <module>
model = sm.OLS(Y, X).fit()
...
File "D:pythonprojectstestprojtest_envlibsite-packagesstatsmodelsbasedata.py", line 132, in _handle_constant
if not np.isfinite(ptp_).all():
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
So I changed the type of input using
X = X.astype(float)
Then a different error pops out.
Traceback (most recent call last):
File "read_xml.py", line 339, in <module>
print(model.summary())
...
File "D:pythonprojectstestprojtest_envlibsite-packagesscipystats_distn_infrastructure.py", line 1824, in sf
place(output, (1-cond0)+np.isnan(x), self.badvalue)
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
My code looks like this.
new_df0 = pd.concat([df_lex[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_lex[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:6,:]
Y = data.values[6,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
First error triggered in model = sm.OLS(Y,X).fit()
Second error triggered in model.summary()
But with some other features, there are no errors.
new_df0 = pd.concat([df_len[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_len[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:2,:]
Y = data.values[2,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
print(X.shape)
print(Y.shape)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
It looks like when I have only two features it works. But when different 6 features added, it gives the errors. My major concern is to understand the error. Because I have read similar question related to plots in python. But this is triggered in the built-in functions, not in my code. Any suggestions to debug is highly appreciated.
python python-3.x statsmodels sklearn-pandas
1
One thought...what doesdata.dtypesshow? It looks like something that is not an array like object is getting passed to thenp.isinstanceand/ornp.isnanfunctions.
– jtweeder
Nov 16 '18 at 18:23
1
I found a solution when I let one of my friend to look into my code. I was only considering X as input, forgetting Y at all. Y was just 1/0. Then he proposed to set Y also toastype(float)and it model is working again.
– akalanka
Nov 18 '18 at 18:37
add a comment |
I am trying to fit a set of features to statsmodel's OLS linear regression model.
I am adding a few features at a time. With the first two features, it works fine. But when I keep adding new features it gives me an error.
Traceback (most recent call last):
File "read_xml.py", line 337, in <module>
model = sm.OLS(Y, X).fit()
...
File "D:pythonprojectstestprojtest_envlibsite-packagesstatsmodelsbasedata.py", line 132, in _handle_constant
if not np.isfinite(ptp_).all():
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
So I changed the type of input using
X = X.astype(float)
Then a different error pops out.
Traceback (most recent call last):
File "read_xml.py", line 339, in <module>
print(model.summary())
...
File "D:pythonprojectstestprojtest_envlibsite-packagesscipystats_distn_infrastructure.py", line 1824, in sf
place(output, (1-cond0)+np.isnan(x), self.badvalue)
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
My code looks like this.
new_df0 = pd.concat([df_lex[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_lex[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:6,:]
Y = data.values[6,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
First error triggered in model = sm.OLS(Y,X).fit()
Second error triggered in model.summary()
But with some other features, there are no errors.
new_df0 = pd.concat([df_len[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_len[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:2,:]
Y = data.values[2,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
print(X.shape)
print(Y.shape)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
It looks like when I have only two features it works. But when different 6 features added, it gives the errors. My major concern is to understand the error. Because I have read similar question related to plots in python. But this is triggered in the built-in functions, not in my code. Any suggestions to debug is highly appreciated.
python python-3.x statsmodels sklearn-pandas
I am trying to fit a set of features to statsmodel's OLS linear regression model.
I am adding a few features at a time. With the first two features, it works fine. But when I keep adding new features it gives me an error.
Traceback (most recent call last):
File "read_xml.py", line 337, in <module>
model = sm.OLS(Y, X).fit()
...
File "D:pythonprojectstestprojtest_envlibsite-packagesstatsmodelsbasedata.py", line 132, in _handle_constant
if not np.isfinite(ptp_).all():
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
So I changed the type of input using
X = X.astype(float)
Then a different error pops out.
Traceback (most recent call last):
File "read_xml.py", line 339, in <module>
print(model.summary())
...
File "D:pythonprojectstestprojtest_envlibsite-packagesscipystats_distn_infrastructure.py", line 1824, in sf
place(output, (1-cond0)+np.isnan(x), self.badvalue)
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
My code looks like this.
new_df0 = pd.concat([df_lex[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_lex[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:6,:]
Y = data.values[6,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
First error triggered in model = sm.OLS(Y,X).fit()
Second error triggered in model.summary()
But with some other features, there are no errors.
new_df0 = pd.concat([df_len[0], summary_df[0]], axis = 0, join = 'inner')
new_df1 = pd.concat([df_len[1], summary_df[1]], axis = 0, join = 'inner')
data = pd.concat([new_df0, new_df1], axis = 1)
print(data.shape)
X = data.values[0:2,:]
Y = data.values[2,:]
Y = Y.reshape(1,88)
X = X.T
Y = Y.T
X = X.astype(float)
print(X.shape)
print(Y.shape)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
It looks like when I have only two features it works. But when different 6 features added, it gives the errors. My major concern is to understand the error. Because I have read similar question related to plots in python. But this is triggered in the built-in functions, not in my code. Any suggestions to debug is highly appreciated.
python python-3.x statsmodels sklearn-pandas
python python-3.x statsmodels sklearn-pandas
asked Nov 14 '18 at 19:16
akalankaakalanka
9910
9910
1
One thought...what doesdata.dtypesshow? It looks like something that is not an array like object is getting passed to thenp.isinstanceand/ornp.isnanfunctions.
– jtweeder
Nov 16 '18 at 18:23
1
I found a solution when I let one of my friend to look into my code. I was only considering X as input, forgetting Y at all. Y was just 1/0. Then he proposed to set Y also toastype(float)and it model is working again.
– akalanka
Nov 18 '18 at 18:37
add a comment |
1
One thought...what doesdata.dtypesshow? It looks like something that is not an array like object is getting passed to thenp.isinstanceand/ornp.isnanfunctions.
– jtweeder
Nov 16 '18 at 18:23
1
I found a solution when I let one of my friend to look into my code. I was only considering X as input, forgetting Y at all. Y was just 1/0. Then he proposed to set Y also toastype(float)and it model is working again.
– akalanka
Nov 18 '18 at 18:37
1
1
One thought...what does
data.dtypes show? It looks like something that is not an array like object is getting passed to the np.isinstance and/or np.isnan functions.– jtweeder
Nov 16 '18 at 18:23
One thought...what does
data.dtypes show? It looks like something that is not an array like object is getting passed to the np.isinstance and/or np.isnan functions.– jtweeder
Nov 16 '18 at 18:23
1
1
I found a solution when I let one of my friend to look into my code. I was only considering X as input, forgetting Y at all. Y was just 1/0. Then he proposed to set Y also to
astype(float) and it model is working again.– akalanka
Nov 18 '18 at 18:37
I found a solution when I let one of my friend to look into my code. I was only considering X as input, forgetting Y at all. Y was just 1/0. Then he proposed to set Y also to
astype(float) and it model is working again.– akalanka
Nov 18 '18 at 18:37
add a comment |
2 Answers
2
active
oldest
votes
Y.astype(float)
did the trick.
add a comment |
please use
model=sm.OLS(df.Y,df.X, missing='drop').fit()
It looks like there is a nan value in some variable. By default missing is none and this might be the cause.
It still gives me the same error'isnan'atmodel.summary(). So I wonder this is something related to the output of thesm.OLS(...)due to some of my input values are NaNs.
– akalanka
Nov 15 '18 at 19:04
Assuming I have NaNs in my input feature dataframe I used thisdf_lex[i].replace([np.inf, -np.inf, np.nan], x)to replace withxwherexsubstituted with 0, 0.0001 (small value). Still the same error.
– akalanka
Nov 15 '18 at 21:49
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307308%2fstatmodels-ols-giving-a-typeerror-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Y.astype(float)
did the trick.
add a comment |
Y.astype(float)
did the trick.
add a comment |
Y.astype(float)
did the trick.
Y.astype(float)
did the trick.
answered Nov 18 '18 at 18:37
akalankaakalanka
9910
9910
add a comment |
add a comment |
please use
model=sm.OLS(df.Y,df.X, missing='drop').fit()
It looks like there is a nan value in some variable. By default missing is none and this might be the cause.
It still gives me the same error'isnan'atmodel.summary(). So I wonder this is something related to the output of thesm.OLS(...)due to some of my input values are NaNs.
– akalanka
Nov 15 '18 at 19:04
Assuming I have NaNs in my input feature dataframe I used thisdf_lex[i].replace([np.inf, -np.inf, np.nan], x)to replace withxwherexsubstituted with 0, 0.0001 (small value). Still the same error.
– akalanka
Nov 15 '18 at 21:49
add a comment |
please use
model=sm.OLS(df.Y,df.X, missing='drop').fit()
It looks like there is a nan value in some variable. By default missing is none and this might be the cause.
It still gives me the same error'isnan'atmodel.summary(). So I wonder this is something related to the output of thesm.OLS(...)due to some of my input values are NaNs.
– akalanka
Nov 15 '18 at 19:04
Assuming I have NaNs in my input feature dataframe I used thisdf_lex[i].replace([np.inf, -np.inf, np.nan], x)to replace withxwherexsubstituted with 0, 0.0001 (small value). Still the same error.
– akalanka
Nov 15 '18 at 21:49
add a comment |
please use
model=sm.OLS(df.Y,df.X, missing='drop').fit()
It looks like there is a nan value in some variable. By default missing is none and this might be the cause.
please use
model=sm.OLS(df.Y,df.X, missing='drop').fit()
It looks like there is a nan value in some variable. By default missing is none and this might be the cause.
answered Nov 15 '18 at 13:17
sukhbindersukhbinder
36235
36235
It still gives me the same error'isnan'atmodel.summary(). So I wonder this is something related to the output of thesm.OLS(...)due to some of my input values are NaNs.
– akalanka
Nov 15 '18 at 19:04
Assuming I have NaNs in my input feature dataframe I used thisdf_lex[i].replace([np.inf, -np.inf, np.nan], x)to replace withxwherexsubstituted with 0, 0.0001 (small value). Still the same error.
– akalanka
Nov 15 '18 at 21:49
add a comment |
It still gives me the same error'isnan'atmodel.summary(). So I wonder this is something related to the output of thesm.OLS(...)due to some of my input values are NaNs.
– akalanka
Nov 15 '18 at 19:04
Assuming I have NaNs in my input feature dataframe I used thisdf_lex[i].replace([np.inf, -np.inf, np.nan], x)to replace withxwherexsubstituted with 0, 0.0001 (small value). Still the same error.
– akalanka
Nov 15 '18 at 21:49
It still gives me the same error
'isnan' at model.summary(). So I wonder this is something related to the output of the sm.OLS(...) due to some of my input values are NaNs.– akalanka
Nov 15 '18 at 19:04
It still gives me the same error
'isnan' at model.summary(). So I wonder this is something related to the output of the sm.OLS(...) due to some of my input values are NaNs.– akalanka
Nov 15 '18 at 19:04
Assuming I have NaNs in my input feature dataframe I used this
df_lex[i].replace([np.inf, -np.inf, np.nan], x) to replace with x where x substituted with 0, 0.0001 (small value). Still the same error.– akalanka
Nov 15 '18 at 21:49
Assuming I have NaNs in my input feature dataframe I used this
df_lex[i].replace([np.inf, -np.inf, np.nan], x) to replace with x where x substituted with 0, 0.0001 (small value). Still the same error.– akalanka
Nov 15 '18 at 21:49
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53307308%2fstatmodels-ols-giving-a-typeerror-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
One thought...what does
data.dtypesshow? It looks like something that is not an array like object is getting passed to thenp.isinstanceand/ornp.isnanfunctions.– jtweeder
Nov 16 '18 at 18:23
1
I found a solution when I let one of my friend to look into my code. I was only considering X as input, forgetting Y at all. Y was just 1/0. Then he proposed to set Y also to
astype(float)and it model is working again.– akalanka
Nov 18 '18 at 18:37