Classification Scores differ between H2O4GPU and Scikit-Learn
I've begun evaluating a random forest classifier using precision and recall. However, despite the train and test sets being identical for the CPU and GPU implementations of the classifier, I'm seeing differences in the returned evaluation scores. Is this a known bug within the library by chance?
Both code samples are below for reference.
Scikit-Learn (CPU)
from sklearn.metrics import recall_score, precision_score
from sklearn.ensemble import RandomForestClassifier
rf_cpu = RandomForestClassifier(n_estimators=5000, n_jobs=-1)
rf_cpu.fit(X_train, y_train)
rf_cpu_pred = clf.predict(X_test)
recall_score(rf_cpu_pred, y_test)
precision_score(rf_cpu_pred, y_test)
CPU Recall: 0.807186
CPU Precision: 0.82095
H2O4GPU (GPU)
from h2o4gpu.metrics import recall_score, precision_score
from h2o4gpu import RandomForestClassifier
rf_gpu = RandomForestClassifier(n_estimators=5000, n_gpus=1)
rf_gpu.fit(X_train, y_train)
rf_gpu_pred = clf.predict(X_test)
recall_score(rf_gpu_pred, y_test)
precision_score(rf_gpu_pred, y_test)
GPU Recall: 0.714286
GPU Precision: 0.809988
python scikit-learn random-forest h2o h2o4gpu
add a comment |
I've begun evaluating a random forest classifier using precision and recall. However, despite the train and test sets being identical for the CPU and GPU implementations of the classifier, I'm seeing differences in the returned evaluation scores. Is this a known bug within the library by chance?
Both code samples are below for reference.
Scikit-Learn (CPU)
from sklearn.metrics import recall_score, precision_score
from sklearn.ensemble import RandomForestClassifier
rf_cpu = RandomForestClassifier(n_estimators=5000, n_jobs=-1)
rf_cpu.fit(X_train, y_train)
rf_cpu_pred = clf.predict(X_test)
recall_score(rf_cpu_pred, y_test)
precision_score(rf_cpu_pred, y_test)
CPU Recall: 0.807186
CPU Precision: 0.82095
H2O4GPU (GPU)
from h2o4gpu.metrics import recall_score, precision_score
from h2o4gpu import RandomForestClassifier
rf_gpu = RandomForestClassifier(n_estimators=5000, n_gpus=1)
rf_gpu.fit(X_train, y_train)
rf_gpu_pred = clf.predict(X_test)
recall_score(rf_gpu_pred, y_test)
precision_score(rf_gpu_pred, y_test)
GPU Recall: 0.714286
GPU Precision: 0.809988
python scikit-learn random-forest h2o h2o4gpu
add a comment |
I've begun evaluating a random forest classifier using precision and recall. However, despite the train and test sets being identical for the CPU and GPU implementations of the classifier, I'm seeing differences in the returned evaluation scores. Is this a known bug within the library by chance?
Both code samples are below for reference.
Scikit-Learn (CPU)
from sklearn.metrics import recall_score, precision_score
from sklearn.ensemble import RandomForestClassifier
rf_cpu = RandomForestClassifier(n_estimators=5000, n_jobs=-1)
rf_cpu.fit(X_train, y_train)
rf_cpu_pred = clf.predict(X_test)
recall_score(rf_cpu_pred, y_test)
precision_score(rf_cpu_pred, y_test)
CPU Recall: 0.807186
CPU Precision: 0.82095
H2O4GPU (GPU)
from h2o4gpu.metrics import recall_score, precision_score
from h2o4gpu import RandomForestClassifier
rf_gpu = RandomForestClassifier(n_estimators=5000, n_gpus=1)
rf_gpu.fit(X_train, y_train)
rf_gpu_pred = clf.predict(X_test)
recall_score(rf_gpu_pred, y_test)
precision_score(rf_gpu_pred, y_test)
GPU Recall: 0.714286
GPU Precision: 0.809988
python scikit-learn random-forest h2o h2o4gpu
I've begun evaluating a random forest classifier using precision and recall. However, despite the train and test sets being identical for the CPU and GPU implementations of the classifier, I'm seeing differences in the returned evaluation scores. Is this a known bug within the library by chance?
Both code samples are below for reference.
Scikit-Learn (CPU)
from sklearn.metrics import recall_score, precision_score
from sklearn.ensemble import RandomForestClassifier
rf_cpu = RandomForestClassifier(n_estimators=5000, n_jobs=-1)
rf_cpu.fit(X_train, y_train)
rf_cpu_pred = clf.predict(X_test)
recall_score(rf_cpu_pred, y_test)
precision_score(rf_cpu_pred, y_test)
CPU Recall: 0.807186
CPU Precision: 0.82095
H2O4GPU (GPU)
from h2o4gpu.metrics import recall_score, precision_score
from h2o4gpu import RandomForestClassifier
rf_gpu = RandomForestClassifier(n_estimators=5000, n_gpus=1)
rf_gpu.fit(X_train, y_train)
rf_gpu_pred = clf.predict(X_test)
recall_score(rf_gpu_pred, y_test)
precision_score(rf_gpu_pred, y_test)
GPU Recall: 0.714286
GPU Precision: 0.809988
python scikit-learn random-forest h2o h2o4gpu
python scikit-learn random-forest h2o h2o4gpu
asked Nov 13 '18 at 22:42
GregGreg
1913
1913
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Correction: Realized the inputs for precision and recall were in the wrong order. The order is always (y_true, y_pred)
, per the Scikit-Learn documentation.
Corrected Evaluation Code
recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)
1
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53290612%2fclassification-scores-differ-between-h2o4gpu-and-scikit-learn%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Correction: Realized the inputs for precision and recall were in the wrong order. The order is always (y_true, y_pred)
, per the Scikit-Learn documentation.
Corrected Evaluation Code
recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)
1
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
add a comment |
Correction: Realized the inputs for precision and recall were in the wrong order. The order is always (y_true, y_pred)
, per the Scikit-Learn documentation.
Corrected Evaluation Code
recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)
1
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
add a comment |
Correction: Realized the inputs for precision and recall were in the wrong order. The order is always (y_true, y_pred)
, per the Scikit-Learn documentation.
Corrected Evaluation Code
recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)
Correction: Realized the inputs for precision and recall were in the wrong order. The order is always (y_true, y_pred)
, per the Scikit-Learn documentation.
Corrected Evaluation Code
recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)
answered Nov 13 '18 at 23:33
GregGreg
1913
1913
1
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
add a comment |
1
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
1
1
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
Did that change the values/results?
– Andreas Mueller
Nov 14 '18 at 1:14
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
@AndreasMueller It actually did...for whatever reason, the Scikit-Learn numbers are now much lower (precision ~0.6, recall ~0.6). H2O4GPU numbers are a touch higher.
– Greg
Nov 14 '18 at 17:27
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
can you try using the same scoring function both times just to make sure? But I imagine the main difference is in hyperparameters.
– Andreas Mueller
Nov 15 '18 at 2:02
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53290612%2fclassification-scores-differ-between-h2o4gpu-and-scikit-learn%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown