Overlapping communication and computation issue










0















I am a student whose research work includes using MPI and OpenACC to accelerate our in-house research CFD code on multiple GPUs. I am using the openmpi 2.0.0 and PGI 17.5 compiler. Now I am having a big issue related to the "progression of operations in MPI". If using multiple GPUs, the communication overhead in my code is very large so that I want to overlap the communication between hosts (CPUs) and computation on devices (GPUs). However, in my case, the actual communication often starts when the computations finish. Therefore, even though I wrote my code in an overlapping way, there is no overlapping because of the openmpi not supporting asynchronous progression. Also, I have already done a lot of testing and added MPI_Test function in my code to probe, I found that MPI often does progress (i.e. actually send or receive the data) only if I am blocking in a call to MPI_Wait (Then no overlapping occurs at all). My purpose is to use overlapping to hide communication latency and thus improve the performance of my code. Can anyone provide some suggestions for this question? I would greatly appreciate your help!



Best,



Cheng










share|improve this question






















  • there is no progress thread in Open MPI 2.0.0. Open MPI 4.0.0 was released yesterday, and I recommend you give it a try. btw, which interconnect are you using ?

    – Gilles Gouaillardet
    Nov 13 '18 at 21:30











  • Hi Gilles, Thank you very much for your reply! It looks like OpenMPI 4.0.0 was not installed on the cluster. 100 Gbps EDR-Infiniband is used for low latency communication between compute nodes for MPI traffic (I am now using the Newriver cluster in my university: arc.vt.edu/computing/newriver).

    – Cheng
    Nov 13 '18 at 22:03











  • The good news is you can install and use Open MPI as a user in your home directory !

    – Gilles Gouaillardet
    Nov 13 '18 at 22:41











  • Gilles, Got it! I'll give it a try. Thank you so much for your help!

    – Cheng
    Nov 14 '18 at 0:26















0















I am a student whose research work includes using MPI and OpenACC to accelerate our in-house research CFD code on multiple GPUs. I am using the openmpi 2.0.0 and PGI 17.5 compiler. Now I am having a big issue related to the "progression of operations in MPI". If using multiple GPUs, the communication overhead in my code is very large so that I want to overlap the communication between hosts (CPUs) and computation on devices (GPUs). However, in my case, the actual communication often starts when the computations finish. Therefore, even though I wrote my code in an overlapping way, there is no overlapping because of the openmpi not supporting asynchronous progression. Also, I have already done a lot of testing and added MPI_Test function in my code to probe, I found that MPI often does progress (i.e. actually send or receive the data) only if I am blocking in a call to MPI_Wait (Then no overlapping occurs at all). My purpose is to use overlapping to hide communication latency and thus improve the performance of my code. Can anyone provide some suggestions for this question? I would greatly appreciate your help!



Best,



Cheng










share|improve this question






















  • there is no progress thread in Open MPI 2.0.0. Open MPI 4.0.0 was released yesterday, and I recommend you give it a try. btw, which interconnect are you using ?

    – Gilles Gouaillardet
    Nov 13 '18 at 21:30











  • Hi Gilles, Thank you very much for your reply! It looks like OpenMPI 4.0.0 was not installed on the cluster. 100 Gbps EDR-Infiniband is used for low latency communication between compute nodes for MPI traffic (I am now using the Newriver cluster in my university: arc.vt.edu/computing/newriver).

    – Cheng
    Nov 13 '18 at 22:03











  • The good news is you can install and use Open MPI as a user in your home directory !

    – Gilles Gouaillardet
    Nov 13 '18 at 22:41











  • Gilles, Got it! I'll give it a try. Thank you so much for your help!

    – Cheng
    Nov 14 '18 at 0:26













0












0








0








I am a student whose research work includes using MPI and OpenACC to accelerate our in-house research CFD code on multiple GPUs. I am using the openmpi 2.0.0 and PGI 17.5 compiler. Now I am having a big issue related to the "progression of operations in MPI". If using multiple GPUs, the communication overhead in my code is very large so that I want to overlap the communication between hosts (CPUs) and computation on devices (GPUs). However, in my case, the actual communication often starts when the computations finish. Therefore, even though I wrote my code in an overlapping way, there is no overlapping because of the openmpi not supporting asynchronous progression. Also, I have already done a lot of testing and added MPI_Test function in my code to probe, I found that MPI often does progress (i.e. actually send or receive the data) only if I am blocking in a call to MPI_Wait (Then no overlapping occurs at all). My purpose is to use overlapping to hide communication latency and thus improve the performance of my code. Can anyone provide some suggestions for this question? I would greatly appreciate your help!



Best,



Cheng










share|improve this question














I am a student whose research work includes using MPI and OpenACC to accelerate our in-house research CFD code on multiple GPUs. I am using the openmpi 2.0.0 and PGI 17.5 compiler. Now I am having a big issue related to the "progression of operations in MPI". If using multiple GPUs, the communication overhead in my code is very large so that I want to overlap the communication between hosts (CPUs) and computation on devices (GPUs). However, in my case, the actual communication often starts when the computations finish. Therefore, even though I wrote my code in an overlapping way, there is no overlapping because of the openmpi not supporting asynchronous progression. Also, I have already done a lot of testing and added MPI_Test function in my code to probe, I found that MPI often does progress (i.e. actually send or receive the data) only if I am blocking in a call to MPI_Wait (Then no overlapping occurs at all). My purpose is to use overlapping to hide communication latency and thus improve the performance of my code. Can anyone provide some suggestions for this question? I would greatly appreciate your help!



Best,



Cheng







mpi openmpi






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 13 '18 at 21:24









ChengCheng

62




62












  • there is no progress thread in Open MPI 2.0.0. Open MPI 4.0.0 was released yesterday, and I recommend you give it a try. btw, which interconnect are you using ?

    – Gilles Gouaillardet
    Nov 13 '18 at 21:30











  • Hi Gilles, Thank you very much for your reply! It looks like OpenMPI 4.0.0 was not installed on the cluster. 100 Gbps EDR-Infiniband is used for low latency communication between compute nodes for MPI traffic (I am now using the Newriver cluster in my university: arc.vt.edu/computing/newriver).

    – Cheng
    Nov 13 '18 at 22:03











  • The good news is you can install and use Open MPI as a user in your home directory !

    – Gilles Gouaillardet
    Nov 13 '18 at 22:41











  • Gilles, Got it! I'll give it a try. Thank you so much for your help!

    – Cheng
    Nov 14 '18 at 0:26

















  • there is no progress thread in Open MPI 2.0.0. Open MPI 4.0.0 was released yesterday, and I recommend you give it a try. btw, which interconnect are you using ?

    – Gilles Gouaillardet
    Nov 13 '18 at 21:30











  • Hi Gilles, Thank you very much for your reply! It looks like OpenMPI 4.0.0 was not installed on the cluster. 100 Gbps EDR-Infiniband is used for low latency communication between compute nodes for MPI traffic (I am now using the Newriver cluster in my university: arc.vt.edu/computing/newriver).

    – Cheng
    Nov 13 '18 at 22:03











  • The good news is you can install and use Open MPI as a user in your home directory !

    – Gilles Gouaillardet
    Nov 13 '18 at 22:41











  • Gilles, Got it! I'll give it a try. Thank you so much for your help!

    – Cheng
    Nov 14 '18 at 0:26
















there is no progress thread in Open MPI 2.0.0. Open MPI 4.0.0 was released yesterday, and I recommend you give it a try. btw, which interconnect are you using ?

– Gilles Gouaillardet
Nov 13 '18 at 21:30





there is no progress thread in Open MPI 2.0.0. Open MPI 4.0.0 was released yesterday, and I recommend you give it a try. btw, which interconnect are you using ?

– Gilles Gouaillardet
Nov 13 '18 at 21:30













Hi Gilles, Thank you very much for your reply! It looks like OpenMPI 4.0.0 was not installed on the cluster. 100 Gbps EDR-Infiniband is used for low latency communication between compute nodes for MPI traffic (I am now using the Newriver cluster in my university: arc.vt.edu/computing/newriver).

– Cheng
Nov 13 '18 at 22:03





Hi Gilles, Thank you very much for your reply! It looks like OpenMPI 4.0.0 was not installed on the cluster. 100 Gbps EDR-Infiniband is used for low latency communication between compute nodes for MPI traffic (I am now using the Newriver cluster in my university: arc.vt.edu/computing/newriver).

– Cheng
Nov 13 '18 at 22:03













The good news is you can install and use Open MPI as a user in your home directory !

– Gilles Gouaillardet
Nov 13 '18 at 22:41





The good news is you can install and use Open MPI as a user in your home directory !

– Gilles Gouaillardet
Nov 13 '18 at 22:41













Gilles, Got it! I'll give it a try. Thank you so much for your help!

– Cheng
Nov 14 '18 at 0:26





Gilles, Got it! I'll give it a try. Thank you so much for your help!

– Cheng
Nov 14 '18 at 0:26












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289706%2foverlapping-communication-and-computation-issue%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289706%2foverlapping-communication-and-computation-issue%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Darth Vader #20

Ondo