Huge Time Taken for Hive Query to execute
WITH Employee AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
Employee E
INNER JOIN employee_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN Employee_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN Emp_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN Employee_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN Employee_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN Emp_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN Employee_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
DEPT AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
DEPT E
INNER JOIN DEPT_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN DEPT_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN DEPT_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN DEPT_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN DEPT_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN DEPT_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN DEPT_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.Date='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
SELECT *
FROM Employee
UNION ALL
SELECT *
FROM DEPT;
Counts of the following Tables are too huge to be processed in
Hive, Can you please, provide an hint how to increase the performance
Employee Counts
Det - 15 Million
Sig - 15 Million
Com - 10 Million
Class - 10 Million
Abc.ODate - 1000 Records
ABC_Sig -10 Million
Class.ODate- 10 Million
Dept Counts
Det - 25 Million
Sig - 25 Million
Com -20 Million
Class - 20 Million
Abc.ODate - 1000 Records
ABC_Sig -50 Million
Class - 50 Million
sql hive
add a comment |
WITH Employee AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
Employee E
INNER JOIN employee_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN Employee_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN Emp_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN Employee_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN Employee_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN Emp_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN Employee_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
DEPT AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
DEPT E
INNER JOIN DEPT_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN DEPT_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN DEPT_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN DEPT_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN DEPT_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN DEPT_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN DEPT_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.Date='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
SELECT *
FROM Employee
UNION ALL
SELECT *
FROM DEPT;
Counts of the following Tables are too huge to be processed in
Hive, Can you please, provide an hint how to increase the performance
Employee Counts
Det - 15 Million
Sig - 15 Million
Com - 10 Million
Class - 10 Million
Abc.ODate - 1000 Records
ABC_Sig -10 Million
Class.ODate- 10 Million
Dept Counts
Det - 25 Million
Sig - 25 Million
Com -20 Million
Class - 20 Million
Abc.ODate - 1000 Records
ABC_Sig -50 Million
Class - 50 Million
sql hive
Count you have posted, is it for a single day (as mentioned in where condition) or for whole table. by DefaultPPD
is set, if not try to set and see if it improves. Also try to post yourquery plan
.
– Gaurang Shah
Nov 11 at 15:48
This is for a single day and also date is partition column
– kalis
Nov 11 at 16:06
how about putting query plan and checking if ppd is set to true
– Gaurang Shah
Nov 11 at 16:24
What is ppd? Thanks
– kalis
Nov 11 at 17:19
putexplain
in front of your query run it and put the output here. also run following `hive -e "set hive.optimize.ppd" and out the output here.
– Gaurang Shah
Nov 11 at 17:29
add a comment |
WITH Employee AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
Employee E
INNER JOIN employee_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN Employee_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN Emp_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN Employee_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN Employee_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN Emp_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN Employee_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
DEPT AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
DEPT E
INNER JOIN DEPT_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN DEPT_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN DEPT_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN DEPT_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN DEPT_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN DEPT_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN DEPT_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.Date='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
SELECT *
FROM Employee
UNION ALL
SELECT *
FROM DEPT;
Counts of the following Tables are too huge to be processed in
Hive, Can you please, provide an hint how to increase the performance
Employee Counts
Det - 15 Million
Sig - 15 Million
Com - 10 Million
Class - 10 Million
Abc.ODate - 1000 Records
ABC_Sig -10 Million
Class.ODate- 10 Million
Dept Counts
Det - 25 Million
Sig - 25 Million
Com -20 Million
Class - 20 Million
Abc.ODate - 1000 Records
ABC_Sig -50 Million
Class - 50 Million
sql hive
WITH Employee AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
Employee E
INNER JOIN employee_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN Employee_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN Emp_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN Employee_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN Employee_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN Emp_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN Employee_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
DEPT AS
( SELECT E.Com_Id,
Com.Sinl_ID CLASS.Details_ID,
DEPT E
INNER JOIN DEPT_Det Det ON E.Com_ID=Det.Com_ID
INNER JOIN DEPT_Sig Sig ON E.Com_ID=Sig.Main_Com_ID
INNER JOIN DEPT_Sig_Det Com Sig.Com_ID=Com.Com_ID
INNER JOIN DEPT_Class CLASS ON Com.F_ID=Class.Class_Id
******N*******
INNER JOIN DEPT_AB_Class ABC ON E.Com_ID=AB.Main_Com_Id
INNER JOIN DEPT_AB_Class_Sig ABC_Sig ON AB.COM_ID=ABC_SIG.COM_ID
INNER JOIN DEPT_Class Class1 ABC_Sig.F_ID=Class1.Class_Id
WHERE Det.ODate='2015-01-01'
AND Sig.ODate='2015-01-01'
AND Com.Date='2015-01-01'
AND Class.ODate='2015-01-01'
AND Abc.ODate='2015-01-01'
AND ABC_Sig.ODate='2015-01-01'
AND Class.ODate='2015-01-01'
AND ),
SELECT *
FROM Employee
UNION ALL
SELECT *
FROM DEPT;
Counts of the following Tables are too huge to be processed in
Hive, Can you please, provide an hint how to increase the performance
Employee Counts
Det - 15 Million
Sig - 15 Million
Com - 10 Million
Class - 10 Million
Abc.ODate - 1000 Records
ABC_Sig -10 Million
Class.ODate- 10 Million
Dept Counts
Det - 25 Million
Sig - 25 Million
Com -20 Million
Class - 20 Million
Abc.ODate - 1000 Records
ABC_Sig -50 Million
Class - 50 Million
sql hive
sql hive
edited Nov 11 at 15:45
Gaurang Shah
2,94611233
2,94611233
asked Nov 11 at 12:58
kalis
427
427
Count you have posted, is it for a single day (as mentioned in where condition) or for whole table. by DefaultPPD
is set, if not try to set and see if it improves. Also try to post yourquery plan
.
– Gaurang Shah
Nov 11 at 15:48
This is for a single day and also date is partition column
– kalis
Nov 11 at 16:06
how about putting query plan and checking if ppd is set to true
– Gaurang Shah
Nov 11 at 16:24
What is ppd? Thanks
– kalis
Nov 11 at 17:19
putexplain
in front of your query run it and put the output here. also run following `hive -e "set hive.optimize.ppd" and out the output here.
– Gaurang Shah
Nov 11 at 17:29
add a comment |
Count you have posted, is it for a single day (as mentioned in where condition) or for whole table. by DefaultPPD
is set, if not try to set and see if it improves. Also try to post yourquery plan
.
– Gaurang Shah
Nov 11 at 15:48
This is for a single day and also date is partition column
– kalis
Nov 11 at 16:06
how about putting query plan and checking if ppd is set to true
– Gaurang Shah
Nov 11 at 16:24
What is ppd? Thanks
– kalis
Nov 11 at 17:19
putexplain
in front of your query run it and put the output here. also run following `hive -e "set hive.optimize.ppd" and out the output here.
– Gaurang Shah
Nov 11 at 17:29
Count you have posted, is it for a single day (as mentioned in where condition) or for whole table. by Default
PPD
is set, if not try to set and see if it improves. Also try to post your query plan
.– Gaurang Shah
Nov 11 at 15:48
Count you have posted, is it for a single day (as mentioned in where condition) or for whole table. by Default
PPD
is set, if not try to set and see if it improves. Also try to post your query plan
.– Gaurang Shah
Nov 11 at 15:48
This is for a single day and also date is partition column
– kalis
Nov 11 at 16:06
This is for a single day and also date is partition column
– kalis
Nov 11 at 16:06
how about putting query plan and checking if ppd is set to true
– Gaurang Shah
Nov 11 at 16:24
how about putting query plan and checking if ppd is set to true
– Gaurang Shah
Nov 11 at 16:24
What is ppd? Thanks
– kalis
Nov 11 at 17:19
What is ppd? Thanks
– kalis
Nov 11 at 17:19
put
explain
in front of your query run it and put the output here. also run following `hive -e "set hive.optimize.ppd" and out the output here.– Gaurang Shah
Nov 11 at 17:29
put
explain
in front of your query run it and put the output here. also run following `hive -e "set hive.optimize.ppd" and out the output here.– Gaurang Shah
Nov 11 at 17:29
add a comment |
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248975%2fhuge-time-taken-for-hive-query-to-execute%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248975%2fhuge-time-taken-for-hive-query-to-execute%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Count you have posted, is it for a single day (as mentioned in where condition) or for whole table. by Default
PPD
is set, if not try to set and see if it improves. Also try to post yourquery plan
.– Gaurang Shah
Nov 11 at 15:48
This is for a single day and also date is partition column
– kalis
Nov 11 at 16:06
how about putting query plan and checking if ppd is set to true
– Gaurang Shah
Nov 11 at 16:24
What is ppd? Thanks
– kalis
Nov 11 at 17:19
put
explain
in front of your query run it and put the output here. also run following `hive -e "set hive.optimize.ppd" and out the output here.– Gaurang Shah
Nov 11 at 17:29