How to count duplicates by multiple records in subtable in MySQL
Say I have the following table structure:
products
id | name | price
products_ean
id | product_id | ean
A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.
What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?
I've tried something like the following, but that makes the query really slower:
SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12
mysql mysqli
add a comment |
Say I have the following table structure:
products
id | name | price
products_ean
id | product_id | ean
A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.
What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?
I've tried something like the following, but that makes the query really slower:
SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12
mysql mysqli
Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.
– Madhur Bhaiya
Nov 14 '18 at 9:31
add a comment |
Say I have the following table structure:
products
id | name | price
products_ean
id | product_id | ean
A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.
What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?
I've tried something like the following, but that makes the query really slower:
SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12
mysql mysqli
Say I have the following table structure:
products
id | name | price
products_ean
id | product_id | ean
A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.
What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?
I've tried something like the following, but that makes the query really slower:
SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12
mysql mysqli
mysql mysqli
asked Nov 14 '18 at 9:10
user2191227user2191227
14010
14010
Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.
– Madhur Bhaiya
Nov 14 '18 at 9:31
add a comment |
Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.
– Madhur Bhaiya
Nov 14 '18 at 9:31
Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.
– Madhur Bhaiya
Nov 14 '18 at 9:31
Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.
– Madhur Bhaiya
Nov 14 '18 at 9:31
add a comment |
1 Answer
1
active
oldest
votes
Using joins is the simplest way to generate related information. I've GROUP BY
the product.id which means the eans
are the aggregated field because those are the only ones that can duplicate. I've added HAVING
part after the query to select only those results with 2 or more (its optional).
SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2
On query efficiency, having the product_id,ean as a composite primary key for the products_ean
table is probably most efficient. Since that is unique its not obvious why the products_ean.id
column is needed.
I guess OP is trying to find thoseean
values which exist in multiple product(s) ?
– Madhur Bhaiya
Nov 14 '18 at 9:30
I didn't read it that way, however the output wasn't specified.GROUP_CONCAT(ean) as ean_list
could be added to theSELECT
results.
– danblack
Nov 14 '18 at 9:33
1
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index onproduct_id
+ean
, and a regular index onean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)
– user2191227
Nov 14 '18 at 12:13
@danblack one optimization I can think of is changingleft join
tojoin
. OP has clarified. so +1
– Madhur Bhaiya
Nov 14 '18 at 17:13
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53296514%2fhow-to-count-duplicates-by-multiple-records-in-subtable-in-mysql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Using joins is the simplest way to generate related information. I've GROUP BY
the product.id which means the eans
are the aggregated field because those are the only ones that can duplicate. I've added HAVING
part after the query to select only those results with 2 or more (its optional).
SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2
On query efficiency, having the product_id,ean as a composite primary key for the products_ean
table is probably most efficient. Since that is unique its not obvious why the products_ean.id
column is needed.
I guess OP is trying to find thoseean
values which exist in multiple product(s) ?
– Madhur Bhaiya
Nov 14 '18 at 9:30
I didn't read it that way, however the output wasn't specified.GROUP_CONCAT(ean) as ean_list
could be added to theSELECT
results.
– danblack
Nov 14 '18 at 9:33
1
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index onproduct_id
+ean
, and a regular index onean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)
– user2191227
Nov 14 '18 at 12:13
@danblack one optimization I can think of is changingleft join
tojoin
. OP has clarified. so +1
– Madhur Bhaiya
Nov 14 '18 at 17:13
|
show 1 more comment
Using joins is the simplest way to generate related information. I've GROUP BY
the product.id which means the eans
are the aggregated field because those are the only ones that can duplicate. I've added HAVING
part after the query to select only those results with 2 or more (its optional).
SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2
On query efficiency, having the product_id,ean as a composite primary key for the products_ean
table is probably most efficient. Since that is unique its not obvious why the products_ean.id
column is needed.
I guess OP is trying to find thoseean
values which exist in multiple product(s) ?
– Madhur Bhaiya
Nov 14 '18 at 9:30
I didn't read it that way, however the output wasn't specified.GROUP_CONCAT(ean) as ean_list
could be added to theSELECT
results.
– danblack
Nov 14 '18 at 9:33
1
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index onproduct_id
+ean
, and a regular index onean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)
– user2191227
Nov 14 '18 at 12:13
@danblack one optimization I can think of is changingleft join
tojoin
. OP has clarified. so +1
– Madhur Bhaiya
Nov 14 '18 at 17:13
|
show 1 more comment
Using joins is the simplest way to generate related information. I've GROUP BY
the product.id which means the eans
are the aggregated field because those are the only ones that can duplicate. I've added HAVING
part after the query to select only those results with 2 or more (its optional).
SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2
On query efficiency, having the product_id,ean as a composite primary key for the products_ean
table is probably most efficient. Since that is unique its not obvious why the products_ean.id
column is needed.
Using joins is the simplest way to generate related information. I've GROUP BY
the product.id which means the eans
are the aggregated field because those are the only ones that can duplicate. I've added HAVING
part after the query to select only those results with 2 or more (its optional).
SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2
On query efficiency, having the product_id,ean as a composite primary key for the products_ean
table is probably most efficient. Since that is unique its not obvious why the products_ean.id
column is needed.
edited Nov 14 '18 at 20:45
answered Nov 14 '18 at 9:24
danblackdanblack
2,5241319
2,5241319
I guess OP is trying to find thoseean
values which exist in multiple product(s) ?
– Madhur Bhaiya
Nov 14 '18 at 9:30
I didn't read it that way, however the output wasn't specified.GROUP_CONCAT(ean) as ean_list
could be added to theSELECT
results.
– danblack
Nov 14 '18 at 9:33
1
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index onproduct_id
+ean
, and a regular index onean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)
– user2191227
Nov 14 '18 at 12:13
@danblack one optimization I can think of is changingleft join
tojoin
. OP has clarified. so +1
– Madhur Bhaiya
Nov 14 '18 at 17:13
|
show 1 more comment
I guess OP is trying to find thoseean
values which exist in multiple product(s) ?
– Madhur Bhaiya
Nov 14 '18 at 9:30
I didn't read it that way, however the output wasn't specified.GROUP_CONCAT(ean) as ean_list
could be added to theSELECT
results.
– danblack
Nov 14 '18 at 9:33
1
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index onproduct_id
+ean
, and a regular index onean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)
– user2191227
Nov 14 '18 at 12:13
@danblack one optimization I can think of is changingleft join
tojoin
. OP has clarified. so +1
– Madhur Bhaiya
Nov 14 '18 at 17:13
I guess OP is trying to find those
ean
values which exist in multiple product(s) ?– Madhur Bhaiya
Nov 14 '18 at 9:30
I guess OP is trying to find those
ean
values which exist in multiple product(s) ?– Madhur Bhaiya
Nov 14 '18 at 9:30
I didn't read it that way, however the output wasn't specified.
GROUP_CONCAT(ean) as ean_list
could be added to the SELECT
results.– danblack
Nov 14 '18 at 9:33
I didn't read it that way, however the output wasn't specified.
GROUP_CONCAT(ean) as ean_list
could be added to the SELECT
results.– danblack
Nov 14 '18 at 9:33
1
1
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.
– Madhur Bhaiya
Nov 14 '18 at 9:34
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on
product_id
+ean
, and a regular index on ean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)– user2191227
Nov 14 '18 at 12:13
The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on
product_id
+ean
, and a regular index on ean
. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)– user2191227
Nov 14 '18 at 12:13
@danblack one optimization I can think of is changing
left join
to join
. OP has clarified. so +1– Madhur Bhaiya
Nov 14 '18 at 17:13
@danblack one optimization I can think of is changing
left join
to join
. OP has clarified. so +1– Madhur Bhaiya
Nov 14 '18 at 17:13
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53296514%2fhow-to-count-duplicates-by-multiple-records-in-subtable-in-mysql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.
– Madhur Bhaiya
Nov 14 '18 at 9:31