How to count duplicates by multiple records in subtable in MySQL










0















Say I have the following table structure:



products
id | name | price

products_ean
id | product_id | ean


A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.



What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?



I've tried something like the following, but that makes the query really slower:



SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12









share|improve this question






















  • Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.

    – Madhur Bhaiya
    Nov 14 '18 at 9:31
















0















Say I have the following table structure:



products
id | name | price

products_ean
id | product_id | ean


A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.



What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?



I've tried something like the following, but that makes the query really slower:



SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12









share|improve this question






















  • Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.

    – Madhur Bhaiya
    Nov 14 '18 at 9:31














0












0








0








Say I have the following table structure:



products
id | name | price

products_ean
id | product_id | ean


A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.



What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?



I've tried something like the following, but that makes the query really slower:



SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12









share|improve this question














Say I have the following table structure:



products
id | name | price

products_ean
id | product_id | ean


A product can (unfortunately) have multiple EAN numbers. Two products can have one or more of the same EAN numbers.



What is the best practice to count the amount of duplicate products by comparing multiple EAN numbers from the products_ean table?



I've tried something like the following, but that makes the query really slower:



SELECT
`products`.`name`,
(
SELECT
COUNT(*)
FROM
`products_ean`
WHERE
`ean` IN(
SELECT
`ean`
FROM
`products_ean`
WHERE
`product_id` = `products`.`id`
) AND `products_ean`.`product_id` != `products`.`id`
GROUP BY `product_id`
) AS `ProductEANCount`
FROM
`products`
LIMIT 12






mysql mysqli






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 14 '18 at 9:10









user2191227user2191227

14010




14010












  • Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.

    – Madhur Bhaiya
    Nov 14 '18 at 9:31


















  • Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.

    – Madhur Bhaiya
    Nov 14 '18 at 9:31

















Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.

– Madhur Bhaiya
Nov 14 '18 at 9:31






Are you trying to find those ean values which exist in multiple product(s) ? Some sample data and expected output would really be helpful.

– Madhur Bhaiya
Nov 14 '18 at 9:31













1 Answer
1






active

oldest

votes


















2














Using joins is the simplest way to generate related information. I've GROUP BY the product.id which means the eans are the aggregated field because those are the only ones that can duplicate. I've added HAVING part after the query to select only those results with 2 or more (its optional).



SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2


On query efficiency, having the product_id,ean as a composite primary key for the products_ean table is probably most efficient. Since that is unique its not obvious why the products_ean.id column is needed.






share|improve this answer

























  • I guess OP is trying to find those ean values which exist in multiple product(s) ?

    – Madhur Bhaiya
    Nov 14 '18 at 9:30











  • I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

    – danblack
    Nov 14 '18 at 9:33






  • 1





    maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

    – Madhur Bhaiya
    Nov 14 '18 at 9:34











  • The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

    – user2191227
    Nov 14 '18 at 12:13












  • @danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

    – Madhur Bhaiya
    Nov 14 '18 at 17:13










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53296514%2fhow-to-count-duplicates-by-multiple-records-in-subtable-in-mysql%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














Using joins is the simplest way to generate related information. I've GROUP BY the product.id which means the eans are the aggregated field because those are the only ones that can duplicate. I've added HAVING part after the query to select only those results with 2 or more (its optional).



SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2


On query efficiency, having the product_id,ean as a composite primary key for the products_ean table is probably most efficient. Since that is unique its not obvious why the products_ean.id column is needed.






share|improve this answer

























  • I guess OP is trying to find those ean values which exist in multiple product(s) ?

    – Madhur Bhaiya
    Nov 14 '18 at 9:30











  • I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

    – danblack
    Nov 14 '18 at 9:33






  • 1





    maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

    – Madhur Bhaiya
    Nov 14 '18 at 9:34











  • The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

    – user2191227
    Nov 14 '18 at 12:13












  • @danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

    – Madhur Bhaiya
    Nov 14 '18 at 17:13















2














Using joins is the simplest way to generate related information. I've GROUP BY the product.id which means the eans are the aggregated field because those are the only ones that can duplicate. I've added HAVING part after the query to select only those results with 2 or more (its optional).



SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2


On query efficiency, having the product_id,ean as a composite primary key for the products_ean table is probably most efficient. Since that is unique its not obvious why the products_ean.id column is needed.






share|improve this answer

























  • I guess OP is trying to find those ean values which exist in multiple product(s) ?

    – Madhur Bhaiya
    Nov 14 '18 at 9:30











  • I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

    – danblack
    Nov 14 '18 at 9:33






  • 1





    maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

    – Madhur Bhaiya
    Nov 14 '18 at 9:34











  • The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

    – user2191227
    Nov 14 '18 at 12:13












  • @danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

    – Madhur Bhaiya
    Nov 14 '18 at 17:13













2












2








2







Using joins is the simplest way to generate related information. I've GROUP BY the product.id which means the eans are the aggregated field because those are the only ones that can duplicate. I've added HAVING part after the query to select only those results with 2 or more (its optional).



SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2


On query efficiency, having the product_id,ean as a composite primary key for the products_ean table is probably most efficient. Since that is unique its not obvious why the products_ean.id column is needed.






share|improve this answer















Using joins is the simplest way to generate related information. I've GROUP BY the product.id which means the eans are the aggregated field because those are the only ones that can duplicate. I've added HAVING part after the query to select only those results with 2 or more (its optional).



SELECT p.id, name, price, count(ean) as eans
FROM products p
JOIN products_ean e
ON p.id = e.product_id
GROUP BY p.id
HAVING eans >= 2


On query efficiency, having the product_id,ean as a composite primary key for the products_ean table is probably most efficient. Since that is unique its not obvious why the products_ean.id column is needed.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 14 '18 at 20:45

























answered Nov 14 '18 at 9:24









danblackdanblack

2,5241319




2,5241319












  • I guess OP is trying to find those ean values which exist in multiple product(s) ?

    – Madhur Bhaiya
    Nov 14 '18 at 9:30











  • I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

    – danblack
    Nov 14 '18 at 9:33






  • 1





    maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

    – Madhur Bhaiya
    Nov 14 '18 at 9:34











  • The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

    – user2191227
    Nov 14 '18 at 12:13












  • @danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

    – Madhur Bhaiya
    Nov 14 '18 at 17:13

















  • I guess OP is trying to find those ean values which exist in multiple product(s) ?

    – Madhur Bhaiya
    Nov 14 '18 at 9:30











  • I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

    – danblack
    Nov 14 '18 at 9:33






  • 1





    maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

    – Madhur Bhaiya
    Nov 14 '18 at 9:34











  • The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

    – user2191227
    Nov 14 '18 at 12:13












  • @danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

    – Madhur Bhaiya
    Nov 14 '18 at 17:13
















I guess OP is trying to find those ean values which exist in multiple product(s) ?

– Madhur Bhaiya
Nov 14 '18 at 9:30





I guess OP is trying to find those ean values which exist in multiple product(s) ?

– Madhur Bhaiya
Nov 14 '18 at 9:30













I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

– danblack
Nov 14 '18 at 9:33





I didn't read it that way, however the output wasn't specified. GROUP_CONCAT(ean) as ean_list could be added to the SELECT results.

– danblack
Nov 14 '18 at 9:33




1




1





maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

– Madhur Bhaiya
Nov 14 '18 at 9:34





maybe other way around; group_concat(product_id) and group by on ean. But i would wait for OP to clarify further.

– Madhur Bhaiya
Nov 14 '18 at 9:34













The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

– user2191227
Nov 14 '18 at 12:13






The query works, but unfortunately it is still really slow (14 secs). To clarify some uncertainties: 1) 1 product will never have the same EAN in the products_ean table (so product_id + ean is unique) 2) I don't wish to retrieve the EANS that have been found, just the count is enough I have a unique index on product_id+ean, and a regular index on ean. One (pretty) important field I forgot to mention is the 'webshop_id'. Products will also not have the same EAN with the same webshop_id (sorry I forgot to mention this earlier)

– user2191227
Nov 14 '18 at 12:13














@danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

– Madhur Bhaiya
Nov 14 '18 at 17:13





@danblack one optimization I can think of is changing left join to join. OP has clarified. so +1

– Madhur Bhaiya
Nov 14 '18 at 17:13



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53296514%2fhow-to-count-duplicates-by-multiple-records-in-subtable-in-mysql%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Darth Vader #20

Ondo