Slow query performance: MongoDB with $lookup and $and/$or

I'm trying to find a way to create an engine that translates GraphQL query filters to MongoDB aggregations while keeping the performance. Our application has the requisite of limiting the results from collection A by applying filters to collection B, C and even D sometimes.

For better understanding, here's a sample about how a filter is translated to MongoDB.

This:


 "filter": 
 "return": null,
 "AND": [
 "customer_WITH": 
 "OR": [
 "code": "CUSTOMER NAME"
 , 
 "commercialName_LIKE": "CUSTOMER NAME"
 , 
 "corporateName_LIKE": "CUSTOMER NAME"
 ]
 
 ],
 "OR": [
 "dispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 , 
 "redispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 ],
 "reversal": null

Gets translated to this:

[
 "$match": 
 "return": 
 "$eq": null
 ,
 "reversal": 
 "$eq": null
 ,
 "company": 
 "$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
 
 
, 
 "$lookup": 
 "as": "dispatcher",
 "from": "shippers",
 "localField": "dispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "dispatcher.company",
 "from": "companies",
 "localField": "dispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher",
 "from": "shippers",
 "localField": "redispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher.company",
 "from": "companies",
 "localField": "redispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "customer",
 "from": "customers",
 "localField": "customer",
 "foreignField": "_id"
 
, 
 "$match": 
 "$or": [
 "dispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 , 
 "redispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 ],
 "$and": [
 "$or": [
 "customer.code": 
 "$eq": "CUSTOMER NAME"
 
 , 
 "customer.commercialName": 
 "$regex": /CUSTOMERsNAME/
 
 , 
 "customer.corporateName": 
 "$regex": /CUSTOMERsNAME/
 
 ]
 ]
 
, 
 "$unwind": 
 "path": "$customer",
 "preserveNullAndEmptyArrays": true
 
, 
 "$group": 
 "_id": "$invoiceNo",
 "__rootId": 
 "$first": "$_id"
 ,
 "company": 
 "$first": "$company"
 ,
 "customer": 
 "$first": "$customer._id"
 ,
 "dispatcher": 
 "$first": "$dispatcher._id"
 ,
 "redispatcher": 
 "$first": "$redispatcher._id"
 ,
 "driverPlate": 
 "$first": "$driverPlate"
 ,
 "key": 
 "$first": "$key"
 ,
 "activities": 
 "$first": "$activities"
 ,
 "serialNo": 
 "$first": "$serialNo"
 ,
 "invoiceNo": 
 "$first": "$invoiceNo"
 ,
 "incidents": 
 "$first": "$incidents"
 ,
 "deliveries": 
 "$first": "$deliveries"
 ,
 "return": 
 "$first": "$return"
 
 
, 
 "$project": 
 "_id": "$__rootId",
 "company": "$company",
 "customer": "$customer",
 "dispatcher": "$dispatcher",
 "redispatcher": "$redispatcher",
 "driverPlate": "$driverPlate",
 "key": "$key",
 "activities": "$activities",
 "serialNo": "$serialNo",
 "invoiceNo": "$invoiceNo",
 "incidents": "$incidents",
 "deliveries": "$deliveries",
 "return": "$return"
 
, 
 "$sort": 
 "invoiceNo": -1
 
, 
 "$limit": 51
]

The engine is smart enough to reallocate to the first position $match properties that don't require $lookups and right after $lookups if they do, however if they are within a $and/$or condition block, then they are reallocated after the last $lookup, regardless of what properties are there.

I could scan for what is used inside the $and and deconstruct it into new reallocated $match phases, but I need to figure how to handle the $or operator: I can't apply the same desconstruction idea on it because this would invalidate the condition.

So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?

Creating more indexes won't help because they're not used for the $lookup. Moving up $match phases, as the MongoDB team would suggest is also not possible because it would break the conditions. So I'm out of ideas now.

Best regards.

edited Nov 12 '18 at 18:00

asked Nov 12 '18 at 17:40

Rubens Felipe

112

Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.

– Neil Lunn
Nov 12 '18 at 20:46

Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.

– Rubens Felipe
Nov 13 '18 at 0:10

Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.

– Neil Lunn
Nov 13 '18 at 0:15

Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.

– Neil Lunn
Nov 13 '18 at 0:17

add a comment |

For better understanding, here's a sample about how a filter is translated to MongoDB.

This:


 "filter": 
 "return": null,
 "AND": [
 "customer_WITH": 
 "OR": [
 "code": "CUSTOMER NAME"
 , 
 "commercialName_LIKE": "CUSTOMER NAME"
 , 
 "corporateName_LIKE": "CUSTOMER NAME"
 ]
 
 ],
 "OR": [
 "dispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 , 
 "redispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 ],
 "reversal": null

Gets translated to this:

[
 "$match": 
 "return": 
 "$eq": null
 ,
 "reversal": 
 "$eq": null
 ,
 "company": 
 "$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
 
 
, 
 "$lookup": 
 "as": "dispatcher",
 "from": "shippers",
 "localField": "dispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "dispatcher.company",
 "from": "companies",
 "localField": "dispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher",
 "from": "shippers",
 "localField": "redispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher.company",
 "from": "companies",
 "localField": "redispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "customer",
 "from": "customers",
 "localField": "customer",
 "foreignField": "_id"
 
, 
 "$match": 
 "$or": [
 "dispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 , 
 "redispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 ],
 "$and": [
 "$or": [
 "customer.code": 
 "$eq": "CUSTOMER NAME"
 
 , 
 "customer.commercialName": 
 "$regex": /CUSTOMERsNAME/
 
 , 
 "customer.corporateName": 
 "$regex": /CUSTOMERsNAME/
 
 ]
 ]
 
, 
 "$unwind": 
 "path": "$customer",
 "preserveNullAndEmptyArrays": true
 
, 
 "$group": 
 "_id": "$invoiceNo",
 "__rootId": 
 "$first": "$_id"
 ,
 "company": 
 "$first": "$company"
 ,
 "customer": 
 "$first": "$customer._id"
 ,
 "dispatcher": 
 "$first": "$dispatcher._id"
 ,
 "redispatcher": 
 "$first": "$redispatcher._id"
 ,
 "driverPlate": 
 "$first": "$driverPlate"
 ,
 "key": 
 "$first": "$key"
 ,
 "activities": 
 "$first": "$activities"
 ,
 "serialNo": 
 "$first": "$serialNo"
 ,
 "invoiceNo": 
 "$first": "$invoiceNo"
 ,
 "incidents": 
 "$first": "$incidents"
 ,
 "deliveries": 
 "$first": "$deliveries"
 ,
 "return": 
 "$first": "$return"
 
 
, 
 "$project": 
 "_id": "$__rootId",
 "company": "$company",
 "customer": "$customer",
 "dispatcher": "$dispatcher",
 "redispatcher": "$redispatcher",
 "driverPlate": "$driverPlate",
 "key": "$key",
 "activities": "$activities",
 "serialNo": "$serialNo",
 "invoiceNo": "$invoiceNo",
 "incidents": "$incidents",
 "deliveries": "$deliveries",
 "return": "$return"
 
, 
 "$sort": 
 "invoiceNo": -1
 
, 
 "$limit": 51
]

So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?

Best regards.

edited Nov 12 '18 at 18:00

asked Nov 12 '18 at 17:40

Rubens Felipe

112

Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.

– Neil Lunn
Nov 12 '18 at 20:46

Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.

– Rubens Felipe
Nov 13 '18 at 0:10

Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.

– Neil Lunn
Nov 13 '18 at 0:15

Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.

– Neil Lunn
Nov 13 '18 at 0:17

add a comment |

For better understanding, here's a sample about how a filter is translated to MongoDB.

This:


 "filter": 
 "return": null,
 "AND": [
 "customer_WITH": 
 "OR": [
 "code": "CUSTOMER NAME"
 , 
 "commercialName_LIKE": "CUSTOMER NAME"
 , 
 "corporateName_LIKE": "CUSTOMER NAME"
 ]
 
 ],
 "OR": [
 "dispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 , 
 "redispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 ],
 "reversal": null

Gets translated to this:

[
 "$match": 
 "return": 
 "$eq": null
 ,
 "reversal": 
 "$eq": null
 ,
 "company": 
 "$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
 
 
, 
 "$lookup": 
 "as": "dispatcher",
 "from": "shippers",
 "localField": "dispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "dispatcher.company",
 "from": "companies",
 "localField": "dispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher",
 "from": "shippers",
 "localField": "redispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher.company",
 "from": "companies",
 "localField": "redispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "customer",
 "from": "customers",
 "localField": "customer",
 "foreignField": "_id"
 
, 
 "$match": 
 "$or": [
 "dispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 , 
 "redispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 ],
 "$and": [
 "$or": [
 "customer.code": 
 "$eq": "CUSTOMER NAME"
 
 , 
 "customer.commercialName": 
 "$regex": /CUSTOMERsNAME/
 
 , 
 "customer.corporateName": 
 "$regex": /CUSTOMERsNAME/
 
 ]
 ]
 
, 
 "$unwind": 
 "path": "$customer",
 "preserveNullAndEmptyArrays": true
 
, 
 "$group": 
 "_id": "$invoiceNo",
 "__rootId": 
 "$first": "$_id"
 ,
 "company": 
 "$first": "$company"
 ,
 "customer": 
 "$first": "$customer._id"
 ,
 "dispatcher": 
 "$first": "$dispatcher._id"
 ,
 "redispatcher": 
 "$first": "$redispatcher._id"
 ,
 "driverPlate": 
 "$first": "$driverPlate"
 ,
 "key": 
 "$first": "$key"
 ,
 "activities": 
 "$first": "$activities"
 ,
 "serialNo": 
 "$first": "$serialNo"
 ,
 "invoiceNo": 
 "$first": "$invoiceNo"
 ,
 "incidents": 
 "$first": "$incidents"
 ,
 "deliveries": 
 "$first": "$deliveries"
 ,
 "return": 
 "$first": "$return"
 
 
, 
 "$project": 
 "_id": "$__rootId",
 "company": "$company",
 "customer": "$customer",
 "dispatcher": "$dispatcher",
 "redispatcher": "$redispatcher",
 "driverPlate": "$driverPlate",
 "key": "$key",
 "activities": "$activities",
 "serialNo": "$serialNo",
 "invoiceNo": "$invoiceNo",
 "incidents": "$incidents",
 "deliveries": "$deliveries",
 "return": "$return"
 
, 
 "$sort": 
 "invoiceNo": -1
 
, 
 "$limit": 51
]

So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?

Best regards.

edited Nov 12 '18 at 18:00

asked Nov 12 '18 at 17:40

Rubens Felipe

112

For better understanding, here's a sample about how a filter is translated to MongoDB.

This:


 "filter": 
 "return": null,
 "AND": [
 "customer_WITH": 
 "OR": [
 "code": "CUSTOMER NAME"
 , 
 "commercialName_LIKE": "CUSTOMER NAME"
 , 
 "corporateName_LIKE": "CUSTOMER NAME"
 ]
 
 ],
 "OR": [
 "dispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 , 
 "redispatcher_WITH": 
 "company_WITH": 
 "corporateName_LIKE": "COMPANY NAME"
 
 
 ],
 "reversal": null

Gets translated to this:

[
 "$match": 
 "return": 
 "$eq": null
 ,
 "reversal": 
 "$eq": null
 ,
 "company": 
 "$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
 
 
, 
 "$lookup": 
 "as": "dispatcher",
 "from": "shippers",
 "localField": "dispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "dispatcher.company",
 "from": "companies",
 "localField": "dispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$dispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher",
 "from": "shippers",
 "localField": "redispatcher",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "redispatcher.company",
 "from": "companies",
 "localField": "redispatcher.company",
 "foreignField": "_id"
 
, 
 "$unwind": 
 "path": "$redispatcher.company",
 "preserveNullAndEmptyArrays": true
 
, 
 "$lookup": 
 "as": "customer",
 "from": "customers",
 "localField": "customer",
 "foreignField": "_id"
 
, 
 "$match": 
 "$or": [
 "dispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 , 
 "redispatcher.company.corporateName": 
 "$regex": /sCOMPANYsNAME/
 
 ],
 "$and": [
 "$or": [
 "customer.code": 
 "$eq": "CUSTOMER NAME"
 
 , 
 "customer.commercialName": 
 "$regex": /CUSTOMERsNAME/
 
 , 
 "customer.corporateName": 
 "$regex": /CUSTOMERsNAME/
 
 ]
 ]
 
, 
 "$unwind": 
 "path": "$customer",
 "preserveNullAndEmptyArrays": true
 
, 
 "$group": 
 "_id": "$invoiceNo",
 "__rootId": 
 "$first": "$_id"
 ,
 "company": 
 "$first": "$company"
 ,
 "customer": 
 "$first": "$customer._id"
 ,
 "dispatcher": 
 "$first": "$dispatcher._id"
 ,
 "redispatcher": 
 "$first": "$redispatcher._id"
 ,
 "driverPlate": 
 "$first": "$driverPlate"
 ,
 "key": 
 "$first": "$key"
 ,
 "activities": 
 "$first": "$activities"
 ,
 "serialNo": 
 "$first": "$serialNo"
 ,
 "invoiceNo": 
 "$first": "$invoiceNo"
 ,
 "incidents": 
 "$first": "$incidents"
 ,
 "deliveries": 
 "$first": "$deliveries"
 ,
 "return": 
 "$first": "$return"
 
 
, 
 "$project": 
 "_id": "$__rootId",
 "company": "$company",
 "customer": "$customer",
 "dispatcher": "$dispatcher",
 "redispatcher": "$redispatcher",
 "driverPlate": "$driverPlate",
 "key": "$key",
 "activities": "$activities",
 "serialNo": "$serialNo",
 "invoiceNo": "$invoiceNo",
 "incidents": "$incidents",
 "deliveries": "$deliveries",
 "return": "$return"
 
, 
 "$sort": 
 "invoiceNo": -1
 
, 
 "$limit": 51
]

So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?

Best regards.

javascript mongodb performance graphql gql

edited Nov 12 '18 at 18:00

asked Nov 12 '18 at 17:40

Rubens Felipe

112

edited Nov 12 '18 at 18:00

asked Nov 12 '18 at 17:40

Rubens Felipe

112

edited Nov 12 '18 at 18:00

asked Nov 12 '18 at 17:40

Rubens Felipe

112

asked Nov 12 '18 at 17:40

Rubens Felipe

112

asked Nov 12 '18 at 17:40

Rubens Felipe

112

Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.

– Neil Lunn
Nov 12 '18 at 20:46

Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.

– Rubens Felipe
Nov 13 '18 at 0:10

Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.

– Neil Lunn
Nov 13 '18 at 0:15

Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.

– Neil Lunn
Nov 13 '18 at 0:17

add a comment |

Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.

– Neil Lunn
Nov 12 '18 at 20:46

Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.

– Rubens Felipe
Nov 13 '18 at 0:10

Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.

– Neil Lunn
Nov 13 '18 at 0:15

Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.

– Neil Lunn
Nov 13 '18 at 0:17

Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.

– Neil Lunn
Nov 12 '18 at 20:46

Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.

– Rubens Felipe
Nov 13 '18 at 0:10

Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.

– Neil Lunn
Nov 13 '18 at 0:15

Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.

– Neil Lunn
Nov 13 '18 at 0:17

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267388%2fslow-query-performance-mongodb-with-lookup-and-and-or%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb