Store and query NxN matrix
I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.
I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.
Example:
From NY to Boston: Distance: 215 mi, Data: No trains.
From Boston to NY: Empty.
From Washington DC to NY: Distance: 225 mi, Data: -
To represent the data I made a matrix
I am working with mongoDB as a database.
First option:
I am thinking in representing any row of the matrix as a document.
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]
Cons: Some documents have an array near 4000 destination cities.
Pros: there are only 4000 documents in the collection.
Second Option:
Create a document with any connection of cities:
Example:
from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"
Pros: i think that with an index it may be faster for searches.
Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.
Any help is appreciated.
mongodb
add a comment |
I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.
I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.
Example:
From NY to Boston: Distance: 215 mi, Data: No trains.
From Boston to NY: Empty.
From Washington DC to NY: Distance: 225 mi, Data: -
To represent the data I made a matrix
I am working with mongoDB as a database.
First option:
I am thinking in representing any row of the matrix as a document.
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]
Cons: Some documents have an array near 4000 destination cities.
Pros: there are only 4000 documents in the collection.
Second Option:
Create a document with any connection of cities:
Example:
from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"
Pros: i think that with an index it may be faster for searches.
Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.
Any help is appreciated.
mongodb
1
It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.
– Neil Lunn
Nov 13 '18 at 13:16
Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.
– sebacipo
Nov 13 '18 at 13:39
@sebacipo: I'm with Neil here. These need to be measured/benchmarked.
– Sergio Tulentsev
Nov 13 '18 at 14:04
@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.
– Sergio Tulentsev
Nov 13 '18 at 14:06
add a comment |
I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.
I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.
Example:
From NY to Boston: Distance: 215 mi, Data: No trains.
From Boston to NY: Empty.
From Washington DC to NY: Distance: 225 mi, Data: -
To represent the data I made a matrix
I am working with mongoDB as a database.
First option:
I am thinking in representing any row of the matrix as a document.
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]
Cons: Some documents have an array near 4000 destination cities.
Pros: there are only 4000 documents in the collection.
Second Option:
Create a document with any connection of cities:
Example:
from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"
Pros: i think that with an index it may be faster for searches.
Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.
Any help is appreciated.
mongodb
I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.
I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.
Example:
From NY to Boston: Distance: 215 mi, Data: No trains.
From Boston to NY: Empty.
From Washington DC to NY: Distance: 225 mi, Data: -
To represent the data I made a matrix
I am working with mongoDB as a database.
First option:
I am thinking in representing any row of the matrix as a document.
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]
Cons: Some documents have an array near 4000 destination cities.
Pros: there are only 4000 documents in the collection.
Second Option:
Create a document with any connection of cities:
Example:
from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"
Pros: i think that with an index it may be faster for searches.
Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.
Any help is appreciated.
mongodb
mongodb
asked Nov 13 '18 at 13:05
sebaciposebacipo
50659
50659
1
It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.
– Neil Lunn
Nov 13 '18 at 13:16
Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.
– sebacipo
Nov 13 '18 at 13:39
@sebacipo: I'm with Neil here. These need to be measured/benchmarked.
– Sergio Tulentsev
Nov 13 '18 at 14:04
@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.
– Sergio Tulentsev
Nov 13 '18 at 14:06
add a comment |
1
It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.
– Neil Lunn
Nov 13 '18 at 13:16
Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.
– sebacipo
Nov 13 '18 at 13:39
@sebacipo: I'm with Neil here. These need to be measured/benchmarked.
– Sergio Tulentsev
Nov 13 '18 at 14:04
@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.
– Sergio Tulentsev
Nov 13 '18 at 14:06
1
1
It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.
– Neil Lunn
Nov 13 '18 at 13:16
It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.
– Neil Lunn
Nov 13 '18 at 13:16
Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.
– sebacipo
Nov 13 '18 at 13:39
Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.
– sebacipo
Nov 13 '18 at 13:39
@sebacipo: I'm with Neil here. These need to be measured/benchmarked.
– Sergio Tulentsev
Nov 13 '18 at 14:04
@sebacipo: I'm with Neil here. These need to be measured/benchmarked.
– Sergio Tulentsev
Nov 13 '18 at 14:04
@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.
– Sergio Tulentsev
Nov 13 '18 at 14:06
@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.
– Sergio Tulentsev
Nov 13 '18 at 14:06
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53281663%2fstore-and-query-nxn-matrix%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53281663%2fstore-and-query-nxn-matrix%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.
– Neil Lunn
Nov 13 '18 at 13:16
Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.
– sebacipo
Nov 13 '18 at 13:39
@sebacipo: I'm with Neil here. These need to be measured/benchmarked.
– Sergio Tulentsev
Nov 13 '18 at 14:04
@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.
– Sergio Tulentsev
Nov 13 '18 at 14:06