Store and query NxN matrix










0















I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.



I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.



Example:
From NY to Boston: Distance: 215 mi, Data: No trains.



From Boston to NY: Empty.



From Washington DC to NY: Distance: 225 mi, Data: -



To represent the data I made a matrix



enter image description here



I am working with mongoDB as a database.



First option:
I am thinking in representing any row of the matrix as a document.



 
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]



Cons: Some documents have an array near 4000 destination cities.



Pros: there are only 4000 documents in the collection.



Second Option:



Create a document with any connection of cities:



Example:




from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"



Pros: i think that with an index it may be faster for searches.



Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.



Any help is appreciated.










share|improve this question

















  • 1





    It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.

    – Neil Lunn
    Nov 13 '18 at 13:16











  • Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.

    – sebacipo
    Nov 13 '18 at 13:39











  • @sebacipo: I'm with Neil here. These need to be measured/benchmarked.

    – Sergio Tulentsev
    Nov 13 '18 at 14:04











  • @sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.

    – Sergio Tulentsev
    Nov 13 '18 at 14:06
















0















I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.



I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.



Example:
From NY to Boston: Distance: 215 mi, Data: No trains.



From Boston to NY: Empty.



From Washington DC to NY: Distance: 225 mi, Data: -



To represent the data I made a matrix



enter image description here



I am working with mongoDB as a database.



First option:
I am thinking in representing any row of the matrix as a document.



 
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]



Cons: Some documents have an array near 4000 destination cities.



Pros: there are only 4000 documents in the collection.



Second Option:



Create a document with any connection of cities:



Example:




from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"



Pros: i think that with an index it may be faster for searches.



Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.



Any help is appreciated.










share|improve this question

















  • 1





    It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.

    – Neil Lunn
    Nov 13 '18 at 13:16











  • Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.

    – sebacipo
    Nov 13 '18 at 13:39











  • @sebacipo: I'm with Neil here. These need to be measured/benchmarked.

    – Sergio Tulentsev
    Nov 13 '18 at 14:04











  • @sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.

    – Sergio Tulentsev
    Nov 13 '18 at 14:06














0












0








0








I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.



I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.



Example:
From NY to Boston: Distance: 215 mi, Data: No trains.



From Boston to NY: Empty.



From Washington DC to NY: Distance: 225 mi, Data: -



To represent the data I made a matrix



enter image description here



I am working with mongoDB as a database.



First option:
I am thinking in representing any row of the matrix as a document.



 
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]



Cons: Some documents have an array near 4000 destination cities.



Pros: there are only 4000 documents in the collection.



Second Option:



Create a document with any connection of cities:



Example:




from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"



Pros: i think that with an index it may be faster for searches.



Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.



Any help is appreciated.










share|improve this question














I am looking for a way to store and query a N x N matrix.
The data will no change frequently.
The search must be fast.



I have to be able to query the distance (and any other data stored as flights, bus, trains, others) from any city to other.



Example:
From NY to Boston: Distance: 215 mi, Data: No trains.



From Boston to NY: Empty.



From Washington DC to NY: Distance: 225 mi, Data: -



To represent the data I made a matrix



enter image description here



I am working with mongoDB as a database.



First option:
I am thinking in representing any row of the matrix as a document.



 
city: "NY",
destinations: [
city:"Boston", distance: "215 mi", data: "no trains",
city:"Washington DC", distance: "226 mi", data: ""
]



Cons: Some documents have an array near 4000 destination cities.



Pros: there are only 4000 documents in the collection.



Second Option:



Create a document with any connection of cities:



Example:




from: "NY",
to: "Boston",
distance: "215 mi",
data: "no trains"



Pros: i think that with an index it may be faster for searches.



Cons: From a 4000 x 4000 combinations, the collection may have near 12.000.000 of documents.



Any help is appreciated.







mongodb






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 13 '18 at 13:05









sebaciposebacipo

50659




50659







  • 1





    It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.

    – Neil Lunn
    Nov 13 '18 at 13:16











  • Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.

    – sebacipo
    Nov 13 '18 at 13:39











  • @sebacipo: I'm with Neil here. These need to be measured/benchmarked.

    – Sergio Tulentsev
    Nov 13 '18 at 14:04











  • @sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.

    – Sergio Tulentsev
    Nov 13 '18 at 14:06













  • 1





    It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.

    – Neil Lunn
    Nov 13 '18 at 13:16











  • Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.

    – sebacipo
    Nov 13 '18 at 13:39











  • @sebacipo: I'm with Neil here. These need to be measured/benchmarked.

    – Sergio Tulentsev
    Nov 13 '18 at 14:04











  • @sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.

    – Sergio Tulentsev
    Nov 13 '18 at 14:06








1




1





It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.

– Neil Lunn
Nov 13 '18 at 13:16





It's a pretty broad question here and really depends mostly on what the intended "query pattern" actually is. I would generally be pretty loathe to have an array with 4000 or more items though unless there was a specific advantage in doing so. I suggest actually trying both and benchmarking which one works best instead of asking someone else to "greenlight" a choice for you.

– Neil Lunn
Nov 13 '18 at 13:16













Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.

– sebacipo
Nov 13 '18 at 13:39





Thank you Neil Lunn for your reply, but i am not looking for someone to grenlight a choise for me. Maybe there are better options that i am not considering or may be someone has experience with this and can help me.

– sebacipo
Nov 13 '18 at 13:39













@sebacipo: I'm with Neil here. These need to be measured/benchmarked.

– Sergio Tulentsev
Nov 13 '18 at 14:04





@sebacipo: I'm with Neil here. These need to be measured/benchmarked.

– Sergio Tulentsev
Nov 13 '18 at 14:04













@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.

– Sergio Tulentsev
Nov 13 '18 at 14:06






@sebacipo but my intuition tells me that the first one is a terrible terrible solution. Its only advantage is that the database contains only a few supersized documents. But it's not that much of an advantage. It's not like there's a limit on number of documents in a collection. Looks like you want to use this is a proxy metric for query performance, right? That's what you need to measure, ease/performance of queries.

– Sergio Tulentsev
Nov 13 '18 at 14:06













0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53281663%2fstore-and-query-nxn-matrix%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53281663%2fstore-and-query-nxn-matrix%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

Syphilis

Darth Vader #20