Elassandra Search for Replicated Data










0















How token_range is decided in Elassandra while distributing the query to node?



What happens when the data is replicated across Elassandra node(s)?



How does the filtering of duplicate results take place?










share|improve this question


























    0















    How token_range is decided in Elassandra while distributing the query to node?



    What happens when the data is replicated across Elassandra node(s)?



    How does the filtering of duplicate results take place?










    share|improve this question
























      0












      0








      0








      How token_range is decided in Elassandra while distributing the query to node?



      What happens when the data is replicated across Elassandra node(s)?



      How does the filtering of duplicate results take place?










      share|improve this question














      How token_range is decided in Elassandra while distributing the query to node?



      What happens when the data is replicated across Elassandra node(s)?



      How does the filtering of duplicate results take place?







      elasticsearch-5 cassandra-3.0 elassandra






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 14 '18 at 6:59









      DivsDivs

      6411824




      6411824






















          2 Answers
          2






          active

          oldest

          votes


















          0














          My understanding is that the queries go around the cluster in a manner similar to what Cassandra otherwise does.



          The data replication is not a concern to the Elasticsearch side of things. They create their own tables to create their search information and those tables are replicated through the standard Cassandra mechanism. If you understand how Cassandra replication works, then the Elasticsearch data does the same kind of thing.



          The filtering happens because each search node is given a non-overlapping range of tokens to take care of. In other words, one node is asked to return results for 1, 2, 3, the next node for results for 4, 5, 6, and the third node results for 7, 8, 9. Therefore there won't an overlap and no actual filtering takes place.






          share|improve this answer






























            0














            Elassandra distributes the query to nodes according to the search_strategy_class of the targeted index. There are two strategies : PrimaryFirstSearchStrategy (the default) and RandomSearchStrategy.



            Primary first search strategy



            Each node is involved in the query, and is responsible to return documents it owns as a primary node. When a node is down, the next replica will be used as a substitute.



            Random search strategy



            When RF > 1, the full ring can be reached with only a subset of nodes. The random search strategy takes advantage of this by randomly choosing such a subset of nodes to improve search efficiency.



            Both strategies add a token_range filter to each sub-queries according the behavior described above. Therefore, the filtering happens locally, not in the coordinator node.






            share|improve this answer






















              Your Answer






              StackExchange.ifUsing("editor", function ()
              StackExchange.using("externalEditor", function ()
              StackExchange.using("snippets", function ()
              StackExchange.snippets.init();
              );
              );
              , "code-snippets");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "1"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53294684%2felassandra-search-for-replicated-data%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0














              My understanding is that the queries go around the cluster in a manner similar to what Cassandra otherwise does.



              The data replication is not a concern to the Elasticsearch side of things. They create their own tables to create their search information and those tables are replicated through the standard Cassandra mechanism. If you understand how Cassandra replication works, then the Elasticsearch data does the same kind of thing.



              The filtering happens because each search node is given a non-overlapping range of tokens to take care of. In other words, one node is asked to return results for 1, 2, 3, the next node for results for 4, 5, 6, and the third node results for 7, 8, 9. Therefore there won't an overlap and no actual filtering takes place.






              share|improve this answer



























                0














                My understanding is that the queries go around the cluster in a manner similar to what Cassandra otherwise does.



                The data replication is not a concern to the Elasticsearch side of things. They create their own tables to create their search information and those tables are replicated through the standard Cassandra mechanism. If you understand how Cassandra replication works, then the Elasticsearch data does the same kind of thing.



                The filtering happens because each search node is given a non-overlapping range of tokens to take care of. In other words, one node is asked to return results for 1, 2, 3, the next node for results for 4, 5, 6, and the third node results for 7, 8, 9. Therefore there won't an overlap and no actual filtering takes place.






                share|improve this answer

























                  0












                  0








                  0







                  My understanding is that the queries go around the cluster in a manner similar to what Cassandra otherwise does.



                  The data replication is not a concern to the Elasticsearch side of things. They create their own tables to create their search information and those tables are replicated through the standard Cassandra mechanism. If you understand how Cassandra replication works, then the Elasticsearch data does the same kind of thing.



                  The filtering happens because each search node is given a non-overlapping range of tokens to take care of. In other words, one node is asked to return results for 1, 2, 3, the next node for results for 4, 5, 6, and the third node results for 7, 8, 9. Therefore there won't an overlap and no actual filtering takes place.






                  share|improve this answer













                  My understanding is that the queries go around the cluster in a manner similar to what Cassandra otherwise does.



                  The data replication is not a concern to the Elasticsearch side of things. They create their own tables to create their search information and those tables are replicated through the standard Cassandra mechanism. If you understand how Cassandra replication works, then the Elasticsearch data does the same kind of thing.



                  The filtering happens because each search node is given a non-overlapping range of tokens to take care of. In other words, one node is asked to return results for 1, 2, 3, the next node for results for 4, 5, 6, and the third node results for 7, 8, 9. Therefore there won't an overlap and no actual filtering takes place.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 1 at 17:35









                  Alexis WilkeAlexis Wilke

                  10.1k34180




                  10.1k34180























                      0














                      Elassandra distributes the query to nodes according to the search_strategy_class of the targeted index. There are two strategies : PrimaryFirstSearchStrategy (the default) and RandomSearchStrategy.



                      Primary first search strategy



                      Each node is involved in the query, and is responsible to return documents it owns as a primary node. When a node is down, the next replica will be used as a substitute.



                      Random search strategy



                      When RF > 1, the full ring can be reached with only a subset of nodes. The random search strategy takes advantage of this by randomly choosing such a subset of nodes to improve search efficiency.



                      Both strategies add a token_range filter to each sub-queries according the behavior described above. Therefore, the filtering happens locally, not in the coordinator node.






                      share|improve this answer



























                        0














                        Elassandra distributes the query to nodes according to the search_strategy_class of the targeted index. There are two strategies : PrimaryFirstSearchStrategy (the default) and RandomSearchStrategy.



                        Primary first search strategy



                        Each node is involved in the query, and is responsible to return documents it owns as a primary node. When a node is down, the next replica will be used as a substitute.



                        Random search strategy



                        When RF > 1, the full ring can be reached with only a subset of nodes. The random search strategy takes advantage of this by randomly choosing such a subset of nodes to improve search efficiency.



                        Both strategies add a token_range filter to each sub-queries according the behavior described above. Therefore, the filtering happens locally, not in the coordinator node.






                        share|improve this answer

























                          0












                          0








                          0







                          Elassandra distributes the query to nodes according to the search_strategy_class of the targeted index. There are two strategies : PrimaryFirstSearchStrategy (the default) and RandomSearchStrategy.



                          Primary first search strategy



                          Each node is involved in the query, and is responsible to return documents it owns as a primary node. When a node is down, the next replica will be used as a substitute.



                          Random search strategy



                          When RF > 1, the full ring can be reached with only a subset of nodes. The random search strategy takes advantage of this by randomly choosing such a subset of nodes to improve search efficiency.



                          Both strategies add a token_range filter to each sub-queries according the behavior described above. Therefore, the filtering happens locally, not in the coordinator node.






                          share|improve this answer













                          Elassandra distributes the query to nodes according to the search_strategy_class of the targeted index. There are two strategies : PrimaryFirstSearchStrategy (the default) and RandomSearchStrategy.



                          Primary first search strategy



                          Each node is involved in the query, and is responsible to return documents it owns as a primary node. When a node is down, the next replica will be used as a substitute.



                          Random search strategy



                          When RF > 1, the full ring can be reached with only a subset of nodes. The random search strategy takes advantage of this by randomly choosing such a subset of nodes to improve search efficiency.



                          Both strategies add a token_range filter to each sub-queries according the behavior described above. Therefore, the filtering happens locally, not in the coordinator node.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Feb 13 at 8:46









                          barthbarth

                          1813




                          1813



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53294684%2felassandra-search-for-replicated-data%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Darth Vader #20

                              How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

                              Ondo