How to use SPARK to query on HIVE?










-2















I am trying to use spark to run queries on hive table.
I have followed lots of articles present on internet, but had no success.
I have moved the hive-site.xml file to spark location.



Could you please explain how to do that? I am using Spark 1.6



Thank you in advance.



Please find my code below.



import sqlContext.implicits._
import org.apache.spark.sql
val eBayText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
val hospitalDataText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
val header = hospitalDataText.first()
val hospitalData = hospitalDataText.filter(a=>a!=header)
case class Services(uhid:String,locationid:String,doctorid:String)
val hData = hospitalData.map(_.split(",")).map(p=>Services(p(0),p(1),p(2)))
val hosService = hData.toDF()
hosService.write.format("parquet").mode(org.apache.spark.sql.SaveMode.Append).save("/user/hive/warehouse/hosdata")


This code created 'hosdata' folder at specified path, which contains data in 'parquet' format.



But when i went to hive and check table got created or not the, i did not able to see any table name as 'hosdata'.



So i run below commands.



hosService.write.mode("overwrite").saveAsTable("hosData")
sqlContext.sql("show tables").show


shows me below result



+--------------------+-----------+
| tableName|isTemporary|
+--------------------+-----------+
| hosdata| false|
+--------------------+-----------+


But again when i check in hive, i can not see table 'hosdata'



Could anyone let me know what step i am missing?










share|improve this question




























    -2















    I am trying to use spark to run queries on hive table.
    I have followed lots of articles present on internet, but had no success.
    I have moved the hive-site.xml file to spark location.



    Could you please explain how to do that? I am using Spark 1.6



    Thank you in advance.



    Please find my code below.



    import sqlContext.implicits._
    import org.apache.spark.sql
    val eBayText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
    val hospitalDataText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
    val header = hospitalDataText.first()
    val hospitalData = hospitalDataText.filter(a=>a!=header)
    case class Services(uhid:String,locationid:String,doctorid:String)
    val hData = hospitalData.map(_.split(",")).map(p=>Services(p(0),p(1),p(2)))
    val hosService = hData.toDF()
    hosService.write.format("parquet").mode(org.apache.spark.sql.SaveMode.Append).save("/user/hive/warehouse/hosdata")


    This code created 'hosdata' folder at specified path, which contains data in 'parquet' format.



    But when i went to hive and check table got created or not the, i did not able to see any table name as 'hosdata'.



    So i run below commands.



    hosService.write.mode("overwrite").saveAsTable("hosData")
    sqlContext.sql("show tables").show


    shows me below result



    +--------------------+-----------+
    | tableName|isTemporary|
    +--------------------+-----------+
    | hosdata| false|
    +--------------------+-----------+


    But again when i check in hive, i can not see table 'hosdata'



    Could anyone let me know what step i am missing?










    share|improve this question


























      -2












      -2








      -2








      I am trying to use spark to run queries on hive table.
      I have followed lots of articles present on internet, but had no success.
      I have moved the hive-site.xml file to spark location.



      Could you please explain how to do that? I am using Spark 1.6



      Thank you in advance.



      Please find my code below.



      import sqlContext.implicits._
      import org.apache.spark.sql
      val eBayText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
      val hospitalDataText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
      val header = hospitalDataText.first()
      val hospitalData = hospitalDataText.filter(a=>a!=header)
      case class Services(uhid:String,locationid:String,doctorid:String)
      val hData = hospitalData.map(_.split(",")).map(p=>Services(p(0),p(1),p(2)))
      val hosService = hData.toDF()
      hosService.write.format("parquet").mode(org.apache.spark.sql.SaveMode.Append).save("/user/hive/warehouse/hosdata")


      This code created 'hosdata' folder at specified path, which contains data in 'parquet' format.



      But when i went to hive and check table got created or not the, i did not able to see any table name as 'hosdata'.



      So i run below commands.



      hosService.write.mode("overwrite").saveAsTable("hosData")
      sqlContext.sql("show tables").show


      shows me below result



      +--------------------+-----------+
      | tableName|isTemporary|
      +--------------------+-----------+
      | hosdata| false|
      +--------------------+-----------+


      But again when i check in hive, i can not see table 'hosdata'



      Could anyone let me know what step i am missing?










      share|improve this question
















      I am trying to use spark to run queries on hive table.
      I have followed lots of articles present on internet, but had no success.
      I have moved the hive-site.xml file to spark location.



      Could you please explain how to do that? I am using Spark 1.6



      Thank you in advance.



      Please find my code below.



      import sqlContext.implicits._
      import org.apache.spark.sql
      val eBayText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
      val hospitalDataText = sc.textFile("/user/cloudera/spark/servicesDemo.csv")
      val header = hospitalDataText.first()
      val hospitalData = hospitalDataText.filter(a=>a!=header)
      case class Services(uhid:String,locationid:String,doctorid:String)
      val hData = hospitalData.map(_.split(",")).map(p=>Services(p(0),p(1),p(2)))
      val hosService = hData.toDF()
      hosService.write.format("parquet").mode(org.apache.spark.sql.SaveMode.Append).save("/user/hive/warehouse/hosdata")


      This code created 'hosdata' folder at specified path, which contains data in 'parquet' format.



      But when i went to hive and check table got created or not the, i did not able to see any table name as 'hosdata'.



      So i run below commands.



      hosService.write.mode("overwrite").saveAsTable("hosData")
      sqlContext.sql("show tables").show


      shows me below result



      +--------------------+-----------+
      | tableName|isTemporary|
      +--------------------+-----------+
      | hosdata| false|
      +--------------------+-----------+


      But again when i check in hive, i can not see table 'hosdata'



      Could anyone let me know what step i am missing?







      apache-spark hive






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 16 '18 at 4:15







      Kedar Divekar

















      asked Nov 15 '18 at 2:41









      Kedar DivekarKedar Divekar

      11




      11






















          1 Answer
          1






          active

          oldest

          votes


















          0














          There are multiple ways you can use to query Hive using Spark.



          1. Like in Hive CLI, you can query using Spark SQL

          2. Spark-shell is available to run spark class files in which you need to define variable like for hive, spark configuration object. Spark Context-sql() method allows you to execute the same query that you might have executed on Hive

          Performance tuning is definitely an important perspect as you can use broadcast and other methods for faster execution.



          Hope this helps.






          share|improve this answer























          • I have added exact code above. Could you please let me know what i am missing over there ?

            – Kedar Divekar
            Nov 21 '18 at 4:39










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53311665%2fhow-to-use-spark-to-query-on-hive%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          There are multiple ways you can use to query Hive using Spark.



          1. Like in Hive CLI, you can query using Spark SQL

          2. Spark-shell is available to run spark class files in which you need to define variable like for hive, spark configuration object. Spark Context-sql() method allows you to execute the same query that you might have executed on Hive

          Performance tuning is definitely an important perspect as you can use broadcast and other methods for faster execution.



          Hope this helps.






          share|improve this answer























          • I have added exact code above. Could you please let me know what i am missing over there ?

            – Kedar Divekar
            Nov 21 '18 at 4:39















          0














          There are multiple ways you can use to query Hive using Spark.



          1. Like in Hive CLI, you can query using Spark SQL

          2. Spark-shell is available to run spark class files in which you need to define variable like for hive, spark configuration object. Spark Context-sql() method allows you to execute the same query that you might have executed on Hive

          Performance tuning is definitely an important perspect as you can use broadcast and other methods for faster execution.



          Hope this helps.






          share|improve this answer























          • I have added exact code above. Could you please let me know what i am missing over there ?

            – Kedar Divekar
            Nov 21 '18 at 4:39













          0












          0








          0







          There are multiple ways you can use to query Hive using Spark.



          1. Like in Hive CLI, you can query using Spark SQL

          2. Spark-shell is available to run spark class files in which you need to define variable like for hive, spark configuration object. Spark Context-sql() method allows you to execute the same query that you might have executed on Hive

          Performance tuning is definitely an important perspect as you can use broadcast and other methods for faster execution.



          Hope this helps.






          share|improve this answer













          There are multiple ways you can use to query Hive using Spark.



          1. Like in Hive CLI, you can query using Spark SQL

          2. Spark-shell is available to run spark class files in which you need to define variable like for hive, spark configuration object. Spark Context-sql() method allows you to execute the same query that you might have executed on Hive

          Performance tuning is definitely an important perspect as you can use broadcast and other methods for faster execution.



          Hope this helps.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 '18 at 7:43









          Rahul Singh BajajRahul Singh Bajaj

          143




          143












          • I have added exact code above. Could you please let me know what i am missing over there ?

            – Kedar Divekar
            Nov 21 '18 at 4:39

















          • I have added exact code above. Could you please let me know what i am missing over there ?

            – Kedar Divekar
            Nov 21 '18 at 4:39
















          I have added exact code above. Could you please let me know what i am missing over there ?

          – Kedar Divekar
          Nov 21 '18 at 4:39





          I have added exact code above. Could you please let me know what i am missing over there ?

          – Kedar Divekar
          Nov 21 '18 at 4:39



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53311665%2fhow-to-use-spark-to-query-on-hive%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Darth Vader #20

          How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

          Ondo