Parse Dataframe and store output in a single file [duplicate]









up vote
0
down vote

favorite













This question already has an answer here:



  • Spark split a column value into multiple rows

    1 answer



I have a data frame using Spark SQL in Scala with columns A and B with values:



A | B
1 a|b|c
2 b|d
3 d|e|f


I need to store the output to a single textfile in following format



1 a
1 b
1 c
2 b
2 d
3 d
3 e
3 f


How can I do that?










share|improve this question















marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Nov 10 at 10:56


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


















    up vote
    0
    down vote

    favorite













    This question already has an answer here:



    • Spark split a column value into multiple rows

      1 answer



    I have a data frame using Spark SQL in Scala with columns A and B with values:



    A | B
    1 a|b|c
    2 b|d
    3 d|e|f


    I need to store the output to a single textfile in following format



    1 a
    1 b
    1 c
    2 b
    2 d
    3 d
    3 e
    3 f


    How can I do that?










    share|improve this question















    marked as duplicate by user6910411 apache-spark
    Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

    StackExchange.ready(function()
    if (StackExchange.options.isMobile) return;

    $('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
    var $hover = $(this).addClass('hover-bound'),
    $msg = $hover.siblings('.dupe-hammer-message');

    $hover.hover(
    function()
    $hover.showInfoMessage('',
    messageElement: $msg.clone().show(),
    transient: false,
    position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
    dismissable: false,
    relativeToBody: true
    );
    ,
    function()
    StackExchange.helpers.removeMessages();

    );
    );
    );
    Nov 10 at 10:56


    This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
















      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite












      This question already has an answer here:



      • Spark split a column value into multiple rows

        1 answer



      I have a data frame using Spark SQL in Scala with columns A and B with values:



      A | B
      1 a|b|c
      2 b|d
      3 d|e|f


      I need to store the output to a single textfile in following format



      1 a
      1 b
      1 c
      2 b
      2 d
      3 d
      3 e
      3 f


      How can I do that?










      share|improve this question
















      This question already has an answer here:



      • Spark split a column value into multiple rows

        1 answer



      I have a data frame using Spark SQL in Scala with columns A and B with values:



      A | B
      1 a|b|c
      2 b|d
      3 d|e|f


      I need to store the output to a single textfile in following format



      1 a
      1 b
      1 c
      2 b
      2 d
      3 d
      3 e
      3 f


      How can I do that?





      This question already has an answer here:



      • Spark split a column value into multiple rows

        1 answer







      scala apache-spark apache-spark-sql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 10 at 9:41









      SCouto

      3,73531227




      3,73531227










      asked Nov 10 at 8:59









      Nick

      96110




      96110




      marked as duplicate by user6910411 apache-spark
      Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

      StackExchange.ready(function()
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function()
      $hover.showInfoMessage('',
      messageElement: $msg.clone().show(),
      transient: false,
      position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
      dismissable: false,
      relativeToBody: true
      );
      ,
      function()
      StackExchange.helpers.removeMessages();

      );
      );
      );
      Nov 10 at 10:56


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






      marked as duplicate by user6910411 apache-spark
      Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

      StackExchange.ready(function()
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function()
      $hover.showInfoMessage('',
      messageElement: $msg.clone().show(),
      transient: false,
      position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
      dismissable: false,
      relativeToBody: true
      );
      ,
      function()
      StackExchange.helpers.removeMessages();

      );
      );
      );
      Nov 10 at 10:56


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



           resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer




















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20

















          up vote
          0
          down vote













          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer




















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16

















          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          2
          down vote



          accepted










          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



           resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer




















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20














          up vote
          2
          down vote



          accepted










          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



           resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer




















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20












          up vote
          2
          down vote



          accepted







          up vote
          2
          down vote



          accepted






          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



           resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")





          share|improve this answer












          You can get the desired Dataframe with an expode and a split:



          val resultDF = df.withColumn("B", explode(split($"B", "\|")))


          Result



          +---+---+
          | A| B|
          +---+---+
          | 1| a|
          | 1| b|
          | 1| c|
          | 2| b|
          | 2| d|
          | 3| d|
          | 3| e|
          | 3| f|
          +---+---+


          Then you can save in a single file with a coalesce(1)



           resultDF.coalesce(1).rdd.saveAsTextFile("desiredPath")






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 10 at 9:47









          SCouto

          3,73531227




          3,73531227











          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20
















          • explode function is not recognized in my code. What dependency do I need to add?
            – Nick
            Nov 10 at 10:17






          • 1




            this should be enough: import org.apache.spark.sql.functions._
            – SCouto
            Nov 10 at 10:20















          explode function is not recognized in my code. What dependency do I need to add?
          – Nick
          Nov 10 at 10:17




          explode function is not recognized in my code. What dependency do I need to add?
          – Nick
          Nov 10 at 10:17




          1




          1




          this should be enough: import org.apache.spark.sql.functions._
          – SCouto
          Nov 10 at 10:20




          this should be enough: import org.apache.spark.sql.functions._
          – SCouto
          Nov 10 at 10:20












          up vote
          0
          down vote













          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer




















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16














          up vote
          0
          down vote













          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer




















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16












          up vote
          0
          down vote










          up vote
          0
          down vote









          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")





          share|improve this answer












          You can do something like,



          val df = ???
          val resDF =df.withColumn("B", explode(split(col("B"), "\|")))

          resDF.coalesce(1).write.option("delimiter", " ").csv("path/to/file")






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 10 at 9:47









          Chitral Verma

          9341317




          9341317











          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16
















          • explode(split(col : this part of your code is not recognized
            – Nick
            Nov 10 at 10:15










          • col comes from org.apache.spark.sql.functions
            – Chitral Verma
            Nov 10 at 11:16















          explode(split(col : this part of your code is not recognized
          – Nick
          Nov 10 at 10:15




          explode(split(col : this part of your code is not recognized
          – Nick
          Nov 10 at 10:15












          col comes from org.apache.spark.sql.functions
          – Chitral Verma
          Nov 10 at 11:16




          col comes from org.apache.spark.sql.functions
          – Chitral Verma
          Nov 10 at 11:16



          Popular posts from this blog

          Use pre created SQLite database for Android project in kotlin

          Darth Vader #20

          Ondo