Transform CSV Column Values into Single Row
up vote
0
down vote
favorite
My data in CSV is like this(Expected Image):
Actual Data
And I want to convert this Data into:
Expected Data
(hivetablename.hivecolumnname = dbtablename.dbtablecolumn)
By joining the multiple Row values into a Single row value like above.
Please note that 'AND' is a Literal between the condition to be built, which would appear until the second last record.
Once the Last Record is reached, Only the condition would appear(xx=yy)
I wish the result to be in SCALA SPARK.
Many thanks in advance!
scala apache-spark-2.0
New contributor
add a comment |
up vote
0
down vote
favorite
My data in CSV is like this(Expected Image):
Actual Data
And I want to convert this Data into:
Expected Data
(hivetablename.hivecolumnname = dbtablename.dbtablecolumn)
By joining the multiple Row values into a Single row value like above.
Please note that 'AND' is a Literal between the condition to be built, which would appear until the second last record.
Once the Last Record is reached, Only the condition would appear(xx=yy)
I wish the result to be in SCALA SPARK.
Many thanks in advance!
scala apache-spark-2.0
New contributor
Could you please share what have you tried so far?
– ulubeyn
Nov 9 at 14:40
Many thanks for the response. Im a learner so couldnt go much further. Heres what Ive been able to achieve val dfconf= spark.read.option("header", "true").csv("file:///conf.csv").alias("conf"); dfconf.show(10, false) val newDF1 = dfconf.withColumn("Y",lit("1")) val newDF2 = newDF1.withColumn("Z",lit(",")) val newDF3 = newDF2.withColumn("comb",concat($"hivetablename",lit("."),$"hivetablecolumn",lit("="),$"dbtablename",lit("."),$"dbtablecolumn")) newDF3.show() val newDF4 = newDF3.groupBy("Y").agg(collect_set("z").as("combined"))
– Farhan Soomro
Nov 9 at 14:51
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
My data in CSV is like this(Expected Image):
Actual Data
And I want to convert this Data into:
Expected Data
(hivetablename.hivecolumnname = dbtablename.dbtablecolumn)
By joining the multiple Row values into a Single row value like above.
Please note that 'AND' is a Literal between the condition to be built, which would appear until the second last record.
Once the Last Record is reached, Only the condition would appear(xx=yy)
I wish the result to be in SCALA SPARK.
Many thanks in advance!
scala apache-spark-2.0
New contributor
My data in CSV is like this(Expected Image):
Actual Data
And I want to convert this Data into:
Expected Data
(hivetablename.hivecolumnname = dbtablename.dbtablecolumn)
By joining the multiple Row values into a Single row value like above.
Please note that 'AND' is a Literal between the condition to be built, which would appear until the second last record.
Once the Last Record is reached, Only the condition would appear(xx=yy)
I wish the result to be in SCALA SPARK.
Many thanks in advance!
scala apache-spark-2.0
scala apache-spark-2.0
New contributor
New contributor
edited Nov 9 at 14:28
New contributor
asked Nov 9 at 13:54
Farhan Soomro
12
12
New contributor
New contributor
Could you please share what have you tried so far?
– ulubeyn
Nov 9 at 14:40
Many thanks for the response. Im a learner so couldnt go much further. Heres what Ive been able to achieve val dfconf= spark.read.option("header", "true").csv("file:///conf.csv").alias("conf"); dfconf.show(10, false) val newDF1 = dfconf.withColumn("Y",lit("1")) val newDF2 = newDF1.withColumn("Z",lit(",")) val newDF3 = newDF2.withColumn("comb",concat($"hivetablename",lit("."),$"hivetablecolumn",lit("="),$"dbtablename",lit("."),$"dbtablecolumn")) newDF3.show() val newDF4 = newDF3.groupBy("Y").agg(collect_set("z").as("combined"))
– Farhan Soomro
Nov 9 at 14:51
add a comment |
Could you please share what have you tried so far?
– ulubeyn
Nov 9 at 14:40
Many thanks for the response. Im a learner so couldnt go much further. Heres what Ive been able to achieve val dfconf= spark.read.option("header", "true").csv("file:///conf.csv").alias("conf"); dfconf.show(10, false) val newDF1 = dfconf.withColumn("Y",lit("1")) val newDF2 = newDF1.withColumn("Z",lit(",")) val newDF3 = newDF2.withColumn("comb",concat($"hivetablename",lit("."),$"hivetablecolumn",lit("="),$"dbtablename",lit("."),$"dbtablecolumn")) newDF3.show() val newDF4 = newDF3.groupBy("Y").agg(collect_set("z").as("combined"))
– Farhan Soomro
Nov 9 at 14:51
Could you please share what have you tried so far?
– ulubeyn
Nov 9 at 14:40
Could you please share what have you tried so far?
– ulubeyn
Nov 9 at 14:40
Many thanks for the response. Im a learner so couldnt go much further. Heres what Ive been able to achieve val dfconf= spark.read.option("header", "true").csv("file:///conf.csv").alias("conf"); dfconf.show(10, false) val newDF1 = dfconf.withColumn("Y",lit("1")) val newDF2 = newDF1.withColumn("Z",lit(",")) val newDF3 = newDF2.withColumn("comb",concat($"hivetablename",lit("."),$"hivetablecolumn",lit("="),$"dbtablename",lit("."),$"dbtablecolumn")) newDF3.show() val newDF4 = newDF3.groupBy("Y").agg(collect_set("z").as("combined"))
– Farhan Soomro
Nov 9 at 14:51
Many thanks for the response. Im a learner so couldnt go much further. Heres what Ive been able to achieve val dfconf= spark.read.option("header", "true").csv("file:///conf.csv").alias("conf"); dfconf.show(10, false) val newDF1 = dfconf.withColumn("Y",lit("1")) val newDF2 = newDF1.withColumn("Z",lit(",")) val newDF3 = newDF2.withColumn("comb",concat($"hivetablename",lit("."),$"hivetablecolumn",lit("="),$"dbtablename",lit("."),$"dbtablecolumn")) newDF3.show() val newDF4 = newDF3.groupBy("Y").agg(collect_set("z").as("combined"))
– Farhan Soomro
Nov 9 at 14:51
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Farhan Soomro is a new contributor. Be nice, and check out our Code of Conduct.
Farhan Soomro is a new contributor. Be nice, and check out our Code of Conduct.
Farhan Soomro is a new contributor. Be nice, and check out our Code of Conduct.
Farhan Soomro is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53227060%2ftransform-csv-column-values-into-single-row%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Could you please share what have you tried so far?
– ulubeyn
Nov 9 at 14:40
Many thanks for the response. Im a learner so couldnt go much further. Heres what Ive been able to achieve val dfconf= spark.read.option("header", "true").csv("file:///conf.csv").alias("conf"); dfconf.show(10, false) val newDF1 = dfconf.withColumn("Y",lit("1")) val newDF2 = newDF1.withColumn("Z",lit(",")) val newDF3 = newDF2.withColumn("comb",concat($"hivetablename",lit("."),$"hivetablecolumn",lit("="),$"dbtablename",lit("."),$"dbtablecolumn")) newDF3.show() val newDF4 = newDF3.groupBy("Y").agg(collect_set("z").as("combined"))
– Farhan Soomro
Nov 9 at 14:51