Why is Key always 0 when creating map

up vote
0
down vote

favorite

My code is supposed to extract a Map from a dataframe. The map will be used later for some calculations (mapping Credit to best matching original Billing). However the first step is failing already - the TransactionId is always retrieved as 0.

Simplified version of the code:

case class SalesTransaction(
 CustomerId : Int,
 Score : Int,
 Revenue : Double,
 Type : String,
 Credited : Double = 0.0,
 LinkedTransactionId : Int = 0,
 IsProcessed : Boolean = false
 )
val df = Seq(
 (1, 1, 123, "Sales", 100),
 (1, 2, 122, "Credit", 100),
 (1, 3, 99, "Sales", 70),
 (1, 4, 101, "Sales", 77),
 (1, 5, 102, "Credit", 75),
 (1, 6, 98, "Sales", 71),
 (2, 7, 200, "Sales", 55),
 (2, 8, 220, "Sales", 55),
 (2, 9, 200, "Credit", 50),
 (2, 10, 205, "Sales", 50)
).toDF("CustomerId", "TransactionId", "TransactionAttributesScore", "TransactionType", "Revenue")
 .withColumn("Revenue", $"Revenue".cast(DoubleType))
 .repartition($"CustomerId")

//map generation:
val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs("TransactionId")
 , new SalesTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 )
 ).collect.toMap

m2.foreach(m => println("key: " + m._1 +" Value: "+ m._2))

The output has only the very last record, because all values captured by row.getAs("TransactionId") is null (i.e. translates as 0 in the m2 Map) thus tuple created in each iteration is (null, [current row SalesTransaction]).

Could you please advice me what might be wrong with my code? I'm quite new to Scala and must be missing some syntactical nuance here.

asked Nov 8 at 18:57

Dan

716

add a comment |

up vote
0
down vote

favorite

Simplified version of the code:

case class SalesTransaction(
 CustomerId : Int,
 Score : Int,
 Revenue : Double,
 Type : String,
 Credited : Double = 0.0,
 LinkedTransactionId : Int = 0,
 IsProcessed : Boolean = false
 )
val df = Seq(
 (1, 1, 123, "Sales", 100),
 (1, 2, 122, "Credit", 100),
 (1, 3, 99, "Sales", 70),
 (1, 4, 101, "Sales", 77),
 (1, 5, 102, "Credit", 75),
 (1, 6, 98, "Sales", 71),
 (2, 7, 200, "Sales", 55),
 (2, 8, 220, "Sales", 55),
 (2, 9, 200, "Credit", 50),
 (2, 10, 205, "Sales", 50)
).toDF("CustomerId", "TransactionId", "TransactionAttributesScore", "TransactionType", "Revenue")
 .withColumn("Revenue", $"Revenue".cast(DoubleType))
 .repartition($"CustomerId")

//map generation:
val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs("TransactionId")
 , new SalesTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 )
 ).collect.toMap

m2.foreach(m => println("key: " + m._1 +" Value: "+ m._2))

Could you please advice me what might be wrong with my code? I'm quite new to Scala and must be missing some syntactical nuance here.

asked Nov 8 at 18:57

Dan

716

add a comment |

up vote
0
down vote

favorite

Simplified version of the code:

case class SalesTransaction(
 CustomerId : Int,
 Score : Int,
 Revenue : Double,
 Type : String,
 Credited : Double = 0.0,
 LinkedTransactionId : Int = 0,
 IsProcessed : Boolean = false
 )
val df = Seq(
 (1, 1, 123, "Sales", 100),
 (1, 2, 122, "Credit", 100),
 (1, 3, 99, "Sales", 70),
 (1, 4, 101, "Sales", 77),
 (1, 5, 102, "Credit", 75),
 (1, 6, 98, "Sales", 71),
 (2, 7, 200, "Sales", 55),
 (2, 8, 220, "Sales", 55),
 (2, 9, 200, "Credit", 50),
 (2, 10, 205, "Sales", 50)
).toDF("CustomerId", "TransactionId", "TransactionAttributesScore", "TransactionType", "Revenue")
 .withColumn("Revenue", $"Revenue".cast(DoubleType))
 .repartition($"CustomerId")

//map generation:
val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs("TransactionId")
 , new SalesTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 )
 ).collect.toMap

m2.foreach(m => println("key: " + m._1 +" Value: "+ m._2))

Could you please advice me what might be wrong with my code? I'm quite new to Scala and must be missing some syntactical nuance here.

asked Nov 8 at 18:57

Dan

716

Simplified version of the code:

case class SalesTransaction(
 CustomerId : Int,
 Score : Int,
 Revenue : Double,
 Type : String,
 Credited : Double = 0.0,
 LinkedTransactionId : Int = 0,
 IsProcessed : Boolean = false
 )
val df = Seq(
 (1, 1, 123, "Sales", 100),
 (1, 2, 122, "Credit", 100),
 (1, 3, 99, "Sales", 70),
 (1, 4, 101, "Sales", 77),
 (1, 5, 102, "Credit", 75),
 (1, 6, 98, "Sales", 71),
 (2, 7, 200, "Sales", 55),
 (2, 8, 220, "Sales", 55),
 (2, 9, 200, "Credit", 50),
 (2, 10, 205, "Sales", 50)
).toDF("CustomerId", "TransactionId", "TransactionAttributesScore", "TransactionType", "Revenue")
 .withColumn("Revenue", $"Revenue".cast(DoubleType))
 .repartition($"CustomerId")

//map generation:
val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs("TransactionId")
 , new SalesTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 )
 ).collect.toMap

m2.foreach(m => println("key: " + m._1 +" Value: "+ m._2))

Could you please advice me what might be wrong with my code? I'm quite new to Scala and must be missing some syntactical nuance here.

scala apache-spark

asked Nov 8 at 18:57

Dan

716

asked Nov 8 at 18:57

Dan

716

asked Nov 8 at 18:57

Dan

716

asked Nov 8 at 18:57

Dan

716

asked Nov 8 at 18:57

Dan

716

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

You can also use row.getAs[Int]("TransactionId") as shown below :

val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs[Int]("TransactionId"), 
 new SalesTransaction(row.getAs("CustomerId"),
 row.getAs("TransactionAttributesScore"),
 row.getAs("Revenue"),
 row.getAs("TransactionType"))
 )
 ).collect.toMap

It is always better to use the casted version of getAs to avoid errors like this.

answered Nov 10 at 16:36

user238607

680711

add a comment |

up vote
0
down vote

The issue is related to data type obtained from row.getAs("TransactionId"). Despite underlying $"TransactionId" being integer. Converting the input explicitly resolved the issue:

//… code above unchanged
val m2 : Map[Int, SlTransaction] =
 df.map(row => 
 val mKey : Int = row.getAs("TransactionId") //forcing into Int variable
 val mValue : SlTransaction = new SlTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 (mKey, mValue)
 
 ).collect.toMap

answered Nov 9 at 20:44

Dan

716

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53214409%2fwhy-is-key-always-0-when-creating-map%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

You can also use row.getAs[Int]("TransactionId") as shown below :

val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs[Int]("TransactionId"), 
 new SalesTransaction(row.getAs("CustomerId"),
 row.getAs("TransactionAttributesScore"),
 row.getAs("Revenue"),
 row.getAs("TransactionType"))
 )
 ).collect.toMap

It is always better to use the casted version of getAs to avoid errors like this.

answered Nov 10 at 16:36

user238607

680711

add a comment |

up vote
1
down vote

accepted

You can also use row.getAs[Int]("TransactionId") as shown below :

val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs[Int]("TransactionId"), 
 new SalesTransaction(row.getAs("CustomerId"),
 row.getAs("TransactionAttributesScore"),
 row.getAs("Revenue"),
 row.getAs("TransactionType"))
 )
 ).collect.toMap

It is always better to use the casted version of getAs to avoid errors like this.

answered Nov 10 at 16:36

user238607

680711

add a comment |

up vote
1
down vote

accepted

You can also use row.getAs[Int]("TransactionId") as shown below :

val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs[Int]("TransactionId"), 
 new SalesTransaction(row.getAs("CustomerId"),
 row.getAs("TransactionAttributesScore"),
 row.getAs("Revenue"),
 row.getAs("TransactionType"))
 )
 ).collect.toMap

It is always better to use the casted version of getAs to avoid errors like this.

answered Nov 10 at 16:36

user238607

680711

You can also use row.getAs[Int]("TransactionId") as shown below :

val m2 : Map[Int, SalesTransaction] =
 df.map(row => (
 row.getAs[Int]("TransactionId"), 
 new SalesTransaction(row.getAs("CustomerId"),
 row.getAs("TransactionAttributesScore"),
 row.getAs("Revenue"),
 row.getAs("TransactionType"))
 )
 ).collect.toMap

It is always better to use the casted version of getAs to avoid errors like this.

answered Nov 10 at 16:36

user238607

680711

answered Nov 10 at 16:36

user238607

680711

answered Nov 10 at 16:36

user238607

680711

answered Nov 10 at 16:36

user238607

680711

add a comment |

up vote
0
down vote

The issue is related to data type obtained from row.getAs("TransactionId"). Despite underlying $"TransactionId" being integer. Converting the input explicitly resolved the issue:

//… code above unchanged
val m2 : Map[Int, SlTransaction] =
 df.map(row => 
 val mKey : Int = row.getAs("TransactionId") //forcing into Int variable
 val mValue : SlTransaction = new SlTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 (mKey, mValue)
 
 ).collect.toMap

answered Nov 9 at 20:44

Dan

716

add a comment |

up vote
0
down vote

The issue is related to data type obtained from row.getAs("TransactionId"). Despite underlying $"TransactionId" being integer. Converting the input explicitly resolved the issue:

//… code above unchanged
val m2 : Map[Int, SlTransaction] =
 df.map(row => 
 val mKey : Int = row.getAs("TransactionId") //forcing into Int variable
 val mValue : SlTransaction = new SlTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 (mKey, mValue)
 
 ).collect.toMap

answered Nov 9 at 20:44

Dan

716

add a comment |

up vote
0
down vote

The issue is related to data type obtained from row.getAs("TransactionId"). Despite underlying $"TransactionId" being integer. Converting the input explicitly resolved the issue:

//… code above unchanged
val m2 : Map[Int, SlTransaction] =
 df.map(row => 
 val mKey : Int = row.getAs("TransactionId") //forcing into Int variable
 val mValue : SlTransaction = new SlTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 (mKey, mValue)
 
 ).collect.toMap

answered Nov 9 at 20:44

Dan

716

The issue is related to data type obtained from row.getAs("TransactionId"). Despite underlying $"TransactionId" being integer. Converting the input explicitly resolved the issue:

//… code above unchanged
val m2 : Map[Int, SlTransaction] =
 df.map(row => 
 val mKey : Int = row.getAs("TransactionId") //forcing into Int variable
 val mValue : SlTransaction = new SlTransaction(row.getAs("CustomerId")
 , row.getAs("TransactionAttributesScore")
 , row.getAs("Revenue")
 , row.getAs("TransactionType")
 )
 (mKey, mValue)
 
 ).collect.toMap

answered Nov 9 at 20:44

Dan

716

answered Nov 9 at 20:44

Dan

716

answered Nov 9 at 20:44

Dan

716

answered Nov 9 at 20:44

Dan

716

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

wTf120,7UIa GMB4ZJU7EN7czhzARzx0Ca,4N1RTcS5RJMZhKSdugsmH6BSQuSwP YQm5nNIbU0ICgD

搜尋此網誌

Pfthb