dcast not retaining variable type as character, error in vapply when a variable is NA

up vote
2
down vote

favorite

In attempting to reshape data using resphape2::dcast, I am encountering an error involving NA entries. Sample data are at the end.

The data are reshaped from long to wide, but on occasion some parameters have all NA entries, which appears to be causing the issue. Or at least, I think it is. If I remove any parameter like that, Ammonia in this example, the error goes away.

In debugging dcast, it seems to pin down to this line:

ordered <- vaggregate(.value = value, .group = overall, 
 .fun = fun.aggregate, ..., .default = fill, .n = n)

which results in the error:

Error in vapply(indices, fun, .default) : 
values must be type 'character',
but FUN(X[[1]]) result is type 'integer'

Seeing that the NA variable is first in line, I thought the aggregate function may default to integer, even though the entire column is character, but moving those rows did not solve it. The only way I can find to solve it is by using na.omit, which removes that parameter completely. My expected output would retain any parameters with all NA if possible. The second reason for this is if a day/depth is not sampled, it should be retained and those entries should be ns (not sampled). Is there a way I can solve this error without having to remove all NA parameters that will be reshaped?

Reproducible example (data are below dcast code):

library(reshape2)
dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", fill="ns")

dcast(na.omit(df), station + date + depth ~ parmcode,
 value.var = "value_qualif", fill="ns") # solves error, but removes parameter completely

Example data:

df <- structure(list(station = c("A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A"), date = c("7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018"), depth = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 1L, 1L, 1L, 
12L, 12L, 12L, 18L, 18L, 18L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 18L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L), parmcode = c("CDOM", 
"DENSITY", "DO", "ENTERO", "PH", "TOTAL", "XMS", "TEMP", "SAL", 
"FECAL", "TOTAL", "FECAL", "ENTERO", "CDOM", "XMS", "TEMP", "SAL", 
"PH", "DO", "DENSITY", "DO", "DENSITY", "TOTAL", "FECAL", "PH", 
"CDOM", "XMS", "TEMP", "SAL", "ENTERO", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "TOTAL", "XMS", 
"TEMP", "SAL", "PH", "DO", "DENSITY", "CDOM", "FECAL", "ENTERO", 
"CDOM", "FECAL", "ENTERO", "PH", "DO", "TEMP", "XMS", "TOTAL", 
"DENSITY", "SAL", "TOTAL", "FECAL", "ENTERO", "XMS", "TEMP", 
"SAL", "PH", "DO", "DENSITY", "CDOM"), value_qualif = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, "<2", "<2", "<2", "1.3", "69.67", 
"16.6", "33.7", "8.1", "7.6", "24.622", "5.5", "25.279", "<2", 
"<2", "7.8", "1.38", "72.96", "13.2", "33.61", "<2", NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "<2", "77.82", "20.8", "33.72", "8.2", 
"8.8", "23.58", "1.01", "<2", "<2", "1.78", "<2", "<2", "8", 
"6.5", "13.5", "67.19", "2e", "25.197", "33.58", "2e", "2e", 
"<2", "75.53", "12.9", "33.61", "7.9", "5.5", "25.34", "1.77"
)), class = "data.frame", row.names = c(NA, -69L))

Some tangentially related questions that don't answer my question are POSIXct values become numeric in reshape2 dcast and Error with custom aggregate function for a cast() call in R reshape2

With using na.omit my output is:

 station date depth CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 12 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
2 A 7/2/2018 18 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
3 A 7/9/2018 1 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
4 A 7/9/2018 12 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
5 A 7/9/2018 18 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Expected output without using na.omit is:

 station date depth AMMONIA CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
2 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
3 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
4 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
5 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
6 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

asked Nov 9 at 15:06

Anonymous coward

1,159819

add a comment |

up vote
2
down vote

favorite

In attempting to reshape data using resphape2::dcast, I am encountering an error involving NA entries. Sample data are at the end.

In debugging dcast, it seems to pin down to this line:

ordered <- vaggregate(.value = value, .group = overall, 
 .fun = fun.aggregate, ..., .default = fill, .n = n)

which results in the error:

Error in vapply(indices, fun, .default) : 
values must be type 'character',
but FUN(X[[1]]) result is type 'integer'

Reproducible example (data are below dcast code):

library(reshape2)
dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", fill="ns")

dcast(na.omit(df), station + date + depth ~ parmcode,
 value.var = "value_qualif", fill="ns") # solves error, but removes parameter completely

Example data:

df <- structure(list(station = c("A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A"), date = c("7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018"), depth = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 1L, 1L, 1L, 
12L, 12L, 12L, 18L, 18L, 18L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 18L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L), parmcode = c("CDOM", 
"DENSITY", "DO", "ENTERO", "PH", "TOTAL", "XMS", "TEMP", "SAL", 
"FECAL", "TOTAL", "FECAL", "ENTERO", "CDOM", "XMS", "TEMP", "SAL", 
"PH", "DO", "DENSITY", "DO", "DENSITY", "TOTAL", "FECAL", "PH", 
"CDOM", "XMS", "TEMP", "SAL", "ENTERO", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "TOTAL", "XMS", 
"TEMP", "SAL", "PH", "DO", "DENSITY", "CDOM", "FECAL", "ENTERO", 
"CDOM", "FECAL", "ENTERO", "PH", "DO", "TEMP", "XMS", "TOTAL", 
"DENSITY", "SAL", "TOTAL", "FECAL", "ENTERO", "XMS", "TEMP", 
"SAL", "PH", "DO", "DENSITY", "CDOM"), value_qualif = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, "<2", "<2", "<2", "1.3", "69.67", 
"16.6", "33.7", "8.1", "7.6", "24.622", "5.5", "25.279", "<2", 
"<2", "7.8", "1.38", "72.96", "13.2", "33.61", "<2", NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "<2", "77.82", "20.8", "33.72", "8.2", 
"8.8", "23.58", "1.01", "<2", "<2", "1.78", "<2", "<2", "8", 
"6.5", "13.5", "67.19", "2e", "25.197", "33.58", "2e", "2e", 
"<2", "75.53", "12.9", "33.61", "7.9", "5.5", "25.34", "1.77"
)), class = "data.frame", row.names = c(NA, -69L))

Some tangentially related questions that don't answer my question are POSIXct values become numeric in reshape2 dcast and Error with custom aggregate function for a cast() call in R reshape2

With using na.omit my output is:

 station date depth CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 12 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
2 A 7/2/2018 18 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
3 A 7/9/2018 1 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
4 A 7/9/2018 12 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
5 A 7/9/2018 18 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Expected output without using na.omit is:

 station date depth AMMONIA CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
2 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
3 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
4 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
5 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
6 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

asked Nov 9 at 15:06

Anonymous coward

1,159819

add a comment |

up vote
2
down vote

favorite

In attempting to reshape data using resphape2::dcast, I am encountering an error involving NA entries. Sample data are at the end.

In debugging dcast, it seems to pin down to this line:

ordered <- vaggregate(.value = value, .group = overall, 
 .fun = fun.aggregate, ..., .default = fill, .n = n)

which results in the error:

Error in vapply(indices, fun, .default) : 
values must be type 'character',
but FUN(X[[1]]) result is type 'integer'

Reproducible example (data are below dcast code):

library(reshape2)
dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", fill="ns")

dcast(na.omit(df), station + date + depth ~ parmcode,
 value.var = "value_qualif", fill="ns") # solves error, but removes parameter completely

Example data:

df <- structure(list(station = c("A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A"), date = c("7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018"), depth = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 1L, 1L, 1L, 
12L, 12L, 12L, 18L, 18L, 18L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 18L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L), parmcode = c("CDOM", 
"DENSITY", "DO", "ENTERO", "PH", "TOTAL", "XMS", "TEMP", "SAL", 
"FECAL", "TOTAL", "FECAL", "ENTERO", "CDOM", "XMS", "TEMP", "SAL", 
"PH", "DO", "DENSITY", "DO", "DENSITY", "TOTAL", "FECAL", "PH", 
"CDOM", "XMS", "TEMP", "SAL", "ENTERO", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "TOTAL", "XMS", 
"TEMP", "SAL", "PH", "DO", "DENSITY", "CDOM", "FECAL", "ENTERO", 
"CDOM", "FECAL", "ENTERO", "PH", "DO", "TEMP", "XMS", "TOTAL", 
"DENSITY", "SAL", "TOTAL", "FECAL", "ENTERO", "XMS", "TEMP", 
"SAL", "PH", "DO", "DENSITY", "CDOM"), value_qualif = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, "<2", "<2", "<2", "1.3", "69.67", 
"16.6", "33.7", "8.1", "7.6", "24.622", "5.5", "25.279", "<2", 
"<2", "7.8", "1.38", "72.96", "13.2", "33.61", "<2", NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "<2", "77.82", "20.8", "33.72", "8.2", 
"8.8", "23.58", "1.01", "<2", "<2", "1.78", "<2", "<2", "8", 
"6.5", "13.5", "67.19", "2e", "25.197", "33.58", "2e", "2e", 
"<2", "75.53", "12.9", "33.61", "7.9", "5.5", "25.34", "1.77"
)), class = "data.frame", row.names = c(NA, -69L))

Some tangentially related questions that don't answer my question are POSIXct values become numeric in reshape2 dcast and Error with custom aggregate function for a cast() call in R reshape2

With using na.omit my output is:

 station date depth CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 12 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
2 A 7/2/2018 18 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
3 A 7/9/2018 1 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
4 A 7/9/2018 12 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
5 A 7/9/2018 18 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Expected output without using na.omit is:

 station date depth AMMONIA CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
2 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
3 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
4 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
5 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
6 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

asked Nov 9 at 15:06

Anonymous coward

1,159819

In attempting to reshape data using resphape2::dcast, I am encountering an error involving NA entries. Sample data are at the end.

In debugging dcast, it seems to pin down to this line:

ordered <- vaggregate(.value = value, .group = overall, 
 .fun = fun.aggregate, ..., .default = fill, .n = n)

which results in the error:

Error in vapply(indices, fun, .default) : 
values must be type 'character',
but FUN(X[[1]]) result is type 'integer'

Reproducible example (data are below dcast code):

library(reshape2)
dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", fill="ns")

dcast(na.omit(df), station + date + depth ~ parmcode,
 value.var = "value_qualif", fill="ns") # solves error, but removes parameter completely

Example data:

df <- structure(list(station = c("A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A"), date = c("7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", 
"7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/2/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", "7/1/2018", 
"7/1/2018", "7/1/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", "7/9/2018", 
"7/9/2018", "7/9/2018"), depth = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 1L, 1L, 1L, 
12L, 12L, 12L, 18L, 18L, 18L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 18L, 
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L), parmcode = c("CDOM", 
"DENSITY", "DO", "ENTERO", "PH", "TOTAL", "XMS", "TEMP", "SAL", 
"FECAL", "TOTAL", "FECAL", "ENTERO", "CDOM", "XMS", "TEMP", "SAL", 
"PH", "DO", "DENSITY", "DO", "DENSITY", "TOTAL", "FECAL", "PH", 
"CDOM", "XMS", "TEMP", "SAL", "ENTERO", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", 
"AMMONIA AS N", "AMMONIA AS N", "AMMONIA AS N", "TOTAL", "XMS", 
"TEMP", "SAL", "PH", "DO", "DENSITY", "CDOM", "FECAL", "ENTERO", 
"CDOM", "FECAL", "ENTERO", "PH", "DO", "TEMP", "XMS", "TOTAL", 
"DENSITY", "SAL", "TOTAL", "FECAL", "ENTERO", "XMS", "TEMP", 
"SAL", "PH", "DO", "DENSITY", "CDOM"), value_qualif = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, "<2", "<2", "<2", "1.3", "69.67", 
"16.6", "33.7", "8.1", "7.6", "24.622", "5.5", "25.279", "<2", 
"<2", "7.8", "1.38", "72.96", "13.2", "33.61", "<2", NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "<2", "77.82", "20.8", "33.72", "8.2", 
"8.8", "23.58", "1.01", "<2", "<2", "1.78", "<2", "<2", "8", 
"6.5", "13.5", "67.19", "2e", "25.197", "33.58", "2e", "2e", 
"<2", "75.53", "12.9", "33.61", "7.9", "5.5", "25.34", "1.77"
)), class = "data.frame", row.names = c(NA, -69L))

Some tangentially related questions that don't answer my question are POSIXct values become numeric in reshape2 dcast and Error with custom aggregate function for a cast() call in R reshape2

With using na.omit my output is:

 station date depth CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 12 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
2 A 7/2/2018 18 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
3 A 7/9/2018 1 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
4 A 7/9/2018 12 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
5 A 7/9/2018 18 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Expected output without using na.omit is:

 station date depth AMMONIA CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
1 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
2 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
3 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
4 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
5 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
6 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

r reshape2 dcast

asked Nov 9 at 15:06

Anonymous coward

1,159819

asked Nov 9 at 15:06

Anonymous coward

1,159819

asked Nov 9 at 15:06

Anonymous coward

1,159819

asked Nov 9 at 15:06

Anonymous coward

1,159819

asked Nov 9 at 15:06

Anonymous coward

1,159819

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

The actual issue is that all the parameters have only one value for each triple of (station, date, depth) except for AMMONIA AS N, which has three NA entries.

For instance,

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif")
# Aggregation function missing: defaulting to length
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 3 0 0 0 0 0 0 0 0 0 0
# 2 A 7/1/2018 12 3 0 0 0 0 0 0 0 0 0 0
# 3 A 7/1/2018 18 3 0 0 0 0 0 0 0 0 0 0
# 4 A 7/2/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 5 A 7/2/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 6 A 7/2/2018 18 0 1 1 1 1 1 1 1 1 1 1
# 7 A 7/9/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 8 A 7/9/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 9 A 7/9/2018 18 0 1 1 1 1 1 1 1 1 1 1

Once we remove the duplicate rows everything works smoothly

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif", fill = "ns")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 2 A 7/1/2018 12 ns ns ns ns ns ns ns ns ns ns ns
# 3 A 7/1/2018 18 ns ns ns ns ns ns ns ns ns ns ns
# 4 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 5 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Alternatively, you could run

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", 
 fill = NA_character_, fun.aggregate = head, n = 1)
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

See this answer regarding NA_character_.

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

I see now, those may be sampled in triplicate. This was the inverse of what I thought was wrong. I had to modify the duplicate call since I have unique sample IDs for those duplicates in my actual data, but this did the trick.
– Anonymous coward
Nov 9 at 16:26

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228268%2fdcast-not-retaining-variable-type-as-character-error-in-vapply-when-a-variable%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

The actual issue is that all the parameters have only one value for each triple of (station, date, depth) except for AMMONIA AS N, which has three NA entries.

For instance,

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif")
# Aggregation function missing: defaulting to length
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 3 0 0 0 0 0 0 0 0 0 0
# 2 A 7/1/2018 12 3 0 0 0 0 0 0 0 0 0 0
# 3 A 7/1/2018 18 3 0 0 0 0 0 0 0 0 0 0
# 4 A 7/2/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 5 A 7/2/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 6 A 7/2/2018 18 0 1 1 1 1 1 1 1 1 1 1
# 7 A 7/9/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 8 A 7/9/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 9 A 7/9/2018 18 0 1 1 1 1 1 1 1 1 1 1

Once we remove the duplicate rows everything works smoothly

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif", fill = "ns")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 2 A 7/1/2018 12 ns ns ns ns ns ns ns ns ns ns ns
# 3 A 7/1/2018 18 ns ns ns ns ns ns ns ns ns ns ns
# 4 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 5 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Alternatively, you could run

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", 
 fill = NA_character_, fun.aggregate = head, n = 1)
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

See this answer regarding NA_character_.

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

I see now, those may be sampled in triplicate. This was the inverse of what I thought was wrong. I had to modify the duplicate call since I have unique sample IDs for those duplicates in my actual data, but this did the trick.
– Anonymous coward
Nov 9 at 16:26

add a comment |

up vote
1
down vote

accepted

The actual issue is that all the parameters have only one value for each triple of (station, date, depth) except for AMMONIA AS N, which has three NA entries.

For instance,

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif")
# Aggregation function missing: defaulting to length
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 3 0 0 0 0 0 0 0 0 0 0
# 2 A 7/1/2018 12 3 0 0 0 0 0 0 0 0 0 0
# 3 A 7/1/2018 18 3 0 0 0 0 0 0 0 0 0 0
# 4 A 7/2/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 5 A 7/2/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 6 A 7/2/2018 18 0 1 1 1 1 1 1 1 1 1 1
# 7 A 7/9/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 8 A 7/9/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 9 A 7/9/2018 18 0 1 1 1 1 1 1 1 1 1 1

Once we remove the duplicate rows everything works smoothly

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif", fill = "ns")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 2 A 7/1/2018 12 ns ns ns ns ns ns ns ns ns ns ns
# 3 A 7/1/2018 18 ns ns ns ns ns ns ns ns ns ns ns
# 4 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 5 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Alternatively, you could run

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", 
 fill = NA_character_, fun.aggregate = head, n = 1)
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

See this answer regarding NA_character_.

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

I see now, those may be sampled in triplicate. This was the inverse of what I thought was wrong. I had to modify the duplicate call since I have unique sample IDs for those duplicates in my actual data, but this did the trick.
– Anonymous coward
Nov 9 at 16:26

add a comment |

up vote
1
down vote

accepted

The actual issue is that all the parameters have only one value for each triple of (station, date, depth) except for AMMONIA AS N, which has three NA entries.

For instance,

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif")
# Aggregation function missing: defaulting to length
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 3 0 0 0 0 0 0 0 0 0 0
# 2 A 7/1/2018 12 3 0 0 0 0 0 0 0 0 0 0
# 3 A 7/1/2018 18 3 0 0 0 0 0 0 0 0 0 0
# 4 A 7/2/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 5 A 7/2/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 6 A 7/2/2018 18 0 1 1 1 1 1 1 1 1 1 1
# 7 A 7/9/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 8 A 7/9/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 9 A 7/9/2018 18 0 1 1 1 1 1 1 1 1 1 1

Once we remove the duplicate rows everything works smoothly

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif", fill = "ns")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 2 A 7/1/2018 12 ns ns ns ns ns ns ns ns ns ns ns
# 3 A 7/1/2018 18 ns ns ns ns ns ns ns ns ns ns ns
# 4 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 5 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Alternatively, you could run

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", 
 fill = NA_character_, fun.aggregate = head, n = 1)
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

See this answer regarding NA_character_.

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

The actual issue is that all the parameters have only one value for each triple of (station, date, depth) except for AMMONIA AS N, which has three NA entries.

For instance,

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif")
# Aggregation function missing: defaulting to length
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 3 0 0 0 0 0 0 0 0 0 0
# 2 A 7/1/2018 12 3 0 0 0 0 0 0 0 0 0 0
# 3 A 7/1/2018 18 3 0 0 0 0 0 0 0 0 0 0
# 4 A 7/2/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 5 A 7/2/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 6 A 7/2/2018 18 0 1 1 1 1 1 1 1 1 1 1
# 7 A 7/9/2018 1 0 1 1 1 1 1 1 1 1 1 1
# 8 A 7/9/2018 12 0 1 1 1 1 1 1 1 1 1 1
# 9 A 7/9/2018 18 0 1 1 1 1 1 1 1 1 1 1

Once we remove the duplicate rows everything works smoothly

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

dcast(df[!duplicated(df), ], station + date + depth ~ parmcode, value.var = "value_qualif", fill = "ns")
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 2 A 7/1/2018 12 ns ns ns ns ns ns ns ns ns ns ns
# 3 A 7/1/2018 18 ns ns ns ns ns ns ns ns ns ns ns
# 4 A 7/2/2018 1 ns ns ns ns ns ns ns ns ns ns ns
# 5 A 7/2/2018 12 ns 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 ns 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 ns 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 ns 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 ns 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

Alternatively, you could run

dcast(df, station + date + depth ~ parmcode, value.var = "value_qualif", 
 fill = NA_character_, fun.aggregate = head, n = 1)
# station date depth AMMONIA AS N CDOM DENSITY DO ENTERO FECAL PH SAL TEMP TOTAL XMS
# 1 A 7/1/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 2 A 7/1/2018 12 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 3 A 7/1/2018 18 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 A 7/2/2018 1 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 A 7/2/2018 12 <NA> 1.3 24.622 7.6 <2 <2 8.1 33.7 16.6 <2 69.67
# 6 A 7/2/2018 18 <NA> 1.38 25.279 5.5 <2 <2 7.8 33.61 13.2 <2 72.96
# 7 A 7/9/2018 1 <NA> 1.01 23.58 8.8 <2 <2 8.2 33.72 20.8 <2 77.82
# 8 A 7/9/2018 12 <NA> 1.78 25.197 6.5 <2 <2 8 33.58 13.5 2e 67.19
# 9 A 7/9/2018 18 <NA> 1.77 25.34 5.5 <2 2e 7.9 33.61 12.9 2e 75.53

See this answer regarding NA_character_.

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

answered Nov 9 at 15:48

Julius Vainora

26.3k75877

I see now, those may be sampled in triplicate. This was the inverse of what I thought was wrong. I had to modify the duplicate call since I have unique sample IDs for those duplicates in my actual data, but this did the trick.
– Anonymous coward
Nov 9 at 16:26

add a comment |

I see now, those may be sampled in triplicate. This was the inverse of what I thought was wrong. I had to modify the duplicate call since I have unique sample IDs for those duplicates in my actual data, but this did the trick.
– Anonymous coward
Nov 9 at 16:26

I see now, those may be sampled in triplicate. This was the inverse of what I thought was wrong. I had to modify the duplicate call since I have unique sample IDs for those duplicates in my actual data, but this did the trick.
– Anonymous coward
Nov 9 at 16:26

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb