In R data.table what is the difference between taking dot(.) and 'c' in j while using .SD?How it works?
up vote
2
down vote
favorite
Take DT as mtcars data table.
DT <- as.data.table(mtcars)
While taking multiple arguments in 'j' with .SD, if we use dot(.) before j like below code
DT[ , .(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in vertical order with without the column names.
O/P:
cyl V1 N
1: 6 138.2 7
2: 6 1283.2 7
3: 6 856 7
4: 6 25.1 7
5: 6 21.82 7
6: 6 125.84 7
7: 6 4 7
8: 6 3 7
But when I replace that with the dot(.) in 'j' with 'c' like below,
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in horizontal order.
O/P:
cyl mpg disp hp drat wt qsec vs am gear carb N
6 138.2 1283.2 856 25.10 21.820 125.84 4 3 27 24 7
4 293.3 1156.5 909 44.78 25.143 210.51 10 8 45 17 11
8 211.4 4943.4 2929 45.21 55.989 234.81 0 2 46 49 14
In another case, without lapply, exactly the opposite happens.
DT[ , c(sum(mpg), .N), by = (cyl) ]
gives the output vertically
O/P:
cyl V1
1: 6 138.2
2: 6 7.0
3: 4 293.3
4: 4 11.0
5: 8 211.4
6: 8 14.0
whereas a dot(.) in 'j' gives the output horizonatlly.
DT[ , .(sum(mpg), .N), by = (cyl) ]
O/P:
cyl V1 N
1: 6 138.2 7
2: 4 293.3 11
3: 8 211.4 14
Why does this happens? Why the result is ordered in such way?
r data.table
add a comment |
up vote
2
down vote
favorite
Take DT as mtcars data table.
DT <- as.data.table(mtcars)
While taking multiple arguments in 'j' with .SD, if we use dot(.) before j like below code
DT[ , .(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in vertical order with without the column names.
O/P:
cyl V1 N
1: 6 138.2 7
2: 6 1283.2 7
3: 6 856 7
4: 6 25.1 7
5: 6 21.82 7
6: 6 125.84 7
7: 6 4 7
8: 6 3 7
But when I replace that with the dot(.) in 'j' with 'c' like below,
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in horizontal order.
O/P:
cyl mpg disp hp drat wt qsec vs am gear carb N
6 138.2 1283.2 856 25.10 21.820 125.84 4 3 27 24 7
4 293.3 1156.5 909 44.78 25.143 210.51 10 8 45 17 11
8 211.4 4943.4 2929 45.21 55.989 234.81 0 2 46 49 14
In another case, without lapply, exactly the opposite happens.
DT[ , c(sum(mpg), .N), by = (cyl) ]
gives the output vertically
O/P:
cyl V1
1: 6 138.2
2: 6 7.0
3: 4 293.3
4: 4 11.0
5: 8 211.4
6: 8 14.0
whereas a dot(.) in 'j' gives the output horizonatlly.
DT[ , .(sum(mpg), .N), by = (cyl) ]
O/P:
cyl V1 N
1: 6 138.2 7
2: 4 293.3 11
3: 8 211.4 14
Why does this happens? Why the result is ordered in such way?
r data.table
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Take DT as mtcars data table.
DT <- as.data.table(mtcars)
While taking multiple arguments in 'j' with .SD, if we use dot(.) before j like below code
DT[ , .(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in vertical order with without the column names.
O/P:
cyl V1 N
1: 6 138.2 7
2: 6 1283.2 7
3: 6 856 7
4: 6 25.1 7
5: 6 21.82 7
6: 6 125.84 7
7: 6 4 7
8: 6 3 7
But when I replace that with the dot(.) in 'j' with 'c' like below,
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in horizontal order.
O/P:
cyl mpg disp hp drat wt qsec vs am gear carb N
6 138.2 1283.2 856 25.10 21.820 125.84 4 3 27 24 7
4 293.3 1156.5 909 44.78 25.143 210.51 10 8 45 17 11
8 211.4 4943.4 2929 45.21 55.989 234.81 0 2 46 49 14
In another case, without lapply, exactly the opposite happens.
DT[ , c(sum(mpg), .N), by = (cyl) ]
gives the output vertically
O/P:
cyl V1
1: 6 138.2
2: 6 7.0
3: 4 293.3
4: 4 11.0
5: 8 211.4
6: 8 14.0
whereas a dot(.) in 'j' gives the output horizonatlly.
DT[ , .(sum(mpg), .N), by = (cyl) ]
O/P:
cyl V1 N
1: 6 138.2 7
2: 4 293.3 11
3: 8 211.4 14
Why does this happens? Why the result is ordered in such way?
r data.table
Take DT as mtcars data table.
DT <- as.data.table(mtcars)
While taking multiple arguments in 'j' with .SD, if we use dot(.) before j like below code
DT[ , .(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in vertical order with without the column names.
O/P:
cyl V1 N
1: 6 138.2 7
2: 6 1283.2 7
3: 6 856 7
4: 6 25.1 7
5: 6 21.82 7
6: 6 125.84 7
7: 6 4 7
8: 6 3 7
But when I replace that with the dot(.) in 'j' with 'c' like below,
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
the result comes in horizontal order.
O/P:
cyl mpg disp hp drat wt qsec vs am gear carb N
6 138.2 1283.2 856 25.10 21.820 125.84 4 3 27 24 7
4 293.3 1156.5 909 44.78 25.143 210.51 10 8 45 17 11
8 211.4 4943.4 2929 45.21 55.989 234.81 0 2 46 49 14
In another case, without lapply, exactly the opposite happens.
DT[ , c(sum(mpg), .N), by = (cyl) ]
gives the output vertically
O/P:
cyl V1
1: 6 138.2
2: 6 7.0
3: 4 293.3
4: 4 11.0
5: 8 211.4
6: 8 14.0
whereas a dot(.) in 'j' gives the output horizonatlly.
DT[ , .(sum(mpg), .N), by = (cyl) ]
O/P:
cyl V1 N
1: 6 138.2 7
2: 4 293.3 11
3: 8 211.4 14
Why does this happens? Why the result is ordered in such way?
r data.table
r data.table
asked yesterday
Deb
163
163
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
DT[, .(sum(mpg), .N), by = (cyl) ] # equal, creates a list with 2 elements (2 columns)
DT[, list(sum(mpg), .N), by = (cyl) ] # equal, to above
DT[, c(sum(mpg), .N), by = (cyl) ] # creates a vector of length 2 (equal to 2 rows)
another simplified example.
DT[ , .(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , list(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2), by = (cyl) ]
To address your last point,
DT[ , c(element1 = 1, element2 = 2, element3 = list(3)), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2, element3 = 3 ), by = (cyl) ]
You need to learn more about the c
function.
So as lapply
(listapply) returns a list the c
will add the .N
as new LIST ELEMENT in c(lapply(.SD, sum), .N)
.
So you end up with n list elements and therefore n columns.
Just for fun:
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
DT[ , c(sapply(.SD, sum), .N), by = (cyl) ] # sapply will simplify the result into a vector, therefore c() will combine into a vector and you end up with many rows.
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?
– Deb
yesterday
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
DT[, .(sum(mpg), .N), by = (cyl) ] # equal, creates a list with 2 elements (2 columns)
DT[, list(sum(mpg), .N), by = (cyl) ] # equal, to above
DT[, c(sum(mpg), .N), by = (cyl) ] # creates a vector of length 2 (equal to 2 rows)
another simplified example.
DT[ , .(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , list(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2), by = (cyl) ]
To address your last point,
DT[ , c(element1 = 1, element2 = 2, element3 = list(3)), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2, element3 = 3 ), by = (cyl) ]
You need to learn more about the c
function.
So as lapply
(listapply) returns a list the c
will add the .N
as new LIST ELEMENT in c(lapply(.SD, sum), .N)
.
So you end up with n list elements and therefore n columns.
Just for fun:
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
DT[ , c(sapply(.SD, sum), .N), by = (cyl) ] # sapply will simplify the result into a vector, therefore c() will combine into a vector and you end up with many rows.
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?
– Deb
yesterday
add a comment |
up vote
1
down vote
accepted
DT[, .(sum(mpg), .N), by = (cyl) ] # equal, creates a list with 2 elements (2 columns)
DT[, list(sum(mpg), .N), by = (cyl) ] # equal, to above
DT[, c(sum(mpg), .N), by = (cyl) ] # creates a vector of length 2 (equal to 2 rows)
another simplified example.
DT[ , .(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , list(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2), by = (cyl) ]
To address your last point,
DT[ , c(element1 = 1, element2 = 2, element3 = list(3)), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2, element3 = 3 ), by = (cyl) ]
You need to learn more about the c
function.
So as lapply
(listapply) returns a list the c
will add the .N
as new LIST ELEMENT in c(lapply(.SD, sum), .N)
.
So you end up with n list elements and therefore n columns.
Just for fun:
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
DT[ , c(sapply(.SD, sum), .N), by = (cyl) ] # sapply will simplify the result into a vector, therefore c() will combine into a vector and you end up with many rows.
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?
– Deb
yesterday
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
DT[, .(sum(mpg), .N), by = (cyl) ] # equal, creates a list with 2 elements (2 columns)
DT[, list(sum(mpg), .N), by = (cyl) ] # equal, to above
DT[, c(sum(mpg), .N), by = (cyl) ] # creates a vector of length 2 (equal to 2 rows)
another simplified example.
DT[ , .(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , list(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2), by = (cyl) ]
To address your last point,
DT[ , c(element1 = 1, element2 = 2, element3 = list(3)), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2, element3 = 3 ), by = (cyl) ]
You need to learn more about the c
function.
So as lapply
(listapply) returns a list the c
will add the .N
as new LIST ELEMENT in c(lapply(.SD, sum), .N)
.
So you end up with n list elements and therefore n columns.
Just for fun:
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
DT[ , c(sapply(.SD, sum), .N), by = (cyl) ] # sapply will simplify the result into a vector, therefore c() will combine into a vector and you end up with many rows.
DT[, .(sum(mpg), .N), by = (cyl) ] # equal, creates a list with 2 elements (2 columns)
DT[, list(sum(mpg), .N), by = (cyl) ] # equal, to above
DT[, c(sum(mpg), .N), by = (cyl) ] # creates a vector of length 2 (equal to 2 rows)
another simplified example.
DT[ , .(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , list(col1 = 1, col2 = 2), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2), by = (cyl) ]
To address your last point,
DT[ , c(element1 = 1, element2 = 2, element3 = list(3)), by = (cyl) ]
DT[ , c(element1 = 1, element2 = 2, element3 = 3 ), by = (cyl) ]
You need to learn more about the c
function.
So as lapply
(listapply) returns a list the c
will add the .N
as new LIST ELEMENT in c(lapply(.SD, sum), .N)
.
So you end up with n list elements and therefore n columns.
Just for fun:
DT[ , c(lapply(.SD, sum), .N), by = (cyl) ]
DT[ , c(sapply(.SD, sum), .N), by = (cyl) ] # sapply will simplify the result into a vector, therefore c() will combine into a vector and you end up with many rows.
edited yesterday
answered yesterday
Andre Elrico
4,4971827
4,4971827
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?
– Deb
yesterday
add a comment |
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?
– Deb
yesterday
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
@Deb see if this is enough explanation.
– Andre Elrico
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.
DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?– Deb
yesterday
Thank you so much! This explains a lot. But, I still have one doubt.
DT[ , (lapply(.SD, sum)), by = cyl ]
returns a list with all columns as I understand but why does'DT[ , .(lapply(.SD, sum)), by = cyl ]
returns the result in many rows when lapply should also return a list? What is the dot changing here?– Deb
yesterday
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53224462%2fin-r-data-table-what-is-the-difference-between-taking-dot-and-c-in-j-while%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password