Subset a dataframe using colnames from another dataframe
Question:
I have a particular problem where I want to subset a given dataframe columnwise where the column names are stored in another dataframe.
Example using mtcars dataset:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
vec <- df_col_names[1,] # first row contains "hp" and "disp"
mtcars_new <- mtcars[, c("hp", "disp")] ## assuming that vec gives colnames
I even tried inserting double quotes to each of the words using the following:
Attempted solution:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
df_col_names$col_names <- gsub("(\w+)", '"\1"', df_col_names$col_names)
vec <- df_col_names[1,]
vec2 <- gsub("(\w+)", '"\1"', vec)
mtcars_new <- mtcars[,vec2] ## this should be same as mtcars[, c("hp", "disp")]
Expected Solution
mtcars_new <- mtcars[,vec2]
is equal to mtcars_new <- mtcars[, c("hp", "disp")]
r dataframe
add a comment |
Question:
I have a particular problem where I want to subset a given dataframe columnwise where the column names are stored in another dataframe.
Example using mtcars dataset:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
vec <- df_col_names[1,] # first row contains "hp" and "disp"
mtcars_new <- mtcars[, c("hp", "disp")] ## assuming that vec gives colnames
I even tried inserting double quotes to each of the words using the following:
Attempted solution:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
df_col_names$col_names <- gsub("(\w+)", '"\1"', df_col_names$col_names)
vec <- df_col_names[1,]
vec2 <- gsub("(\w+)", '"\1"', vec)
mtcars_new <- mtcars[,vec2] ## this should be same as mtcars[, c("hp", "disp")]
Expected Solution
mtcars_new <- mtcars[,vec2]
is equal to mtcars_new <- mtcars[, c("hp", "disp")]
r dataframe
Not clear what kind of output you want. Is it one data frame with columnshp,disp
and another one withdisp,hp,mpg
? Please show how you want it to be.
– AntoniosK
Nov 13 '18 at 10:50
Ok editing the question to make it more clear, its hp,disp
– Anubhav Dikshit
Nov 13 '18 at 10:50
add a comment |
Question:
I have a particular problem where I want to subset a given dataframe columnwise where the column names are stored in another dataframe.
Example using mtcars dataset:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
vec <- df_col_names[1,] # first row contains "hp" and "disp"
mtcars_new <- mtcars[, c("hp", "disp")] ## assuming that vec gives colnames
I even tried inserting double quotes to each of the words using the following:
Attempted solution:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
df_col_names$col_names <- gsub("(\w+)", '"\1"', df_col_names$col_names)
vec <- df_col_names[1,]
vec2 <- gsub("(\w+)", '"\1"', vec)
mtcars_new <- mtcars[,vec2] ## this should be same as mtcars[, c("hp", "disp")]
Expected Solution
mtcars_new <- mtcars[,vec2]
is equal to mtcars_new <- mtcars[, c("hp", "disp")]
r dataframe
Question:
I have a particular problem where I want to subset a given dataframe columnwise where the column names are stored in another dataframe.
Example using mtcars dataset:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
vec <- df_col_names[1,] # first row contains "hp" and "disp"
mtcars_new <- mtcars[, c("hp", "disp")] ## assuming that vec gives colnames
I even tried inserting double quotes to each of the words using the following:
Attempted solution:
options(stringsAsFactors = FALSE)
col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)
df_col_names$col_names <- gsub("(\w+)", '"\1"', df_col_names$col_names)
vec <- df_col_names[1,]
vec2 <- gsub("(\w+)", '"\1"', vec)
mtcars_new <- mtcars[,vec2] ## this should be same as mtcars[, c("hp", "disp")]
Expected Solution
mtcars_new <- mtcars[,vec2]
is equal to mtcars_new <- mtcars[, c("hp", "disp")]
r dataframe
r dataframe
edited Nov 13 '18 at 10:50
Anubhav Dikshit
asked Nov 13 '18 at 10:47
Anubhav DikshitAnubhav Dikshit
6621027
6621027
Not clear what kind of output you want. Is it one data frame with columnshp,disp
and another one withdisp,hp,mpg
? Please show how you want it to be.
– AntoniosK
Nov 13 '18 at 10:50
Ok editing the question to make it more clear, its hp,disp
– Anubhav Dikshit
Nov 13 '18 at 10:50
add a comment |
Not clear what kind of output you want. Is it one data frame with columnshp,disp
and another one withdisp,hp,mpg
? Please show how you want it to be.
– AntoniosK
Nov 13 '18 at 10:50
Ok editing the question to make it more clear, its hp,disp
– Anubhav Dikshit
Nov 13 '18 at 10:50
Not clear what kind of output you want. Is it one data frame with columns
hp,disp
and another one with disp,hp,mpg
? Please show how you want it to be.– AntoniosK
Nov 13 '18 at 10:50
Not clear what kind of output you want. Is it one data frame with columns
hp,disp
and another one with disp,hp,mpg
? Please show how you want it to be.– AntoniosK
Nov 13 '18 at 10:50
Ok editing the question to make it more clear, its hp,disp
– Anubhav Dikshit
Nov 13 '18 at 10:50
Ok editing the question to make it more clear, its hp,disp
– Anubhav Dikshit
Nov 13 '18 at 10:50
add a comment |
2 Answers
2
active
oldest
votes
Do you need this?
lapply(strsplit(as.character(df_col_names$col_names), ","), function(x) mtcars[x])
#[[1]]
# hp disp
#Mazda RX4 110 160.0
#Mazda RX4 Wag 110 160.0
#Datsun 710 93 108.0
#Hornet 4 Drive 110 258.0
#Hornet Sportabout 175 360.0
#.....
#[[2]]
# disp hp mpg
#Mazda RX4 160.0 110 21.0
#Mazda RX4 Wag 160.0 110 21.0
#Datsun 710 108.0 93 22.8
#Hornet 4 Drive 258.0 110 21.4
#Hornet Sportabout 360.0 175 18.7
#....
Here, we split the column names on comma (",") and then subset it from the dataframe using lapply
. This returns a list of dataframes with length of list which is same as number of rows in the data frame.
If you want to subset only the first row, you could do
mtcars[strsplit(as.character(df_col_names$col_names[1]), ",")[[1]]]
add a comment |
Here's another way to do this:
col_names <- c("hp,disp", "disp,hp,mpg")
vec2 <- unlist(str_split(col_names[[1]],','))
mtcars_new <- mtcars[,vec2]
What you are doing is picking the first element from the col_names
vector, splitting it by the separator, then unlisting it (because str_split() makes a list), then you are using your new vector of names to subset the mtcars data-frame.
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279315%2fsubset-a-dataframe-using-colnames-from-another-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Do you need this?
lapply(strsplit(as.character(df_col_names$col_names), ","), function(x) mtcars[x])
#[[1]]
# hp disp
#Mazda RX4 110 160.0
#Mazda RX4 Wag 110 160.0
#Datsun 710 93 108.0
#Hornet 4 Drive 110 258.0
#Hornet Sportabout 175 360.0
#.....
#[[2]]
# disp hp mpg
#Mazda RX4 160.0 110 21.0
#Mazda RX4 Wag 160.0 110 21.0
#Datsun 710 108.0 93 22.8
#Hornet 4 Drive 258.0 110 21.4
#Hornet Sportabout 360.0 175 18.7
#....
Here, we split the column names on comma (",") and then subset it from the dataframe using lapply
. This returns a list of dataframes with length of list which is same as number of rows in the data frame.
If you want to subset only the first row, you could do
mtcars[strsplit(as.character(df_col_names$col_names[1]), ",")[[1]]]
add a comment |
Do you need this?
lapply(strsplit(as.character(df_col_names$col_names), ","), function(x) mtcars[x])
#[[1]]
# hp disp
#Mazda RX4 110 160.0
#Mazda RX4 Wag 110 160.0
#Datsun 710 93 108.0
#Hornet 4 Drive 110 258.0
#Hornet Sportabout 175 360.0
#.....
#[[2]]
# disp hp mpg
#Mazda RX4 160.0 110 21.0
#Mazda RX4 Wag 160.0 110 21.0
#Datsun 710 108.0 93 22.8
#Hornet 4 Drive 258.0 110 21.4
#Hornet Sportabout 360.0 175 18.7
#....
Here, we split the column names on comma (",") and then subset it from the dataframe using lapply
. This returns a list of dataframes with length of list which is same as number of rows in the data frame.
If you want to subset only the first row, you could do
mtcars[strsplit(as.character(df_col_names$col_names[1]), ",")[[1]]]
add a comment |
Do you need this?
lapply(strsplit(as.character(df_col_names$col_names), ","), function(x) mtcars[x])
#[[1]]
# hp disp
#Mazda RX4 110 160.0
#Mazda RX4 Wag 110 160.0
#Datsun 710 93 108.0
#Hornet 4 Drive 110 258.0
#Hornet Sportabout 175 360.0
#.....
#[[2]]
# disp hp mpg
#Mazda RX4 160.0 110 21.0
#Mazda RX4 Wag 160.0 110 21.0
#Datsun 710 108.0 93 22.8
#Hornet 4 Drive 258.0 110 21.4
#Hornet Sportabout 360.0 175 18.7
#....
Here, we split the column names on comma (",") and then subset it from the dataframe using lapply
. This returns a list of dataframes with length of list which is same as number of rows in the data frame.
If you want to subset only the first row, you could do
mtcars[strsplit(as.character(df_col_names$col_names[1]), ",")[[1]]]
Do you need this?
lapply(strsplit(as.character(df_col_names$col_names), ","), function(x) mtcars[x])
#[[1]]
# hp disp
#Mazda RX4 110 160.0
#Mazda RX4 Wag 110 160.0
#Datsun 710 93 108.0
#Hornet 4 Drive 110 258.0
#Hornet Sportabout 175 360.0
#.....
#[[2]]
# disp hp mpg
#Mazda RX4 160.0 110 21.0
#Mazda RX4 Wag 160.0 110 21.0
#Datsun 710 108.0 93 22.8
#Hornet 4 Drive 258.0 110 21.4
#Hornet Sportabout 360.0 175 18.7
#....
Here, we split the column names on comma (",") and then subset it from the dataframe using lapply
. This returns a list of dataframes with length of list which is same as number of rows in the data frame.
If you want to subset only the first row, you could do
mtcars[strsplit(as.character(df_col_names$col_names[1]), ",")[[1]]]
answered Nov 13 '18 at 11:01
Ronak ShahRonak Shah
38.2k104161
38.2k104161
add a comment |
add a comment |
Here's another way to do this:
col_names <- c("hp,disp", "disp,hp,mpg")
vec2 <- unlist(str_split(col_names[[1]],','))
mtcars_new <- mtcars[,vec2]
What you are doing is picking the first element from the col_names
vector, splitting it by the separator, then unlisting it (because str_split() makes a list), then you are using your new vector of names to subset the mtcars data-frame.
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
add a comment |
Here's another way to do this:
col_names <- c("hp,disp", "disp,hp,mpg")
vec2 <- unlist(str_split(col_names[[1]],','))
mtcars_new <- mtcars[,vec2]
What you are doing is picking the first element from the col_names
vector, splitting it by the separator, then unlisting it (because str_split() makes a list), then you are using your new vector of names to subset the mtcars data-frame.
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
add a comment |
Here's another way to do this:
col_names <- c("hp,disp", "disp,hp,mpg")
vec2 <- unlist(str_split(col_names[[1]],','))
mtcars_new <- mtcars[,vec2]
What you are doing is picking the first element from the col_names
vector, splitting it by the separator, then unlisting it (because str_split() makes a list), then you are using your new vector of names to subset the mtcars data-frame.
Here's another way to do this:
col_names <- c("hp,disp", "disp,hp,mpg")
vec2 <- unlist(str_split(col_names[[1]],','))
mtcars_new <- mtcars[,vec2]
What you are doing is picking the first element from the col_names
vector, splitting it by the separator, then unlisting it (because str_split() makes a list), then you are using your new vector of names to subset the mtcars data-frame.
answered Nov 13 '18 at 11:03
Randall HelmsRandall Helms
512210
512210
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
add a comment |
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
I accepted @Ronak answer but I upvoted yours. Thanks so much!
– Anubhav Dikshit
Nov 13 '18 at 11:09
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279315%2fsubset-a-dataframe-using-colnames-from-another-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Not clear what kind of output you want. Is it one data frame with columns
hp,disp
and another one withdisp,hp,mpg
? Please show how you want it to be.– AntoniosK
Nov 13 '18 at 10:50
Ok editing the question to make it more clear, its hp,disp
– Anubhav Dikshit
Nov 13 '18 at 10:50