fitting grouped regression model and extrapolating

up vote
0
down vote

favorite

I have a dataframe with the following columns: electricity consumption E (over 24 hours), hour h and temperature t.
I would like to extrapolate the consumption per hour for temperatures where I do not have data.

I have been following eddis's reply from Apply grouped model back onto data

combinedprofiles <- data.table(df)

#Make a model for each hour
my.models <- combined_profiles[, list(Model = list(lm(E ~ t))),
 keyby = h] 

#Make predictions on dataset
setkey(combined_profiles, hour)
combined_profiles[my.models, prediction := predict(i.Model[[1]], .SD), by = .EACHI]

I have tried adding a dataframe with the new temperatures as new data to the prediction.

 newtemp<- data.frame(temp_round=c(6,7))
 combined_profiles[my.models, prediction := predict(newdata=newtemp,i.Model[[1]], .SD), by = .EACHI]

but this gives me the following error: Error in se.fit || interval != "none" : invalid 'x' type in 'x || y'

Could anyone please help me how to change this so as to predict demand for temperatures outside the measured data.

For the iris example my question would be, how to extrapolate Sepal.Length for data where we don't have Sepal.Width.

Thanks!

edited Nov 11 at 12:36

asked Nov 10 at 22:35

maaar

You are asking us to read too many of the neurons on your cerebral cortex.
– 42-
Nov 10 at 23:50

add a comment |

up vote
0
down vote

favorite

I have been following eddis's reply from Apply grouped model back onto data

combinedprofiles <- data.table(df)

#Make a model for each hour
my.models <- combined_profiles[, list(Model = list(lm(E ~ t))),
 keyby = h] 

#Make predictions on dataset
setkey(combined_profiles, hour)
combined_profiles[my.models, prediction := predict(i.Model[[1]], .SD), by = .EACHI]

I have tried adding a dataframe with the new temperatures as new data to the prediction.

 newtemp<- data.frame(temp_round=c(6,7))
 combined_profiles[my.models, prediction := predict(newdata=newtemp,i.Model[[1]], .SD), by = .EACHI]

but this gives me the following error: Error in se.fit || interval != "none" : invalid 'x' type in 'x || y'

Could anyone please help me how to change this so as to predict demand for temperatures outside the measured data.

For the iris example my question would be, how to extrapolate Sepal.Length for data where we don't have Sepal.Width.

Thanks!

edited Nov 11 at 12:36

asked Nov 10 at 22:35

maaar

You are asking us to read too many of the neurons on your cerebral cortex.
– 42-
Nov 10 at 23:50

add a comment |

up vote
0
down vote

favorite

I have been following eddis's reply from Apply grouped model back onto data

combinedprofiles <- data.table(df)

#Make a model for each hour
my.models <- combined_profiles[, list(Model = list(lm(E ~ t))),
 keyby = h] 

#Make predictions on dataset
setkey(combined_profiles, hour)
combined_profiles[my.models, prediction := predict(i.Model[[1]], .SD), by = .EACHI]

I have tried adding a dataframe with the new temperatures as new data to the prediction.

 newtemp<- data.frame(temp_round=c(6,7))
 combined_profiles[my.models, prediction := predict(newdata=newtemp,i.Model[[1]], .SD), by = .EACHI]

but this gives me the following error: Error in se.fit || interval != "none" : invalid 'x' type in 'x || y'

Could anyone please help me how to change this so as to predict demand for temperatures outside the measured data.

For the iris example my question would be, how to extrapolate Sepal.Length for data where we don't have Sepal.Width.

Thanks!

edited Nov 11 at 12:36

asked Nov 10 at 22:35

maaar

I have been following eddis's reply from Apply grouped model back onto data

combinedprofiles <- data.table(df)

#Make a model for each hour
my.models <- combined_profiles[, list(Model = list(lm(E ~ t))),
 keyby = h] 

#Make predictions on dataset
setkey(combined_profiles, hour)
combined_profiles[my.models, prediction := predict(i.Model[[1]], .SD), by = .EACHI]

I have tried adding a dataframe with the new temperatures as new data to the prediction.

 newtemp<- data.frame(temp_round=c(6,7))
 combined_profiles[my.models, prediction := predict(newdata=newtemp,i.Model[[1]], .SD), by = .EACHI]

but this gives me the following error: Error in se.fit || interval != "none" : invalid 'x' type in 'x || y'

Could anyone please help me how to change this so as to predict demand for temperatures outside the measured data.

For the iris example my question would be, how to extrapolate Sepal.Length for data where we don't have Sepal.Width.

Thanks!

r dplyr

edited Nov 11 at 12:36

asked Nov 10 at 22:35

maaar

edited Nov 11 at 12:36

asked Nov 10 at 22:35

maaar

edited Nov 11 at 12:36

asked Nov 10 at 22:35

maaar

asked Nov 10 at 22:35

maaar

asked Nov 10 at 22:35

maaar

You are asking us to read too many of the neurons on your cerebral cortex.
– 42-
Nov 10 at 23:50

add a comment |

You are asking us to read too many of the neurons on your cerebral cortex.
– 42-
Nov 10 at 23:50

You are asking us to read too many of the neurons on your cerebral cortex.
– 42-
Nov 10 at 23:50

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

Interpolating

library(tidyverse)
library(data.table)

dplyr to clarify data.table solution you want:

df <- as_tibble(iris)
df
#> # A tibble: 150 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct> 
#> 1 5.1 3.5 1.4 0.2 setosa 
#> 2 4.9 3 1.4 0.2 setosa 
#> 3 4.7 3.2 1.3 0.2 setosa 
#> 4 4.6 3.1 1.5 0.2 setosa 
#> 5 5 3.6 1.4 0.2 setosa 
#> 6 5.4 3.9 1.7 0.4 setosa 
#> 7 4.6 3.4 1.4 0.3 setosa 
#> 8 5 3.4 1.5 0.2 setosa 
#> 9 4.4 2.9 1.4 0.2 setosa 
#> 10 4.9 3.1 1.5 0.1 setosa 
#> # ... with 140 more rows

We can just mutate() the fitted values

df %>%
 group_by(Species) %>% # for each Species
 mutate(
 pred = lm(Sepal.Length ~ Sepal.Width)$fitted.values
 )
#> # A tibble: 150 x 6
#> # Groups: Species [3]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 5.06
#> 2 4.9 3 1.4 0.2 setosa 4.71
#> 3 4.7 3.2 1.3 0.2 setosa 4.85
#> 4 4.6 3.1 1.5 0.2 setosa 4.78
#> 5 5 3.6 1.4 0.2 setosa 5.12
#> 6 5.4 3.9 1.7 0.4 setosa 5.33
#> 7 4.6 3.4 1.4 0.3 setosa 4.99
#> 8 5 3.4 1.5 0.2 setosa 4.99
#> 9 4.4 2.9 1.4 0.2 setosa 4.64
#> 10 4.9 3.1 1.5 0.1 setosa 4.78
#> # ... with 140 more rows

data.table

For this df, we can apply same logic.

setDT(df)[, pred := lm(Sepal.Length ~ Sepal.Width)$fitted.values, by = Species]

define new column pred by fitted values

by each group Species

Then we get the same result:

df
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> 1: 5.1 3.5 1.4 0.2 setosa 5.055715
#> 2: 4.9 3.0 1.4 0.2 setosa 4.710470
#> 3: 4.7 3.2 1.3 0.2 setosa 4.848568
#> 4: 4.6 3.1 1.5 0.2 setosa 4.779519
#> 5: 5.0 3.6 1.4 0.2 setosa 5.124764
#> --- 
#> 146: 6.7 3.0 5.2 2.3 virginica 6.611440
#> 147: 6.3 2.5 5.0 1.9 virginica 6.160673
#> 148: 6.5 3.0 5.2 2.0 virginica 6.611440
#> 149: 6.2 3.4 5.4 2.3 virginica 6.972054
#> 150: 5.9 3.0 5.1 1.8 virginica 6.611440

Extrapolating

First of all, the colname of newdata should be set same as the model.

newtemp <- data.frame(Sepal.Width = c(6, 7))

As doing aggregation in data.table, you might do .(predict(mod, newdata)):

dt <- as.data.table(df)

dt[, .(pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)), by = Species]
#> Species pred
#> 1: setosa 6.781940
#> 2: setosa 7.472429
#> 3: versicolor 8.730201
#> 4: versicolor 9.595279
#> 5: virginica 9.316043
#> 6: virginica 10.217578

If you want newdata column for each group, you can just add the term inside the list .()

I implemented %>% for readability.

df %>%
 data.table() %>%
 .[,
 .(newdata = unlist(newtemp, use.names = FALSE),
 pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)),
 by = Species]
#> Species newdata pred
#> 1: setosa 6 6.781940
#> 2: setosa 7 7.472429
#> 3: versicolor 6 8.730201
#> 4: versicolor 7 9.595279
#> 5: virginica 6 9.316043
#> 6: virginica 7 10.217578

edited Nov 11 at 13:24

answered Nov 11 at 2:47

Blended

42617

Thanks, that works great. Is there a way to add a column with the newdata to the prediction?
– maaar
Nov 11 at 13:12

@maaar, Do you mean the newdata column?
– Blended
Nov 11 at 13:14

I added the column. You can just write additional term inside list() of data.table. I think if we do not unlist your data.frame, it gets error.
– Blended
Nov 11 at 13:27

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244102%2ffitting-grouped-regression-model-and-extrapolating%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

Interpolating

library(tidyverse)
library(data.table)

dplyr to clarify data.table solution you want:

df <- as_tibble(iris)
df
#> # A tibble: 150 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct> 
#> 1 5.1 3.5 1.4 0.2 setosa 
#> 2 4.9 3 1.4 0.2 setosa 
#> 3 4.7 3.2 1.3 0.2 setosa 
#> 4 4.6 3.1 1.5 0.2 setosa 
#> 5 5 3.6 1.4 0.2 setosa 
#> 6 5.4 3.9 1.7 0.4 setosa 
#> 7 4.6 3.4 1.4 0.3 setosa 
#> 8 5 3.4 1.5 0.2 setosa 
#> 9 4.4 2.9 1.4 0.2 setosa 
#> 10 4.9 3.1 1.5 0.1 setosa 
#> # ... with 140 more rows

We can just mutate() the fitted values

df %>%
 group_by(Species) %>% # for each Species
 mutate(
 pred = lm(Sepal.Length ~ Sepal.Width)$fitted.values
 )
#> # A tibble: 150 x 6
#> # Groups: Species [3]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 5.06
#> 2 4.9 3 1.4 0.2 setosa 4.71
#> 3 4.7 3.2 1.3 0.2 setosa 4.85
#> 4 4.6 3.1 1.5 0.2 setosa 4.78
#> 5 5 3.6 1.4 0.2 setosa 5.12
#> 6 5.4 3.9 1.7 0.4 setosa 5.33
#> 7 4.6 3.4 1.4 0.3 setosa 4.99
#> 8 5 3.4 1.5 0.2 setosa 4.99
#> 9 4.4 2.9 1.4 0.2 setosa 4.64
#> 10 4.9 3.1 1.5 0.1 setosa 4.78
#> # ... with 140 more rows

data.table

For this df, we can apply same logic.

setDT(df)[, pred := lm(Sepal.Length ~ Sepal.Width)$fitted.values, by = Species]

define new column pred by fitted values

by each group Species

Then we get the same result:

df
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> 1: 5.1 3.5 1.4 0.2 setosa 5.055715
#> 2: 4.9 3.0 1.4 0.2 setosa 4.710470
#> 3: 4.7 3.2 1.3 0.2 setosa 4.848568
#> 4: 4.6 3.1 1.5 0.2 setosa 4.779519
#> 5: 5.0 3.6 1.4 0.2 setosa 5.124764
#> --- 
#> 146: 6.7 3.0 5.2 2.3 virginica 6.611440
#> 147: 6.3 2.5 5.0 1.9 virginica 6.160673
#> 148: 6.5 3.0 5.2 2.0 virginica 6.611440
#> 149: 6.2 3.4 5.4 2.3 virginica 6.972054
#> 150: 5.9 3.0 5.1 1.8 virginica 6.611440

Extrapolating

First of all, the colname of newdata should be set same as the model.

newtemp <- data.frame(Sepal.Width = c(6, 7))

As doing aggregation in data.table, you might do .(predict(mod, newdata)):

dt <- as.data.table(df)

dt[, .(pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)), by = Species]
#> Species pred
#> 1: setosa 6.781940
#> 2: setosa 7.472429
#> 3: versicolor 8.730201
#> 4: versicolor 9.595279
#> 5: virginica 9.316043
#> 6: virginica 10.217578

If you want newdata column for each group, you can just add the term inside the list .()

I implemented %>% for readability.

df %>%
 data.table() %>%
 .[,
 .(newdata = unlist(newtemp, use.names = FALSE),
 pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)),
 by = Species]
#> Species newdata pred
#> 1: setosa 6 6.781940
#> 2: setosa 7 7.472429
#> 3: versicolor 6 8.730201
#> 4: versicolor 7 9.595279
#> 5: virginica 6 9.316043
#> 6: virginica 7 10.217578

edited Nov 11 at 13:24

answered Nov 11 at 2:47

Blended

42617

Thanks, that works great. Is there a way to add a column with the newdata to the prediction?
– maaar
Nov 11 at 13:12

@maaar, Do you mean the newdata column?
– Blended
Nov 11 at 13:14

I added the column. You can just write additional term inside list() of data.table. I think if we do not unlist your data.frame, it gets error.
– Blended
Nov 11 at 13:27

add a comment |

up vote
0
down vote

Interpolating

library(tidyverse)
library(data.table)

dplyr to clarify data.table solution you want:

df <- as_tibble(iris)
df
#> # A tibble: 150 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct> 
#> 1 5.1 3.5 1.4 0.2 setosa 
#> 2 4.9 3 1.4 0.2 setosa 
#> 3 4.7 3.2 1.3 0.2 setosa 
#> 4 4.6 3.1 1.5 0.2 setosa 
#> 5 5 3.6 1.4 0.2 setosa 
#> 6 5.4 3.9 1.7 0.4 setosa 
#> 7 4.6 3.4 1.4 0.3 setosa 
#> 8 5 3.4 1.5 0.2 setosa 
#> 9 4.4 2.9 1.4 0.2 setosa 
#> 10 4.9 3.1 1.5 0.1 setosa 
#> # ... with 140 more rows

We can just mutate() the fitted values

df %>%
 group_by(Species) %>% # for each Species
 mutate(
 pred = lm(Sepal.Length ~ Sepal.Width)$fitted.values
 )
#> # A tibble: 150 x 6
#> # Groups: Species [3]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 5.06
#> 2 4.9 3 1.4 0.2 setosa 4.71
#> 3 4.7 3.2 1.3 0.2 setosa 4.85
#> 4 4.6 3.1 1.5 0.2 setosa 4.78
#> 5 5 3.6 1.4 0.2 setosa 5.12
#> 6 5.4 3.9 1.7 0.4 setosa 5.33
#> 7 4.6 3.4 1.4 0.3 setosa 4.99
#> 8 5 3.4 1.5 0.2 setosa 4.99
#> 9 4.4 2.9 1.4 0.2 setosa 4.64
#> 10 4.9 3.1 1.5 0.1 setosa 4.78
#> # ... with 140 more rows

data.table

For this df, we can apply same logic.

setDT(df)[, pred := lm(Sepal.Length ~ Sepal.Width)$fitted.values, by = Species]

define new column pred by fitted values

by each group Species

Then we get the same result:

df
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> 1: 5.1 3.5 1.4 0.2 setosa 5.055715
#> 2: 4.9 3.0 1.4 0.2 setosa 4.710470
#> 3: 4.7 3.2 1.3 0.2 setosa 4.848568
#> 4: 4.6 3.1 1.5 0.2 setosa 4.779519
#> 5: 5.0 3.6 1.4 0.2 setosa 5.124764
#> --- 
#> 146: 6.7 3.0 5.2 2.3 virginica 6.611440
#> 147: 6.3 2.5 5.0 1.9 virginica 6.160673
#> 148: 6.5 3.0 5.2 2.0 virginica 6.611440
#> 149: 6.2 3.4 5.4 2.3 virginica 6.972054
#> 150: 5.9 3.0 5.1 1.8 virginica 6.611440

Extrapolating

First of all, the colname of newdata should be set same as the model.

newtemp <- data.frame(Sepal.Width = c(6, 7))

As doing aggregation in data.table, you might do .(predict(mod, newdata)):

dt <- as.data.table(df)

dt[, .(pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)), by = Species]
#> Species pred
#> 1: setosa 6.781940
#> 2: setosa 7.472429
#> 3: versicolor 8.730201
#> 4: versicolor 9.595279
#> 5: virginica 9.316043
#> 6: virginica 10.217578

If you want newdata column for each group, you can just add the term inside the list .()

I implemented %>% for readability.

df %>%
 data.table() %>%
 .[,
 .(newdata = unlist(newtemp, use.names = FALSE),
 pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)),
 by = Species]
#> Species newdata pred
#> 1: setosa 6 6.781940
#> 2: setosa 7 7.472429
#> 3: versicolor 6 8.730201
#> 4: versicolor 7 9.595279
#> 5: virginica 6 9.316043
#> 6: virginica 7 10.217578

edited Nov 11 at 13:24

answered Nov 11 at 2:47

Blended

42617

Thanks, that works great. Is there a way to add a column with the newdata to the prediction?
– maaar
Nov 11 at 13:12

@maaar, Do you mean the newdata column?
– Blended
Nov 11 at 13:14

I added the column. You can just write additional term inside list() of data.table. I think if we do not unlist your data.frame, it gets error.
– Blended
Nov 11 at 13:27

add a comment |

up vote
0
down vote

Interpolating

library(tidyverse)
library(data.table)

dplyr to clarify data.table solution you want:

df <- as_tibble(iris)
df
#> # A tibble: 150 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct> 
#> 1 5.1 3.5 1.4 0.2 setosa 
#> 2 4.9 3 1.4 0.2 setosa 
#> 3 4.7 3.2 1.3 0.2 setosa 
#> 4 4.6 3.1 1.5 0.2 setosa 
#> 5 5 3.6 1.4 0.2 setosa 
#> 6 5.4 3.9 1.7 0.4 setosa 
#> 7 4.6 3.4 1.4 0.3 setosa 
#> 8 5 3.4 1.5 0.2 setosa 
#> 9 4.4 2.9 1.4 0.2 setosa 
#> 10 4.9 3.1 1.5 0.1 setosa 
#> # ... with 140 more rows

We can just mutate() the fitted values

df %>%
 group_by(Species) %>% # for each Species
 mutate(
 pred = lm(Sepal.Length ~ Sepal.Width)$fitted.values
 )
#> # A tibble: 150 x 6
#> # Groups: Species [3]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 5.06
#> 2 4.9 3 1.4 0.2 setosa 4.71
#> 3 4.7 3.2 1.3 0.2 setosa 4.85
#> 4 4.6 3.1 1.5 0.2 setosa 4.78
#> 5 5 3.6 1.4 0.2 setosa 5.12
#> 6 5.4 3.9 1.7 0.4 setosa 5.33
#> 7 4.6 3.4 1.4 0.3 setosa 4.99
#> 8 5 3.4 1.5 0.2 setosa 4.99
#> 9 4.4 2.9 1.4 0.2 setosa 4.64
#> 10 4.9 3.1 1.5 0.1 setosa 4.78
#> # ... with 140 more rows

data.table

For this df, we can apply same logic.

setDT(df)[, pred := lm(Sepal.Length ~ Sepal.Width)$fitted.values, by = Species]

define new column pred by fitted values

by each group Species

Then we get the same result:

df
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> 1: 5.1 3.5 1.4 0.2 setosa 5.055715
#> 2: 4.9 3.0 1.4 0.2 setosa 4.710470
#> 3: 4.7 3.2 1.3 0.2 setosa 4.848568
#> 4: 4.6 3.1 1.5 0.2 setosa 4.779519
#> 5: 5.0 3.6 1.4 0.2 setosa 5.124764
#> --- 
#> 146: 6.7 3.0 5.2 2.3 virginica 6.611440
#> 147: 6.3 2.5 5.0 1.9 virginica 6.160673
#> 148: 6.5 3.0 5.2 2.0 virginica 6.611440
#> 149: 6.2 3.4 5.4 2.3 virginica 6.972054
#> 150: 5.9 3.0 5.1 1.8 virginica 6.611440

Extrapolating

First of all, the colname of newdata should be set same as the model.

newtemp <- data.frame(Sepal.Width = c(6, 7))

As doing aggregation in data.table, you might do .(predict(mod, newdata)):

dt <- as.data.table(df)

dt[, .(pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)), by = Species]
#> Species pred
#> 1: setosa 6.781940
#> 2: setosa 7.472429
#> 3: versicolor 8.730201
#> 4: versicolor 9.595279
#> 5: virginica 9.316043
#> 6: virginica 10.217578

If you want newdata column for each group, you can just add the term inside the list .()

I implemented %>% for readability.

df %>%
 data.table() %>%
 .[,
 .(newdata = unlist(newtemp, use.names = FALSE),
 pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)),
 by = Species]
#> Species newdata pred
#> 1: setosa 6 6.781940
#> 2: setosa 7 7.472429
#> 3: versicolor 6 8.730201
#> 4: versicolor 7 9.595279
#> 5: virginica 6 9.316043
#> 6: virginica 7 10.217578

edited Nov 11 at 13:24

answered Nov 11 at 2:47

Blended

42617

Interpolating

library(tidyverse)
library(data.table)

dplyr to clarify data.table solution you want:

df <- as_tibble(iris)
df
#> # A tibble: 150 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct> 
#> 1 5.1 3.5 1.4 0.2 setosa 
#> 2 4.9 3 1.4 0.2 setosa 
#> 3 4.7 3.2 1.3 0.2 setosa 
#> 4 4.6 3.1 1.5 0.2 setosa 
#> 5 5 3.6 1.4 0.2 setosa 
#> 6 5.4 3.9 1.7 0.4 setosa 
#> 7 4.6 3.4 1.4 0.3 setosa 
#> 8 5 3.4 1.5 0.2 setosa 
#> 9 4.4 2.9 1.4 0.2 setosa 
#> 10 4.9 3.1 1.5 0.1 setosa 
#> # ... with 140 more rows

We can just mutate() the fitted values

df %>%
 group_by(Species) %>% # for each Species
 mutate(
 pred = lm(Sepal.Length ~ Sepal.Width)$fitted.values
 )
#> # A tibble: 150 x 6
#> # Groups: Species [3]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 5.06
#> 2 4.9 3 1.4 0.2 setosa 4.71
#> 3 4.7 3.2 1.3 0.2 setosa 4.85
#> 4 4.6 3.1 1.5 0.2 setosa 4.78
#> 5 5 3.6 1.4 0.2 setosa 5.12
#> 6 5.4 3.9 1.7 0.4 setosa 5.33
#> 7 4.6 3.4 1.4 0.3 setosa 4.99
#> 8 5 3.4 1.5 0.2 setosa 4.99
#> 9 4.4 2.9 1.4 0.2 setosa 4.64
#> 10 4.9 3.1 1.5 0.1 setosa 4.78
#> # ... with 140 more rows

data.table

For this df, we can apply same logic.

setDT(df)[, pred := lm(Sepal.Length ~ Sepal.Width)$fitted.values, by = Species]

define new column pred by fitted values

by each group Species

Then we get the same result:

df
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species pred
#> 1: 5.1 3.5 1.4 0.2 setosa 5.055715
#> 2: 4.9 3.0 1.4 0.2 setosa 4.710470
#> 3: 4.7 3.2 1.3 0.2 setosa 4.848568
#> 4: 4.6 3.1 1.5 0.2 setosa 4.779519
#> 5: 5.0 3.6 1.4 0.2 setosa 5.124764
#> --- 
#> 146: 6.7 3.0 5.2 2.3 virginica 6.611440
#> 147: 6.3 2.5 5.0 1.9 virginica 6.160673
#> 148: 6.5 3.0 5.2 2.0 virginica 6.611440
#> 149: 6.2 3.4 5.4 2.3 virginica 6.972054
#> 150: 5.9 3.0 5.1 1.8 virginica 6.611440

Extrapolating

First of all, the colname of newdata should be set same as the model.

newtemp <- data.frame(Sepal.Width = c(6, 7))

As doing aggregation in data.table, you might do .(predict(mod, newdata)):

dt <- as.data.table(df)

dt[, .(pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)), by = Species]
#> Species pred
#> 1: setosa 6.781940
#> 2: setosa 7.472429
#> 3: versicolor 8.730201
#> 4: versicolor 9.595279
#> 5: virginica 9.316043
#> 6: virginica 10.217578

If you want newdata column for each group, you can just add the term inside the list .()

I implemented %>% for readability.

df %>%
 data.table() %>%
 .[,
 .(newdata = unlist(newtemp, use.names = FALSE),
 pred = predict(lm(Sepal.Length ~ Sepal.Width, data = .SD), newdata = newtemp)),
 by = Species]
#> Species newdata pred
#> 1: setosa 6 6.781940
#> 2: setosa 7 7.472429
#> 3: versicolor 6 8.730201
#> 4: versicolor 7 9.595279
#> 5: virginica 6 9.316043
#> 6: virginica 7 10.217578

edited Nov 11 at 13:24

answered Nov 11 at 2:47

Blended

42617

edited Nov 11 at 13:24

answered Nov 11 at 2:47

Blended

42617

answered Nov 11 at 2:47

Blended

42617

answered Nov 11 at 2:47

Blended

42617

Thanks, that works great. Is there a way to add a column with the newdata to the prediction?
– maaar
Nov 11 at 13:12

@maaar, Do you mean the newdata column?
– Blended
Nov 11 at 13:14

I added the column. You can just write additional term inside list() of data.table. I think if we do not unlist your data.frame, it gets error.
– Blended
Nov 11 at 13:27

add a comment |

Thanks, that works great. Is there a way to add a column with the newdata to the prediction?
– maaar
Nov 11 at 13:12

@maaar, Do you mean the newdata column?
– Blended
Nov 11 at 13:14

I added the column. You can just write additional term inside list() of data.table. I think if we do not unlist your data.frame, it gets error.
– Blended
Nov 11 at 13:27

Thanks, that works great. Is there a way to add a column with the newdata to the prediction?
– maaar
Nov 11 at 13:12

@maaar, Do you mean the newdata column?
– Blended
Nov 11 at 13:14

I added the column. You can just write additional term inside list() of data.table. I think if we do not unlist your data.frame, it gets error.
– Blended
Nov 11 at 13:27

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb

fitting grouped regression model and extrapolating

1 Answer
1

Interpolating

data.table

Extrapolating

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Interpolating

data.table

Extrapolating

Interpolating

data.table

Extrapolating

Interpolating

data.table

Extrapolating

Interpolating

data.table

Extrapolating

Post as a guest

Popular posts from this blog

Makov (Slowakei)

Kleinkühnau

Deutsches Schauspielhaus

fitting grouped regression model and extrapolating

1 Answer 1

Interpolating

data.table

Extrapolating

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Interpolating

data.table

Extrapolating

Interpolating

data.table

Extrapolating

Interpolating

data.table

Extrapolating

Interpolating

data.table

Extrapolating

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Makov (Slowakei)

Kleinkühnau

Deutsches Schauspielhaus

1 Answer
1

1 Answer
1

1 Answer
1