'generator' object is not callable when if inside a for loop in list compression
I am getting the following error when i ran this:
df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))
TypeError: 'generator' object is not callable
I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.
So I have values in df['initial_referrer'] be like:
df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php
And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.
python pandas performance
|
show 1 more comment
I am getting the following error when i ran this:
df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))
TypeError: 'generator' object is not callable
I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.
So I have values in df['initial_referrer'] be like:
df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php
And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.
python pandas performance
1
you're not using a list-comprehension, you are using a generator expression
– juanpa.arrivillaga
Nov 12 '18 at 10:25
Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply
– Gagan
Nov 12 '18 at 10:28
use itertuples. But.apply
isn't going to be much faster. It's basically a plain python for-loop underneath the hood
– juanpa.arrivillaga
Nov 12 '18 at 10:28
If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.
– Ulrich Eckhardt
Nov 12 '18 at 10:39
@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.
– Gagan
Nov 12 '18 at 10:44
|
show 1 more comment
I am getting the following error when i ran this:
df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))
TypeError: 'generator' object is not callable
I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.
So I have values in df['initial_referrer'] be like:
df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php
And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.
python pandas performance
I am getting the following error when i ran this:
df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))
TypeError: 'generator' object is not callable
I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.
So I have values in df['initial_referrer'] be like:
df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php
And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.
python pandas performance
python pandas performance
edited Nov 12 '18 at 11:31
jpp
94.6k2156108
94.6k2156108
asked Nov 12 '18 at 10:23
GaganGagan
392520
392520
1
you're not using a list-comprehension, you are using a generator expression
– juanpa.arrivillaga
Nov 12 '18 at 10:25
Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply
– Gagan
Nov 12 '18 at 10:28
use itertuples. But.apply
isn't going to be much faster. It's basically a plain python for-loop underneath the hood
– juanpa.arrivillaga
Nov 12 '18 at 10:28
If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.
– Ulrich Eckhardt
Nov 12 '18 at 10:39
@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.
– Gagan
Nov 12 '18 at 10:44
|
show 1 more comment
1
you're not using a list-comprehension, you are using a generator expression
– juanpa.arrivillaga
Nov 12 '18 at 10:25
Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply
– Gagan
Nov 12 '18 at 10:28
use itertuples. But.apply
isn't going to be much faster. It's basically a plain python for-loop underneath the hood
– juanpa.arrivillaga
Nov 12 '18 at 10:28
If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.
– Ulrich Eckhardt
Nov 12 '18 at 10:39
@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.
– Gagan
Nov 12 '18 at 10:44
1
1
you're not using a list-comprehension, you are using a generator expression
– juanpa.arrivillaga
Nov 12 '18 at 10:25
you're not using a list-comprehension, you are using a generator expression
– juanpa.arrivillaga
Nov 12 '18 at 10:25
Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply
– Gagan
Nov 12 '18 at 10:28
Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply
– Gagan
Nov 12 '18 at 10:28
use itertuples. But
.apply
isn't going to be much faster. It's basically a plain python for-loop underneath the hood– juanpa.arrivillaga
Nov 12 '18 at 10:28
use itertuples. But
.apply
isn't going to be much faster. It's basically a plain python for-loop underneath the hood– juanpa.arrivillaga
Nov 12 '18 at 10:28
If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.
– Ulrich Eckhardt
Nov 12 '18 at 10:39
If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.
– Ulrich Eckhardt
Nov 12 '18 at 10:39
@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.
– Gagan
Nov 12 '18 at 10:44
@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.
– Gagan
Nov 12 '18 at 10:44
|
show 1 more comment
1 Answer
1
active
oldest
votes
It's instructive to first use apply
with a regular function:
def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]
df['initial_referrer'].apply(func)
Notice the square brackets representing the list comprehension. You need to translate this to your lambda
function:
df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])
But the latter is unreadable. You are better off writing a regular function.
Note pd.Series.apply
is a Python-level loop. You can use map
instead and will likely see a performance improvement:
df['initial_referrer'] = list(map(func, df['initial_referrer'].values))
Or even a list comprehension:
df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260149%2fgenerator-object-is-not-callable-when-if-inside-a-for-loop-in-list-compression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
It's instructive to first use apply
with a regular function:
def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]
df['initial_referrer'].apply(func)
Notice the square brackets representing the list comprehension. You need to translate this to your lambda
function:
df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])
But the latter is unreadable. You are better off writing a regular function.
Note pd.Series.apply
is a Python-level loop. You can use map
instead and will likely see a performance improvement:
df['initial_referrer'] = list(map(func, df['initial_referrer'].values))
Or even a list comprehension:
df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
add a comment |
It's instructive to first use apply
with a regular function:
def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]
df['initial_referrer'].apply(func)
Notice the square brackets representing the list comprehension. You need to translate this to your lambda
function:
df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])
But the latter is unreadable. You are better off writing a regular function.
Note pd.Series.apply
is a Python-level loop. You can use map
instead and will likely see a performance improvement:
df['initial_referrer'] = list(map(func, df['initial_referrer'].values))
Or even a list comprehension:
df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
add a comment |
It's instructive to first use apply
with a regular function:
def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]
df['initial_referrer'].apply(func)
Notice the square brackets representing the list comprehension. You need to translate this to your lambda
function:
df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])
But the latter is unreadable. You are better off writing a regular function.
Note pd.Series.apply
is a Python-level loop. You can use map
instead and will likely see a performance improvement:
df['initial_referrer'] = list(map(func, df['initial_referrer'].values))
Or even a list comprehension:
df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]
It's instructive to first use apply
with a regular function:
def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]
df['initial_referrer'].apply(func)
Notice the square brackets representing the list comprehension. You need to translate this to your lambda
function:
df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])
But the latter is unreadable. You are better off writing a regular function.
Note pd.Series.apply
is a Python-level loop. You can use map
instead and will likely see a performance improvement:
df['initial_referrer'] = list(map(func, df['initial_referrer'].values))
Or even a list comprehension:
df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]
edited Nov 12 '18 at 10:33
answered Nov 12 '18 at 10:28
jppjpp
94.6k2156108
94.6k2156108
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
add a comment |
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process
– Gagan
Nov 12 '18 at 10:32
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.
– jpp
Nov 12 '18 at 10:33
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.
– Gagan
Nov 12 '18 at 11:12
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260149%2fgenerator-object-is-not-callable-when-if-inside-a-for-loop-in-list-compression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
you're not using a list-comprehension, you are using a generator expression
– juanpa.arrivillaga
Nov 12 '18 at 10:25
Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply
– Gagan
Nov 12 '18 at 10:28
use itertuples. But
.apply
isn't going to be much faster. It's basically a plain python for-loop underneath the hood– juanpa.arrivillaga
Nov 12 '18 at 10:28
If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.
– Ulrich Eckhardt
Nov 12 '18 at 10:39
@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.
– Gagan
Nov 12 '18 at 10:44