'generator' object is not callable when if inside a for loop in list compression

I am getting the following error when i ran this:

df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))

TypeError: 'generator' object is not callable

I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.

So I have values in df['initial_referrer'] be like:

df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php

And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

asked Nov 12 '18 at 10:23

Gagan

392520

1

you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25

Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28

use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28

If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39

@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44

|
show 1 more comment

I am getting the following error when i ran this:

df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))

TypeError: 'generator' object is not callable

I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.

So I have values in df['initial_referrer'] be like:

df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

asked Nov 12 '18 at 10:23

Gagan

392520

1

you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25

Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28

use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28

If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39

@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44

|
show 1 more comment

I am getting the following error when i ran this:

df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))

TypeError: 'generator' object is not callable

I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.

So I have values in df['initial_referrer'] be like:

df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

asked Nov 12 '18 at 10:23

Gagan

392520

I am getting the following error when i ran this:

df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))

TypeError: 'generator' object is not callable

I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.

So I have values in df['initial_referrer'] be like:

df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php

python pandas performance

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

asked Nov 12 '18 at 10:23

Gagan

392520

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

asked Nov 12 '18 at 10:23

Gagan

392520

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

edited Nov 12 '18 at 11:31

jpp

94.6k2156108

asked Nov 12 '18 at 10:23

Gagan

392520

asked Nov 12 '18 at 10:23

Gagan

392520

asked Nov 12 '18 at 10:23

Gagan

392520

1

you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25

Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28

use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28

If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39

@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44

|
show 1 more comment

1

you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25

Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28

use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28

If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39

@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44

you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25

Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28

use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28

If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39

@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44

|
show 1 more comment

1 Answer
1

active

oldest

votes

It's instructive to first use apply with a regular function:

def func(x):
 return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan 
 for value in x.split('&')]

df['initial_referrer'].apply(func)

Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:

df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])

But the latter is unreadable. You are better off writing a regular function.

Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:

df['initial_referrer'] = list(map(func, df['initial_referrer'].values))

Or even a list comprehension:

df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]

edited Nov 12 '18 at 10:33

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32

@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33

Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260149%2fgenerator-object-is-not-callable-when-if-inside-a-for-loop-in-list-compression%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

It's instructive to first use apply with a regular function:

def func(x):
 return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan 
 for value in x.split('&')]

df['initial_referrer'].apply(func)

Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:

df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])

But the latter is unreadable. You are better off writing a regular function.

Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:

df['initial_referrer'] = list(map(func, df['initial_referrer'].values))

Or even a list comprehension:

df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]

edited Nov 12 '18 at 10:33

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32

@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33

Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

add a comment |

It's instructive to first use apply with a regular function:

def func(x):
 return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan 
 for value in x.split('&')]

df['initial_referrer'].apply(func)

Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:

df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])

But the latter is unreadable. You are better off writing a regular function.

Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:

df['initial_referrer'] = list(map(func, df['initial_referrer'].values))

Or even a list comprehension:

df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]

edited Nov 12 '18 at 10:33

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32

@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33

Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

add a comment |

It's instructive to first use apply with a regular function:

def func(x):
 return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan 
 for value in x.split('&')]

df['initial_referrer'].apply(func)

Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:

df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])

But the latter is unreadable. You are better off writing a regular function.

Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:

df['initial_referrer'] = list(map(func, df['initial_referrer'].values))

Or even a list comprehension:

df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]

edited Nov 12 '18 at 10:33

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

It's instructive to first use apply with a regular function:

def func(x):
 return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan 
 for value in x.split('&')]

df['initial_referrer'].apply(func)

Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:

df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])

But the latter is unreadable. You are better off writing a regular function.

Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:

df['initial_referrer'] = list(map(func, df['initial_referrer'].values))

Or even a list comprehension:

df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]

edited Nov 12 '18 at 10:33

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

edited Nov 12 '18 at 10:33

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

answered Nov 12 '18 at 10:28

jpp

94.6k2156108

Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32

@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33

Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

add a comment |

Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32

@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33

Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32

@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33

Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb