'generator' object is not callable when if inside a for loop in list compression










0















I am getting the following error when i ran this:



df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))



TypeError: 'generator' object is not callable




I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.



So I have values in df['initial_referrer'] be like:



df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php


And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.










share|improve this question



















  • 1





    you're not using a list-comprehension, you are using a generator expression

    – juanpa.arrivillaga
    Nov 12 '18 at 10:25











  • Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

    – Gagan
    Nov 12 '18 at 10:28











  • use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

    – juanpa.arrivillaga
    Nov 12 '18 at 10:28











  • If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

    – Ulrich Eckhardt
    Nov 12 '18 at 10:39











  • @UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

    – Gagan
    Nov 12 '18 at 10:44















0















I am getting the following error when i ran this:



df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))



TypeError: 'generator' object is not callable




I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.



So I have values in df['initial_referrer'] be like:



df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php


And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.










share|improve this question



















  • 1





    you're not using a list-comprehension, you are using a generator expression

    – juanpa.arrivillaga
    Nov 12 '18 at 10:25











  • Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

    – Gagan
    Nov 12 '18 at 10:28











  • use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

    – juanpa.arrivillaga
    Nov 12 '18 at 10:28











  • If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

    – Ulrich Eckhardt
    Nov 12 '18 at 10:39











  • @UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

    – Gagan
    Nov 12 '18 at 10:44













0












0








0


0






I am getting the following error when i ran this:



df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))



TypeError: 'generator' object is not callable




I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.



So I have values in df['initial_referrer'] be like:



df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php


And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.










share|improve this question
















I am getting the following error when i ran this:



df['initial_referrer'].apply(lambda x: value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&'))



TypeError: 'generator' object is not callable




I am not sure the meaning of the error and how to modify my this to get rid of this. I have read couple of similar questions here but could not figure our what could be the issue.



So I have values in df['initial_referrer'] be like:



df['initial_referrer'].head()
0 /login/index.php
1 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
2 /login/index.php
3 /login/index.php?utm_source=INTERNAL&utm_medium=EMAIL&utm_campaign=login-day1
4 /login/index.php


And in this, I wanted to extract the value of utm_campaign which is login-day1 thats why I was using the for loop and then if statement it was taking a lot of time/days to process 20mil rows. Therefore I wanted to use generator expression or list compression to process it faster.







python pandas performance






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 '18 at 11:31









jpp

94.6k2156108




94.6k2156108










asked Nov 12 '18 at 10:23









GaganGagan

392520




392520







  • 1





    you're not using a list-comprehension, you are using a generator expression

    – juanpa.arrivillaga
    Nov 12 '18 at 10:25











  • Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

    – Gagan
    Nov 12 '18 at 10:28











  • use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

    – juanpa.arrivillaga
    Nov 12 '18 at 10:28











  • If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

    – Ulrich Eckhardt
    Nov 12 '18 at 10:39











  • @UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

    – Gagan
    Nov 12 '18 at 10:44












  • 1





    you're not using a list-comprehension, you are using a generator expression

    – juanpa.arrivillaga
    Nov 12 '18 at 10:25











  • Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

    – Gagan
    Nov 12 '18 at 10:28











  • use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

    – juanpa.arrivillaga
    Nov 12 '18 at 10:28











  • If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

    – Ulrich Eckhardt
    Nov 12 '18 at 10:39











  • @UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

    – Gagan
    Nov 12 '18 at 10:44







1




1





you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25





you're not using a list-comprehension, you are using a generator expression

– juanpa.arrivillaga
Nov 12 '18 at 10:25













Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28





Basically I was doing this using itterrows earlier but it was taking a lot of time so I thought of doing this using apply

– Gagan
Nov 12 '18 at 10:28













use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28





use itertuples. But .apply isn't going to be much faster. It's basically a plain python for-loop underneath the hood

– juanpa.arrivillaga
Nov 12 '18 at 10:28













If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39





If you'd extract a Minimal, Complete, and Verifiable example, you might find the error yourself. Also, it would make this question much more valuable to others.

– Ulrich Eckhardt
Nov 12 '18 at 10:39













@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44





@UlrichEckhardt let me know if question make sense now. I dont think downvote would solve the purpose though but anyways.

– Gagan
Nov 12 '18 at 10:44












1 Answer
1






active

oldest

votes


















5














It's instructive to first use apply with a regular function:



def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]

df['initial_referrer'].apply(func)


Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:



df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])


But the latter is unreadable. You are better off writing a regular function.



Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:



df['initial_referrer'] = list(map(func, df['initial_referrer'].values))


Or even a list comprehension:



df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]





share|improve this answer

























  • Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

    – Gagan
    Nov 12 '18 at 10:32












  • @Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

    – jpp
    Nov 12 '18 at 10:33











  • Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

    – Gagan
    Nov 12 '18 at 11:12










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260149%2fgenerator-object-is-not-callable-when-if-inside-a-for-loop-in-list-compression%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









5














It's instructive to first use apply with a regular function:



def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]

df['initial_referrer'].apply(func)


Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:



df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])


But the latter is unreadable. You are better off writing a regular function.



Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:



df['initial_referrer'] = list(map(func, df['initial_referrer'].values))


Or even a list comprehension:



df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]





share|improve this answer

























  • Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

    – Gagan
    Nov 12 '18 at 10:32












  • @Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

    – jpp
    Nov 12 '18 at 10:33











  • Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

    – Gagan
    Nov 12 '18 at 11:12















5














It's instructive to first use apply with a regular function:



def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]

df['initial_referrer'].apply(func)


Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:



df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])


But the latter is unreadable. You are better off writing a regular function.



Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:



df['initial_referrer'] = list(map(func, df['initial_referrer'].values))


Or even a list comprehension:



df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]





share|improve this answer

























  • Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

    – Gagan
    Nov 12 '18 at 10:32












  • @Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

    – jpp
    Nov 12 '18 at 10:33











  • Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

    – Gagan
    Nov 12 '18 at 11:12













5












5








5







It's instructive to first use apply with a regular function:



def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]

df['initial_referrer'].apply(func)


Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:



df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])


But the latter is unreadable. You are better off writing a regular function.



Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:



df['initial_referrer'] = list(map(func, df['initial_referrer'].values))


Or even a list comprehension:



df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]





share|improve this answer















It's instructive to first use apply with a regular function:



def func(x):
return [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan
for value in x.split('&')]

df['initial_referrer'].apply(func)


Notice the square brackets representing the list comprehension. You need to translate this to your lambda function:



df['initial_referrer'].apply(lambda x: [value.split("utm_campaign=",1)[1] if 'utm_campaign' in value else np.nan for value in x.split('&')])


But the latter is unreadable. You are better off writing a regular function.



Note pd.Series.apply is a Python-level loop. You can use map instead and will likely see a performance improvement:



df['initial_referrer'] = list(map(func, df['initial_referrer'].values))


Or even a list comprehension:



df['initial_referrer'] = [func(x) for x in df['initial_referrer'].values]






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 12 '18 at 10:33

























answered Nov 12 '18 at 10:28









jppjpp

94.6k2156108




94.6k2156108












  • Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

    – Gagan
    Nov 12 '18 at 10:32












  • @Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

    – jpp
    Nov 12 '18 at 10:33











  • Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

    – Gagan
    Nov 12 '18 at 11:12

















  • Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

    – Gagan
    Nov 12 '18 at 10:32












  • @Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

    – jpp
    Nov 12 '18 at 10:33











  • Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

    – Gagan
    Nov 12 '18 at 11:12
















Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32






Thanks jpp. any particular idea which way would be faster? I have 20mil rows to process

– Gagan
Nov 12 '18 at 10:32














@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33





@Gagan, See my last 2 suggestions. Test them and time to see what's most efficient.

– jpp
Nov 12 '18 at 10:33













Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12





Thanks @jpp, I changed the function to this: def func(x): return [value.split("utm_campaign=",1)[1] for value in x.split('&') if 'utm_campaign' in value][0] And statement to this: df['initial_referrer_1'] = [func(x) if 'utm_campaign' in x else np.nan for x in df['initial_referrer'].values]. And it worked.

– Gagan
Nov 12 '18 at 11:12

















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260149%2fgenerator-object-is-not-callable-when-if-inside-a-for-loop-in-list-compression%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Darth Vader #20

Ondo