structuring a large python repository, to not import everything

up vote
4
down vote

favorite

I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:

repo/
 __init__.py
 utils/
 __init__.py
 math.py
 readers.py 
 ...
 ...

Now our __init__.py files are setup so that we can do something like this

from repo.utils import IniReader

In this example repo/utils/__init__.py would have

from .readers import IniReader, DatReader

This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.

The issue is this... let's pretend I'm writing an app that looks like this:

from repo.utils import IniReader
if __name__ == '__main__':
 r = IniReader('blah.ini')
 print(r.fields)

Now the from repo.utils import IniReader will execute repo/utils/__init__.py which in this case will import IniReader and DatReader. Let's pretend that DatReader looks something like this:

import numpy as np
import scipy
import tensorflow
from .math import transform

class DatReader():
...

which adheres to PEP8, with all the imports at the top of the file.

The problem here is that DatReader requires some heavyweight imports (e.g. numpy, scipy, tensorflow are huge libraries). To make matters worse, the from .math import transform might have something like from repo.contrib import lookup which then hits the repo/contrib/__init__.py which starts a chain reaction and ends up importing our entire repository.

This really hasn't been a problem for all of us developers with a full development environment stood up, but now that we're trying to ship applications (internally) this import hell is becoming an issue.

Is there a standard solution to this problem? We've talked about just keeping the __init__.py empty, or just not having all the imports at the top of a file as PEP8 states. Both of these solutions come with compromises, so if anyone has suggestions or references, I'd love to hear it.

Thanks!

asked Nov 10 at 17:10

matt

567

add a comment |

up vote
4
down vote

favorite

I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:

repo/
 __init__.py
 utils/
 __init__.py
 math.py
 readers.py 
 ...
 ...

Now our __init__.py files are setup so that we can do something like this

from repo.utils import IniReader

In this example repo/utils/__init__.py would have

from .readers import IniReader, DatReader

This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.

The issue is this... let's pretend I'm writing an app that looks like this:

from repo.utils import IniReader
if __name__ == '__main__':
 r = IniReader('blah.ini')
 print(r.fields)

import numpy as np
import scipy
import tensorflow
from .math import transform

class DatReader():
...

which adheres to PEP8, with all the imports at the top of the file.

Thanks!

asked Nov 10 at 17:10

matt

567

add a comment |

up vote
4
down vote

favorite

I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:

repo/
 __init__.py
 utils/
 __init__.py
 math.py
 readers.py 
 ...
 ...

Now our __init__.py files are setup so that we can do something like this

from repo.utils import IniReader

In this example repo/utils/__init__.py would have

from .readers import IniReader, DatReader

This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.

The issue is this... let's pretend I'm writing an app that looks like this:

from repo.utils import IniReader
if __name__ == '__main__':
 r = IniReader('blah.ini')
 print(r.fields)

import numpy as np
import scipy
import tensorflow
from .math import transform

class DatReader():
...

which adheres to PEP8, with all the imports at the top of the file.

Thanks!

asked Nov 10 at 17:10

matt

567

I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:

repo/
 __init__.py
 utils/
 __init__.py
 math.py
 readers.py 
 ...
 ...

Now our __init__.py files are setup so that we can do something like this

from repo.utils import IniReader

In this example repo/utils/__init__.py would have

from .readers import IniReader, DatReader

This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.

The issue is this... let's pretend I'm writing an app that looks like this:

from repo.utils import IniReader
if __name__ == '__main__':
 r = IniReader('blah.ini')
 print(r.fields)

import numpy as np
import scipy
import tensorflow
from .math import transform

class DatReader():
...

which adheres to PEP8, with all the imports at the top of the file.

Thanks!

python deployment import

asked Nov 10 at 17:10

matt

567

asked Nov 10 at 17:10

matt

567

asked Nov 10 at 17:10

matt

567

asked Nov 10 at 17:10

matt

567

asked Nov 10 at 17:10

matt

567

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"

Basically there are two categories of solutions to this problem:

Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with pip. Just include dependency specifications in your distributed package, and pip will offer users to automatically download and install any missing packages.
- You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.

Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in __init__.py files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing the DatReader imports.
- As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).

Example of an import in try-except clause:

import warnings
try:
 import scipy
except ImportError:
 warnings.warn("The python package `scipy` could not be imported. As a result "
 "the class `repo.utils.DatReader` will not be functional.")

Now to come back again to your original question "Is there a standard solution to this problem?": I'd say no. There's no single golden bullet. All solutions come with their own advantages and disadvantages, and you'll have to decide which solution is the optimal one for your specific situation.

edited Nov 10 at 23:43

answered Nov 10 at 23:36

Xukrao

1,7321724

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241381%2fstructuring-a-large-python-repository-to-not-import-everything%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"

Basically there are two categories of solutions to this problem:

Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with pip. Just include dependency specifications in your distributed package, and pip will offer users to automatically download and install any missing packages.
- You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.

Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in __init__.py files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing the DatReader imports.
- As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).

Example of an import in try-except clause:

import warnings
try:
 import scipy
except ImportError:
 warnings.warn("The python package `scipy` could not be imported. As a result "
 "the class `repo.utils.DatReader` will not be functional.")

edited Nov 10 at 23:43

answered Nov 10 at 23:36

Xukrao

1,7321724

add a comment |

up vote
1
down vote

accepted

It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"

Basically there are two categories of solutions to this problem:

Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with pip. Just include dependency specifications in your distributed package, and pip will offer users to automatically download and install any missing packages.
- You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.

Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in __init__.py files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing the DatReader imports.
- As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).

Example of an import in try-except clause:

import warnings
try:
 import scipy
except ImportError:
 warnings.warn("The python package `scipy` could not be imported. As a result "
 "the class `repo.utils.DatReader` will not be functional.")

edited Nov 10 at 23:43

answered Nov 10 at 23:36

Xukrao

1,7321724

add a comment |

up vote
1
down vote

accepted

It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"

Basically there are two categories of solutions to this problem:

Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with pip. Just include dependency specifications in your distributed package, and pip will offer users to automatically download and install any missing packages.
- You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.

Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in __init__.py files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing the DatReader imports.
- As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).

Example of an import in try-except clause:

import warnings
try:
 import scipy
except ImportError:
 warnings.warn("The python package `scipy` could not be imported. As a result "
 "the class `repo.utils.DatReader` will not be functional.")

edited Nov 10 at 23:43

answered Nov 10 at 23:36

Xukrao

1,7321724

It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"

Basically there are two categories of solutions to this problem:

Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with pip. Just include dependency specifications in your distributed package, and pip will offer users to automatically download and install any missing packages.
- You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.

Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in __init__.py files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing the DatReader imports.
- As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).

Example of an import in try-except clause:

import warnings
try:
 import scipy
except ImportError:
 warnings.warn("The python package `scipy` could not be imported. As a result "
 "the class `repo.utils.DatReader` will not be functional.")

edited Nov 10 at 23:43

answered Nov 10 at 23:36

Xukrao

1,7321724

edited Nov 10 at 23:43

answered Nov 10 at 23:36

Xukrao

1,7321724

answered Nov 10 at 23:36

Xukrao

1,7321724

answered Nov 10 at 23:36

Xukrao

1,7321724

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb