What is the Python 3 way to ensure the correct dimension of array arguments?
In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray
's. These must be two-dimensional r x n
matrices. The row dimension r
is essential: some functions require 1 x n
vectors, others 2 x n
matrices, with r
up to three and possibly more. There're also functions defined for any r x n
array. (The column dimension n
is not essential for design purposes.)
From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:
- Document the method arguments (of course!)
- Unit tests (of course!)
- Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)
- Define data classes:
OneRow
,TwoRows
,ThreeRows
andFourPlusRows
. Each has anndarray
field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.
Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?
python arrays numpy typehints
add a comment |
In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray
's. These must be two-dimensional r x n
matrices. The row dimension r
is essential: some functions require 1 x n
vectors, others 2 x n
matrices, with r
up to three and possibly more. There're also functions defined for any r x n
array. (The column dimension n
is not essential for design purposes.)
From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:
- Document the method arguments (of course!)
- Unit tests (of course!)
- Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)
- Define data classes:
OneRow
,TwoRows
,ThreeRows
andFourPlusRows
. Each has anndarray
field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.
Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?
python arrays numpy typehints
2
assert a.shape[0] == r
?
– Xiaoyu Lu
Nov 12 '18 at 19:05
Type hints haven't been implemented in numpy.
– hpaulj
Nov 12 '18 at 19:29
Where possible numpy code is written to work with 'any' dimensions. Where not testingndim
andshape
is fine. Whether it raises an error or adjusts shape is your choice.
– hpaulj
Nov 12 '18 at 19:32
According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270
– max9111
Nov 13 '18 at 9:03
add a comment |
In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray
's. These must be two-dimensional r x n
matrices. The row dimension r
is essential: some functions require 1 x n
vectors, others 2 x n
matrices, with r
up to three and possibly more. There're also functions defined for any r x n
array. (The column dimension n
is not essential for design purposes.)
From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:
- Document the method arguments (of course!)
- Unit tests (of course!)
- Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)
- Define data classes:
OneRow
,TwoRows
,ThreeRows
andFourPlusRows
. Each has anndarray
field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.
Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?
python arrays numpy typehints
In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray
's. These must be two-dimensional r x n
matrices. The row dimension r
is essential: some functions require 1 x n
vectors, others 2 x n
matrices, with r
up to three and possibly more. There're also functions defined for any r x n
array. (The column dimension n
is not essential for design purposes.)
From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:
- Document the method arguments (of course!)
- Unit tests (of course!)
- Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)
- Define data classes:
OneRow
,TwoRows
,ThreeRows
andFourPlusRows
. Each has anndarray
field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.
Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?
python arrays numpy typehints
python arrays numpy typehints
edited Nov 12 '18 at 19:57
hpaulj
111k778146
111k778146
asked Nov 12 '18 at 19:01
Tupolev._Tupolev._
31839
31839
2
assert a.shape[0] == r
?
– Xiaoyu Lu
Nov 12 '18 at 19:05
Type hints haven't been implemented in numpy.
– hpaulj
Nov 12 '18 at 19:29
Where possible numpy code is written to work with 'any' dimensions. Where not testingndim
andshape
is fine. Whether it raises an error or adjusts shape is your choice.
– hpaulj
Nov 12 '18 at 19:32
According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270
– max9111
Nov 13 '18 at 9:03
add a comment |
2
assert a.shape[0] == r
?
– Xiaoyu Lu
Nov 12 '18 at 19:05
Type hints haven't been implemented in numpy.
– hpaulj
Nov 12 '18 at 19:29
Where possible numpy code is written to work with 'any' dimensions. Where not testingndim
andshape
is fine. Whether it raises an error or adjusts shape is your choice.
– hpaulj
Nov 12 '18 at 19:32
According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270
– max9111
Nov 13 '18 at 9:03
2
2
assert a.shape[0] == r
?– Xiaoyu Lu
Nov 12 '18 at 19:05
assert a.shape[0] == r
?– Xiaoyu Lu
Nov 12 '18 at 19:05
Type hints haven't been implemented in numpy.
– hpaulj
Nov 12 '18 at 19:29
Type hints haven't been implemented in numpy.
– hpaulj
Nov 12 '18 at 19:29
Where possible numpy code is written to work with 'any' dimensions. Where not testing
ndim
and shape
is fine. Whether it raises an error or adjusts shape is your choice.– hpaulj
Nov 12 '18 at 19:32
Where possible numpy code is written to work with 'any' dimensions. Where not testing
ndim
and shape
is fine. Whether it raises an error or adjusts shape is your choice.– hpaulj
Nov 12 '18 at 19:32
According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270
– max9111
Nov 13 '18 at 9:03
According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270
– max9111
Nov 13 '18 at 9:03
add a comment |
1 Answer
1
active
oldest
votes
One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc
. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n
vectors:
def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]
...function body goes here...
This will avoid any problems due to arr
having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m
array, or a 15 x n
array. Ultimately, you're going to have to go with approach 3.
for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:
def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]
If it makes you feel any better, the code bases of both numpy
and scipy
have those kinds of shape-based exception checks in a number of functions, when and where they're needed.
Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.
If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add
function that only took 2D integer arrays, you could write the following function in a .pyx
file:
import numpy as np
def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)
result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result
for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]
return result
For more details on writing and compiling Cython, see the docs linked above.
This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268485%2fwhat-is-the-python-3-way-to-ensure-the-correct-dimension-of-array-arguments%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc
. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n
vectors:
def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]
...function body goes here...
This will avoid any problems due to arr
having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m
array, or a 15 x n
array. Ultimately, you're going to have to go with approach 3.
for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:
def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]
If it makes you feel any better, the code bases of both numpy
and scipy
have those kinds of shape-based exception checks in a number of functions, when and where they're needed.
Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.
If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add
function that only took 2D integer arrays, you could write the following function in a .pyx
file:
import numpy as np
def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)
result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result
for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]
return result
For more details on writing and compiling Cython, see the docs linked above.
This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
add a comment |
One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc
. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n
vectors:
def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]
...function body goes here...
This will avoid any problems due to arr
having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m
array, or a 15 x n
array. Ultimately, you're going to have to go with approach 3.
for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:
def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]
If it makes you feel any better, the code bases of both numpy
and scipy
have those kinds of shape-based exception checks in a number of functions, when and where they're needed.
Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.
If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add
function that only took 2D integer arrays, you could write the following function in a .pyx
file:
import numpy as np
def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)
result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result
for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]
return result
For more details on writing and compiling Cython, see the docs linked above.
This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
add a comment |
One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc
. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n
vectors:
def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]
...function body goes here...
This will avoid any problems due to arr
having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m
array, or a 15 x n
array. Ultimately, you're going to have to go with approach 3.
for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:
def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]
If it makes you feel any better, the code bases of both numpy
and scipy
have those kinds of shape-based exception checks in a number of functions, when and where they're needed.
Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.
If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add
function that only took 2D integer arrays, you could write the following function in a .pyx
file:
import numpy as np
def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)
result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result
for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]
return result
For more details on writing and compiling Cython, see the docs linked above.
This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.
One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc
. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n
vectors:
def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]
...function body goes here...
This will avoid any problems due to arr
having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m
array, or a 15 x n
array. Ultimately, you're going to have to go with approach 3.
for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:
def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]
If it makes you feel any better, the code bases of both numpy
and scipy
have those kinds of shape-based exception checks in a number of functions, when and where they're needed.
Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.
If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add
function that only took 2D integer arrays, you could write the following function in a .pyx
file:
import numpy as np
def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)
result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result
for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]
return result
For more details on writing and compiling Cython, see the docs linked above.
This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.
edited Nov 12 '18 at 22:57
answered Nov 12 '18 at 21:41
teltel
7,34121431
7,34121431
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
add a comment |
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
Good answer. Probably I should stop thinking "static types" while in Python.
– Tupolev._
Nov 13 '18 at 11:47
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268485%2fwhat-is-the-python-3-way-to-ensure-the-correct-dimension-of-array-arguments%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
assert a.shape[0] == r
?– Xiaoyu Lu
Nov 12 '18 at 19:05
Type hints haven't been implemented in numpy.
– hpaulj
Nov 12 '18 at 19:29
Where possible numpy code is written to work with 'any' dimensions. Where not testing
ndim
andshape
is fine. Whether it raises an error or adjusts shape is your choice.– hpaulj
Nov 12 '18 at 19:32
According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270
– max9111
Nov 13 '18 at 9:03