What is the Python 3 way to ensure the correct dimension of array arguments?

In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray's. These must be two-dimensional r x n matrices. The row dimension r is essential: some functions require 1 x n vectors, others 2 x n matrices, with r up to three and possibly more. There're also functions defined for any r x n array. (The column dimension n is not essential for design purposes.)

From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:

Document the method arguments (of course!)

Unit tests (of course!)

Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?

edited Nov 12 '18 at 19:57

hpaulj

111k778146

asked Nov 12 '18 at 19:01

Tupolev._

31839

2

assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05

Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29

Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32

According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03

add a comment |

From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:

Document the method arguments (of course!)

Unit tests (of course!)

Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?

edited Nov 12 '18 at 19:57

hpaulj

111k778146

asked Nov 12 '18 at 19:01

Tupolev._

31839

2

assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05

Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29

Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32

According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03

add a comment |

From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:

Document the method arguments (of course!)

Unit tests (of course!)

Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?

edited Nov 12 '18 at 19:57

hpaulj

111k778146

asked Nov 12 '18 at 19:01

Tupolev._

31839

From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:

Document the method arguments (of course!)

Unit tests (of course!)

Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?

python arrays numpy typehints

edited Nov 12 '18 at 19:57

hpaulj

111k778146

asked Nov 12 '18 at 19:01

Tupolev._

31839

edited Nov 12 '18 at 19:57

hpaulj

111k778146

asked Nov 12 '18 at 19:01

Tupolev._

31839

edited Nov 12 '18 at 19:57

hpaulj

111k778146

edited Nov 12 '18 at 19:57

hpaulj

111k778146

edited Nov 12 '18 at 19:57

hpaulj

111k778146

asked Nov 12 '18 at 19:01

Tupolev._

31839

asked Nov 12 '18 at 19:01

Tupolev._

31839

asked Nov 12 '18 at 19:01

Tupolev._

31839

2

assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05

Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29

Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32

According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03

add a comment |

2

assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05

Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29

Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32

According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03

assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05

Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29

Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32

According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03

add a comment |

1 Answer
1

active

oldest

votes

One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n vectors:

def vecfunc(arr):
 if arr.ndim==1:
 arr = arr[None, :]

 ...function body goes here...

This will avoid any problems due to arr having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m array, or a 15 x n array. Ultimately, you're going to have to go with approach 3. for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:

def vecfunc(arr):
 if not 0 < arr.ndim < 3:
 raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
 elif arr.ndim==1:
 arr = arr[None, :]

If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.

Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.

If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add function that only took 2D integer arrays, you could write the following function in a .pyx file:

import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
 assert tuple(arr1.shape) == tuple(arr2.shape)

 result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
 cdef long[:, :] result_view = result

 for x in range(arr1.shape[0]):
 for y in range(arr1.shape[1]):
 result_view[x, y] = arr1[x, y] + arr2[x, y]

 return result

For more details on writing and compiling Cython, see the docs linked above.

This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.

edited Nov 12 '18 at 22:57

answered Nov 12 '18 at 21:41

tel

7,34121431

Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268485%2fwhat-is-the-python-3-way-to-ensure-the-correct-dimension-of-array-arguments%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

def vecfunc(arr):
 if arr.ndim==1:
 arr = arr[None, :]

 ...function body goes here...

def vecfunc(arr):
 if not 0 < arr.ndim < 3:
 raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
 elif arr.ndim==1:
 arr = arr[None, :]

If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.

import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
 assert tuple(arr1.shape) == tuple(arr2.shape)

 result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
 cdef long[:, :] result_view = result

 for x in range(arr1.shape[0]):
 for y in range(arr1.shape[1]):
 result_view[x, y] = arr1[x, y] + arr2[x, y]

 return result

For more details on writing and compiling Cython, see the docs linked above.

edited Nov 12 '18 at 22:57

answered Nov 12 '18 at 21:41

tel

7,34121431

Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

add a comment |

def vecfunc(arr):
 if arr.ndim==1:
 arr = arr[None, :]

 ...function body goes here...

def vecfunc(arr):
 if not 0 < arr.ndim < 3:
 raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
 elif arr.ndim==1:
 arr = arr[None, :]

If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.

import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
 assert tuple(arr1.shape) == tuple(arr2.shape)

 result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
 cdef long[:, :] result_view = result

 for x in range(arr1.shape[0]):
 for y in range(arr1.shape[1]):
 result_view[x, y] = arr1[x, y] + arr2[x, y]

 return result

For more details on writing and compiling Cython, see the docs linked above.

edited Nov 12 '18 at 22:57

answered Nov 12 '18 at 21:41

tel

7,34121431

Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

add a comment |

def vecfunc(arr):
 if arr.ndim==1:
 arr = arr[None, :]

 ...function body goes here...

def vecfunc(arr):
 if not 0 < arr.ndim < 3:
 raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
 elif arr.ndim==1:
 arr = arr[None, :]

If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.

import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
 assert tuple(arr1.shape) == tuple(arr2.shape)

 result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
 cdef long[:, :] result_view = result

 for x in range(arr1.shape[0]):
 for y in range(arr1.shape[1]):
 result_view[x, y] = arr1[x, y] + arr2[x, y]

 return result

For more details on writing and compiling Cython, see the docs linked above.

edited Nov 12 '18 at 22:57

answered Nov 12 '18 at 21:41

tel

7,34121431

def vecfunc(arr):
 if arr.ndim==1:
 arr = arr[None, :]

 ...function body goes here...

def vecfunc(arr):
 if not 0 < arr.ndim < 3:
 raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
 elif arr.ndim==1:
 arr = arr[None, :]

If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.

import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
 assert tuple(arr1.shape) == tuple(arr2.shape)

 result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
 cdef long[:, :] result_view = result

 for x in range(arr1.shape[0]):
 for y in range(arr1.shape[1]):
 result_view[x, y] = arr1[x, y] + arr2[x, y]

 return result

For more details on writing and compiling Cython, see the docs linked above.

edited Nov 12 '18 at 22:57

answered Nov 12 '18 at 21:41

tel

7,34121431

edited Nov 12 '18 at 22:57

answered Nov 12 '18 at 21:41

tel

7,34121431

answered Nov 12 '18 at 21:41

tel

7,34121431

answered Nov 12 '18 at 21:41

tel

7,34121431

Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

add a comment |

Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb