What is the Python 3 way to ensure the correct dimension of array arguments?










2















In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray's. These must be two-dimensional r x n matrices. The row dimension r is essential: some functions require 1 x n vectors, others 2 x n matrices, with r up to three and possibly more. There're also functions defined for any r x n array. (The column dimension n is not essential for design purposes.)



From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:



  1. Document the method arguments (of course!)

  2. Unit tests (of course!)

  3. Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

  4. Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?










share|improve this question



















  • 2





    assert a.shape[0] == r?

    – Xiaoyu Lu
    Nov 12 '18 at 19:05











  • Type hints haven't been implemented in numpy.

    – hpaulj
    Nov 12 '18 at 19:29











  • Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

    – hpaulj
    Nov 12 '18 at 19:32











  • According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

    – max9111
    Nov 13 '18 at 9:03















2















In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray's. These must be two-dimensional r x n matrices. The row dimension r is essential: some functions require 1 x n vectors, others 2 x n matrices, with r up to three and possibly more. There're also functions defined for any r x n array. (The column dimension n is not essential for design purposes.)



From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:



  1. Document the method arguments (of course!)

  2. Unit tests (of course!)

  3. Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

  4. Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?










share|improve this question



















  • 2





    assert a.shape[0] == r?

    – Xiaoyu Lu
    Nov 12 '18 at 19:05











  • Type hints haven't been implemented in numpy.

    – hpaulj
    Nov 12 '18 at 19:29











  • Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

    – hpaulj
    Nov 12 '18 at 19:32











  • According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

    – max9111
    Nov 13 '18 at 9:03













2












2








2








In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray's. These must be two-dimensional r x n matrices. The row dimension r is essential: some functions require 1 x n vectors, others 2 x n matrices, with r up to three and possibly more. There're also functions defined for any r x n array. (The column dimension n is not essential for design purposes.)



From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:



  1. Document the method arguments (of course!)

  2. Unit tests (of course!)

  3. Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

  4. Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?










share|improve this question
















In my newbie Python 3.7 project, the arguments in many functions are numpy.ndarray's. These must be two-dimensional r x n matrices. The row dimension r is essential: some functions require 1 x n vectors, others 2 x n matrices, with r up to three and possibly more. There're also functions defined for any r x n array. (The column dimension n is not essential for design purposes.)



From my Matlab experience, this requirement can get confusing and error-prone. So I've considered the following approaches:



  1. Document the method arguments (of course!)

  2. Unit tests (of course!)

  3. Do validation and throw exceptions inside some functions. (However, this is not very functional, nor performant.)

  4. Define data classes: OneRow, TwoRows, ThreeRows and FourPlusRows. Each has an ndarray field, validated in the constructor. The upside includes type hints and a better domain modelling, a la DDD. A downside is extra complexity.

Question: Given the type hints introduced in Python 3 and the trend towards functional programming, what's the current pythonic approach to this problem?







python arrays numpy typehints






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 '18 at 19:57









hpaulj

111k778146




111k778146










asked Nov 12 '18 at 19:01









Tupolev._Tupolev._

31839




31839







  • 2





    assert a.shape[0] == r?

    – Xiaoyu Lu
    Nov 12 '18 at 19:05











  • Type hints haven't been implemented in numpy.

    – hpaulj
    Nov 12 '18 at 19:29











  • Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

    – hpaulj
    Nov 12 '18 at 19:32











  • According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

    – max9111
    Nov 13 '18 at 9:03












  • 2





    assert a.shape[0] == r?

    – Xiaoyu Lu
    Nov 12 '18 at 19:05











  • Type hints haven't been implemented in numpy.

    – hpaulj
    Nov 12 '18 at 19:29











  • Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

    – hpaulj
    Nov 12 '18 at 19:32











  • According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

    – max9111
    Nov 13 '18 at 9:03







2




2





assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05





assert a.shape[0] == r?

– Xiaoyu Lu
Nov 12 '18 at 19:05













Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29





Type hints haven't been implemented in numpy.

– hpaulj
Nov 12 '18 at 19:29













Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32





Where possible numpy code is written to work with 'any' dimensions. Where not testing ndim and shape is fine. Whether it raises an error or adjusts shape is your choice.

– hpaulj
Nov 12 '18 at 19:32













According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03





According to 3.) This is sometimes necessary to get SIMD-vectorization, or unrolling of small loops in jit compiled code and thus a quite significant speedup. github.com/numba/llvmlite/issues/270

– max9111
Nov 13 '18 at 9:03












1 Answer
1






active

oldest

votes


















2














One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n vectors:



def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]

...function body goes here...


This will avoid any problems due to arr having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m array, or a 15 x n array. Ultimately, you're going to have to go with approach 3. for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:



def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]


If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.



Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.



If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add function that only took 2D integer arrays, you could write the following function in a .pyx file:



import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)

result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result

for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]

return result


For more details on writing and compiling Cython, see the docs linked above.



This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.






share|improve this answer

























  • Good answer. Probably I should stop thinking "static types" while in Python.

    – Tupolev._
    Nov 13 '18 at 11:47










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268485%2fwhat-is-the-python-3-way-to-ensure-the-correct-dimension-of-array-arguments%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n vectors:



def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]

...function body goes here...


This will avoid any problems due to arr having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m array, or a 15 x n array. Ultimately, you're going to have to go with approach 3. for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:



def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]


If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.



Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.



If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add function that only took 2D integer arrays, you could write the following function in a .pyx file:



import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)

result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result

for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]

return result


For more details on writing and compiling Cython, see the docs linked above.



This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.






share|improve this answer

























  • Good answer. Probably I should stop thinking "static types" while in Python.

    – Tupolev._
    Nov 13 '18 at 11:47















2














One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n vectors:



def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]

...function body goes here...


This will avoid any problems due to arr having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m array, or a 15 x n array. Ultimately, you're going to have to go with approach 3. for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:



def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]


If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.



Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.



If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add function that only took 2D integer arrays, you could write the following function in a .pyx file:



import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)

result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result

for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]

return result


For more details on writing and compiling Cython, see the docs linked above.



This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.






share|improve this answer

























  • Good answer. Probably I should stop thinking "static types" while in Python.

    – Tupolev._
    Nov 13 '18 at 11:47













2












2








2







One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n vectors:



def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]

...function body goes here...


This will avoid any problems due to arr having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m array, or a 15 x n array. Ultimately, you're going to have to go with approach 3. for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:



def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]


If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.



Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.



If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add function that only took 2D integer arrays, you could write the following function in a .pyx file:



import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)

result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result

for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]

return result


For more details on writing and compiling Cython, see the docs linked above.



This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.






share|improve this answer















One of the best things about Python is duck typing, and Numpy is in general very compatible with that design approach. Say you have a vector-only function vecfunc. You can add some boilerplate to the beginning of the function that will inflate any 1D arrays into 1 x n vectors:



def vecfunc(arr):
if arr.ndim==1:
arr = arr[None, :]

...function body goes here...


This will avoid any problems due to arr having too few dimensions, and will likely still give correct behavior in most cases. However, it doesn't do anything to prevent a user from passing in, say, a r x n x m array, or a 15 x n array. Ultimately, you're going to have to go with approach 3. for a bunch of this stuff and just throw some exceptions where it seems appropriate. For example:



def vecfunc(arr):
if not 0 < arr.ndim < 3:
raise ValueError("arr must have ndim of 1 or 2. arr.ndim: %d" % arr.ndim)
elif arr.ndim==1:
arr = arr[None, :]


If it makes you feel any better, the code bases of both numpy and scipy have those kinds of shape-based exception checks in a number of functions, when and where they're needed.



Of course, you could always leave off adding those kinds of exception checks until the very end of developing any given function. You may be surprised at the range of input that produces reasonable behavior.



If you're dead set on type annotations, you can get something similar by writing your code using Cython. For example, if you wanted an add function that only took 2D integer arrays, you could write the following function in a .pyx file:



import numpy as np

def add(long[:, :] arr1, long[:, :] arr2):
assert tuple(arr1.shape) == tuple(arr2.shape)

result = np.zeros((arr1.shape[0], arr1.shape[1]), dtype=np.long)
cdef long[:, :] result_view = result

for x in range(arr1.shape[0]):
for y in range(arr1.shape[1]):
result_view[x, y] = arr1[x, y] + arr2[x, y]

return result


For more details on writing and compiling Cython, see the docs linked above.



This isn't so much "type annotations" as it is actual strong typing, but it may do what you want. Sadly, I wasn't able to find a way to fix the size of a single dimension, just the total number of dimensions.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 12 '18 at 22:57

























answered Nov 12 '18 at 21:41









teltel

7,34121431




7,34121431












  • Good answer. Probably I should stop thinking "static types" while in Python.

    – Tupolev._
    Nov 13 '18 at 11:47

















  • Good answer. Probably I should stop thinking "static types" while in Python.

    – Tupolev._
    Nov 13 '18 at 11:47
















Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47





Good answer. Probably I should stop thinking "static types" while in Python.

– Tupolev._
Nov 13 '18 at 11:47

















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268485%2fwhat-is-the-python-3-way-to-ensure-the-correct-dimension-of-array-arguments%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Darth Vader #20

Ondo