Should I use numpy (or pylab) as a python environment by using `from numpy import *`?

前端 未结 5 1534
一个人的身影
一个人的身影 2021-01-04 08:32

I use pylab (more specifically numpy) in all of my python programs­. The exceptions are very rare, if any. So far, I have taken the habit of importing numpy in the following

相关标签:
5条回答
  • 2021-01-04 09:18

    I would say that it is an advantage to know where every function call is coming from. It gives you more control over what is in your namespace and avoids all sorts of potential conflicts that will be a pain to debug. If you think import numpy as np is tedious, just wait until you have some third party module that redefines a function name and you have to track down some mysterious behavior that you weren't anticipating.

    0 讨论(0)
  • 2021-01-04 09:26

    it is not a significant problem if numpy is the only module you import like this. Never EVER import any other modules like this in your scripts (unless that module was written by you and you know everything about it and it is reasonably small. For instance, sometimes you split a module into two files so that you can compartmentalize better).

    General Rule: Your code readability will not suffer significantly by importing widely used modules (such as numpy) in this manner. But never never import more than one.

    My Rule: I NEVER do this kind of import. I always do something like "import numpy as np" if it is going to be used alot.

    0 讨论(0)
  • 2021-01-04 09:29

    Let's tackle from the other way around, I get your code to debug, and I see that you call:

    zeros(5)
    

    it is tedious to go check around your source to see if this is np.zeros or you redefined it somewhere else, and since pylab has 930 names, this can happen easily.

    0 讨论(0)
  • 2021-01-04 09:32
    1. Using from numpy import * changes the behavior of any, all and sum. For example,

      any([[False]])
      # True
      all([[True, False], [False, False]])
      # True
      sum([[1,2],[3,4]], 1) 
      # TypeError: unsupported operand type(s) for +: 'int' and 'list'
      

      Whereas, if you use from numpy import * then values are completely different:

      from numpy import *
      any([[False]])
      # False
      all([[True, False], [False, False]])
      # False
      sum([[1,2],[3,4]], 1) 
      array([3, 7])
      

      The full set of name collisions can be found this way (thanks to @Joe Kington and @jolvi for pointing this out):

      import numpy as np
      np_locals = set(np.__all__)
      builtins = set(dir(__builtins__))
      print([name for name in np_locals.intersection(builtins) if not name.startswith('__')])
      # ['any', 'all', 'sum']
      
    2. This can lead to very confusing bugs since someone testing or using your code in a Python interpreter without from numpy import * may see completely different behavior than you do.

    3. Using multiple imports of the form from module import * can compound the problem with even more collisions of this sort. If you nip this bad habit in the bud, you'll never have to worry about this (potentially confounding) bug.

      The order of the imports could also matter if both modules redefine the same name.

      And it makes it harding to figure out where functions and values come from.

    4. While it is possible to use from numpy import * and still access Python's builtins, it is awkward:

      from numpy import *
      any([[False]])
      __builtins__.any([[False]])
      

      and less readable than:

      import numpy as np
      np.any([[False]])
      any([[False]])
      
    5. As the Zen of Python says,

      Namespaces are a honking great idea -- let's use more of those!

    My advice would be to never use from module import * in any script, period.

    0 讨论(0)
  • 2021-01-04 09:35

    Just to elaborate on what other people have said, numpy is an especially bad module to use import * with.

    pylab is meant for interactive use, and it's fine there. No one wants to type pylab.zeros over and over in a shell when they could just type zeros. However, as soon as you start writing code, everything changes. You're typing it once and it's staying around potentially forever, and other people (e.g. yourself a year down the road) are probably going to be trying to figure out what the heck you were doing.

    In addition to what @unutbu already said about overriding python's builtin sum, float int, etc, and to what everyone has said about not knowing where a function came from, numpy and pylab are very large namespaces.

    numpy has 566 functions, variables, classes, etc within its namespace. That's a lot! pylab has 930! (And with pylab, these come from quite a few different modules.)

    Sure, it's easy enough to guess where zeros or ones or array is from, but what about source or DataSource or lib.utils? (all of these will be in your local namespace if you do from numpy import *

    If you have a even slightly larger project, there's a good chance you're going to have a local variable or a variable in another file that's named similar to something in a big module like numpy. Suddenly, you start to care a lot more about exactly what it is that you're calling!

    As another example, how would you distinguish between pylab's fft function and numpy's fft module?

    Depending on whether you do

    from numpy import *
    from pylab import *
    

    or:

    from pylab import *
    from numpy import *
    

    fft is a completely different thing with completely different behavior! (i.e. trying to call fft in the second case will raise an error.)

    All in all, you should always avoid from module import *, but it's an especially bad idea in the case of numpy, scipy, et. al. because they're such large namespaces.

    Of course all that having been said, if you're just futzing around in a shell trying to quickly get a plot of some data before move on to actually doing something with it, then sure, use pylab. That's what it's there for. Just don't write something that way that anyone might try to read later on down the road!

    </rant>

    0 讨论(0)
提交回复
热议问题