2016-09-02

Python import basics in plain English

In my own experience learning Python, and that of others on Python teams I've worked with, a common hurdle is understanding how Python does imports.

The basics are actually very simple, but the documentation tends to be a little neckbeardy and dense, and hard to grok if you're new to the language. So I thought I'd list common, simple practical examples of Python imports in case they help someone.

To import data and functions from somewhere else (another .py file in your project, a standard library like os, or a third-party library you may have installed with pip), you have the following options:

import <module>
import <module> as <other_name>
from <module> import a, b, c
from <module> import *

Let's look at what these options mean.

import <module>

import <module> means the program you're in can access everything that is defined in <module> (variables, classes, functions, etc), and you have to prepend "<module>." to those things. For example, if  is the "os" library (it comes with Python), which defines a function called getpid and a variable named name, your program can do this:

>>> import os
>>> os.name
'posix'
>>> os.getpid()
51678

This works with your own libraries too. Say you created a Python file named network_functions.py which contains a constant named BANDWIDTH and a method named connect(url), you can do:

>>> import network_functions
>>> network_functions.BANDWIDTH
1024
>>> network_functions.connect('https://google.com')
Connecting...

If you don't want to have to type out the whole prefix (which can get unwieldy if your imports are nested (the modules are subdirectories), e.g. import lib.network.connection_functions), you have the following options:

import <module> as <other_name>

This lets you use <other_name> instead of the module's full name. 

>>> import lib.network.connection_functions as netfunc
>>> netfunc.BANDWIDTH
1024
>>> netfunc.connect('https://google.com')
Connecting...

from <module> import a, b, c

This lets you import only what you need from module into your current program's namespace. This means everything you imported from the external module can be called by its bare name in your program:

>>> from lib.network.connection_functions import BANDWIDTH, connect
>>> BANDWIDTH
1024
>>> connect('https://google.com')
Connecting...

from <module> import *

Import everything from the imported module into your current program's namespace, so you can call everything from the module by its bare name. This is strongly not recommended, and I'll explain why.

>>> from lib.network.connection_functions import *
>>> BANDWIDTH
1024
>>> connect('https://google.com')
Connecting...

This is almost never a good idea, because you don't always know or control what is defined in an external module, and there can be name collisions, e.g. functions or variables with the same name, so you may not be using the variable or function you expect! For example:

In file helpers.py:
def connect(url):
    print "Connecting to", url

In file network.py:
def connect(url):
    print "Hacking into", url

>>> from helpers import *
>>> from network import *
>>> connect('https://google.com')
# which one is called?

>>> from network import *
>>> from helpers import *
>>> connect('https://google.com')
# which one is called?

This example may seem contrived, but it's very common to import a bunch of modules written by different people, and some variable or function names are common or obvious enough that they may appear several times in different modules. Why wouldn't they, after all? Joe doesn't know about Jill's (or your own) module, so they have no reason to coordinate and ensure they're not using the same function names. 

If you use from <module> import * with several modules, the odds are very good you'll call a function and actually invoke one that's not the one you expect. And that can be really tricky to debug. 

So what should you do if you do need a bunch of functionality from a module and don't want to import every single function and variable by name with:

from <module> import var1, var2, var3, fun1, fun2, fun4 # etc 

It's simple! Don't use from <module> import *. Instead, use import <module> and presto, your program can use everything from <module>, as long as you prefix <module>. before the names.

>>> import os
>>> os.getpid()
'posix'

Handy Tips

Do you ever want to know the variables or functions defined in a module you imported without having to Google them? Just use vars or dir:

>>> import os
>>> dir(os)
['EX_CANTCREAT', 'EX_CONFIG', 'EX_DATAERR', 'EX_IOERR', 'EX_NOHOST', 'EX_NOINPUT', 'EX_NOPERM', 'EX_NOUSER', 'EX_OK', 'EX_OSERR', 'EX_OSFILE', 'EX_PROTOCOL', 'EX_SOFTWARE', 'EX_TEMPFAIL', 'EX_UNAVAILABLE', 'EX_USAGE', 'F_OK', 'NGROUPS_MAX', 'O_APPEND', 'O_ASYNC', 'O_CREAT', 'O_DIRECTORY', 'O_DSYNC', 'O_EXCL', 'O_EXLOCK', 'O_NDELAY', 'O_NOCTTY', 'O_NOFOLLOW', 'O_NONBLOCK', 'O_RDONLY', 'O_RDWR', 'O_SHLOCK', 'O_SYNC', 'O_TRUNC', 'O_WRONLY', 'P_NOWAIT', 'P_NOWAITO', 'P_WAIT', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'TMP_MAX', 'UserDict', 'WCONTINUED', 'WCOREDUMP', 'WEXITSTATUS', 'WIFCONTINUED', 'WIFEXITED', 'WIFSIGNALED', 'WIFSTOPPED', 'WNOHANG', 'WSTOPSIG', 'WTERMSIG', 'WUNTRACED', 'W_OK', 'X_OK', '_Environ', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '_copy_reg', '_execvpe', '_exists', '_exit', '_get_exports_list', '_make_stat_result', '_make_statvfs_result', '_pickle_stat_result', ] # and a bunch more

>>> vars(os)
{'WTERMSIG': , 'lseek': , 'EX_IOERR': 74, 'EX_NOHOST': 68, 'seteuid': , 'pathsep': ':', 'execle': , '_Environ': , ] # and a bunch more

What's Next?

In a later post, I'll cover other tricky aspects of imports, namely how Python maps import to Python code files in directories, and how to debug ImportError: No module named <module> problems that can occur depending on what directories your files are in.

Hopefully that was helpful!

No comments:

Post a Comment