Auto Installing Packages with Python's sys.meta_path

by James Johnson

Python provides sys.meta_path as a way for you to hook all imports and essentially have a first-chance at handling the importing of a module. Learn how to (ab)use sys.meta_path to autoload modules from github and pypi if the package hasn’t already been installed.

sys.meta_path Details

Python’s sys.meta_path is defined as:

A list of finder objects that have their find_module() methods called to see if one of the objects can find the module to be imported. The find_module() method is called at least with the absolute name of the module being imported. If the module to be imported is contained in package then the parent package’s __path__ attribute is passed in as a second argument. The method returns None if the module cannot be found, else returns a loader.

sys.meta_path is searched before any implicit default finders or sys.path.

See PEP 302 for the original specification.

To summarize, a finder object will look something like below, and must either return a loader object or None if the package can’t be found:

class Finder(object):
    def find_module(self, fullname, path=None):
        # return a loader that can load the module that
        # is being imported

and a loader object will look like below and must return the loaded module or raises an ImportError exception if errors occur that should not be propagated:

class Loader(object):
    def load_module(self, fullname):
        # actually load the module, add to sys.globals and
        # sys.modules, etc.

Oftentimes it makes sense to make a single object be both the finder and the loader as well.

Loader objects have a few responsibilities (see PEP 302 for more details):

  • if the module to be loaded (fullname) is already in sys.modules, that module must be returned. If not, the new module must be added to sys.modules.
  • The following attributes must be set on newly-created modules:
    • __file__
    • __name__
    • __loader__
    • __package__

Auto Installing Packages From PyPi

The script autopypi.py below will auto-install a package from PyPi when it is imported, if it is not currently installed and is a valid package in PyPi:

#!/usr/bin/env python
# encoding: utf-8

import imp
import os
import pdb
import pip
import readline
from pip.commands.search import SearchCommand
import sys
import virtualenv

class PyPiPathHook(object):
    def __init__(self):
        # create a virtualenv at ~/.pypi_autoload
        user_home = os.path.expanduser("~")
        self.venv_home = os.path.join(user_home, ".pypi_autoload")
        if not os.path.exists(self.venv_home):
            virtualenv.create_environment(self.venv_home)
        activate_script = os.path.join(self.venv_home, "bin", "activate_this.py")
        execfile(activate_script, dict(__file__=activate_script))

    def find_module(self, fullname, path=None):
        if "." in fullname:
            return None

        try:
            mod = imp.find_module(fullname)
        except ImportError as e:
            pass
        else:
            # it's already accessible, we don't need to do anything
            return None

        if self._package_exists_in_pypi(fullname):
            pip.main(["install", fullname, "--prefix", self.venv_home])

        # we've made it accessible to the normal import procedures
        # now, (should be on sys.path), so we'll return None which
        # will make Python attempt a normal import
        return None

    def _package_exists_in_pypi(self, fullname):
        searcher = SearchCommand()
        options,args = searcher.parse_args([fullname])
        matches = searcher.search(args, options)
        found_match = None
        for match in matches:
            if match["name"] == fullname:
                return True
                break

        return False

sys.meta_path.append(PyPiPathHook())

There are a four notable things about the script above:

  1. The load_module method was not implemented. Instead of returning self or some other loader object and loading the module ourselves, None is always returned. This makes Python use its standard import mechanisms (e.g. looking in sys.path for the module, etc.) after we have installed the package to a location on sys.path.

  2. virtualenv can be used directly from python. See this post for more info.

  3. Pip commands can be directly imported and used so that raw, unformatted results can be processed. In this example. I really didn’t want to parse what pip printed just to know if a package exists. Directly using the Search command from pip.commands.search lets us make use of the raw results before they are formatted and printed.

  4. imp.find_module attempts to find the package using Python’s normal procedures. This is a good test to tell if a package is installed on a system. See the imp.find_module docs for more information.

Simply importing the script above will add PyPi auto-install functionality to your python programs:

import autopypi
import xmltodict

print("xmltodict was successfully imported!")

Which outputs:

jelly$> python example.py
Collecting xmltodict
Installing collected packages: xmltodict
Successfully installed xmltodict-0.10.1
Collecting defusedexpat
Installing collected packages: defusedexpat
Successfully installed defusedexpat-0.4
xmltodict was successfully imported!