Science News

Generate flawless documentation of Python code using Sphinx

Photo by Dustin Humes on UnsplashYou can build beautiful, standardised and stylised documentation using just the docstrings in a few simple steps.

A Data Scientist holds many responsibilities when working on a project, and one that is usually left until the last minute is documentation. Perhaps you’re diligent with writing docstrings for classes and functions (well done!) — but should that be the resting place of your documentation?

In my opinion, documentation should sit independently from your code. Your team (or you in a few months time) shouldn’t have to trawl through hundreds of lines of code in your python modules to understand what’s going on. You can build beautiful, standardised and stylised documentation using just the docstrings in a few simple steps and make your project speak for itself.

In this article, I’ll focus on using the Sphinx framework for autodocumenting python modules. I’ll also be using the Cookiecutter Data Science project template in Visual Studio Code (VS Code) due to its seamless integration with Sphinx and standardised directory structure. Whilst the official sphinx tutorial documentation is a great resource for those wanting to take a deep dive into this topic, my aim for this article is to be a helpful ‘quick start’ guide to take you through the key steps. Enjoy 🙂

A note on docstrings

The key to good documentation is docstrings. These are the comment blocks that sit within each class, class method and function that describe the nature of the code, along with the inputs, outputs and raised errors. There are three core docstring formats. These are Google, reStructuredText (reST) and NumPy. They all contain the same information, the only difference is that they are formatted differently. You can see examples of each docstring format here.

I’ll be using the Google docstring format as it is easy to read and takes up less space than the others. The below code block is a typical example of a Google docstring:

“””Description of the function, class or method etc.

Args:
varA (str): Description of varA
varB (bool): Description of varB

Returns:
list: Description of returned list

Raises:
ValueError: Description of raised error
“””

Top tip. download the ‘autoDocstring — Python Docstring Generator’ in VS Code to automatically generate a docstring when you type three double quotation marks (i.e. the start of a docstring). Be sure to finish writing the function before generating the docstring so that all inputs/outputs/errors get included in the docstring template that gets generated for you!

Let’s move on to making the documentation!

For the purpose of this demo, I have created a python module demo.py which contains a class and three basic functions (all annotated with docstrings with the exception of one function). It is this module that I’ll be building documentation for in this article. The contents of this demo.py module is below:

Contents of demo.py module to be documented. Snapshot taken using CodeSnap extension in VS Code.

1. Setup

First thing is to get everything setup. You’ll need to install VS Code and setup a new project along with Sphinx. There are a few options here. You can either a) set up a new project using Cookiecutter (where the relevant Sphinx setup will be generated along with standardised directories) or b) create your own project structure and install sphinx separately.

option A — install Cookiecutter

In the terminal, pip install Cookiecutter and then create a new project:

pip install cookiecutter
cookiecutter https://github.com/drivendata/cookiecutter-data-science

Next, answer the questions that appear in the terminal window and your new project will be created. The Sphinx framework will be stored in the /docs directory of your project.

option B — Sphinx quickstart

If the Cookiecutter template doesn’t take your fancy, you can create your own project structure from scratch and install sphinx. It is a good idea to make a documentation directory and install sphinx there. In the terminal:

mkdir docs
cd docs

pip install sphinx
sphinx-quickstart

2. Understanding Sphinx folder structure

After you’ve installed Sphinx using one of the options above, there will be some files that appear in the documentation directory in your project. The conf.py file is the key configuration file which you’ll edit to make your documentation bespoke — more detail on this in the next section. The index.rst file acts as a contents for your documentation. You can find more information on the index.rst file here. The getting-started.rst and commands.rst files are suggested templates for your documentation. You can remove these if necessary. The make files (make.bat and Makefile) are used to actually make the documentation. You don’t need to edit these but will call them in the terminal window when you’re ready to make the documentation.

Default Sphinx files installed

3. Conf.py file

The configuration file is where the magic happens. This file is used during the build process and so it is crucial that you have this set up correctly. Below are some steps to modifying the conf.py file:

Uncomment the sys.path line (line 20 in my setup):

# sys.path.insert(0, os.path.abspath(‘.’))

Change the pathway of the os.path.abspath to the relative location of the code you want documenting (relative to the conf.py file). For example, the python modules that I want documenting sits within the src/ directory of my project. Hence I will change the os.path.abspath to the look in /src directory which is located in the parent folder of the conf.py file. You can specify the relative location using the . and / syntax:

sys.path.insert(0, os.path.abspath(‘../src’))

“””
# you can use the following syntax to specify relative locations:

‘.’ # current path of conf.py
‘..’ # parent path of conf.py
‘../..’ # parent of the parent path of conf.py
“””The relative location of the directory containing the python modules to the documentation folder. In this example, ‘demo.py’ is the module to be documented, located in the src/data/ directory.

Add in the relevant extensions. You’ll need to add in some extensions to the conf.py file to gain extra functionality when creating your documentation. These are all optional and you can have some fun exploring the different extensions available here. Here are the 5 extensions that I recommend at minimum:

sphinx.ext.autodoc— use documentation from docstringsautodocsumm— generate a tabular summary of all docstrings at the top of the html page by listing out the docstring summaries only. Useful when you have a lot of docstrings. Note. you will need to pip install autodocsumm in the terminal.sphinx.ext.napoleon — enables Sphinx to parse google docstringssphinx.ext.viewcode — adds a link to a html page containing the source code for each modulesphinx.ext.coverage — provides a summary of how many classes/functions etc have docstrings. Good coverage signifies that a codebase is well explained.

Here’s how to include these extensions in the conf.py file (line 29 in my setup):

# add in the extension names to the empty list variable ‘extensions’
extensions = [
‘sphinx.ext.autodoc’,
‘sphinx.ext.napoleon’,
‘autodocsumm’,
‘sphinx.ext.coverage’
]

# add in this line for the autosummary functionality
auto_doc_default_options = {‘autosummary’: True}

Change the theme. The default theme of the documentation is quite clean, although you may prefer to play around with different options by changing the ‘html_theme’ variable (line 94 in my setup) from ‘default’ to one of the standard theme options or some third party options. In this demo, I’ll show the default and Read the Docs themes.

html_theme = ‘sphinx_rtd_theme’ # read the docs theme. This variable is ‘default’ by default.

Note. you will need to pip install any non-standard (third party) themes.

4. Make the html pages

Now that your conf.py file is set up and you have glorious docstrings in your code, we’re ready to do some scraping and build some html pages.

Generate .rst files of your python packages

These files are the precursor to the html pages and are the native format for Sphinx. These need to be generated before making the html files. You’ll use the sphinx.apidoc command, which uses the autodoc extension to locate all python modules (e.g. any .py files) within the sys.path location that you specified in the conf.py file. There are some optional parameters to include when using the apidoc command which you can find in the documentation, but I used the following template:

Note. in the terminal, change directory to the root of the project to run the following code.

sphinx-apidoc -f -o output_dir module_dir/-f (force overwriting any existing generated files).-o output_dir (directory to place the output files. If it does not exist, it is created). Note. replace ‘output_dir’ with a directory name of your choice. I set mine to the /docs directory.module_dir (location of python packages to document)

After running this command, there should be newly generated .rst files in the docs folder.

Contents of documentation folder after running sphinx-apidoc command to generate .rst files

Notice that two new .rst files have been generated: data.rst and modules.rst. In addition to modules.rst, a .rst file will be generated for each directory that contains at least one python module. In my example, data.rst is generated as I have saved my demo.py file in the src/data directory. If you have multiple directories of python modules within the location you specified in sys.path in the conf.py file, then multiple .rst files will be generated. Note. These files do not contain the scraped documentation just yet, they just contain the information required for autodoc to make the html files in the next step.

Edit index.rst file

Remember, index.rst acts as a contents page so we must edit this file to include all python modules we want documenting. Luckily, the modules.rst references the source location of all python modules identified in the sys.path, so you can simply add this file to index.rst.

To do this, open the index.rst file and add ‘modules’ underneath the toctree (table of contents tree) section. Make sure there is a line in between the :maxdepth: parameter and the names of the the .rst files.
Note. ‘getting-started’ and ‘commands’ will already be in the index.rst file. You can delete them from this file if you do not want to generate html pages (although a ‘getting-started’ page is probably a good idea!)

Contents of the index.rst file. I have added in ‘modules’ so that the modules.rst file is used in the html generation process.

Make html files

Now we can use the make files in your documentation directory to build the html files. These files will appear in the _build/html/ directory within your documentation folder. You can preview these in VS code if you download the ‘HTML Preview’ extension.

Change directory to where the make.bat file is located and run the following command in cmd terminal:

make html

Note. if you are using windows powershell terminal (rather than cmd), use the following syntax:

.make.bat html

Top tip. if a warning arises when using the make html command that states ‘autodoc: failed to import module’, this is most likely due to autodoc not being able to find your modules as the sys.path has not been configured correctly in conf.py. Make sure this points to the directory where your python modules are located.

Editing html files

If you wish to edit your docstrings and update your html files with the changes, then you can do so using the following command:

make clean html

Let’s take a look at our documentation!

As I mentioned above, I have created some documentation of my python module demo.py in two different themes seen in the images below; ‘default’ (left image) and ‘Read the Docs’ (right image). The content is identical but the look and feel are different. Let’s take note of the core features:

Navigation bar on left hand sideA summary of all classes or functions belonging to the module in tables at the top of the page (thanks to the ‘autodocsumm’ extension)Detailed list of docstring components for all functions and classes below the summaryExamples of documentation html pages for a sample python module using default theme (left image) and the read the docs theme (right image), generated using Sphinx.

Once you’ve created the html pages, you’ll notice a variety of hierarchical html pages will be generated. These will include a home page and pages for each package and module. Have a look around the pages to familiarise yourself with their structure and read the official documentation to see how you can customise them further.

For example, if you wanted to be able to see the raw code of each function in the documentation, add the extension ‘sphinx.ext.viewcode’ to the conf.py file. This will add a hyperlink next to each function or class method which will reveal the raw code to allow for easy interrogation without having to delve into the codebase.

Closing thoughts

And there we have it. Simple and beautiful documentation of your Python modules made in a few easy steps using Sphinx! I hope you have enjoyed learning about autodocumentation and find it to be a useful tool to implement in your projects. If you have any useful tips then feel free to add a comment 🙂

Step by Step Basics: Code Autodocumentation was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Read More

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Loading Disqus Comments ...

No Trackbacks.