This post is part 2 of the "Making a Python Package" series:
- Making a Python Package
- Making a Python Package II - writing docstrings
- Making a Python Package III - making an installable package
- Making a Python Package IV - writing unit tests
- Making a Python Package V - Testing with Tox
- Making a Python Package VI - including data files
- Making a Python Package VII - deploying
- Making a Python Package VIII - summary
Note: To get the material for this blog post, visit the v0.2 tag of Romans! Github project. To get it locally, and assuming you cloned the previous version, run
# clear last set of changes
$ git reset --hard HEAD
# checkout this version
$ git checkout tags/v0.2
To install
If you want to write really good docstrings, you should install pydocstyle
with
pip install pydocstyle
This is a code linter for docstrings.
Making a Python Package II: writing docstrings
Before we write a package that other people can install and download, it is worth taking time to write some proper documentation. Right now, if we load up the Python interpreter, the docstrings for our code are not that helpful:
>>> import roman.roman
>>> help(roman.roman)
gives the help screen
Help on module roman.roman in roman:
NAME
roman.roman
FUNCTIONS
int_to_roman_string(number)
Converts a positive integer into a Roman numeral
roman_string_to_int(numeral_string)
Converts a Roman numeral string to integer form
DATA
ROMAN_SYMBOLS = [('M', 1000), ('CM', 900), ('D', 500), ('CD', 400), ('...
FILE
<Location of file on your system>
A few things to note here:
- The end user probably doesn't care about how we solved this problem. The variable
ROMAN_SYMBOLS
isn't that interesting to them! - The functions here are probably simple enough that the one line description is enough. For user facing functions, it is nice to tell us what the inputs are, the returned values, and possible exceptions that might get raised.
Hiding from the user
Let's start by hiding ROMAN_SYMBOLS
from the help function. The Python help function automatically grabs all global variables - including functions - that don't start with a leading underscore. If we don't want a variable or function to show up, we just need to start it with a leading underscore. This allows us to guide the user to only the things they care about. Simply renaming ROMAN_SYMBOLS
to _ROMAN_SYMBOLS
is enough to make sure it doesn't show up in the help screen.
Google / numpy format for docstrings
A good docstring for a user-facing fucntion should contain the following parts:
- A one-line summary of what the function does. This should end with a period.
- A list of arguments, and their types, with a description.
- A list of returned values and their types.
- An example (optional).
- A list of raised exceptions (if any).
Here is the docstring for roman.int_to_roman_string(numeral)
to illustrate these guidelines:
"""Converts a positive integer into a Roman numeral.
Args:
number: a positive integer to be converted into a roman numeral
Returns:
The string representation of numeral.
Examples:
>>> int_to_roman_string(5)
'V'
>>> int_to_roman_string(2019)
'MMXIX'
"""
If you want to get really fancy, you can run pydocstyle roman/roman.py
and pydocstyle roman/temperature.py
to check for ways in which your docstring doesn't conform to convention. It is actually pretty detailed, for example, if you run it on branch v0.2 in my Github repo, it will complain that I use "Converts" rather than "Convert" in the summary, and that I don't have a blank line after my examples! We will fix these things before finally deploying the package, but it is interesting that pydocstyle
does limited grammar checking!
What about temperature.py
If we look at the v0.1
of the code, the help for temperature.py
is not useful
>>> import roman.temperature
>>> help(roman.temperature)
NAME
roman.temperature
FUNCTIONS
convert(temp, from_unit, to_unit)
Converts temp in unit from_unit to to_unit
convert_C_to_F(tempC)
convert_C_to_K(tempC)
convert_F_to_C(tempF)
convert_F_to_K(tempF)
convert_K_to_C(tempK)
convert_K_to_F(tempK)
convert_all(temp, unit)
Returns a dictionary, converting temp in 'unit' to all different
units
DATA
ABSOLUTE_ZERO_IN_C = -273.15
CONVERSIONS = {('C', 'C'): <function <lambda>>, ('C', 'F'): <function ...
UNITS = 'KFC'
UNIT_NAMES = {'Celsius': 'C', 'Centigrade': 'C', 'Fahrenheit': 'F', 'K..
.
We aren't going to pay a lot of attention to the details of temperature.py
, but it is worth looking at how we tidied up the docstrings:
- We changed the 6
convert_X_to_Y(temp)
functions to_convert_X_to_Y(temp)
to hide them from the user. We want the user to useconvert
orconvert_all
- We also changed
CONVERSIONS
to_CONVERSIONS
, as we don't want the user to access the dictionary of conversion functions directly. - The functions raise errors if we give a unit that it doesn't know. So we included exceptions in the docstring.
- We chose to add an exception if the user tried to generate a temperature below absolute zero, unless he or she indicated this was okay.
Here is what the docstrings end up being for temperature:
>>> import roman.temperature
>>> help(roman.temperture)
NAME
roman.temperature
FUNCTIONS
convert(temp, from_unit, to_unit)
Converts temp expressed in from_unit to the numeric value expressed in
to_unit.
Args:
temp (numeric): the numeric value of the temperature in from_unit.
from_unit (string): one of 'K', 'F', or 'C' to express if temp is given
in Kelvin, Farenheit, or Celsius respectively.
to_unit (string): one of 'K', 'F', or 'C' to express the unit to
convert the temperature in to.
Returns:
The numeric value of the temperature in to_unit
Examples:
# convert 0C into F
>>> convert(0, 'C', 'F')
32
# convert 0F into C
>>> convert(0, 'F', 'C')
-17.777777778
# there is one temp where C and F have same numeric value
>>> convert(-40, 'F', 'C')
-40
Raises:
KeyError: If either from_unit or to_unit are not 'K', 'F', or 'C'
convert_all(temp, unit, allow_neg_abs=False)
Converts temp expressed in unit to Kelvin, Fahrenheit, and Celsius
Args:
temp (numeric): the numeric value of the temperature.
unit (string): one of 'K', 'F', or 'C' to express if temp is given in
Kelvin, Farenheit, or Celsius respectively.
allow_neg_abs: set to True to allow temperatures below absolute zero.
Returns:
A dictionary with keys representing the unit, and values representing
the temperature in that unit.
Examples:
>>> convert_all(0, 'C')
{'K': 273.15, 'F': 32.0, 'C': 0}
>>> convert_all(212, 'F')
{'K': 373.15, 'F': 212, 'C': 100}
Raises:
KeyError: If unit is not one of 'K', 'F', or 'C'
ValueError: If the temperature is below absolute zero, and
allow_neg_abs is False
DATA
ABSOLUTE_ZERO_IN_C = -273.15
UNITS = 'KFC'
UNIT_NAMES = {'Celsius': 'C', 'Centigrade': 'C', 'Fahrenheit': 'F', 'K...
FILE
<location of temperature.py on your machine>
These docstrings are also available to you if you use Shift+Tab in your Jupyter notebook. Some people would advocate docstrings this detailed all the time. For many functions, this is overkill when you are just using them locally. If you are writing a package, then you are expecting other people to be able to use your code. Be a responsbile developer and write a good docstring for your users.
Summary and next steps
This article showed how to write a docstrings. Technically, this has nothing to do with writing packages. You can (and should!) write thoughtful docstrings even when not writing pacakges. Similarly, your package will compile without any docstring. But you should remember all the times you typed help(matplotlib.plot)
only to get plot(x,y,*kw,**args)
and how sad it made you; don't do this to someone else. And if you do decide to publish a package without a good docstring, make sure you remove your address from every corner of the internet first!
We saw
- What the parts of a "good" docstring are.
- That we can hide irrelevant detail by starting functions and variables with a leading underscore.
- We can use
pydocstyle <filename>
to check for deviations of our docstring from the standard.
Next up, we will learn how to use setuptools
to install the package for use anywhere on your machine (and so you can get others to install it from github)