Documenting a Model¶
We can write the documentation and examples for our new model within the model docstring. We should always include at least two examples so that others will know how to use our model! We shoot for at least one command line example, and one Python example. We can put the examples in restructured text format within the class’s docstring.
This tutorial is meant to go after the Packaging a Model tutorial.
It’s okay if you skipped to here though you can get up to speed by running the
following commands. These commands create the model package containing starter
code, change directory into the package, install the package, and run the
setuptools egg_info
hook to register the model with the entry_points
system.
$ dffml service dev create model dffml-model-myslr
$ cd dffml-model-myslr
$ python -m pip install -e .[dev]
$ python setup.py egg_info
Python docstrings¶
A docstring in Python is a comment immediately following a class or function
definition. Python will print out an object’s docstring if the object is passed
to the help()
function.
$ python -c 'from dffml_model_myslr.myslr import MySLRModel; help(MySLRModel)'
Help on class MySLRModel in module dffml_model_myslr.myslr:
class MySLRModel(dffml.model.model.SimpleModel)
| MySLRModel(config: 'BaseConfig') -> None
|
| Example Logistic Regression training one variable to predict another.
|
The docstring goes right after our class
definition. We prefix the block
comment """
with the character r
which means that the block comment
will be treated as a “raw literal”. For more information on raw literals see
https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals.
In our case it essentially means that when we have a backslash it will be
treated as a backslash and not an escape character.
dffml_model_myslr/myslr.py
65@entrypoint("myslr")
66class MySLRModel(SimpleModel):
67 r"""
Quick rST background¶
When you see something with .. `` and ``::
at the end that’s called a
“directive”. For example you’ll see code-block
and literalinclude
which
are directives. Each string that follows ::
after a directive is known as an
“argument”. On the lines immediately following a directive you may see one or
more of what’s called an “option”. These are strings with :
on either end.
consoletest¶
We’ll be using the consoletest Sphinx extension which adds extra options to
directives. Some options that you’ll see are :test:
and :filepath:
.
The full documentation for the consoletest Sphinx extension can be found here:
consoletest Sphinx extension.
The consoletest Sphinx extension will be used for Testing Examples. As we write our documentation we’ll keep in mind the functionality that consoletest provides. The goal is to write documentation that a user would be able to copy paste file data and commands from. The consoletest plugin will help us simulate the actions of the user. Allowing us to write the same files and run the same commands we’re asking the developer to. Just as the developer will, the consoletest plugin will start every test by creating an empty directory in which to do run it’s test. Our documentation is telling the developer what files to create within that directory and what commands to run.
We use the :test:
option on any directive that we want the consoletest
plugin to run. We’ll use the :filepath:
option whenever we want
to write data in a code-block
to a file within the test environment’s
temporary directory. When the literalinclude
directive is used to display
contents of a file, we can also have the contents that are being displayed
copied into a file of a given name within the test environment by using the
:filepath:
option.
Sometimes you may just want to show a code-block
or a literalinclude
that consoletest shouldn’t run. Just leave off the :test:
option in these
situations.
Example Datasets¶
We can include text data to be written to files within code-block
directives. You may want to download data from the internet instead. Running a
command via code-block:: console
which you’ll see an example of shortly is a
good way to do that.
We usually start by telling the user to create the training and test datasets. We always try to make it clear to the user what file they should be writing it by putting the filename in bold above the file contents.
dffml_model_myslr/myslr.py
70 The dataset used for training
71
72 **train.csv**
73
74 .. code-block::
75 :test:
76 :filepath: train.csv
77
78 x,y
79 0.0,0
80 0.1,1
81 0.2,2
82 0.3,3
83 0.4,4
84 0.5,5
85
86 The dataset used for testing
87
88 **test.csv**
89
90 .. code-block::
91 :test:
92 :filepath: test.csv
93
94 x,y
95 0.6,6
96 0.7,7
Example CLI Commands¶
After the user has saved the training and test dataset files. We’ll give then the command line example usage. We’re going to document how to train the model, assess it’s accuracy, and use it for prediction.
When there is a way that a code-block
should be highlighted, we can give it
as an argument. console
, is the way that code examples that users should run
in the console should be highlighted. There is no proper way to highlight a
.csv
file, which is why we didn’t talk about highlighting in the
Example Datasets section.
dffml_model_myslr/myslr.py
98 Train the model
99
100 .. code-block:: console
101 :test:
102
103 $ dffml train \
104 -model myslr \
105 -model-features x:float:1 \
106 -model-predict y:int:1 \
107 -model-location tempdir \
108 -sources f=csv \
109 -source-filename train.csv
110
111 Assess the accuracy
112
113 .. code-block:: console
114 :test:
115
116 $ dffml accuracy \
117 -model myslr \
118 -scorer mse \
119 -model-features x:float:1 \
120 -model-predict y:int:1 \
121 -model-location tempdir \
122 -features y:int:1 \
123 -sources f=csv \
124 -source-filename test.csv
125 1.0
126
127 Make a prediction
128
129 **predict.csv**
130
131 .. code-block::
132 :test:
133 :filepath: predict.csv
134
135 x
136 0.8
137
138 .. code-block:: console
139 :test:
140
141 $ dffml predict all \
142 -model myslr \
143 -model-features x:float:1 \
144 -model-predict y:int:1 \
145 -model-location tempdir \
146 -sources f=csv \
147 -source-filename predict.csv
148 [
149 {
150 "extra": {},
151 "features": {
152 "x": 0.8
153 },
154 "key": "0",
155 "last_updated": "2020-11-15T16:22:25Z",
156 "prediction": {
157 "y": {
158 "confidence": 1.0,
159 "value": 7.999999999999998
160 }
161 }
162 }
163 ]
Example Python Usage¶
Try to include as much of the examples inline as possible. However, when it
comes time to include a Python file in an example it’s best to leave it as a
separate file and use literalinclude
to display the contents. We do this
because the auto formatter, black, can only
format Python files. It can’t format examples within rST within a docstring.
dffml_model_myslr/myslr.py
165 Example usage of Logistic Regression using Python
166
167 **example_myslr.py**
168
169 .. literalinclude:: ../examples/example_myslr.py
170 :test:
171 :filepath: example_myslr.py
By specifying the :filepath:
we copied the contents of the Python example
to the test environment’s directory. The last thing we need to do is run the
Python example and show what it’s output would be. Then end the docstring with
another """`
.
dffml_model_myslr/myslr.py
173 .. code-block:: console
174 :test:
175
176 $ python example_myslr.py
177 Accuracy: 1.0
178 {'x': 0.9, 'y': 4}
Testing Examples¶
Using the
consoletest
module we can test the code-block
sections within the docstring. DFFML has a
run_consoletest
function we
will be using.
We have an integration test file which will use
run_consoletest
to test the model’s docstring.
tests/test_integration.py
from dffml import AsyncTestCase, run_consoletest
from dffml_model_myslr.myslr import MySLRModel
class TestIntegrationMySLRModel(AsyncTestCase):
async def test_docstring(self):
await run_consoletest(MySLRModel)
We can run the test using the unittest
module. The create command gave us
both unit tests and integration tests. We want to only run the integration test
right now (tests.test_integration
).
$ python -m unittest -v tests.test_integration
Note
This tutorial will be updated to show how to build a documentation website just like this one which will display an HTML version of restructured text.