Prediction Using IO Operations

This example will show you how to train a model using DFFML python API and use the model for prediction by taking input from stdio.

DFFML offers several Models. For this example we will be using the Simple Linear Regression model (slr) which is in the dffml package.

First we train the model and then create a DataFlow for making predictions on user input.

main.py

import asyncio
from dffml import *

slr_model = SLRModel(
    features=Features(Feature("Years", int, 1),),
    predict=Feature("Salary", int, 1),
    location="tempdir",
)


# This Dataflow takes input from stdio using `AcceptUserInput`
# operation. The string input which corresponds to feature `Years`
# is converted to `int`/`float` by
# `literal_eval` operation.
# `create_mapping` operation creates a mapping using the numeric output
# of `literal_eval` eg. {"Years":34}.
# The mapping is then fed to `model_predict` operation which
# uses the `slr` model trained above to make prediction. The prediction is then printed to
# stdout using `print_output` operation.
dataflow = DataFlow(
    operations={
        "get_user_input": AcceptUserInput,
        "literal_eval_input": literal_eval,
        "create_feature_map": create_mapping,
        "predict_using_model": model_predict,
        "print_predictions": print_output,
    },
    configs={"predict_using_model": ModelPredictConfig(model=slr_model)},
)
dataflow.flow.update(
    {
        "literal_eval_input": InputFlow(
            inputs={"str_to_eval": [{"get_user_input": "InputData"}]}
        ),
        "create_feature_map": InputFlow(
            inputs={
                "key": ["seed.Years"],
                "value": [{"literal_eval_input": "str_after_eval"}],
            }
        ),
        "predict_using_model": InputFlow(
            inputs={"features": [{"create_feature_map": "mapping"}]}
        ),
        "print_predictions": InputFlow(
            inputs={"data": [{"predict_using_model": "prediction"}]}
        ),
    }
)
dataflow.update()
dataflow.seed.append(
    Input(
        value="Years",
        definition=create_mapping.op.inputs["key"],
        origin="seed.Years",
    )
)


async def main():
    # Train the model
    await train(
        slr_model,
        {"Years": 0, "Salary": 10},
        {"Years": 1, "Salary": 20},
        {"Years": 2, "Salary": 30},
        {"Years": 3, "Salary": 40},
    )
    # Run the dataflow
    async for ctx, results in MemoryOrchestrator.run(dataflow, {"inputs": []}):
        pass


if __name__ == "__main__":
    asyncio.run(main())

On running the above code AcceptUserInput operation waits for input from stdio.

$ python main.py
Enter the value: 21

The feature value (which is the str “21”) is then converted to int by literal_eval operation. Before passing this value to model_predict operation we need to create a mapping (dict) because model_predict takes a mapping for feature name to feature value. The mapping (dict) is created by create_mapping operation ( {“Years”: 21} ) which is then passed to model_predict operation for making prediction. The prediction is printed on stdout using print_output operation.

The output is:

{'Salary': {'confidence': 1.0, 'value': 220.0}}

The dataflow that we have created is

../../_images/dataflow_diagram2.svg

To re-generate the DataFlow diagram run.

$ dffml service dev export examples.io.io_usage:dataflow | \
    dffml dataflow diagram -stages processing -configloader json /dev/stdin

Copy and pasting the output of the above code into the mermaidjs live editor results in the graph.