High_Level Ml

async dffml.high_level.ml.predict(model, *args: Union[BaseSource, Record, Dict[str, Any], List], update: bool = False, keep_record: bool = False)[source]

Make a prediction using a machine learning model.

The model must be trained before using it to make a prediction.

Parameters:
  • model (Model) – Machine Learning model to use. See Models for models options.

  • *args (list) – Input data for prediction. Could be a dict, Record, filename, or one of the data Sources.

  • update (boolean, optional) – If True prediction data within records will be written back to all sources given. Defaults to False.

  • keep_record (boolean, optional) – If True the results will be kept as their Record objects instead of being converted to a (record.key, features, predictions) tuple. Defaults to False.

Returns:

Record objects or (record.key, features, predictions) tuple.

Return type:

asynciterator

Examples

>>> import asyncio
>>> from dffml import *
>>>
>>> model = SLRModel(
...     features=Features(
...         Feature("Years", int, 1),
...     ),
...     predict=Feature("Salary", int, 1),
...     location="tempdir",
... )
>>>
>>> async def main():
...     await train(
...         model,
...         {"Years": 0, "Salary": 10},
...         {"Years": 1, "Salary": 20},
...         {"Years": 2, "Salary": 30},
...         {"Years": 3, "Salary": 40},
...     )
...     async for i, features, prediction in predict(
...         model,
...         {"Years": 6},
...         {"Years": 7},
...     ):
...         features["Salary"] = round(prediction["Salary"]["value"])
...         print(features)
>>>
>>> asyncio.run(main())
{'Years': 6, 'Salary': 70}
{'Years': 7, 'Salary': 80}
async dffml.high_level.ml.score(model, accuracy_scorer: Union[AccuracyScorer, AccuracyContext], features: Union[Feature, Features], *args: Union[BaseSource, Record, Dict[str, Any], List]) float[source]

Assess the accuracy of a machine learning model.

Provide records to the model to assess the percent accuracy of its prediction abilities. The model should be already instantiated and trained.

Parameters:
  • model (Model) – Machine Learning model to use. See Models for models options.

  • *args (list) – Input data for training. Could be a dict, Record, filename, one of the data Sources, or a filename with the extension being one of the data sources.

Returns:

A decimal value representing the percent of the time the model made the correct prediction. For some models this has another meaning. Please see the documentation for the model your using for further details.

Return type:

float

Examples

>>> import asyncio
>>> from dffml import *
>>>
>>> model = SLRModel(
...     features=Features(
...         Feature("Years", int, 1),
...     ),
...     predict=Feature("Salary", int, 1),
...     location="tempdir",
... )
>>>
>>> async def main():
...     await train(
...         model,
...         {"Years": 0, "Salary": 10},
...         {"Years": 1, "Salary": 20},
...         {"Years": 2, "Salary": 30},
...         {"Years": 3, "Salary": 40},
...     )
...     print(
...         "Accuracy:",
...         await score(
...             model,
...             MeanSquaredErrorAccuracy(),
...             Feature("Salary", int, 1),
...             {"Years": 4, "Salary": 50},
...             {"Years": 5, "Salary": 60},
...         ),
...     )
>>>
>>> asyncio.run(main())
Accuracy: 0.0
async dffml.high_level.ml.train(model, *args: Union[BaseSource, Record, Dict[str, Any], List])[source]

Train a machine learning model.

Provide records to the model to train it. The model should be already instantiated.

Parameters:
  • model (Model) – Machine Learning model to use. See Models for models options.

  • *args (list) – Input data for training. Could be a dict, Record, filename, one of the data Sources, or a filename with the extension being one of the data sources.

Examples

>>> import asyncio
>>> from dffml import *
>>>
>>> model = SLRModel(
...     features=Features(
...         Feature("Years", int, 1),
...     ),
...     predict=Feature("Salary", int, 1),
...     location="tempdir",
... )
>>>
>>> async def main():
...     await train(
...         model,
...         {"Years": 0, "Salary": 10},
...         {"Years": 1, "Salary": 20},
...         {"Years": 2, "Salary": 30},
...         {"Years": 3, "Salary": 40},
...     )
>>>
>>> asyncio.run(main())