Source Dataframe

Expose Pandas DataFrame as DFFML Source

class dffml.source.dataframe.DataFrameSource(config)[source]

Proxy for a pandas DataFrame

Examples

You can pass a pandas DataFrame to this class directly via the Python API. Or you can create DataFrames from other data sources via the Python API or the command line.

Example of creating a DataFrame from HTML via command line.

Create an HTML table.

index.html

<table>
  <tr>
    <th>Years</th>
    <th>Salary</th>
  </tr>
  <tr>
    <td>0</td>
    <td>10</td>
  </tr>
  <tr>
    <td>1</td>
    <td>20</td>
  </tr>
  <tr>
    <td>2</td>
    <td>30</td>
  </tr>
</table>

Start the HTTP server to server the HTML page with the table

$ python -m http.server 8000

In another terminal. List all the records in the source.

$ dffml list records \
    -sources table=dataframe \
    -source-table-html http://127.0.0.1:8000/index.html \
    -source-table-protocol_allowlist http://

[
    {
        "extra": {},
        "features": {
            "Salary": 10,
            "Years": 0
        },
        "key": "0"
    },
    {
        "extra": {},
        "features": {
            "Salary": 20,
            "Years": 1
        },
        "key": "1"
    },
    {
        "extra": {},
        "features": {
            "Salary": 30,
            "Years": 2
        },
        "key": "2"
    }
]
CONFIG

alias of DataFrameSourceConfig

CONTEXT

alias of DataFrameSourceContext

class dffml.source.dataframe.DataFrameSourceConfig(dataframe: 'pandas.DataFrame' = None, predictions: List[str] = <factory>, html: str = None, html_table_index: int = 0, protocol_allowlist: List[str] = <factory>)[source]
no_enforce_immutable()

By default, all properties of a config object are immutable. If you would like to mutate immutable properties, you must explicitly call this method using it as a context manager.

Examples

>>> from dffml import config
>>>
>>> @config
... class MyConfig:
...     C: int
>>>
>>> config = MyConfig(C=2)
>>> with config.no_enforce_immutable():
...     config.C = 1
class dffml.source.dataframe.DataFrameSourceContext(parent: BaseSource)[source]
async record(key: str) Record[source]

Get a record from the source or add it if it doesn’t exist.

Examples

>>> import asyncio
>>> from dffml import *
>>>
>>> async def main():
...     async with MemorySource(records=[Record("example", data=dict(features=dict(dead="beef")))]) as source:
...         # Open, update, and close
...         async with source() as ctx:
...             example = await ctx.record("example")
...             # Let's also try calling `record` for a record that doesnt exist.
...             one = await ctx.record("one")
...             await ctx.update(one)
...             async for record in ctx.records():
...                 print(record.export())
>>>
>>> asyncio.run(main())
{'key': 'example', 'features': {'dead': 'beef'}, 'extra': {}}
{'key': 'one', 'extra': {}}
async records() AsyncIterator[Record][source]

Returns a list of records retrieved from self.src

Examples

>>> import asyncio
>>> from dffml import *
>>>
>>> async def main():
...     async with MemorySource(records=[Record("example", data=dict(features=dict(dead="beef")))]) as source:
...         async with source() as ctx:
...             async for record in ctx.records():
...                 print(record.export())
>>>
>>> asyncio.run(main())
{'key': 'example', 'features': {'dead': 'beef'}, 'extra': {}}
async update(record: Record)[source]

Updates a record for a source

Examples

>>> import asyncio
>>> from dffml import *
>>>
>>> async def main():
...     async with MemorySource(records=[]) as source:
...         # Open, update, and close
...         async with source() as ctx:
...             example = Record("one", data=dict(features=dict(feed="face")))
...             # ... Update one into our records ...
...             await ctx.update(example)
...             # Let's check out our records after calling `record` and `update`.
...             async for record in ctx.records():
...                 print(record.export())
>>>
>>> asyncio.run(main())
{'key': 'one', 'features': {'feed': 'face'}, 'extra': {}}