skdag.DAGBuilder

class skdag.DAGBuilder(infer_dataframe=False)[source]

Helper utility for creating a skdag.DAG.

DAGBuilder allows a graph to be defined incrementally by specifying one node (step) at a time. Graph edges are defined by providing optional dependency lists that reference each step by name. Note that steps must be defined before they are used as dependencies.

Parameters
infer_dataframebool, default = False

If True, assume dataframe_columns="infer" every time add_step() is called, if dataframe_columns is set to None. This effectively makes the resulting DAG always try to coerce output into pandas DataFrames wherever possible.

See also

skdag.DAG

The estimator DAG created by this utility.

Examples

>>> from skdag import DAGBuilder
>>> from sklearn.decomposition import PCA
>>> from sklearn.impute import SimpleImputer
>>> from sklearn.linear_model import LogisticRegression
>>> dag = (
...     DAGBuilder()
...     .add_step("impute", SimpleImputer())
...     .add_step("vitals", "passthrough", deps={"impute": slice(0, 4)})
...     .add_step("blood", PCA(n_components=2, random_state=0), deps={"impute": slice(4, 10)})
...     .add_step("lr", LogisticRegression(random_state=0), deps=["blood", "vitals"])
...     .make_dag()
... )
>>> print(dag.draw().strip())
o    impute
|\
o o    blood,vitals
|/
o    lr