abraxos.transform.transformΒΆ

abraxos.transform.transform(df, transformer, chunks=2)[source]

Applies a transformation function to a DataFrame with error isolation.

If the transformation raises an exception on a chunk, the DataFrame is split into smaller chunks recursively to isolate errors. Ultimately, rows that fail even as single-row DataFrames are collected separately.

Parameters:
  • df (pd.DataFrame) – The input DataFrame to transform.

  • transformer (Callable[[pd.DataFrame], pd.DataFrame]) – A function that transforms a DataFrame and returns a new DataFrame.

  • chunks (int, optional) – Number of subchunks to divide the DataFrame into if transformation fails (default is 2).

Returns:

A named tuple with: - errors: A list of exceptions that occurred during transformation. - errored_df: A DataFrame of rows that could not be transformed. - success_df: A DataFrame of successfully transformed rows.

Return type:

TransformResult

Examples

>>> import pandas as pd
>>> def double_values(df): return df.assign(value=df['value'] * 2)
>>> df = pd.DataFrame({'value': [1, 2, 3]})
>>> result = transform(df, double_values)
>>> result.success_df
   value
0      2
1      4
2      6
>>> result.errored_df.empty
True