abraxos.extract.read_csv_chunks¶
- abraxos.extract.read_csv_chunks(path, chunksize, **kwargs)[source]
Reads a CSV file in chunks and captures malformed lines.
- Parameters:
path (str) – Path to the CSV file.
chunksize (int) – Number of rows per chunk.
**kwargs (dict) – Additional arguments passed to pandas.read_csv.
- Yields:
ReadCsvResult – A named tuple containing bad lines and the parsed DataFrame for the chunk.
- Return type:
collections.abc.Generator[abraxos.extract.ReadCsvResult,None,None]
Examples
>>> for result in read_csv_chunks('data.csv', chunksize=100): ... print(result.bad_lines) ... print(result.dataframe)