abraxos.extract.read_csvΒΆ

abraxos.extract.read_csv(path, *, chunksize=None, **kwargs)[source]

Reads a CSV file and optionally processes it in chunks, capturing malformed lines.

Parameters:
  • path (str) – Path to the CSV file.

  • chunksize (int, optional) – Number of rows per chunk. If specified, the file is read in chunks. If None (default), the entire file is read at once.

  • **kwargs (dict) – Additional arguments passed to pandas.read_csv.

Returns:

If chunksize is None, returns a single ReadCsvResult. Otherwise, returns a generator yielding ReadCsvResult for each chunk.

Return type:

ReadCsvResult or Generator of ReadCsvResult

Examples

>>> result = read_csv('data.csv')
>>> print(result.bad_lines)
>>> print(result.dataframe)
>>> for result in read_csv('data.csv', chunksize=50):
...     print(result.bad_lines)
...     print(result.dataframe)