csv data source

blackstork/builtin, v0.4.2

Description #

Loads CSV files with the names that match a provided glob pattern or a single file from a provided path.

Either glob or path argument must be set.

When path argument is specified, the data source returns only the content of a file. When glob argument is specified, the data source returns a list of dicts that contain the content of a file and file’s metadata.

Note: the data source assumes that CSV file has a header: the data source turns each line into a map with the column titles as keys.

For example, CSV file with the following data:

column_A	column_B	column_C
Foo	true	42
Bar	false	4.2

will be represented as the following data structure:

[
  {"column_A": "Foo", "column_B": true, "column_C": 42},
  {"column_A": "Bar", "column_B": false, "column_C": 4.2}
]

When glob is used and multiple files match the pattern, the data source will return a list of dicts, for example:

[
  {
    "file_path": "path/file-a.csv",
    "file_name": "file-a.csv",
    "content": [
      {"column_A": "Foo", "column_B": true, "column_C": 42},
      {"column_A": "Bar", "column_B": false, "column_C": 4.2}
    ]
  },
  {
    "file_path": "path/file-b.csv",
    "file_name": "file-b.csv",
    "content": [
      {"column_C": "Baz", "column_D": 1},
      {"column_C": "Clu", "column_D": 2}
    ]
  },
]

The data source is built-in, which means it’s a part of fabric binary. It’s available out-of-the-box, no installation required.

Configuration #

The data source supports the following configuration arguments:

config data csv {
  # CSV field delimiter
  #
  # Optional string.
  # Must have a length of 1
  # Default value:
  delimiter = ","
}

Usage #

The data source supports the following execution arguments:

data csv {
  # A glob pattern to select CSV files to read
  #
  # Optional string.
  # For example:
  # glob = "path/to/file*.csv"
  # 
  # Default value:
  glob = null

  # A file path to a CSV file to read
  #
  # Optional string.
  # For example:
  # path = "path/to/file.csv"
  # 
  # Default value:
  path = null
}