Custom Importer

class CustomImporter

A flexible importer for custom data formats in thermal analysis.

This class provides methods to import data from non-standard file formats, offering customizable options for data parsing and preprocessing.

__init__(file_path: str, column_names: List[str], separator: str = ',', decimal: str = '.', encoding: str = 'utf-8', skiprows: int = 0)

Initialize the CustomImporter.

Parameters:
  • file_path – Path to the data file.

  • column_names – List of column names in the order they appear in the file.

  • separator – Column separator in the file. Defaults to ‘,’.

  • decimal – Decimal separator used in the file. Defaults to ‘.’.

  • encoding – File encoding. Defaults to ‘utf-8’.

  • skiprows – Number of rows to skip at the beginning of the file. Defaults to 0.

import_data() Dict[str, np.ndarray]

Import data from the file.

Returns:

Dictionary containing the imported data.

Return type:

Dict[str, np.ndarray]

Raises:
  • ValueError – If the file format is not recognized or supported.

  • FileNotFoundError – If the specified file does not exist.

static detect_delimiter(file_path: str, num_lines: int = 5) str

Attempt to detect the delimiter used in the file.

Parameters:
  • file_path – Path to the data file.

  • num_lines – Number of lines to check. Defaults to 5.

Returns:

Detected delimiter.

Return type:

str

Raises:

ValueError – If unable to detect the delimiter.

static suggest_column_names(file_path: str, delimiter: str | None = None) List[str]

Suggest column names based on the first row of the file.

Parameters:
  • file_path – Path to the data file.

  • delimiter – Delimiter to use. If None, will attempt to detect.

Returns:

Suggested column names.

Return type:

List[str]

Raises:

ValueError – If unable to suggest column names.

Usage Example

from pkynetics.data_import import CustomImporter

# Detect delimiter and suggest column names
delimiter = CustomImporter.detect_delimiter('path/to/custom_data.csv')
suggested_columns = CustomImporter.suggest_column_names('path/to/custom_data.csv', delimiter=delimiter)

print(f"Detected delimiter: {delimiter}")
print(f"Suggested columns: {suggested_columns}")

# Initialize the CustomImporter
importer = CustomImporter(
    'path/to/custom_data.csv',
    suggested_columns,
    separator=delimiter,
    decimal='.',
    encoding='utf-8',
    skiprows=1
)

# Import the data
data = importer.import_data()

# Access the imported data
for column in suggested_columns:
    print(f"{column}: {data[column][:5]}...")  # Print first 5 values of each column

Key Features

  1. Flexible data import for non-standard formats

  2. Automatic delimiter detection

  3. Column name suggestion

  4. Customizable import parameters (separator, decimal format, encoding, etc.)

  5. Robust error handling

Notes

  • The CustomImporter is particularly useful when dealing with data formats not covered by the standard TGA and DSC importers.

  • It’s recommended to use the detect_delimiter and suggest_column_names methods before initializing the CustomImporter to ensure correct data parsing.

  • Make sure to specify the correct decimal separator and encoding to avoid data misinterpretation.

See Also

  • tga_importer(): For importing standard Thermogravimetric Analysis (TGA) data

  • dsc_importer(): For importing standard Differential Scanning Calorimetry (DSC) data