Data Writer

Overview

Overview

The DataWriter module is designed to facilitate writing data to various output formats. It supports multiple file formats including CSV, HYPER, GeoPackage, Parquet and Shapefile, each handled by dedicated writer classes. This modular approach allows for extending the functionality of the package to support additional formats with minimal changes to the existing codebase.

Different methods within the DataWriter class instantiate relative writers to direct data write operations to the appropriate format-specific method.

Usage Example:

Here’s a brief example to demonstrate the usage of the DataWriter module to write data to a CSV file:

import pandas as pd
from src.utils.data_writer.data_writer import DataWriter

# Sample data
data = {
    "Name": ["John", "Alice", "Bob", "Emily"],
    "Age": [25, 28, 22, 30],
    "City": ["New York", "London", "Paris", "Sydney"],
    "Salary": [50000, 55000, 45000, 60000],
}
dataframe = pd.DataFrame(data)

# Initialise DataWriter
data_writer = DataWriter()

# Using keyword arguments:
data_writer.to_csv(
    dataframe, folder_path="path/to/destination_folder", file_name='output.csv')

# Using a configuration Dictionary
config_csv = {'folder_path': "path/to/destination_folder", 'file_name': 'output.csv'}
# Write to CSV
data_writer.to_csv(
    dataframe, config_csv)

Depending on the writer method, the user may need to provide a different set of arguments to the writer method, see the table below for more details:

Writer Method	Required Arguments
to_csv	folder_path, file_name
to_parquet	folder_path, file_name
to_shapefile	folder_path, file_name
to_geopackage	folder_path, file_name, layer_name
to_hyper	file_path, table_name

Data Writer

Table of contents

Overview