Skip to content

Saving to single csv

write_single_csv(sdf, path, **csvKwargs)

Writes a single CSV file from a Spark DataFrame.

Parameters: sdf (DataFrame): The Spark DataFrame to write. path (str): The destination path for the CSV file. **csvKwargs: Additional keyword arguments to pass to the DataFrame's write.csv method.

Example: write_single_csv(spark.range(1), "temp/file.csv", header=True, mode='overwrite')

Source code in pysparky/io/csv.py
def write_single_csv(sdf: DataFrame, path: str, **csvKwargs):
    """
    Writes a single CSV file from a Spark DataFrame.

    Parameters:
    sdf (DataFrame): The Spark DataFrame to write.
    path (str): The destination path for the CSV file.
    **csvKwargs: Additional keyword arguments to pass to the DataFrame's write.csv method.

    Example:
    write_single_csv(spark.range(1), "temp/file.csv", header=True, mode='overwrite')
    """
    with tempfile.TemporaryDirectory() as temp_dir:
        sdf.repartition(1).write.csv(temp_dir, **csvKwargs)
        csv_file = glob.glob(os.path.join(temp_dir, "*.csv"))[0]
        os.rename(csv_file, path)