from pyspark.sql import SparkSession
# Create a SparkSession
spark = SparkSession.builder.appName("CSV Read").getOrCreate()
# Read the CSV file
df = spark.read.csv("path/to/your/file.csv", header=True, inferSchema=True)
# Display the dataframe
df.show()
# Perform further transformations or analysis on the dataframe as needed
# Stop the SparkSession
spark.stop()
In the code above, we import the necessary modules from pyspark.sql, create a SparkSession, and then use the read.csv() method to read the CSV file. The header=True parameter indicates that the first row of the CSV file contains the column names. The inferSchema=True parameter tells Spark to infer the data types of the columns.
After reading the CSV file, you can perform additional transformations, filtering, or analysis on the DataFrame object df. Finally, you can stop the SparkSession using spark.stop().
Make sure to replace "path/to/your/file.csv" with the actual path to your CSV file.