This guide discusses how to convert a string column to float through examples in CSV using Python. These examples cover how to use pandas and csv packages in the conversion.
We will save the following data in employees-salaries.csv and use it in our examples.
"FName","LName","Salary","Increment","New Salary" "Bob","Smith","3,000","0.12","3,360" "Lewis","Walker","5,600","0.09","6,104" "Deborah" ,"Shawn","12,100","0.14","13,794" "Smith","Rowe","2,800","0.08","3,024" "Kael","Fernandez","7,200","0.10","7,920"
Example 1: Convert a Single String Column in a CSV File into Float Using Pandas
Given a pandas data frame, df, you can convert a column into a float using the following line:
1 |
df["column_name"] = df["column_name"].astype(float) |
The following example converts the New Salary column in the CSV data given above into float. The pre-requisite is to remove commas in the numbers before doing the conversion.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
<a id="post-3259-_frh9ry2n0ja1"></a>import pandas as pd # Read the CSV file into a pandas DataFrame df = pd.read_csv("employees-salaries.csv") # Remove commas in numbers df["New Salary"] = df["New Salary"].str.replace(",", "") # Check the column data type before changing print("New Salary column before conversion: ", df["New Salary"].dtypes) # Convert the desired column from string to float df["New Salary"] = df["New Salary"].astype(float) # Replace "column_name" with the actual column name # Check the new data type print("New Salary column after conversion: ", df["New Salary"].dtypes) # Write the updated DataFrame back to a CSV file df.to_csv("output.csv", index=False) |
Output:
New Salary column before conversion: object New Salary column after conversion: float64
Example 2: Convert Multiple Columns in a CSV File into Float Using Pandas
Given a data frame named df, you can convert multiple columns into float using the line below:
1 |
df = df.astype({"col1": float, "col2": float, "col3": float}) |
Here is an example of converting the Salary and New Salary columns.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import pandas as pd # Read the CSV file into a pandas DataFrame df = pd.read_csv("employees-salaries.csv") # Remove commas in numbers df["New Salary"] = df["New Salary"].str.replace(",", "") df["Salary"] = df["Salary"].str.replace(",", "") # Check the columns data type before conversion print("New Salary column before conversion: ", df["New Salary"].dtypes) print("Salary column before conversion: ", df["Salary"].dtypes) # Convert the desired column from string to float df = df.astype({"New Salary": float, "Salary": float}) # Replace "column_name" with the actual column name # Check the new data types print("New Salary column after conversion: ", df["New Salary"].dtypes) print("Salary column after conversion: ", df["Salary"].dtypes) # Write the updated DataFrame back to a CSV file df.to_csv("output2.csv", index=False) |
Example 3: Convert String Column (s) into a Numeric Using pandas.to_numeric() Function
The pandas.to_numeric() function converts a column into an integer or float based on input. The general syntax for the function is as follows.
1 |
df["Column_name"] = pd.to_numeric(df["Column_name"]) |
Here is an example of how to use it to convert the New Salary Column.
1 2 3 4 5 6 7 8 9 |
import pandas as pd df = pd.read_csv("employees-salaries.csv") # Remove commas in values under the New Salary column df["New Salary"] = df["New Salary"].str.replace(",", "") # Convert New Salary to numeric - Int or Float df["New Salary"] = pd.to_numeric(df["New Salary"]) print(df.dtypes) df.to_csv("output3.csv", index=False) |
Output:
FName object LName object Salary object Increment float64 New Salary int64
Then you can apply pandas.to_numeric() on multiple columns using the code below.
1 2 3 4 5 6 7 8 9 10 |
<a id="post-3259-_ers29tqc2grj"></a>import pandas as pd df = pd.read_csv("employees-salaries.csv") # Columns to convert cols_to_convert = ["Salary", "New Salary"] # Convert the two columns into numeric - int or float. # errors="coerce" means values that can't be converted are replaced with NaN df[cols_to_convert] = df[cols_to_convert].apply(pd.to_numeric, errors="coerce") print(df.dtypes) df.to_csv("output4.csv", index=False) |
Output:
FName object LName object Salary float64 Increment float64 New Salary float64
Example 4: Convert String Column(s) to Floats Using the CSV Package
This method iterates through the CSV rows and converts column(s) to float based on the index.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import csv # Open the input CSV file with open("employees-salaries.csv", "r") as file: reader = csv.reader(file) rows = list(reader) # Get the header row header_row = rows[0] # Get all the records rows = rows[1:] # Iterate over each row in the CSV for row in rows: # Convert the desired column from string to float try: # Assuming the column to convert is at index 2 float_value = float(row[2].replace(",", "")) # Update the row with the converted value row[2] = float_value except ValueError: pass # Handle any non-convertible values if needed # Write the updated rows to the output CSV file with open("output2.csv", "w", newline="") as file: writer = csv.writer(file) # Write the header row into the output CSV file writer.writerow(header_row) # Write the rows into the CSV file writer.writerows(rows) |