Validating CSV Files in Rails

Validating CSV imports in Rails

Last Updated on September 3, 2020 by Christopher G Mendla

I needed to validate CSV files that were being imported into a Rails app. The validation was simple: Look for blank entries, ensure columns used for associations (IDs) were integers and check that certain other entries were numeric.

The first thing I checked was to make sure that the header of the CSV file matched what we expected. The simplest way to do this was to set an ‘expected header’ and then check the header on the CSV.

Here is a stripped down example for a hypothetical employee list. In the controller, I would  have :

expected_header = ["EmployeeNo", "First", "Last", "Bonus"]
 csv_error = true if !helpers.check_header(expected_header,csv_file)
We are using Rails 5 so we can use helpers. to call a method in a helper for the controller.
In the helper, we put
def check_header(expected_header,csv_file)
  header =, 'r') { |csv| csv.first }
  valid_csv = true 
  for i in (0..header.size-1)
    if header[i].downcase != expected_header[i].downcase
    valid_csv = false

if !valid_csv
  $csv_error = "Header:
  Expected Header:
  #{expected_header} "
return valid_csv
A couple of notes on the text below
  • We have already set csv_file in the controller.
  • We are grabbing the first line of the csv which should be the header.
  • We walk through each header.  .downcase is used because there were some differences in case from different sources.
  • we compare the expected header with the header and return valid_csv if it matches
The next lines in the controller are as follows. We are checking to see if we got an error checking the header. If not, then we walk through the data.
if !csv_error
  CSV.foreach(csv_file, :headers => true) do |row|
  csv_error = true if !helpers.check_employee_csv(row)
The method called in the helper is something like
def check_lists_csv(row)
  ok = true
  ok = false unless data_is_present(row, 0, "EmployeeID)
  ok = false unless data_is_integer(row, 0, "EmployeeID)
  ok = false unless data_is_present(row, 1, "First")
  ok = false unless data_is_present(row, 2, "Last")
  ok = false unless data_is_present(row, 3, "Bonus)
  ok = false unless_data_is_numeric(row, 3, "Bonus")
  return ok
Notes on the code above:
  • row is passed from the controller and contains the line of data from the csv
  • the second parameter is the index of the column. The column numbers begin with 0
  • the third parameter is the name of the column for the error message.
Here are the methods that check for blank, integer and numeric in the helper
def data_is_numeric(row, column_number, column_name)
  data_numeric = true
  if row[column_number.to_i].to_s.match(/A[+-]?d+?(.d+)?Z/) == nil
    $csv_error = $csv_error + "Error in row #{$INPUT_LINE_NUMBER}          >> #{row} >>  #{column_name} is not numeric
   data_numeric = true
return data_numeric
def data_is_integer(row, column_number, column_name)
data_integer = true
  if row[column_number.to_i].to_s.match(/^(0|[1-9][0-9]*)$/) ==   nil
  $csv_error = $csv_error + "Error in row #{$INPUT_LINE_NUMBER} >>   #{row} >>  #{column_name} is not an integer
  data_integer = true
return data_integer
def data_is_present(row, column_number, column_name)
  data_present = true
  if row[column_number.to_i].blank?
    $csv_error = $csv_error + "Error in row #{$INPUT_LINE_NUMBER} >> #{row} >>  #{column_name} is blank
  data_present = false
return data_present

If the csv validations pass, then we process the csv

if !csv_error
  CSV.foreach(csv_file, :headers => true) do |row|
  file = "Employee"
  redirect_to file_error_path and return
  $csv_error = $csv_error + "Your CSV was NOT uploaded  Please fix the error(s) and retry"
NOTE – I know that there are problems with global variables but they work for the app.
The file error view is simply

There was an error in the file upload
The file you were attempting to upload was not replaced

<%=  $csv_error.html_safe %>

This works and requires about 3 seconds to process about 40k of records.

My first attempt was taking about 8 minutes to process 40k of records. The problem was that I was trying to address variations in case with something like

ok = false unless data_is_present(row, "EmployeeID") || data_is_present(row, "Employeeid")

We had CSVs with both versions of the header. For some reason, that was taking a lot of time to process. By using the index number of the header, it avoided whatever was causing the slow processing.

Note – This code is from a couple of years back and there are deviations from best practices such as redundantly returning the value for a method.

1 thought on “Validating CSV imports in Rails”

  1. Olá amigo, estou tentando validar um CSV, e acabei achando seu post e gostaria de saber como posso migrar esse seu código para Ruby puro. É um desafio que tenho que realizar mas como sou novo no Ruby e também na área de de programação principalmente back-end, estou perdido. Se houver a possibilidade me socorrer ficarei muito grato.

Leave a Reply

Your email address will not be published. Required fields are marked *