By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
Design

5 Best Practices for Building a CSV Uploader

Our comprehensive breakdown of key best practices to building a delightful CSV import experience.

Andrew Luo

Andrew is the co-founder & CTO at OneSchema.

July 31, 2024

Uploading a CSV file is a common capability in web applications across every industry. However, it's also one of the most notoriously frustrating experiences for most applications. Having a delightful CSV upload experience can set you apart from your competitors and make the right first impression with your customers.

The best CSV uploaders are intuitive, performant, and give the customer ample opportunity to correct issues without leaving your product. Whether your customers are uploading contacts, users, or financial data, the same core principles apply to developing your CSV upload experience.

Building a delightful upload experience is a deceptively complicated undertaking. Without proper planning, teams run into numerous delays. Proper project planning and awareness of the edge cases of CSV upload will lead to a much smoother launch. 

See also: Customer survey: How much did it cost to build your CSV importer in-house?

CSV Upload: Build vs. Buy?

Engineering teams are often faced with the decision to build technology in-house vs leveraging a pre-built SaaS solution. OneSchema is an embeddable CSV uploader that allows your engineering team to launch a delightful data upload feature in minutes instead of months. However, leveraging a hosted solution is not the best option for every engineering team. 

Advantages of Building CSV Upload

  • Full control over design and features to fit your use case
  • Ability to update the software to adapt to changing internal and external needs
  • No recurring 3rd party vendor fees

Advantages of Using a Hosted Upload Tool

Read more: Full comparison of the build vs buy tradeoff for engineering teams

5 Steps to Building a Delightful CSV Uploader

So you’ve decided to build your own CSV uploader. Here are 5 steps to ensuring you get the experience right for your customers.

Read more: Lessons learned building a CSV importer, with founder Lior Harel

1. Define your data model(s)

First, define what data you want to collect from your customers. Consider the following questions when defining your data model:

  • What columns of data will my customers be uploading?
  • What data type validations should be run on each column?
  • What other validations, if any, should be run on the data?
  • Do I expect my customer to upload data in a single file or multiple files?
  • Do I want my customer to be able to create columns that don’t exist in my system (custom fields) during their upload experience? 
  • Do I need to validate data against my database or a 3P data source? Will this impact the performance of my upload experience?
  • Will the data uploaded vary by customer (see also: template overrides)?
  • How big are the typical files my customers will be uploading?
  • How many columns are being uploaded? 

From here, you can define your data template: the specification for the files, columns, data types, and validations for the data customers will be uploading. 

A data template is a description of your data model's columns and required validations


2. Design your user experience

Understanding your desired workflow

There are several other nuances of your workflow that will influence the design of your experience. You should also consider questions like:

  • How important is conversion to my CSV upload experience? Should my uploader require all data to be perfectly clean before allowing any data to be uploaded?
  • What kinds of errors do I expect my customers will have in their data? Will the person performing the upload be able to resolve the errors, or will they need to ask others on their team for support?
  • Will customers upload more data in the same format in the future?
  • Do I want my customer to be able to return to an abandoned upload session?
  • Are there certain types of common errors I want to be able to resolve on behalf of my customer (e.g. dates formats, capitalization, etc.)?
  • How do I want to educate customers about the expected data to be uploaded?
  • How do I want to explain to customers errors in the data?
  • Does my column mapping experience need to accommodate a very large number of columns (20+)?

The answers to these questions will determine if you want to invest in features like accepting partial data submissions, advanced error resolution workflows, saved mappings, session resume and user education components.

What are the common steps in a CSV upload interface?

There are typically 4 steps in a CSV file upload interface

1. File upload

A component where your user can drag-and-drop a file or use a file picker to select a file to upload. It can be helpful to include user education on this pane to explain to your customers what data you'd like them to upload.

2. Header row selection (optional)

Allows your user to select which row of data is the header. We only recommend implementing this step if your customers frequently have data where the header is not the first row. The vast majority of our customers choose to skip this pane.

3. Column mapping

Allows your user to map columns from their uploaded CSV file to your column headers. A nice-to-have feature on this pane is suggested column mappings, typically based on fuzzy matching or historical uploads.

4. Data review

Shows your customer errors in the file and assists the customer with data conversion

3. Choose a CSV parsing library

The CSV format is unfortunately not as consistent as you might imagine. CSV file parsing is one of the most notoriously bug-prone aspects of CSV import (right behind data validation). While there are great open-source libraries out there, none are perfect and will likely require you to build and maintain an edge-case handling layer on top.  Over time as your customers encounter errors, your parsing layer will become more robust. Common issues you may encounter with CSV parsing libraries include special characters, misplaced delimiters, and performance.

OneSchema is built in Rust, so we use rust-csv for parsing. Here are a few other libraries we recommend:

  • papaparse. papaparse is a popular node.js library. It is extensively maintained by its creator and the open source community.
  • csv-parser. csv-parser is built for performance and is a great choice if you're expecting to work with large files.
  • fast-csv. fast-csv is indeed fast, but is actually one of the slower libraries. It however has a few advanced features like callbacks to modify headers and transform rows.

Read more here for a full rundown of the best CSV parsing libraries.

4. Build data validation logic

Typically, the most time consuming aspect of building and maintaining a CSV importer is building and maintaining your data validation logic. 

There are typically 3 types of validations you may need in your CSV importer:

  • Data type validations For each column, you’ll likely want to validate that the data conforms to your data type specification. For example, you may want to validate that all numbers are formatted with two digits past the decimal point and are written without commas or that all URLs end with a valid TLD. Data type validations are often achieved via regular expressions.  
  • Multi-column validations You may need to implement more complex business logic that involves data from multiple columns in a record. Common validations in this category include referential integrity and mathematical validations.
  • Database or 3P validations You may want to validate that uploaded data is valid given information in an existing database or from a 3P data source. Common use cases for database validations include de-duplication and referential integrity.

OneSchema supports an out-of-the-box library of validations for CSV files that can be used with or without OneSchema’s CSV import UI. Read more here about the benefits of leveraging a pre-built data validation library.

{{blog-content-cta}}

5. Launch & Iterate

Once you’ve defined and implemented your user experience, you’re ready to launch your new CSV upload feature!

Home-rolled CSV uploaders often take several months to roll out fully as there are often edge-cases that are difficult to debug without live user traffic. For example, it will be difficult to improve your CSV parsing library without seeing some unusual files or catch edge cases in your data validation logic. Data mapping is another common step that requires extensive testing, especially if you implement logic to attempt to automatically map data for your customer.

We recommend rolling out your CSV importer to a subset of beta users and doing a few user research sessions to make sure customers understand how to use the feature. We also frequently see developers implement a rollback feature for bad imports as they can be quite painful to reverse manually.

Most teams also choose to publish a support page on how to use their CSV import tool. Unless it has been impeccably designed, it's a common area for support requests and having a FAQ with common points of confusion can save your support team a lot of time.

Conclusion

CSV uploaders can vary drastically in complexity depending on the specifics of your data model and use case. They can take anywhere from 2 weeks to 6 months (or more!) to launch and typically involve engineering, product, and design.

It can be helpful also to review the CSV import experience in other SaaS solutions as you may notice design patterns that would work well for your customers. Your competitor's data upload experiences will be especially helpful as they are likelier to have similar data and customers. However, there is no substitute for user testing as there are often nuances to your data model that you will need to clarify with your users.

Especially if your data upload experience is on a critical flow (such as customer onboarding), we highly recommend leaving ample time for testing. If you have any questions about the design of your experience, we are more than happy to share our expertise! Feel free to reach out to us at support@oneschema.co

Continue reading

Human with plants and dashboards

Make data imports effortless and efficient

Ready to get started?

Design
August 11, 2023

5 Best Practices for Building a CSV Uploader

Andrew Luo
Andrew is the co-founder & CEO at OneSchema.
Launch CSV import in 30 minutes
Companies like Ramp, Toast, Scale AI, and Vanta trust OneSchema to provide a delightfully guided spreadsheet import experience for their customers.

Uploading a CSV file is a common capability in web applications across every industry. However, it's also one of the most notoriously frustrating experiences for most applications. Having a delightful CSV upload experience can set you apart from your competitors and make the right first impression with your customers.

The best CSV uploaders are intuitive, performant, and give the customer ample opportunity to correct issues without leaving your product. Whether your customers are uploading contacts, users, or financial data, the same core principles apply to developing your CSV upload experience.

Building a delightful upload experience is a deceptively complicated undertaking. Without proper planning, teams run into numerous delays. Proper project planning and awareness of the edge cases of CSV upload will lead to a much smoother launch. 

See also: Customer survey: How much did it cost to build your CSV importer in-house?

CSV Upload: Build vs. Buy?

Engineering teams are often faced with the decision to build technology in-house vs leveraging a pre-built SaaS solution. OneSchema is an embeddable CSV uploader that allows your engineering team to launch a delightful data upload feature in minutes instead of months. However, leveraging a hosted solution is not the best option for every engineering team. 

Advantages of Building CSV Upload

  • Full control over design and features to fit your use case
  • Ability to update the software to adapt to changing internal and external needs
  • No recurring 3rd party vendor fees

Advantages of Using a Hosted Upload Tool

Read more: Full comparison of the build vs buy tradeoff for engineering teams

5 Steps to Building a Delightful CSV Uploader

So you’ve decided to build your own CSV uploader. Here are 5 steps to ensuring you get the experience right for your customers.

Read more: Lessons learned building a CSV importer, with founder Lior Harel

1. Define your data model(s)

First, define what data you want to collect from your customers. Consider the following questions when defining your data model:

  • What columns of data will my customers be uploading?
  • What data type validations should be run on each column?
  • What other validations, if any, should be run on the data?
  • Do I expect my customer to upload data in a single file or multiple files?
  • Do I want my customer to be able to create columns that don’t exist in my system (custom fields) during their upload experience? 
  • Do I need to validate data against my database or a 3P data source? Will this impact the performance of my upload experience?
  • Will the data uploaded vary by customer (see also: template overrides)?
  • How big are the typical files my customers will be uploading?
  • How many columns are being uploaded? 

From here, you can define your data template: the specification for the files, columns, data types, and validations for the data customers will be uploading. 

A data template is a description of your data model's columns and required validations


2. Design your user experience

Understanding your desired workflow

There are several other nuances of your workflow that will influence the design of your experience. You should also consider questions like:

  • How important is conversion to my CSV upload experience? Should my uploader require all data to be perfectly clean before allowing any data to be uploaded?
  • What kinds of errors do I expect my customers will have in their data? Will the person performing the upload be able to resolve the errors, or will they need to ask others on their team for support?
  • Will customers upload more data in the same format in the future?
  • Do I want my customer to be able to return to an abandoned upload session?
  • Are there certain types of common errors I want to be able to resolve on behalf of my customer (e.g. dates formats, capitalization, etc.)?
  • How do I want to educate customers about the expected data to be uploaded?
  • How do I want to explain to customers errors in the data?
  • Does my column mapping experience need to accommodate a very large number of columns (20+)?

The answers to these questions will determine if you want to invest in features like accepting partial data submissions, advanced error resolution workflows, saved mappings, session resume and user education components.

What are the common steps in a CSV upload interface?

There are typically 4 steps in a CSV file upload interface

1. File upload

A component where your user can drag-and-drop a file or use a file picker to select a file to upload. It can be helpful to include user education on this pane to explain to your customers what data you'd like them to upload.

2. Header row selection (optional)

Allows your user to select which row of data is the header. We only recommend implementing this step if your customers frequently have data where the header is not the first row. The vast majority of our customers choose to skip this pane.

3. Column mapping

Allows your user to map columns from their uploaded CSV file to your column headers. A nice-to-have feature on this pane is suggested column mappings, typically based on fuzzy matching or historical uploads.

4. Data review

Shows your customer errors in the file and assists the customer with data conversion

3. Choose a CSV parsing library

The CSV format is unfortunately not as consistent as you might imagine. CSV file parsing is one of the most notoriously bug-prone aspects of CSV import (right behind data validation). While there are great open-source libraries out there, none are perfect and will likely require you to build and maintain an edge-case handling layer on top.  Over time as your customers encounter errors, your parsing layer will become more robust. Common issues you may encounter with CSV parsing libraries include special characters, misplaced delimiters, and performance.

OneSchema is built in Rust, so we use rust-csv for parsing. Here are a few other libraries we recommend:

  • papaparse. papaparse is a popular node.js library. It is extensively maintained by its creator and the open source community.
  • csv-parser. csv-parser is built for performance and is a great choice if you're expecting to work with large files.
  • fast-csv. fast-csv is indeed fast, but is actually one of the slower libraries. It however has a few advanced features like callbacks to modify headers and transform rows.

Read more here for a full rundown of the best CSV parsing libraries.

4. Build data validation logic

Typically, the most time consuming aspect of building and maintaining a CSV importer is building and maintaining your data validation logic. 

There are typically 3 types of validations you may need in your CSV importer:

  • Data type validations For each column, you’ll likely want to validate that the data conforms to your data type specification. For example, you may want to validate that all numbers are formatted with two digits past the decimal point and are written without commas or that all URLs end with a valid TLD. Data type validations are often achieved via regular expressions.  
  • Multi-column validations You may need to implement more complex business logic that involves data from multiple columns in a record. Common validations in this category include referential integrity and mathematical validations.
  • Database or 3P validations You may want to validate that uploaded data is valid given information in an existing database or from a 3P data source. Common use cases for database validations include de-duplication and referential integrity.

OneSchema supports an out-of-the-box library of validations for CSV files that can be used with or without OneSchema’s CSV import UI. Read more here about the benefits of leveraging a pre-built data validation library.

{{blog-content-cta}}

5. Launch & Iterate

Once you’ve defined and implemented your user experience, you’re ready to launch your new CSV upload feature!

Home-rolled CSV uploaders often take several months to roll out fully as there are often edge-cases that are difficult to debug without live user traffic. For example, it will be difficult to improve your CSV parsing library without seeing some unusual files or catch edge cases in your data validation logic. Data mapping is another common step that requires extensive testing, especially if you implement logic to attempt to automatically map data for your customer.

We recommend rolling out your CSV importer to a subset of beta users and doing a few user research sessions to make sure customers understand how to use the feature. We also frequently see developers implement a rollback feature for bad imports as they can be quite painful to reverse manually.

Most teams also choose to publish a support page on how to use their CSV import tool. Unless it has been impeccably designed, it's a common area for support requests and having a FAQ with common points of confusion can save your support team a lot of time.

Conclusion

CSV uploaders can vary drastically in complexity depending on the specifics of your data model and use case. They can take anywhere from 2 weeks to 6 months (or more!) to launch and typically involve engineering, product, and design.

It can be helpful also to review the CSV import experience in other SaaS solutions as you may notice design patterns that would work well for your customers. Your competitor's data upload experiences will be especially helpful as they are likelier to have similar data and customers. However, there is no substitute for user testing as there are often nuances to your data model that you will need to clarify with your users.

Especially if your data upload experience is on a critical flow (such as customer onboarding), we highly recommend leaving ample time for testing. If you have any questions about the design of your experience, we are more than happy to share our expertise! Feel free to reach out to us at support@oneschema.co

Uploading a CSV file is a common capability in web applications across every industry. However, it's also one of the most notoriously frustrating experiences for most applications. Having a delightful CSV upload experience can set you apart from your competitors and make the right first impression with your customers.

The best CSV uploaders are intuitive, performant, and give the customer ample opportunity to correct issues without leaving your product. Whether your customers are uploading contacts, users, or financial data, the same core principles apply to developing your CSV upload experience.

Building a delightful upload experience is a deceptively complicated undertaking. Without proper planning, teams run into numerous delays. Proper project planning and awareness of the edge cases of CSV upload will lead to a much smoother launch. 

See also: Customer survey: How much did it cost to build your CSV importer in-house?

CSV Upload: Build vs. Buy?

Engineering teams are often faced with the decision to build technology in-house vs leveraging a pre-built SaaS solution. OneSchema is an embeddable CSV uploader that allows your engineering team to launch a delightful data upload feature in minutes instead of months. However, leveraging a hosted solution is not the best option for every engineering team. 

Advantages of Building CSV Upload

  • Full control over design and features to fit your use case
  • Ability to update the software to adapt to changing internal and external needs
  • No recurring 3rd party vendor fees

Advantages of Using a Hosted Upload Tool

Read more: Full comparison of the build vs buy tradeoff for engineering teams

5 Steps to Building a Delightful CSV Uploader

So you’ve decided to build your own CSV uploader. Here are 5 steps to ensuring you get the experience right for your customers.

Read more: Lessons learned building a CSV importer, with founder Lior Harel

1. Define your data model(s)

First, define what data you want to collect from your customers. Consider the following questions when defining your data model:

  • What columns of data will my customers be uploading?
  • What data type validations should be run on each column?
  • What other validations, if any, should be run on the data?
  • Do I expect my customer to upload data in a single file or multiple files?
  • Do I want my customer to be able to create columns that don’t exist in my system (custom fields) during their upload experience? 
  • Do I need to validate data against my database or a 3P data source? Will this impact the performance of my upload experience?
  • Will the data uploaded vary by customer (see also: template overrides)?
  • How big are the typical files my customers will be uploading?
  • How many columns are being uploaded? 

From here, you can define your data template: the specification for the files, columns, data types, and validations for the data customers will be uploading. 

A data template is a description of your data model's columns and required validations


2. Design your user experience

Understanding your desired workflow

There are several other nuances of your workflow that will influence the design of your experience. You should also consider questions like:

  • How important is conversion to my CSV upload experience? Should my uploader require all data to be perfectly clean before allowing any data to be uploaded?
  • What kinds of errors do I expect my customers will have in their data? Will the person performing the upload be able to resolve the errors, or will they need to ask others on their team for support?
  • Will customers upload more data in the same format in the future?
  • Do I want my customer to be able to return to an abandoned upload session?
  • Are there certain types of common errors I want to be able to resolve on behalf of my customer (e.g. dates formats, capitalization, etc.)?
  • How do I want to educate customers about the expected data to be uploaded?
  • How do I want to explain to customers errors in the data?
  • Does my column mapping experience need to accommodate a very large number of columns (20+)?

The answers to these questions will determine if you want to invest in features like accepting partial data submissions, advanced error resolution workflows, saved mappings, session resume and user education components.

What are the common steps in a CSV upload interface?

There are typically 4 steps in a CSV file upload interface

1. File upload

A component where your user can drag-and-drop a file or use a file picker to select a file to upload. It can be helpful to include user education on this pane to explain to your customers what data you'd like them to upload.

2. Header row selection (optional)

Allows your user to select which row of data is the header. We only recommend implementing this step if your customers frequently have data where the header is not the first row. The vast majority of our customers choose to skip this pane.

3. Column mapping

Allows your user to map columns from their uploaded CSV file to your column headers. A nice-to-have feature on this pane is suggested column mappings, typically based on fuzzy matching or historical uploads.

4. Data review

Shows your customer errors in the file and assists the customer with data conversion

3. Choose a CSV parsing library

The CSV format is unfortunately not as consistent as you might imagine. CSV file parsing is one of the most notoriously bug-prone aspects of CSV import (right behind data validation). While there are great open-source libraries out there, none are perfect and will likely require you to build and maintain an edge-case handling layer on top.  Over time as your customers encounter errors, your parsing layer will become more robust. Common issues you may encounter with CSV parsing libraries include special characters, misplaced delimiters, and performance.

OneSchema is built in Rust, so we use rust-csv for parsing. Here are a few other libraries we recommend:

  • papaparse. papaparse is a popular node.js library. It is extensively maintained by its creator and the open source community.
  • csv-parser. csv-parser is built for performance and is a great choice if you're expecting to work with large files.
  • fast-csv. fast-csv is indeed fast, but is actually one of the slower libraries. It however has a few advanced features like callbacks to modify headers and transform rows.

Read more here for a full rundown of the best CSV parsing libraries.

4. Build data validation logic

Typically, the most time consuming aspect of building and maintaining a CSV importer is building and maintaining your data validation logic. 

There are typically 3 types of validations you may need in your CSV importer:

  • Data type validations For each column, you’ll likely want to validate that the data conforms to your data type specification. For example, you may want to validate that all numbers are formatted with two digits past the decimal point and are written without commas or that all URLs end with a valid TLD. Data type validations are often achieved via regular expressions.  
  • Multi-column validations You may need to implement more complex business logic that involves data from multiple columns in a record. Common validations in this category include referential integrity and mathematical validations.
  • Database or 3P validations You may want to validate that uploaded data is valid given information in an existing database or from a 3P data source. Common use cases for database validations include de-duplication and referential integrity.

OneSchema supports an out-of-the-box library of validations for CSV files that can be used with or without OneSchema’s CSV import UI. Read more here about the benefits of leveraging a pre-built data validation library.

{{blog-content-cta}}

5. Launch & Iterate

Once you’ve defined and implemented your user experience, you’re ready to launch your new CSV upload feature!

Home-rolled CSV uploaders often take several months to roll out fully as there are often edge-cases that are difficult to debug without live user traffic. For example, it will be difficult to improve your CSV parsing library without seeing some unusual files or catch edge cases in your data validation logic. Data mapping is another common step that requires extensive testing, especially if you implement logic to attempt to automatically map data for your customer.

We recommend rolling out your CSV importer to a subset of beta users and doing a few user research sessions to make sure customers understand how to use the feature. We also frequently see developers implement a rollback feature for bad imports as they can be quite painful to reverse manually.

Most teams also choose to publish a support page on how to use their CSV import tool. Unless it has been impeccably designed, it's a common area for support requests and having a FAQ with common points of confusion can save your support team a lot of time.

Conclusion

CSV uploaders can vary drastically in complexity depending on the specifics of your data model and use case. They can take anywhere from 2 weeks to 6 months (or more!) to launch and typically involve engineering, product, and design.

It can be helpful also to review the CSV import experience in other SaaS solutions as you may notice design patterns that would work well for your customers. Your competitor's data upload experiences will be especially helpful as they are likelier to have similar data and customers. However, there is no substitute for user testing as there are often nuances to your data model that you will need to clarify with your users.

Especially if your data upload experience is on a critical flow (such as customer onboarding), we highly recommend leaving ample time for testing. If you have any questions about the design of your experience, we are more than happy to share our expertise! Feel free to reach out to us at support@oneschema.co