Function Repository Resource:

ImportCSVToDataset

Source Notebook

Import CSV and TSV files to datasets

Contributed by: Anton Antonov

ResourceFunction["ImportCSVToDataset"][fname]

imports the CSV or TSV file fname into a Dataset.

ResourceFunction["ImportCSVToDataset"][fname,"format"]

imports the file fname into a Dataset, assuming the file is in the specified format.

Details and Options

ResourceFunction["ImportCSVToDataset"] takes all options of Import for CSV as well as the options Method, "ColumnNames" and "RowNames".
The value of the Method option can be one of Automatic, Import or ImportString.
Here are the options and default values:
"ColumnNames"Truewhether to treat the first row as column names
"RowNames"Falsewhether to treat the first column as row names
CharacterEncoding"UTF8ISOLatin1"raw character encoding used in the file
"CurrencyTokens"{{"$", "£", "¥", "€"}, {"c", "¢", "p", "F"}}currency units to be skipped when importing numerical values
"DateStringFormat"Nonedate format, given as a DateString specification
"FieldSeparators"","field separator
"FillRows"Automaticwhether to fill rows to the max column length
"HeaderLines"0number of lines to assume as headers
"IgnoreEmptyLines"Falsewhether to ignore empty lines
MethodImportimporting method to use
"NumberPoint""."decimal point string
"Numeric"Automaticwhether to import data fields as numbers if possible
"SkipLines"0number of lines to skip at the beginning of the file

Examples

Basic Examples (1) 

Here, a CSV file from GitHub is imported:

In[1]:=
ResourceFunction[
 "ImportCSVToDataset"]["https://raw.githubusercontent.com/antononcube/\
MathematicaVsR/master/Data/MathematicaVsR-Data-Titanic.csv"]
Out[1]=

Scope (1) 

If the first argument is a CSV string (not a file name or URL), then that string can be imported by specifying ImportString as the importing method:

In[2]:=
tbl = "V1,V2
1,1
2,2
3,3";
ResourceFunction["ImportCSVToDataset"][tbl, "CSV", Method -> ImportString]
Out[3]=

Options (5) 

ColumnNames (1) 

Treat the first row and column of the file as ordinary data:

In[4]:=
ResourceFunction[
 "ImportCSVToDataset"]["https://raw.githubusercontent.com/antononcube/\
MathematicaVsR/master/Data/MathematicaVsR-Data-Titanic.csv", "ColumnNames" -> False]
Out[4]=

RowNames (1) 

Treat both the first row and column of the array as names:

In[5]:=
ResourceFunction[
 "ImportCSVToDataset"]["https://raw.githubusercontent.com/antononcube/\
MathematicaVsR/master/Data/MathematicaVsR-Data-Titanic.csv", "RowNames" -> True]
Out[5]=

FieldSeparators (3) 

CSV files can use field separators other than commas, such as ";":

In[6]:=
tbl = "V1;V2
1;1
2;2
3;3";

Import using the default field separator:

In[7]:=
ResourceFunction["ImportCSVToDataset"][tbl, "CSV", Method -> ImportString]
Out[7]=

Importing using a proper field separator:

In[8]:=
ResourceFunction["ImportCSVToDataset"][tbl, Method -> ImportString, "FieldSeparators" -> ";"]
Out[8]=

Properties and Relations (2) 

ImportCSVToDataset will give similar results to Import with "Dataset" specified as the result type:

In[9]:=
Import["https://raw.githubusercontent.com/antononcube/MathematicaVsR/\
master/Data/MathematicaVsR-Data-Titanic.csv", "Dataset", "HeaderLines" -> 1]
Out[9]=

A dataset exported as CSV can be imported back to a dataset:

In[10]:=
titanicDS = ResourceFunction["ImportCSVToDataset"][
   "https://raw.githubusercontent.com/antononcube/MathematicaVsR/\
master/Data/MathematicaVsR-Data-Titanic.csv", "RowNames" -> True];

Export the dataset to a temporary location as a CSV file:

In[11]:=
exportedCSV = Export["/tmp/titanic.csv", titanicDS, "CSV"];

Use ImportCSVToDataset to bring it back to a dataset:

In[12]:=
ResourceFunction["ImportCSVToDataset"][exportedCSV, "RowNames" -> True]
Out[12]=

Possible Issues (1) 

Importing CSV files as TSV produces a one-column dataset:

In[13]:=
ResourceFunction[
 "ImportCSVToDataset"]["https://raw.githubusercontent.com/antononcube/\
MathematicaVsR/master/Data/MathematicaVsR-Data-Titanic.csv", "TSV"]
Out[13]=

Publisher

Anton Antonov

Version History

  • 2.1.0 – 29 March 2021
  • 2.0.0 – 09 March 2020
  • 1.0.0 – 18 December 2019

Related Resources

Author Notes

Except the options Method, "FieldSeparators", "RowNames" and "ColumnNams", all options are the same as the (hidden) Import CSV options. See: https://reference.wolfram.com/language/ref/format/CSV.html

License Information