Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Evaluate a function applied to each record in a file
ResourceFunction["ReadScan"][f,file] evaluates f applied to each expression read from file in turn. | |
ResourceFunction["ReadScan"][f,file,type] evaluates fapplied to each object of the specified type read from file in turn. | |
ResourceFunction["ReadScan"][f,file,{type1,type2,…}] evaluates fapplied to each list of the specified types read from file in turn. |
Print expressions from a file:
In[1]:= |
Out[3]= |
Print lines from a file:
In[4]:= |
Out[4]= |
Process groups of records in bulk from a file:
In[5]:= |
Out[5]= |
Print records from a file:
In[6]:= |
Out[6]= |
Specify record separators:
In[7]:= |
Out[7]= |
ReadScan[f,file,type] is comparable to Scan[f,ReadList[file,type]], but it may use less memory:
In[9]:= |
Out[9]= |
In[10]:= |
Out[10]= |
ReadScan used with Reap and Sow can replicate behavior of ReadList:
In[11]:= |
Out[11]= |
Processing a large file having a simple structure can use a lot of memory if done with Import:
In[12]:= |
Out[12]= |
ReadScan can achieve similar results with significantly less memory:
In[13]:= |
Out[13]= |
If the lines of the file are to be processed, but not stored in the current session, substantially less memory is required:
In[14]:= |
Out[14]= |
When processing groups of lines, the final list may include EndOfFile:
In[15]:= |
Out[15]= |
Using Import may be faster:
In[16]:= |
Out[16]= |
In[17]:= |
Out[17]= |
Download and extract a somewhat large, "nearly-flat" JSON file from USDA FoodData Central:
In[18]:= |
Out[18]= |
Import requires a large amount memory:
In[19]:= |
Out[19]= |
Processing the file line-by-line requires much less memory, and allows for fine-grained control of how each line is processed:
In[20]:= |
Out[21]= |
Clean up by deleting the extracted file:
In[22]:= |
This work is licensed under a Creative Commons Attribution 4.0 International License