how to read and write comma-separated value (.csv) files

introduction

CSV files are a traditional means of writing database or spreadsheet files to disk. Many modern programs, such as MS Excel and MS Access, can read and write these files. Their advantage nowadays is that they and the data in them can easily be manipulated with a simple text editor, rather than needing the full application which wrote them.

The structure of a CSV file is very simple. It is a text file, and usually starts with a list of fields, separated by commas. For example, if the file was to store people’s names and addresses, the first line of the file might look like:

Forename,Surname,Address1,Address2,City,PostCode

Subsequent lines contain the data corresponding to these fields, e.g:

John,Smith,12 Any Street,A District,Some Town,X1 2YZ

It is easy to use VB to read and write these files, but the problem then is to separate the data values, separated by commas, into individual items which can be stored in variables. You can either write a parser to do this, or more conveniently use the VB6 Split() function.

the Split() function

This function has the following syntax:

array = Split(string expression [,delimiter character [,number of substrings to return [,compare method]]])

The string expression contains a string which consists of a list of values separated by the delimiter. The delimiter in the case of a CSV file is a comma, but in other cases it could be a space, tab, asterisk, or any other character which does not occur in the data values themselves. 

The first parameter is not optional, but the last three are. The first of these is the delimiter character which separates data items. In the case of a CSV file, this is a comma, but it can be any valid character. If this parameter is missing, then the delimiter is assumed by default to be a space. If it is set to a zero-length string, all the substrings in the expression are joined together and returned as a single element without the separating commas.

The third parameter is an integer containing the number of substrings to return. By default this is –1, which means return all strings.

The final parameter indicates how the value in the delimiter string will be compared to the expression to be split. In other words, if the delimiter is a capital A, then you only want to match upper case As in the expression. All lower case letter a should be ignored. The potential values here are –1 (vbUseCompareOption, which uses the setting of the Option Compare statement if there is one); 0 (vbCompareBinary, which performs a binary comparison which will be case-sensitive), 1 (vbCompareText, which performs a text comparison and is case-insensitive) and 2 (vbCompareDatabase, which depends on information contained in a database).

The Split() function returns a one-dimensional array containing the required data items.

example

To split a CSV file into sub-strings:

Private Sub GetSubStrings(sLine As String)
‘ sLine is a single line read from the .CSV file, e.g. using the Line Input statement

‘ declare a string array
Dim sTokens() As String
Dim sTemp As String
Dim nI As Integer

‘ compare parameter is irrelevant here really because commas don’t have a case!
sTokens = Split(sLine, “,”, -1, vbCompareText)

For nI = 0 To UBound(sTokens)
sTemp = sTokens(nI)
‘ additional code here to do whatever is required with the substring
Next nI

End Sub

In the example given in the introduction, this would split the first line of the file into a string array with each element containing one field name. The array would have 6 elements, and since arrays are zero-based (unless you change this with the Option Base statement) the upper bound is 5. The routine above merely takes each element in turn and processes it in whatever fashion is required.

writing a CSV file

This is just as simple, but uses the Join() function to create the CSV lines. This has the following syntax:

Join(array expression [,delimiter string])

The array is required and is a one-dimensional array of substrings which will be joined into a single string. The substrings will be separated by the delimiter; if this is missing, a space is used. If it is set to an empty string, then all substrings are joined together with no delimiter.