C#: Split CSV Values with a Regular Expression

And they said it couldn’t be done.

You can split a comma-separated data file into its fields, while stripping text qualifying quotes, in a single line of code with this regular expression:

((?<=\")[^\"]*(?=\"(,|$)+)|(?<=,|^)[^,\"]*(?=,|$))

Here is the implementation in C#:

string values = "111,222,\"33,44,55\",666,\"77,88\",\"99\"";

MatchCollection matches = 
    new Regex("((?<=\")[^\"]*(?=\"(,|$)+)|(?<=,|^)[^,\"]*(?=,|$))").Matches(values);

foreach (var match in matches)
{
    Console.WriteLine(match);
}

Outputs

111  
222  
33,44,55  
666  
77,88  
99

To accomplish the same with a tab-separated values (TSV) document, use this regular expression:

new Regex("((?<=\")[^\"]*(?=\"(\t|$)+)|(?<=\t|^)[^\t\"]*(?=\t|$))");
Tagged on: ,

Leave a Reply

Your email address will not be published. Required fields are marked *