Archive for January, 2006

Parse Comma Delimited File with quotes and commas in .NET

I have had some developers here ask me about parsing strings with commas and quotes in the line to parse. I decided to post this because my searching on the web was not super successful in finding the answer. I orginally tried to use the matching methods in the RegEx class and it did not work very well. I used the RegEx.Matches() method and then I created a collection of matches and looped through the matches and got unexpected results. I used Espresso to test my regular expression and it worked fine with this expression: \s*(?!$)*\”(?[^\s]*([^\"]*)*\”)|(?[^,]*)\s*. When I put this into .NET and used the for each to loop through the matches and specify the group “value”, I got every other line being a blank space, which is not what I wanted. After some searching and tweaking, I have this solution, which works great.

Here is the sample line to be parsed:

“Field1,Field2,Field3,Field4,Field5,”Field6, Field7″”

Here is the code:

string line = “Field1,Field2,Field3,Field4,Field5,\”Field6, Field7\”";
System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(”,(?=(?:[^\"]*\”[^\"]*\”)*(?![^\"]*\”))”);
string[] result = r.Split(line);
for(int i =0;i<result .Length;i++)
{
Console.WriteLine(result[i].ToString());
}
Console.ReadLine();

(You can convert the C# to VB.NET here: http://www.developerfusion.co.uk/utilities/convertcsharptovb.aspx)

The results looks like this:

Field1
Field2
Field3
Field4
Field5
“Field6, Field7″

2 Comments