Regular expressions (regex)

This page is a list of handy regular expressions I wrote for different projects. Feel free to use any expression you find here.

CSV parsing


Some people will tell you it's not possible to parse CSV files using regular expressions. A lot of sites state this, and tell you that you need a real 'parser' in code. However, this is not true. It's not too difficult to parse a row of csv data using a regular expression...

Expression for semi-colon separated CSV
("([^"]*|"{2})*"(;|$))|"[^"]*"(;|$)|[^;]+(;|$)|(;)
Expression for comma-separated CSV
("([^"]*|"{2})*"(,|$))|"[^"]*"(,|$)|[^,]+(,|$)|(,)

As you read the expressions, you will notice that the expression will match the fields, not the delimiters. Each field in the parsed row will produce a match.

These patterns support:

  • quoted fields
  • unquoted fields
  • empty fields
  • escaped quotes in fields

Note that every match will also contain the delimiter that ends it. Either a semicolon or comma, or nothing (the last field).

For example, matching this line:

This;is;"a test";of;the;"csv parser";using;"semi-colon separated fields"

will give you these matches:

  1. This;
  2. is;
  3. "a test";
  4. of;
  5. the;
  6. "csv parser";
  7. using;
  8. "semi-colon separated fields";

The pattern does NOT support newlines within fields, as you have to parse by record yourself.

Placeholder parsing


Placeholders are often used in java when constructing strings, or when creating a spring context that contains variables. Also in JSP pages placeholders are used in order to delimit EL expressions.

Placeholders looks like this:

Welcome, ${userName}. Today is ${date}, there are ${numOnlineUsers} users online.

Sometimes, you'll want to implement placeholders in your own code, and parse/replace them at runtime. Here is a regular expression that will match (un-nested) placeholders:

\$\{([^\$])+\}

Applied to the example above, this expression will produce the following matches:

  1. ${userName}
  2. ${date}
  3. ${numOnlineUsers}