Mastering Regular Expressions
Saturday, 25 September 2004I’ve been using regular expressions for quite a while now, and had basically tought myself how to use them by trial and error. I thought that I had most of the standard (as in non-proprietary) features licked, so much so that I was able to write a highly praised chapter about them in my Foundation PHP for Flash book.
However, for the ASDocGen project I needed to parse code with regular expressions, and that presented quite a challenge. After a while it became clear to me that either regular expressions weren’t going to be powerful enough for what I wanted to do, or maybe there was something I was missing, so I decided to buy myself a copy of Mastering Regular Expressions.
This book is pretty much the de-facto book on regular expression development, and the first three chapters tought me a bit about regular expressions that I’d missed when teaching myself. However, the real kicker for me came when reading the fourth chapter, which attempts to demystify the internal workings of regular expression engines. On the face of it, the chapter sounds really boring, but I found that once I understood how the various engine types work, I was able to see exactly where my old expressions has been letting me down.
Once of the additional chapters for the 2nd edition is the chapter on regular expressions in .NET, and opened my eyes to a technique that really made parsing code a lot easier – matching nested constructs. This allows you to match balanced nested patters, such as code blocks enlosed in braces, something that is impossible to do with normal regular expressions. This chapter is available for download from the O’Reilly site if you want to see this for yourself. Just a bit of a shame for me that the code for this chapter was in VB.NET, but it’s not that hard to translate the code into C# or JScript.NET if you’re after something a little more modern.
I’d recommend this book to anyone and everyone working with regular expressions – I certainly learned a lot from it and it was well worth its low price tag!





