The Ivy Bridge

Ivysim

Home

Perl

Tools

Picoblaze

Spartan3E
Starter Kit









Regex Explorer

Pattern Matching Made Simple

Regular expressions are at the core of powerful scripting languages including Perl and Tcl. Many tasks can be simplified with some knowledge of regular expressions. Indeed, some daunting tasks are solved very easily with a little regex knowledge.

The Regex Explorer from Ivybridge Simulation provides an interactive assistant to help you write regular expressions quickly. Read in a sample of text you want to search then edit your regular expression - the tool provides instant feedback showing which parts of the text have been matched. When you make mistakes the match disappears. But the undo and redo buttons help you to backtrack.

Download

Version 1.0
Operating System MD5SUM Download
Windows 815f84e4f18b084c2b55e0f36bd48e23 regex10.exe
Linux e29f59376cacc91ac353ca18c3c06a84 regex10.bin

Licence

Version 1.0 is free to download and use at work or home.

Manual

The program is quite self-explanatory. Here are the steps to take:

  • Choose Perl or Tcl with the radio buttons
  • Type, paste or open a text file (File menu) into the Search Space window
  • Edit the regular expression in the Regex entry window
  • See where your regex matches in the Matches window
  • Highlight and cross-reference matches using the mouse pointer

Note: A Tcl interpreter is built into this application. A Perl interpreter must be installed on your machine for Perl Regular expression matching to work.


Regex Syntax Primer

Character Classes

. Dot The regex wildcard. In both Perl and Tcl dot matches any character. However, in Perl dot only matches the newline character if the s option is used. In Tcl dot always matches the newline character unless the -linestop option is used.
\w Alphanumeric Matches any letter, digit or underscore. Upper or lower case.
\d Digit Matches any digit, 0-9.
\s Whitespace Matches any whitespace characters including spaces, tabs and newlines.

Quantifiers

? 0 or 1 The preceding character class or group should appear zero times or exactly once.
* 0 or more The preceding character class or group should appear zero or more times.
+ 1 or more The preceding character class or group should appear one or more times.
For example: Searching using the regex

  \d+ \w+

to find a match in the string:

  There are 27 desks on order

finds a number followed by a word.

Groups

(...) Group Round brackets are used to delimit groups or sub-expressions. The group can be repeated using a quantifier and the substring matched by the subexpression is copied into a back-reference variable for reading after a successful match.
(.(..)) Groups Groups can be nested but not overlapped. The order of the back-references depend upon the order of the opening brackets.
For example: Searching "There are 27 desks on order" with "(\d+) (\w+)" matches a number followed by a word and places the two matches in separate variables.- so 27 in the first variable and desks in the second.

Imagine a large text file contains a line saying how many desks there are then this short Tcl script could open the file and report how many desks are on order:

set f [open filename.txt]
while {[gets $f line]>=0} {
  if [regexp {(\d+) desks} $line -> number] {
    puts "Found $number desks!"
  }
}

There's even less to it in Perl:

open( F, 'filename.txt' );
while( <F> ) {
  /(\d+) desks/ and print "Found $1 desks!\n";
}

Do you need to learn some more Perl? Start by working through the Quick Start Perl article from Doulos. I wrote this while a member of the Doulos consulting team.

Do you need to learn lots of Perl or Tcl? I can recommend the 3 day Essential Perl and Essential Tcl/Tk training courses from Doulos. I did write a lot of this material and might even end up delivering it for you as a Doulos approved training provider.

Simon Dempsey
Consultant
Nov 2006

Copyright © 2006-2018 Ivybridge Simulation Email: info6@ivysim.com Phone: 07704 874512