Thursday, July 17, 2008

Replace with Regular Expression

This article explains some outstanding usage of Regular Expression's Replace Method

Level: Intermediate

Knowledge Required:
Regular Expressions

Description:
Currently I am working on a Tool (Short Project) to convert the code into HTML. This is my requirement, since it takes time to convert the code into HTML format, before I can paste into Blogger.

So I will be heavily using Regular Expression (RegEx Class) in this project. Meanwhile I am sharing the core technique here.

In code we use Keywords, words that highlight with different color. For example "Dim", "Me", "As", "For", etc. I use "span" or "font" tag to change the text color. E.g.

<span style="color:blue;">Dim</span>

Usually keywords found to be separated with white-spaces,

Dim myVar As String

3 Keywords: Dim, As, String

All of them are separated with Space. But some keywords are used as Instances,

Me.TextBox1.Text = "ABC"

Here "Me" has "." (full stop)

As per above findings, I decided,

To Convert the Code into HTML replace the Keywords with the Tags

Me.TextBox1.Text = "ABC"

we can replace the above line as,

<span style="color:blue;">Me</span>.TextBox1.Text = "ABC"

So here we are using the Replace. Which gave me the idea to use the Regular Expression, because its Replace method is very flexible. First we need to decide the Pattern. For the above case the pattern will be,

(\W|^)(Me)(\W|$)

In above statement () brackets are indicating Groups. Therefore in above pattern I have used 3 groups. Which we will be using in Replace String.

Next we need to decide what should be the Replace String.

$1<font color=blue>$2</font>$3

The above string means that while replacing,

1) Put Group #1
2) Put Font Tag
3) Put Group #2
4) Close the Font Tag
5) Put the Group #3

That's it, here is the complete code,

Dim sCode As String
Dim sHTML As String

sCode = "Me.TextBox1.Text=""Abc"""
sHTML = System. _
Text. _
RegularExpressions. _
Regex.Replace( _
sCode, _
"(\W|^)(Me)(\W|$)", _
"$1<font color=blue>$2</font>$3", _
System.Text.RegularExpressions.RegexOptions.IgnoreCase _
)
Debug.Print(sHTML)

No comments: