We have a large amount of text and HTML stored in a database. Some of the words in the text need to be replaced with hyperlinks. The words are part of a glossary we want to add a mouse over to the reference.
eg. "The contact is available" would become "The <a href="#" onmouseover="ShowGlossary107">contact</a> is available.
The problem we are trying to solve is that the word will at times appear as part of an exsting link or be part of a html [login to view URL] is an example of the text we are talking about. In the following text we need to replace all instances of the word "contact" with a mouseover hyperlink to a glossary of terms.
*line1. This is the Contact
line2. <a href=<[login to view URL]>
title=??Contact??>Google</a>
line3. Contact can be contact at <a href="[login to view URL]">contact</a>.
line4. Contact can be found if you use the contact links at <a href="[login to view URL]">Contact information</a>
* *We have been doing a straight vbscript replace (the code is ASP) and finding that we break the hyperlinks.
*
eg. line 2 would become
<a href=<[login to view URL]> title=??<a href="[login to view URL]'Contact'>Contact</a>??>Google</a>
which totally trashes the existing link.
What we want to see happen is the words that are part of a link or img src of anything else that is inside of a HTML tag is not replaced.
So we need a regex (or any other method) to ensure that the words that are part of html tags will not be broken.
<a href=<[login to view URL]> title=??Contact??>Contact</a> would not be touched because title="contact" is part of a HTML tag and the >Contact</a is inside of a href already.
Also words that are part of any other html tags are ot be left untouched.
eg.<img src="[login to view URL]" title="This is a contact image">. If we attempt to replace the word contact inside the content then the following:
<a href="[login to view URL]" title="My Contact"> Here is my contact </a> would be left untouched.