Quantcast
Viewing all articles
Browse latest Browse all 224

Extract tag within pre tag

Hi,
I'm going nuts. I've got heaps of html-pages, that have to be adjusted, since there are "a href" tags within "pre" "/pre" code, that have to be removed. Some code like this:
 
<preclass="brush:xml;toolbar:false;gutter:true;">&lt;constructiongroups xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema-instance&quot;<br>
 xsd:noNamespaceSchemaLocation=&quot;svgdescription.xsd&quot;&gt;<br>&lt;constructiongroup&gt;<br>&lt;fza&gt;5&lt;/fza&gt;<br>&lt;<atitle="HST"href="1802.htm#o2441">hst</a>&gt;10&lt;/hst&gt;<br>&lt;<atitle="HT"href="1802.htm#o2442">ht</a>&gt;30&lt;/ht&gt;<br>
 

In the meantime I found espresso to work with, but it doesn't satisfy my needs. I found the following regex:
 
(?:<pre .*\b>)*(?:<a[^>].*href="[0-9]+\.htm\#o[0-9]+"[^>]*>)(?:</a>)(?:</pre)
 
But it's not the end of the flagpole, but I don't know how to complete it. At the moment the regex is working partialy, but not as a whole.
Can someone help me, PLEASE?
Cheerio,
Heike

Viewing all articles
Browse latest Browse all 224

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>