Add regular expression to apply to all markup files before parsing and scanning for images, scripts, style sheets and other resources. All matches of the regular expression pattern will be replaced by the given format string.
Currently, the patters are always case sensitive.
For example, if you want to remove all <OBJECT> tags from the result web archive MHT, you can use AddPreParsingFilter("<OBJECT.*</OBJECT>", ""). Please note that the closing ">" was removed from the first tag intentionally, in order to skip the attributes.
Pre-parsing and post-parsing filters are designed to apply custom changes to all Markup files embedded into web archive MHT or downloaded using the MakeSite method. Pre-parsing filters may also affect the result of the parsing. For example, if your pre-parsing filter removes <script> tags then all dynamically loaded images will not be embedded and there will be no scripts in the final web archive MHT. If the same pattern and format string are used as a post-parsing filter, the scripts will be removed from the final web archive MHT but the images will be embedded.
Performance note: Complicated regular expressions may cause degradation of performance.
More information:
(1) For more information about regular expressions please read: http://www.codeproject.com/string/re.asp.
(2) For more information about format
strings please read Appendix D