This is a list of my most effective Proxomitron filters. Together, they kill almost any form of advertising and neutralize many other common web annoyances. I've described what each filter is intended to do and given technical explanations for some of the more complicated ones. My active set of filters includes most of these plus a few others, and I see zero advertising with no noticeable decrease in page loading speed even on a fairly fast cable connection. If you notice ads slipping through or pages taking a long time to load while Proxomitron is at 100% CPU usage, give these a shot.
Send a fake Referer (yes, it's misspelled, don't ask me why) header of the form http://currenthost/ with every browser request. I have referrer turned off at the browser level already, so this is only for the benefit of server-side scripts that foolishly think the Referer header means anything.
In = FALSE Out = TRUE Key = "Referer: Fake referrer info (Out)" Replace = "http://\h/"
This filter illustrates the use of an empty matching expression in order to add a new header. Proxomitron will create the header if it is not present or replace it if it is. In this case the replacement uses \h to construct the fake referrer information. If you want to use the Randomize referrer info filter as well, add the line URL = "*/?" to this one.
These three filters block most cookies I don't want. This doesn't do much for me except reduce the number of times I have to click Refuse when I have cookies set to Prompt instead of my usual Disable. The first blocks any attempt to set a cookie that persists after the browser is closed. I figure there is no reason for any site to set a persistent cookie, other than those I log into with a user/password, and so I just kill 'em all. The second filter targets cookies with the name "ASPSESSIONID". Many servers running Microsoft software set this by default and most never use it for anything, so it's just a nuisance. The third filter I can't take credit for. It was created by someone on the Proxomitron mailing list, but it was so clever I thought I'd include it here. It forbids setting cookies on anything but HTML pages. This simultaneously cripples most third-party tracking and stops you from getting bombarded with prompts from servers that attempt to set the same cookie dozens of times.
In general, I think you're better off using settings within the browser to filter cookies rather than relying on proxy filters, because there are several ways to get cookies in and out, each requiring a different approach to filter externally. If your browser doesn't give you enough control over cookies (such as deleting them all on exit, persistent or not) then I say it's defective. Replace it with one that does and be happier. :)
In = TRUE Out = FALSE Key = "Set-Cookie: Kill persistent cookies (In)" Match = "*; expires=*" In = TRUE Out = FALSE Key = "Set-Cookie: Never accept ASPSESSIONID cookies (In)" Match = "*ASPSESSIONID*" In = TRUE Out = FALSE Key = "Set-Cookie: Never accept bugged cookies (In)" Match = "$IHDR(Content-Type:(^*html*))*"
All three filters look at the Set-Cookie header, which is the most common way for a web site to request that cookie be stored. Here is a typical example of this from the Google search engine. I've split it into multiple lines for display purposes, but it would normally be a single long line.
Set-Cookie: PREF=ID=51495baf28a1278f:TM=992791317:LM=992791317; domain=.google.com; path=/; expires=Sun, 17-Jan-2038 19:14:07 GMT
The first line is the name and value of the cookie, which in this case looks like an ID number and two timestamps (in number of seconds since midnight Jan 1, 1970). The second tells the browser when it should send this cookie back in the Cookie header. This cookie is set for any URLs of the form [^/]++.google.com/*, to use Proxomitron's syntax. Finally, the last line tells the browser that it should remember the cookie for the next 36 years. That seems a little excessive, doesn't it? I highly doubt my hard drive will even last that long.
The "Kill persistent cookies" filter looks for that ; expires= string. There is no replacement text, which means that if a match is successful, Proxomitron will delete the header entirely (rather than passing a blank Set-Cookie header on to the browser).
"Never accept ASPSESSIONID cookies" is even simpler. Same story, just look for the string ASPSESSIONID instead and kill the entire line if found.
The final one, "Never accept bugged cookies" is a little different. It doesn't care about the contents of Set-Cookie itself, but uses Proxomitron's $IHDR function to look at the incoming Content-Type header instead. If it matches (^*html*), that is, does not contain "html", the filter deletes the Set-Cookie header by matching its contents against * and replacing them with nothing.
Some servers are configured improperly to send the wrong Content-Type header. This violates the HTTP standards and can cause lots of problems ranging from images not displaying, displaying a binary file as HTML instead of being prompted to save it, or Proxomitron not filtering webpages it should. The following filter fixes the Content-Type header of common file types according to the file extension of the URL. You will need to install the MIME-List blockfile.
In = TRUE Out = FALSE Key = "Content-Type: Fix MIME types (In)" Match = "(text/(^html)|*unknown|(^?))$URL(http(s|)://[^/]+*.([a-z0-9]+{2,5}&&$LST(MIME-List))((^?)|\?))" Replace = "\0"
The filter first checks the original Content-Type header. The most common "wrong" values are text/plain, unknown, or no header at all. To reduce the chance of false positives, the filter activates only if it matches one of these. Then it looks for the last . in the URL and matches what follows against MIME-List. The list stores the corrected Content-Type in the variable \0, which is used as the replacement text.
A three-part filter that blocks almost any Javascript ad. Essentially kills any script that refers to an ad site (as listed in AdList). For AdList itself, see my blocklists. With such a long list, this is an expensive filter CPU-wise, but worth it, as it makes Javascript almost bearable.
Name = "Kill ad scripts (part 1 - <SCRIPT>)" Active = TRUE Multi = TRUE URL = "$TYPE(htm)" Bounds = "<script*</script>" Limit = 10240 Match = "*://(\w&&$LST(AdList)*)*" Replace = "<span class=prox kill=Script detail=\9></span>" Name = "Kill ad scripts (part 2 - <NOSCRIPT>)" Active = TRUE URL = "$TYPE(htm)" Limit = 2048 Match = "<span class=prox kill=Script\1</span>" "(\2&&(^*<(no|)script)*)<noscript*</noscript>" Replace = "<span class=prox kill=Script+\1</span>\2" Name = "Kill ad scripts (part 3 - external files)" Active = TRUE URL = "$TYPE(js)" Limit = 1024 Match = "://$LST(AdList)" Replace = "Script from \9 killed.\k"
This set of filters illustrates the use of Proxomitron's blocklist feature. Let's examine them one at a time. The first one uses the $TYPE function introduced in Beta Five to restrict the filter to only HTML pages. (Equivalently, we could also use $IHDR(Content-Type:*html*).) This is mainly for efficiency, since the filter doesn't apply to Javascript or CSS files. The bounds match looks for any script element and ensures that we won't accidentally match <script> ... </script> ... <script> ... </script>. I set a byte limit of 10kb because some scripts can be quite long. This does increase the amount of processing required, so you can lower it if you want.
The matching expression is very simple: just look for :// (the start of a URL) and then try to find anything in AdList. Putting the literal :// before the call to AdList greatly improves the speed of these filters, but means that it might miss some relative URLs like src="/ads/buyme?abcdef1234". Such URLs will still be blocked by the URL-Killer filter anyway, so it hasn't been a problem. If there is a match in AdList, we save that part of the text in the variable \1. Yep, list expressions can be parenthesized and stored just like any others. The final * will match anything up to the end that wasn't accounted for by AdList. Remember that when using bounds, the matching expression must match all of the text that the bounds expression does.
Now on to the replacement text. Since the idea is to kill the entire script, the replacement expression uses very little from the input text. My version of AdList stores the keyword that triggered the match in the variable \9. I replace the script element with a span element to show that something was killed. See this explanation of why I prefer this to a replacement like <font size=1>[ad killed]</font>.
Most ad scripts come with their own noscript section so even those with Javascript turned off can't escape. Since I turn Javascript on and off on a whim, I want to make sure I'm protected either way. The second filter in the set looks for a previously killed script tag (already replaced with a span element) and kills the next noscript tag, provided there are no other script tags in between. In order to match text previously replaced, the first filter must have "Allow multiple matches" checked. This filter doesn't need a bounds match, it just looks for the replacement text of the previous filter (<span ...) followed by a noscript section, and then lops that section off.
There's a subtlety here in the use of the && and ^ operators. Each subexpression of && must not only match, but also consume the same amount of text. Since the negative expression ^*<(no|)script does not consume anything, we need to append an extra * outside of the negation, but still within the &&. Thanks to Scott (the Proxomitron author) for explaining this to me, I hope I've reproduced his explanation adequately.
The last filter handles scripts that are linked in externally rather than embedded in HTML. Hence we restrict the filter to Javascript files. It merely scans the whole file (no bounds match needed) for :// followed by something in AdList. If it finds anything, it kills the connection with Proxomitron's \k command. This may cause syntax errors by truncating Javascript files mid-statement, but this shouldn't matter unless you have Javascript error reporting turned on in your browser.
This filter shows a very powerful technique that allows you to override common Javascript functions like window.open. To use this, you will need to place files start.js and end.js in your Proxomitron\html directory. (Important: Make sure you bypass Proxomitron when downloading these two files.) The files can contain any code you want. I was feeling particularly evil when I wrote mine, setting the screen resolution to a random number for example. I've devoted a separate page to this filter since it has such great potential.
The nice thing about this filter is that it puts the full power of Javascript at your disposal for once, whereas it's usually just a weapon available only to the page author. It even handles pages that are "encrypted" with Javascript. Fight fire with fire I say. The disadvantage is that it is doing a lot of things to the Javascript engine that browser developers never expected. As a result some of the tricks I'd like to do either don't work in some browsers or cause them to crash outright. Much of start.js is disabled for Internet Explorer for this reason.
The possibilities for start.js and end.js are endless, but in order to prevent breaking the scripts you do want to keep, you should name any global variables or functions with PROX or something equally unlikely to be used within a script. If your start.js defines a variable i and an existing script also uses that name, things could get screwy.
Name = "Tame Javascript (part 1)" Active = TRUE URL = "$TYPE(htm)" Limit = 256 Match = "<start>" Replace = "<!--//--><script type="text/javascript" " "src="http://bweb..local.ptron/start.js"></script>\r\n" Name = "Tame Javascript (part 2)" Active = TRUE URL = "$TYPE(htm)" Limit = 256 Match = "<end>" Replace = "\r\n<!--//--><script type="text/javascript" " "src="http://bweb..local.ptron/end.js"></script>"
Both of these filters are fairly simple, actually. The first calls start.js at the beginning of every HTML file and the second calls end.js at the end. The real complexity is in the two external Javascript files. Again, see the Proxomitron and Javascript page for more details. I used external files rather than embedding the code within the filters themselves to make editing easier as both files are quite long (around 300 lines).
The next set of filters serves as a very effective ad blocker. Together they kill frames, iframes, ilayers, objects, embeds, applets, links, forms, and images that refer to an ad site. Get AdList and AdKeys from the blocklist page.
Name = "Kill ad containers" Active = TRUE Bounds = "<frame\s*>|<iframe\s*</iframe>|<ilayer\s*</ilayer>|<object\s*</object>|<embed\s*>|<applet\s*</applet>" Limit = 2048 Match = "<(\w)\1*src=$AV(*://$LST(AdList)*)*" Replace = "<span class=prox kill=\1 detail=\9></span>" Name = "Kill ad links" Active = TRUE Bounds = "<a\s*</a>" Limit = 1024 Match = "*href=$AV($LST(AdKeys)*)*" Replace = "<span class=prox kill=Link detail=\9></span>" Name = "Kill ad forms" Active = TRUE Bounds = "<form\s*</form>" Limit = 2048 Match = "*action=$AV(*://$LST(AdList)*)*" Replace = "<span class=prox kill=Form detail=\9></span>" Name = "Kill ad images" Active = TRUE Bounds = "<i(m(g|age)|nput)\s*>" Limit = 256 Match = "<(\w)\1*src=$AV($LST(AdKeys)*)*" Replace = "<span class=prox kill=\1 detail=\9></span>"
If you understood my long-winded explanation of the "Kill ad scripts" filters, these should be nothing new. Instead of script and noscript, we target frame, iframe, ilayer, object, embed, applet, a (links), form, img, and the non-standard image. Note the ordering here. "Kill ad links" comes before "Kill ad images" because most banner ads are in fact links that just so happen to contain an image. If we kill the entire link first, we'll never even have to look at the img tag, so performance improves. Similarly, "Kill ad containers" comes before the others because an iframe often contains a normal banner ad as fallback content for browsers that don't support iframe.
This filter disables many of the Javascript events that advertisers like to use to trigger more content to be thrown at you. It handles both body tags and code within scripts, like document.onunload = annoy_user;. Note that the "Tame Javascript" filter above also does this within end.js.
Name = "Stop most Javascript event handlers" Active = FALSE Limit = 64 Match = "on(load|unload|abort|focus|blur|resize|move|keypress|" "keydown|keyup|select|mousedown|mouseup|mousemove)\1=" Replace = "PROXon\1="
I could have made this filter better by restricting the replacement to script elements and body tags. However this would have made it much more complicated, probably requiring several filters to do the same job, analogous to the "Kill ad script" filters. So I went with this simpler, albeit incorrect, version instead. It probably won't matter unless you are reading a page about Javascript event handlers. You often have to make cost-benefit judgements like this when writing your own filters. Keep in mind that it is impossible for any page filter to be 100% correct, a topic I may delve further into on a separate page at some point.
This is an improved version of the standard "Stop Javascript Redirects" filter. The intent is to stop pages that attempt to force you back into their own frameset even if you manually load the URL for one of the frames. This is normally done via Javascript like if (self == top) location.replace('/frameset.html');. It looks for other ways of setting the current URL and tries to preserve the functionality of the page by displaying the redirect as a normal link rather than killing it altogether.
If a page needs the redirect to work properly, you can still follow it manually, similar to the "Anti-Auto-Refresher" filter. This filter might cause problems on pages that do the redirect after the page is fully loaded (such as during an OnChange event for a drop-down box--another cheap trick I wish authors would quit using). The reason is that the filter relies on document.write, which may not work after the page has been rendered.
Name = "Convert Javascript redirects to links (simple)" Active = FALSE Limit = 256 Match = "[dpstw]&(((window|document|this|self|top|parent) . )+{1,*} )" "location ((. href|)=([^=]*)\1|. replace " "$NEST(\(,\1,\))) ([;}])\2" Replace = "document.write('<a class=proxlink target=_top " "href=' + unescape('%22') + (\1) + unescape('%22') " "+ '>[Redirect]</a>')\2"
If you are using "Tame Javascript", here is a smarter version of the filter that allows automatic redirects after the page loads.
Name = "Convert Javascript redirects to links" Active = TRUE Limit = 256 Match = "[dpstw]&(((window|document|this|self|top|parent) . )+{1,*} )\1" "location (. href|)=([^=]*)\2 ([;}])\3" Replace = "setLocationPROX(\2)\3"
This one's a bit complicated. The first thing to notice is the [dpstw]& at the beginning of the match. This is only for performance reasons. The real matching expression looks for sequences like window.parent. followed by location, but this match is expensive without a fixed starting point. To speed things up, note that any successful match must start with the letters d, p, s, t, or w. The extra & condition at the beginning checks for exactly that before doing the more difficult full match. While it creates more work for a successful match, unsuccessful ones are much faster. Unsuccessful matches also happen to be by far more common, so the result is a near doubling in overall performance of this filter.
On to the next part. We need to account for both location = and location.replace(), but not location ==. The second and third lines of the matching expression do this, using $NEST to match parentheses in the replace case. Either way, the new destination is stored in \2.
The filter must replace one Javascript statement with another in order to preserve the syntax of the script. The matching expression absorbs everything up to and including the ; or } of the original statement and replaces it with a new compound statement in braces. If the page is in the process of loading, use document.write to write out a link with \2 as its destination. Otherwise, call the original statement and go directly to the new URL. I was careful not to use double-quotes here because the entire script might itself be within double-quotes as in this example:
Input | <input type=button onclick="window.location = 'next_page.html';"> |
Wrong | <input type=button onclick="if (pageIsLoadingPROX) document.write('<a class=proxlink target=_top href="' + ('next_page.html') + '">[Redirect]<\/a>'); else window.location = ('next_page.html'); } |
Better | <input type=button onclick="if (pageIsLoadingPROX) document.write('<a class=proxlink target=_top href=' + unescape('%22') + ('next_page.html') + unescape('%22') + '>[Redirect]<\/a>'); else window.location = ('next_page.html'); } |
Using unescape('%22') instead of " avoids the problem of the extra double-quote closing the onclick= attribute prematurely. It's still not perfect, but it does improve the reliability of the filter somewhat.
I hate that Javascript-created windows usually don't have scrollbars or an address bar so you can't even tell what site you're looking at anymore. This filter gives popups the same controls as normal windows. It also prevents popups from displaying offscreen or with small sizes. "Tame Javascript" includes this one as well.
Name = "Ignore scripts' settings for new windows" Active = TRUE Multi = TRUE Limit = 512 Match = ". open \( \w\1, \w\2, *\)" Replace = ".open(\1, \2, 'toolbar=1, scrollbars=1, location=1, " "status=1, menubar=1, resizable=1, width=500, height=500')"
This filter prevents Javascript from modifying the window status bar. There's no legitimate reason for any web page to have this ability as far as I'm concerned. 'Nuf said. "Tame Javascript" handles this too.
Name = "Stop Javascript status bar scrollers" Active = TRUE Limit = 64 Match = ".(default|)\1status =" Replace = ".\1statusPROX ="
Various HTTP headers can control whether your browser caches pages or not. Each of these may also be expressed within the page itself as a meta tag. While there may be a few cases where it makes sense not to cache a page, most often this is abused by page authors to inflate their site statistics and force you to load a new set of banner ads every time you look at their page. This filter kills these meta tags to prevent unnecessary reloads.
Name = "Kill cache-breaking META tags" Active = TRUE Bounds = "<meta\s*>" Limit = 256 Match = "*(expires|pragma|last-modified|cache|store)*"
I don't need this filter anymore thanks to Opera's "Minimum font size" setting, but many have asked for a way to increase the small fonts used on many pages. This filter has two parts, one for font tags and one for CSS.
Name = "Increase small font sizes (<FONT>)" Active = FALSE Bounds = "<font*>" Limit = 256 Match = "\1size=$AV([#0-2])\2" Replace = "\1size="3"\2" Name = "Increase small font sizes (CSS)" Active = FALSE Bounds = "{*}" Limit = 256 Match = "\1font-size : ([#0-15]px|[#0-11]pt|x-small|xx-small)*\2" Replace = "\1font-size: 12pt\2"
For some reason, authors very often set their page background to a blinding white. I guess they feel the need to leave some sort of impression on their readers, even if it is just an afterimage burned into their retinas. This filter dims these backgrounds without altering their color so much that other colors on the page clash with it.
Name = "Dim white backgrounds" Active = TRUE Limit = 32 Match = "(b(gcolor=|ackground(-color|) : ))\1" "$AV((white|((#|)(([ef][0-9a-f])+{3}|[ef]+{3})))(;|)\2)" Replace = "\1#E0E0E0\2"
Many pages have links that automatically open in new windows. Since I prefer to choose when to create new windows on my desktop, I wrote this filter to catch most of them. Opera has an "Allow documents to create windows" setting, but I have to leave it enabled for my Google filter (below) to work. Guess what, "Tame Javascript" has this one covered too. Did I mention that that's a very powerful filter yet? I thought so.
Name = "Suppress _blank/_new" Active = TRUE Bounds = "<(a|base)\s*>" Limit = 256 Match = "\1target=$AV((_|)(blank|new))\2" Replace = "\1\2"
This blocks any small (less than 10 pixels wide or tall) images on a page. Often these are just spacers, which I think are a sign of poor page design and, in any event, are a waste of a network connection. Although it wasn't the intent, the filter will also block many so-called "web bugs". I don't really believe in blocking web bugs this way because they can be anything external to the page, an iframe, a background image, an external Javascript, or even a stylesheet. Thus I think it is better to block them the same way you block banner ads--by host or URL.
Name = "Kill small images" Active = TRUE Bounds = "<im(g|age)\s*>" Limit = 256 Match = "*(width|height)\1=$AV(([#0-10])\2)*"
This filter attempts to work around links of the form http://someadsite.com/track?http://therealsite.com that track where you go and how you got there. Since not every URL of this form is necessarily tracking you, the filter leaves the original link intact, but adds another one in front of it. To save screen space, the new link uses a small image (), which you must place in your Proxomitron\html directory. You can feel free to borrow mine or make one of your own. I think it was supposed to be a Proxomitron logo with an arrow through it meaning "Direct". Hey, I've seen worse.
Name = "Bypass redirects in links" Active = TRUE Bounds = "<a(rea|)\s*[^\-]>" Limit = 512 Match = "(*(href|onclick))\1=$AV((?(*\?|*")*)\2(http(s|)|ftp)\3" "(:|%3a)(/|%2f)+{2}([^&'"]+)\4\5)\6" Replace = "<a href="\3://$UESC(\4)" class=proxlink>" "<img src="http://local.ptron/direct.gif" " "alt=Direct width=16 height=16></a>" "\1="\2\3://\4\5"\6"
This one's complicated. I've revised it probably a dozen times by now. The bounds match looks for opening a or area tags. The byte limit of 512 should be enough to handle even those URLs that send everything down to the color of your underwear to the marketers. Fortunately, URLs longer than that are rare--well, almost. Anyway, the best way to explain this filter is by example. We'll use this made-up link:
<a href="http://someadsite.com/track?user=sheep& url=http://therealsite.com/home.html&product=soma">
The match works as follows:
Sub-expression | Match | Explanation |
---|---|---|
(*(href|onclick))\1= | <a href= | The beginning of the link up to the href or OnClick attribute. |
$AV(...) | "http://someadsite. ... =soma" | The entire quoted attribute value is consumed by $AV, but only the part inside the quotes is passed on to the match inside. |
(?(*\?|*")*)\2 | http://someadsite.com/ track?user=sheep&url= | Match at least one character, then any string containing either a question mark or a single or double quote character. Store everything up to the protocol into \2. $AV prevents reading past the end of the href attribute. |
(http(s|)|ftp)\3 | http | The protocol of the real destination, http, https, or ftp. |
(:|%3a)(/|%2f)+{2} | :// | The separator :// either literally or encoded using percent-hex notation. |
([^&'"]+)\4 | therealsite.com/home.html | Absorb the rest of the href attribute up to & or a quotation mark into \4. This is the real destination of the link. Remember that + unlike ++ is a blind match. It will not backtrack and try to consume less if something fails later. This is ok in this case because the remainder of this match expression has no assertions in it, so it can't fail. But keep this distinction in mind when writing filters, and use ++ when in doubt. |
\5 | &product=soma | The rest of the attribute value is stored in \5. |
\6 | > | The remainder of the original a or area tag goes in \6. |
Now for the replacement:
Sub-expression | Replacement | Explanation |
---|---|---|
<a href=" | <a href=" | The beginning of the new, direct link. |
\3:// | http:// | The protocol of the real destination followed by ://. |
$UESC(\4) | therealsite.com/home.html | The path of the new URL. Use $UESC to convert any other percent-hex sequences into their single character equivalents. |
" class=proxlink><img src="http://local.ptron/direct.png" alt=Direct width=16 height=16></a> | " class=proxlink><img src="http://local.ptron/direct.png" alt=Direct width=16 height=16></a> | The end of the new, direct link. The class=proxlink makes it easy to hide the new links when printing. |
\1="\2\3://\4\5"\6 | The original text | Insert the entire original link unmodified. |
So the final replacement text is
<a href="http://therealsite.com/home.html" class=proxlink> <img src="http://local.ptron/direct.png" alt=Direct width=16 height=16></a> <a href="http://someadsite.com/track?user=sheep& url=http://therealsite.com/home.html&product=soma">
Piece of cake, right?
Opera version 5.11 was my first experience with the Flash plugin. It didn't take me long to realize that it was indeed as obnoxious and intrusive as I had feared. Rather than disable it completely, I wrote a filter to convert Flash animations embedded within a page into links. If you want to see a particular animation, just click on the link. You can also disable this filter on certain sites by adding them to the AllowFlash list.
Name = "Convert Flash animations to links" Active = TRUE URL = "^$LST(AllowFlash)" Bounds = "<object\s*</object>|<embed\s*>" Limit = 4096 Match = "<\1\s*(shockwave|macromedia|flash|swf)*&" "(*(src=$AVQ(\2)|" "*<param name=$AV(movie) value=$AVQ(\2) >)*)" Replace = "<a href="\2"><span class=prox kill=\1 " "detail=Flash> \2</span></a>"
There are two ways to insert an animation into a page, object and embed. I think object is the standard way and embed is an old Netscape-ism, but I could be wrong about that. In any event, both are still in use, so the bounds check for this filter will allow either. I used a 4kb byte limit because by the time you add in all the parameters, the code can be pretty long. The first thing to look for is the telltale signs of Flash: Keywords like shockwave, macromedia, flash, or the extension swf are a dead giveaway. If we just wanted to kill the animation and not worry about linking to it, we could stop here and replace with the empty string.
Once we're sure we're dealing with Flash, we need to capture the URL of the movie so we can use it in our link later. This is trickier than it sounds because there are usually other URLs around, e.g. the location of the Flash plugin itself in case you don't have it. In addition, the object and embed forms put the movie URL in a different place. So look for either src=... or <param name=movie value=...> anywhere within the bounds. Capture this string into \2.
Now the replacement text will be a link with destination \2. Inside the link is my usual span trick, which allows me to hide the link altogether if I want. If you want the animation to always open in a separate window you could add target=_blank to the link, but I find it's easy enough to Shift-Click or whatever instead.
If you use the Google search engine a lot like I do, you may find this filter handy. It causes search results from Google to open in a second window. This window gets reused for each search result so it's easy to walk down the list until you find what you're looking for. In addition, different search strings each have their own second window so if you have two searches going at once, they won't interfere with each other.
Some people reported problems getting this filter to work initially. It turns out that Google sends slightly different code to Linux users, and that code wasn't being caught by this filter. I added the (<!--*-->|) to account for this and I think it works now, but if not, let me know.
Name = "Open Google links in separate window" Active = TRUE URL = "www.google.com/search\?" Limit = 256 Match = "<p>(<!--*-->|)<a(\s*)\1>&" "$URL(http://www.google.com/search\?(*\&|)(as_|)q=([^\&]+)\2)" Replace = "<p><a\1 class=proxlink target="google-\2">" "[New window]</a>\r\n<a\1>"
Here's a simpler version of the filter that uses the same window for multiple searches.
Name = "Open Google links in separate window" Active = TRUE URL = "www.google.com/search\?" Limit = 256 Match = "<p>(<!--*-->|)<a(\s*)\1>" Replace = "<p><a\1 class=proxlink target=google>" "[New window]</a>\r\n<a\1>"
This is a cached copy of http://www.geocities.com/u82011729/prox/filter.html