Questions and Answers / phpAds
« on: August 09, 2002, 01:22:24 AM »
This might be of interest to some of you. If you want to read about
how some of the web sites incorporate ads into their pages, here is
an article I happened upon for developers using a tool called phpAds:


Site Specific /
« on: July 15, 2002, 06:30:52 AM »
Yahoo has made changes to pages which happens
to make it easy to strip out a few advertisment sections.
This filter removes tables which contain certain key words:
ADVERTISEMENT which conviently marks a table of ads,
MTFSID which seems to always be part of a comment line inside
the table for the advertisement banner on the page, and
MTFAD:N:750x100 which catches some other banners that don't
have an MTFSID comment.

Name = "Kill advertisement tables"
Active = TRUE
URL = "(story.|)*"
Bounds = "$NEST(<table,</table>)"
Limit = 10000
Replace = "<!-- Yahoo 1 removed -->"

So far I haven't seen where this has removed any legitimate content.
Although it would on older-style yahoo pages such as
and if you used it on them (which is why I've been so
specific with the URL match)
Hopefully they are moving those other pages to this new format!!
I'm sure this filter will evolve, I've re-written it several times
since I started posting this
(and now editted it once since posting!)

Edited by - pooms on 15 Jul 2002  07:44:12

I seem to be getting a Microsoft JET error if I try to reply to any
posts on the forum:
Microsoft JET Database Engine error '80040e10'

No value given for one or more required parameters.

/pforum/post.asp, line 120

I'm assuming that this isn't just me.


Microsoft Help / MSN Messenger
« on: June 28, 2002, 07:21:58 AM »
If you are using MSN Messenger, you might be interested in this:
I don't use it so I don't know if you can configure it to
"tunnel" through HTTP and use Proxomitron or not. Probably


Security / Windows Media Player Pragma: log-line
« on: June 24, 2002, 06:27:44 PM »
I have Windows Media Player configured to go through Proxomitron and I
recently noticed an HTTP POST that was triggered at the end of a video
clip embedded in an HTML page. This POST contained an HTTP header starting
Pragma: log-line=  
followed by a whole bunch of stuff.
Included in it was the IP address of my computer, as well as the name of
my computer.
Unfortunately I haven't been able to cause this POST to happen again, so I don't
know what it was that triggered it. And in my trying to recreate it, I
forgot to copy and save the header line from the Log Window.  
So I'm not certain if it was "log-line" or "log_line".
I haven't found anything on the net that provides any clue about this
header. For now I've put in a header filter that looks for any occurrence
of my computer name in a Pragma header and calls $ALERT. Hopefully I'll
be able to catch this again and figure out what is going on.


Privacy / Mail Bug
« on: June 22, 2002, 01:46:35 AM »
I just received an HTML mail message from my ISP which contains this bug:

<IMG src="">

a quick Google search on "flosensing" shows up a number of sites using
this CGI program, so I think it is a good candidate for the kill list.

I'm also not to happy that other links in the email show up like this:

<A class=lightgrey
            Privacy Commitment</A>

So the URL contains an identifier that has a core part in common with the identifier
sent to the mail bug. I'm just guessing, but if that part of the identifier
is unique for each person the mail is sent to, this would
allow them to correlate my email address with the fact that I clicked on
a link. In the above example, ironically, it is a link to their "Privacy

Now I could be wrong, and the identifier is the same for everyone who got
the email, so I'm going to ask a few friends to send me copies they received.


Other / Enabling filtering of text/xml
« on: June 12, 2002, 08:18:44 PM »
It took me a long time to figure out how to apply content filters
to text/xml documents, so just in case anyone else is as dumb
as me, you have to first create a header filter for the Content-type
that explicitly enables filtering. Something like this:

Key = "Content-Type: Enable text/xml filtering (In)"
Match = "(text/xml)&$FILTER(true)"
Replace = ""

There are a few other xml related content types that I probably want to
enable (eg application/xml, application/soap), so it will probably make
sense to expand this to use a LST of content types to enable.


Other / Removing XML prolog statement
« on: June 12, 2002, 08:02:15 PM »
I recently ran into a problem with the following web page:
where I got an error saying "The XML page cannot be displayed"
and "Cannot have a DOCTYPE declaration outside of a prolog".
The problem appears to be that the page starts with an XML prolog statement
that looks like:
<?xml version="1.0" encoding="iso-8859-1"?>
and Proxomitron filters that had matched on <start> had placed stuff before
this line. Apparently MSIE doesn't like it if the XML prolog statement
isn't at the beginning of the page. Since the XML prolog statement is
an optional thing, I ended up removing it from the page in order to
get around the problem:

Name = "Remove XML prolog"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "$NEST(<?xml, ?>)"
Limit = 256
Match = "*"

Opera doesn't seem to have this problem.


Questions and Answers / 4.3 Crashing and Hanging
« on: June 11, 2002, 03:36:38 PM »
I started using 4.3 last night and have run into a couple of
crashes and one case where it hung and I had to kill it
through the Task Manager.

The first time it died, I wasn't paying attention, so I'm not sure
what was the cause. The second time, though, I figured out the what
happened and can repeat it. If you scroll all the way to the bottom
of either the Web Page or Header filters lists, there is a strip
of grey below the very last entry. I was trying to unselect the
last filter entry and clicked a bit too low in the dark grey area
below the checkbox. That killed Proxomitron.

I managed to hang 4.3 twice in the same manner, so this may also
be repeatable. I was switching between configuration files and
had selected File/Load Config File and clicked to open the new Config
file I wanted to load. I hadn't noticed, but there was one active
connection, and I got the warning dialogue with the Retry, Kill
Connections options. It didn't seem like there was any good reason
for there to be an active connection, so I selected the kill
connections option. At that point it completely froze. It wasn't
taking up any CPU time, but there was nothing I could do other
than kill it through the Task Manager.

Other than these problems (which can easily be avoided) I haven't
run into anything else.



The simplified home page is meant to accentuate advertisements

The page's redesign would further accentuate interactive types of advertisements on the site

a test of our proxomitron filter-writing skills

Privacy / block confimax email delivery confirmation
« on: May 21, 2002, 06:44:12 PM »
Recently I've received a few email spams that use the confimax automatic email
delivery confirmation service:
The HTML email contains the following text:
<a href=""><img src="" border="0" alt="Delivery confirmed by"></a>
I've now placed in my kill list, hopefully spammers
using this service will now think that emails to my address never get


Questions and Answers / Configuring Java clients to use HTTP Proxy
« on: April 25, 2002, 06:25:55 PM »
If you are developing your own Java clients that talk to HTTP
servers, or if you are using someone else's Java program
that does this, you can set Java properties to cause it to
use an HTTP Proxy. Simply add the following command line

-DproxySet=true -DproxyHost=localhost -DproxyPort=8080

and your Java client will use Proxomitron.


Cosmetic / Remove Dramatic Transition Effects from Web Pages
« on: April 23, 2002, 02:30:40 AM »
One web site that I visit has this very tacky, stupid looking
transition effect when you leave one of their pages. (Visit to see what I mean). I finally
got around to looking at the HTML to see what they were doing
and it is an IE specific <META> tag. I found the following page that describes
the tacky page transitions that can be done, which makes it easy
to write a rule to remove them:

Here is my rule:

Name = "Remove Transition Effects"
Active = TRUE
Limit = 256
Match = "<meta*http-equiv=$AV((site|page)-(enter|exit))*content=$AV((RevealTrans|BlendTrans)*)*>"
Replace = "<!-- Transition Effect Removed -->"

Edited by - pooms on 23 Apr 2002  05:29:01

Privacy / Remove NewChannel
« on: April 21, 2002, 05:54:47 AM »
NewChannel is a product that tracks web site visitors and analyzes
their movements. If you visit a website that has NewChannel embedded
in their HTML pages, every two seconds or so that you are on that site,
information is sent to their server.

This article describes what NewChannel does:,2997,s%253D400%2526a%253D4992,00.asp

Here's a simple rule to remove the NewChannel tags from pages:
Name = "Remove Newchannel"
Active = TRUE
Bounds = "<NewChannel*</NewChannel>"
Limit = 150
Match = "*"
Replace = "<!-- NewChannel Tags Removed -->"

NewChannel appears to host the service on servers in their own domain, so
I also added [^/] to my AdList to make sure no traffic
goes to their site.

I wrote this rule when I discovered that my ADSL provider is using
NewChannel. I only noticed it because I happened to have Proxomitron's
Log Window open when I visited the site, and noticed the continual
stream of HTTP traffic even though I wasn't doing anything.
To verify that the rule works, you can go to NewChannel as they have it running on their
own site.

Site Specific / Remove MSN from and
« on: April 21, 2002, 05:26:08 AM »
Certain sites such as and
are "wrapped" by MSN stuff, generally a header, a footer and
some additional advertising block on the side.
I wrote the following rule to get ride of the MSN content, based
on the fact that they conveniently inserted HTML comments to mark
the beginning and end of the MSN content.

Name = "Kill MSN Content"
Active = TRUE
Bounds = "<!--( |-)("|BEGIN |)MSN (Header|Footer|SideBar|module)*MSN (Header|Footer|SideBar|module)* -->"
Limit = 10000
Match = "*"
Replace = "<!-- MSN stuff removed -->"


