Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - altosax

Pages: [1] 2
1
Other / new tutorial about the if-then-else structure
« on: August 23, 2002, 12:47:54 AM »
hi friend,
i've written this new tutorial about the use of an implied if-then-else structure in filters. it was originated from the discussion i had on this forum with hpguru, sidki, jor, jd and tegghead. but, as always, also members who hadn't part in that discussion have contributed because all here have learned a lot from their posts, so thank you all.
here is the tutorial. i have written it in a couple of hours so i'm sure there are many other things that could be added: let me know your comments and suggestions.

-------------------- start here -----------------------------
***********************************************************
THE PROXOMITRON FILTERS TUTORIAL
How write filters using an implied if-then-else structure
by altosax

Created August 22, 2002 - Updated August 23, 2002
***********************************************************


INTRODUCTION
This tutorial explains how write filters able to perform an if-then-else check to be used with The Proxomitron, by Scott R. Lemmon. If you don't understand what i wrote in this tutorial, please read the Proxomitron help first.


UNDERSTAND THE PROBLEM
The Proxomitron matching language is a specific language created by Scott R. Lemmon with a subset of characters and symbols used for the regular expressions, to allow the user to write himself his own filters.
Although it is much powerful, and allows the user to write really complex filters, this means that The Proxomitron matching language is not a programming language and can not provide some logical structures you can find in the high level programming languages.
A useful structure that the user could need while he is writing a filter is the if-then-else check. This way he could test when a condition is true and do something otherwise do something other.
The main intent of this tutorial is to show the way this can be done using The Proxomitron language and its built-in commands.


SOME EXAMPLES REQUIRING AN IF-THEN-ELSE
The following examples comes from the real questions posted by some Proxomitron users at http://asp.flaaten.dk/pforum/ that have originated the discussion about the way to write an implied if-then-else structure in the filters.

- Example 1
Suppose you have to close the <head> tag when </head> is missing.
To solve this problem you can think this:

IF </head> exists
THEN do nothing
ELSE
IF </head> is missing
THEN add it

- Example 2
You want to set the image border to 0 both if the border attribute exists or not.
To solve this problem you can think this:

IF the border attribute exists
THEN set it to 0
ELSE
IF the border attribute not exists
THEN add it

- Example 3
You want to write a simple filter to view the page comments.
To solve this problem you can think this:

IF the opening comment tag is encountered
THEN modify it to something other
ELSE
IF the closing comment tag is encountered
THEN modify it to something other


HOW WRITE THE IF-THEN-ELSE
The most effective way to write this kind of structure in a filter is to use 2 different matching expressions separated by an OR function, each using a SET command.
This table shows the corrispondences:

IF = first matching expression
THEN = first SET command
ELSE = |
IF = second matching expression
THEN = second SET command

If you have to test more than an IF case, you can add to the basic structure the required number of else-if-then as follows:

ELSE = |
IF = new matching expression
THEN = new SET command

After doing this, you simply have to replace the variables set in the if clauses.


SOLVING THE PROPOSED EXAMPLES
Here you can view how easy is to solve the previous problems using an if-then-else structure.

- Example 1
We have to close the <head> tag when </head> is missing.

IF </head> exists = first matching expression
THEN do nothing = first SET command
ELSE = |
IF </head> is missing = second matching expression
THEN add it = second SET command

Match = "</head>$SET(1=</head>)"
        "|"
        "<body$SET(1=</head><body)"
Replace = "1"

We can learn from this example that:
a) "do nothing" is equal to "set the variable to what matched";
b) when the code we are searching for is missing we have to match what follows then inject it in front of what matched.

- Example 2
We have to set the image border to 0 both if the border attribute exists or not.

IF the border attribute exists = first matching expression
THEN set it to 0 = first SET command
ELSE = |
IF the border attribute not exists = second matching expression
THEN add it = second SET command

Bounds = "<im(g|age)s*>"
Match = "1 border=$AV(*)3$SET(2=border="0")"
        "|"
        "1>$SET(2=border="0">)"
Replace = "1 23"

This time we had to use Bounds to define the limit where the filter have to be applied. Also, in the second clause, we have to leave out of the matching variables the closing > parens, otherwise the added attribute could never be inserted into the tag. This is the same as before: we need to match what follows to inject our code in front of it.

- Example 3
We have to write a simple filter to view the page comments.

IF the opening comment tag is encountered = first matching expression
THEN modify it to something other = first SET command
ELSE = |
IF the closing comment tag is encountered = second matching expression
THEN modify it to something other = second SET command

Match = "<!--$SET(1=<small><!--)"
        "|"
        "-->$SET(1=--></small>)"
Replace = "1"

This last filter differs from the first and the second because they check if something we are looking for is present or not in the html code, while the third filter check for different code. As we'll see in the next paragraph, this kind of filter can not be simplified while the first 2 can do.


SIMPLIFY THE MATCHING EXPRESSION WHEN POSSIBLE
Using the if-then-else structure helps a lot to write filters but after doing it you always have to try tweaking them to simplify, improve or make them more effective. So, the if-then-else structure can be considered a way to address the solution but the result it provides often requires some further modifies to come to a definitive version of the filters you wrote.
To better understand what this means, the filters used as examples in this tutorial will be tweaked to show how to achieve their final form.

- Example 1

Match = "</head>$SET(1=</head>)"
        "|"
        "<body$SET(1=</head><body)"
Replace = "1"

The first IF clause matches when </head> is present, but in this case we don't need to set the 1 variable because we can store directly </head> into 1. Also, in the second IF clause, we can store what matched in the 2 variable:

Match = "(</head>)1"
        "|"
        "(<body)2$SET(1=</head>)"
Replace = "12"

Because this filter applies just one time each page we can also use the STOP command to exclude it for the rest of the page, so its final version is:

Match = "(</head>)1$STOP()"
        "|"
        "(<body)2$SET(1=</head>)"
Replace = "12$STOP()"

- Example 2

Bounds = "<im(g|age)s*>"
Match = "1 border=$AV(*)3$SET(2=border="0")"
        "|"
        "1>$SET(2=border="0">)"
Replace = "1 23"

Both the IF clauses set the 2 variables to border="0" so we can directly replace it without set anything:

Bounds = "<im(g|age)s*>"
Match = "1 border=$AV(*)3>"
        "|"
        "1>"
Replace = "1 border="0"3>"

Because the IF clauses are very similar, we can also merge them using a check for something OR nothing:

Bounds = "<im(g|age)s*>"
Match = "1 (border=$AV(*)3|)>"
Replace = "1 border="0"3>"

- Example 3

Match = "<!--$SET(1=<small><!--)"
        "|"
        "-->$SET(1=--></small>)"
Replace = "1"

In this case we can't do anything to simplify or improve the filter so it is really the case where we need an if-then-else structure and this is true every time we have to match different code and replace it with different replacement expressions.


CONCLUSION
The use of an implied if-then-else structure can represent a useful trick to write filters. It should not be seen as a result but as a way to come to a solution: although the filters can be written without knowing anything about it, when this trick can be applied provides an easy way to solve problems in writing a filter.


NOTE
I believe in the Open Source Definition and in the Open Source Community. This tutorial is provided as public domain and copyleft. You can freely distribute and/or modify it but, please, don't remove this note.
------------------------ end here -------------------------

also, i've attached it to this message in a .txt zipped format that you can easily download:

http://uploaded/altosax/2002823191958_tutorial2.zip

<edit>: this is the more recent version, that you can find also in p-faq

regards to all,
altosax.

Edited by - altosax on 01 Sep 2002  19:02:21

2
Community Discussions (Non-Forum Related) / proxomitron compressed
« on: August 18, 2002, 12:48:36 PM »
i'm playing these days with ultimate packer for executable:

http://upx.sourceforge.net/

and i've compressed proxomitron from its 332800 bytes to 168960 bytes.
the obtained file is auto-extracting and works fine as the original one.
this is just an experiment, and i intend to return to the unpacked one in the next days, but this could be a suggestion for scott to reduce the size of the distributed package. also, upx can compress the dll.

some huge exe also seems to take a lesser time to load, but i'm not so sure.

that's all,
if you are interested, try it yourself,
altosax.

 

3
Community Discussions (Non-Forum Related) / w2k sp3 and privacy question
« on: August 04, 2002, 11:50:54 AM »
Be sure to read the new EULA/privacy statement for Windows 2000 Service Pack 3, it has an interesting portion about how Windows Update and Automatic Update (which gets installed with SP3) can, by agreeing to this license, send the following pieces of info to Microsoft, this was posted on the MS focus list by Javier Sanchez:

"With the latest version of Windows Update (essentially a mandatory download and now part of SP3) you consent to sending the following information to Microsoft:

* Operating-system version number and Product Identification number
* Internet Explorer version number
* Version numbers of other software
* Plug and Play ID numbers of hardware devices

This is stated in the "Windows Update Privacy Statement" which you can read at <http://v4.windowsupdate.microsoft.com/en/about.asp?>  You can also follow the "About Windows Update" link off the WindowsUpdate page. Don't bother trying to right-click, they've made sure to disable that."

altosax.

 

4
Spam Blockers / Columbus' egg surrounded by comments
« on: July 28, 2002, 02:40:31 PM »
Columbus' egg surrounded by comments

after, it seemed so simple.

the reason i've wrote time ago a new filter to kill comments-surrounded ads was to avoid scanning of a huge list every time <!-- was found in a web page.

as all of you have read following the previous three threads (the jor/jd one, the mine one and the sidki one, all posted in the spam blockers section), there are different approaches to kill this kind of nosey stuff. but i've not abandoned my initial idea, so the solution i propose now is really simple: use two different filters, the mine one and the sidki one, working in conjunction.

you will better understand the idea reading the filters:

Name = "Kill Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 12000
Match = "<!-- (auto|begin|start) $LST(AdComments)"
Replace = "<span style=display:none;>[Killed Comments-surrounded Ads]</span>"

Name = "Remove Comment-Block Ads [sidki]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 12000
Match = "<!--[^>]++{0,30}$LST(AdCommentPairs)"
Replace = "<span style=display:none;>[Killed Comments-surrounded Ads]</span>"

when an opening <!-- is found, my filter is scanned first, but it calls the AdComments.txt list ONLY if auto, begin or start match, so ONLY when they match the list is scanned, otherwise it is skipped. this requires to check only 3 keywords and not the huge AdComments.txt.

when a comment not starting with auto, begin or start is found, the AdCommentPairs.txt list is scanned, but no duped comments are contained in the second list so the number of lines of code to scan is dramatically reduced using two different lists.

the only case where both lists are scanned is when a comment starting with auto, begin or start is found and it is not contained in the first list. but if you find a new comment, simply add it to the right list. on the other hand these two lists, if merged, are equal to the one used now by sidki filter, so this apparent problem is caused ONLY by the missing entry.

also, i use a trick to skip both lists on false matches. to use this trick you too, add this line to your managedtags.txt list:

<!-- (begin|open|start) (left|right|head|footer|main|menu|javascript) (^ad)$SET(1=)

and now here are the lists as they are at this time (these are more updated than the sidki one posted in the other thread):

----------------AdComments.txt begin----------------------
# Proxomitron4 URL killfile: $LST(AdComments)
# Created by altosax on July 08, 2002
# Updated on August 03, 2002
#
# List for "Kill Comments-surrounded Ads [vm]" filter.
# It removes ad-blocks surrounded by listed comments.
# To make it safer, add here longer possible comments.
# Also, you need to add both starting and ending
# comments, separated by *.

# AUTO
Banner Insertion Begin * (Auto Banner Insertion)1 Complete *-->

# BEGIN
468 Ad area * End (468 Ad area)1 *-->
ADVERT POWER * END (ADVERT POWER)1 *-->
BAD ASS Advertising * END OF (BAD ASS)1 RANDOM ADVERTISEMENTS *-->
Ban Man Pro * End (Ban Man Pro)1 *-->
BURST * END (BURST)1 *-->
CLICK2NET CODE * END (CLICK2NET CODE)1 *-->
Crucial advertisement * end (Crucial advertisement)1 *-->
EXIT CODE * END (EXIT CODE)1 *-->
Flycast Ad Copyright * End (Flycast Ad)1 Copyright *-->
ITALIA HYPERBANNER * END (ITALIA HYPERBANNER)1 *-->
LINKEXCHANGE CODE * END (LINKEXCHANGE CODE)1 *-->
linswap Code * End (linswap Code)1 *-->
Linux Waves Banner Exchange * End (Linux Waves)1 Banner Exchange *-->
MPU * END (MPU)1 *-->
Nedstat Basic code * End (Nedstat Basic code)1 *-->
of MAFIA * end of (MAFIA)1 *-->
of SpyLOG * end of (SpyLOG)1 *-->
of technojobs ad * end of technojobs ad *-->
of Top100 * end of (Top100)1 *-->
of TopList * end of (TopList)1 *-->
PayCounter * End (PayCounter)1 *-->
PayPal Logo * End (PayPal Logo)1 *-->
RealHomepageTools * End (RealHomepageTools)1 *-->
RICH-MEDIA BURST * END (BURST)1 *-->
SEXCOUNTER ADVANCED CODE * END (SEXCOUNTER)1 ADVANCED CODE *-->
SexList Counter Code * End (SexList Counter)1 Code *-->
SEXLIST REFERRER-STATS CODE * END (SEXLIST REFERRER-STATS)1 CODE *-->
SEXTRACKER CLIT CODE * DONE WITH (SEXTRACKER CLIT CODE)1 *-->
SEXTRACKER CODE * END (SEXTRACKER)1 CODE *-->
SITEWISE * END (SITEWISE)1 *-->
Tracker * End (Tracker)1 *-->
TT Side CODE * END (TT Side)1 CODE *-->
TXTAD ROTATE * END (TXTAD ROTATE)1 *-->
WEBSIDESTORY CODE * END (WEBSIDESTORY)1 CODE *-->
Web-Stat code * End (Web-Stat)1 code *-->
ZEDO * end (ZEDO)1 *-->
: Pop-Up Window * END: (Pop-Up Window)1 *-->
n Cash 2002 HTML Code * Ende (Cash 2002)1 HTML Code *-->
ning Advertising nAdvert * End (Advertising nAdvert)1 *-->

# START
ADCYCLE STANDARD * END (ADCYCLE)1 CODE *-->
EROTISM HEADER CODE * END (EROTISM HEADER)1 CODE *-->
EROTISM FOOTER CODE * END (EROTISM FOOTER)1 CODE *-->
Gamma Entertainment * End (Gamma Entertainment)1 *-->
of ExtremeDM Code * End of (ExtremeDM Code)1 *-->
OF GENERIC SITEWISE * END OF (GENERIC SITEWISE)1 *-->
of NedStat * end of (NedStat)1 *-->
of Recommend-it Code * End of (Recommend-it)1 Code *-->
of ReferStat * End of (ReferStat)1 *-->
of Sex Trail Safe-Code * End of (Sex Trail)1 Safe-Code *-->
OF SITEWISE * END OF (SITEWISE)1 *-->
of TheCounter.com * End of (TheCounter.com)1 *-->
OF WEBTRENDS LIVE * END OF (WEBTRENDS LIVE)1 *-->
Product-Specific Links * End (Product-Specific)1 Links *-->
RedMeasure * END (RedMeasure)1 *-->

# GENERIC AD COMMENTS FOR ANY DOMAIN
(of|[^a-z]|) ad(s|)[^a-z] * end (of|[^a-z]|) (ad(s|)[^a-z])1 *-->
(of|[^a-z]|) advertis(ing|ements) * end (of|[^a-z]|) (advertis(ing|ements))1 *-->
(of|[^a-z]|) banner(s|)[^a-z] * end (of|[^a-z]|) (banner(s|)[^a-z])1 *-->

-------------------AdComments.txt end--------------------------

and here the second list:

-----------------AdCommentPairs.txt begin----------------------
# NoAddURL
# Proxomitron4 URL killfile: $LST(AdCommentPairs)
# List for "Remove Comment-Block Ads [sidki]" filter.
# It removes Ad-blocks surrounded by listed comments.
# Keywords by sidki, Jor, JD, altosax
# Created by sidki on July 12, 2002
# This version by altosax. Updated August 03, 2002
###############################################
# Checked And Ordered, No Leading Wildcard Here
# ---------------------------------------------
ACTIVEADV BEGIN BANNER * END (BANNER)1 *-->
AD BOX BEGINS * (BOX/BAR ENDS)1 *-->
ADDFREESTATS.COM * END (ADDFREESTATS.COM)1 *-->
ads begin * (ads)1 end *-->
Adspace * / (Adspace)1 *-->
Adv Ins Banner * End (Adv Ins Banner)1 *-->
AD POSITION * End (AD POSITION)1 *-->
Banner Ad Cell * / Banner (Ad Cell)1 *-->
Banner code begin * (Banner code)1 end *-->
Click.it Mondadori * Fine (Click.it)1 Mondadori *-->
DoubleClick Bottom Ad BEGIN * (DoubleClick Bottom Ad)1 END *-->
DoubleClick Javascript BEGIN * (DoubleClick Javascript)1 END *-->
DoubleClick Top Ad BEGIN * (DoubleClick Top Ad)1 END *-->
FASTCLICK.COM * (FASTCLICK.COM)1 *-->
HotLogs * (HotLog)1s *-->
HTML BANNER AD * / (HTML BANNER)1 AD *-->
HTTPADS * / (HTTPADS)1 *-->
Inizio Codice Shinystat * Fine (Codice Shinystat)1 *-->
KMiNDEXs * (KMiNDEX)1s *-->
new ad code * end (new ad code)1 *-->
OSDN Navbar * End (OSDN Navbar)1 *-->
Pair Promotion Begin * End (Pair Promotion)1 *-->
PayPopup.com Advertising * (PayPopup.com)1 Advertising *-->
Rating@Mail.ru COUNTER * <!-- / (COUNTER)1 -->
Russian LinkExchange code * (Russian LinkExchange)1 code *-->
SexKey Original code * (SexKey)1 Original code *-->
SpyLOG * <!-- SpyLOG -->
STATS4ALL_START * (STATS4ALL_END)1 *-->
TOPCTO begin * (TOPCTO)1 end *-->
TopList COUNTER * (TopList COUNTER)1 *-->
TOPLIST * (TOPLIST)1 END *-->
VC active * (VC active)1 *-->
WebMeasure start * (WebMeasure)1 slutt *-->
##############################################################################
# [^>]++{0,30} ==> To Move In The Above Category Or In The AdComments.txt List
# ----------------------------------------------------------------------------
(1st|2nd|3rd|4th|5th|6th) Ad*((1st|2nd|3rd|4th|5th|6th) Ad)1 *-->
1000stars*(1000stars)1 (?)++{0,90}-->
123Advertising*(123Advertising)1 (?)++{0,90}-->
4-F-R-E-E*(4-F-R-E-E)1 *-->
468X60 AD*(468X60 AD)1 *-->
AD BANNER*(AD BANNER)1 *-->
Ad code*(Ad code)1 *-->
AD TABLE*(AD TABLE)1 *-->
ad(vertisement|) 468x60*end (ad(vertisement|[^a-z]))1 *-->
ADCALL*(ADCALL)1 (?)++{0,90}-->
ADCYCLE.COM*(ADCYCLE.COM)1 (?)++{0,90}-->
ADDFREESTATS (EASY|NORMAL) CODE*END (ADDFREESTATS)1 *-->
ADnetz.net Code*(ADnetz.net Code)1 *-->
AdSolution-Tag*(AdSolution-Tag)1 *-->
AdultPlex.Com*(AdultPlex.Com)1 (?)++{0,90}-->
Advert Block*(Advert Block)1 *-->
advertisement code*(advertisement code)1 *-->
Advertising.com Banner Code*(Advertising.com)1 (?)++{0,90}-->
Advertizment Flash*(Advertizment Flash)1 *-->
Affiliate Code*(Affiliate Code)1 *-->
affiliate links*(affiliate links)1 *-->
Amateur Pages Code*(Amateur Pages Code)1(?)++{0,90}-->
Anonymizer*(Anonymizer)1 (?)++{0,90}-->
Bananer Ad*(Bananer Ad)1 *-->
Banner Ad*(Banner Ad)1 *-->
Banner Exchange Code*(Banner Exchange Code)1 *-->
# Banner*/ (Banner)1 *-->
BannerAlto*(BannerAlto)1 *-->
BarelyLegal Banner*(BarelyLegal Banner)1(?)++{0,90}-->
BEGIN ADs*END (AD)1s *-->
begin clickXchange*end (clickXchange)1(?)++{0,90}-->
Begin HBtrack*End (HBtrack)1(?)++{0,90}-->
BelStat.be Counter*(BelStat.be Counter)1(?)++{0,90}-->
BOT AD*(BOT AD)1 *-->
btpromo*(btpromo)1 (?)++{0,90}-->
BUTTON ADS*(BUTTON ADS)1 *-->
CASH COUNT BANNER*(CASH COUNT BANNER)1 *-->
CibleClick*(CibleClick)1 (?)++{0,90}-->
Click-Counter*(Click-Counter)1 *-->
cobranding*(cobranding)1 (?)++{0,90}-->
Coolerguys advertisement*(advertisement)1 *-->
Counter Code*(Counter Code)1 *-->
Counters*END (Counters)1 *-->
DarkCounter*(DarkCounter)1 (?)++{0,90}-->
dialerfactory*(dialerfactory)1(?)++{0,90}-->
DoubleClick ADJ*(DoubleClick ADJ)1 *-->
dynad*(dynad)1 (?)++{0,90}-->
eMerite code*(eMerite code)1 *-->
Extract.Ru banner*(Extract.Ru banner)1 *-->
focusIN code*(focusIN code)1 *-->
frameJammer_hp*(frameJammer_hp)1 *-->
freecom*(freecom)1 (?)++{0,90}-->
FreepageScript1*(FreepageScript1)1 *-->
friendplay.com*(friendplay.com)1 (?)++{0,90}-->
Gallery Host*(Gallery Host)1 *-->
GeoGuide*(GeoGuide)1 *-->
HitBox Ads*(HitBox Ads)1 *-->
Hittrack Tracker*(Hittrack Tracker)1 *-->
Home Free*(Home Free)1 *-->
Honor System*(Honor System)1 *-->
HumanTag Monitor*(HumanTag Monitor)1 *-->
Hustler Banner*(Hustler Banner)1 *-->
Impression code*(Impression code)1 *-->
Impressions-Counter*(Impressions-Counter)1 *-->
InDepthInfoAd*(InDepthInfoAd)1 (?)++{0,90}-->
INLIVE CODE*(INLIVE CODE)1 *-->
INVOEGCODE*(INVOEGCODE)1 (?)++{0,90}-->
IVWs*(IVWs)1 (?)++{0,90}-->
Land Banner*(Land Banner)1 *-->
LIVE WIRE MEDIA CODE*(LIVE WIRE MEDIA CODE)1(?)++{0,90}-->
MAJOR SPONSORS*(MAJOR SPONSORS)1 *-->
MILLTO_BAR*(MILLTO_BAR)1 *-->
Money 4u HTML Code*(Money 4u HTML Code)1 *-->
NetworXXX*(NetworXXX)1 (?)++{0,90}-->
NEWS TICKER*(NEWS TICKER)1 *-->
Newsensations Banner*(Newsensations Banner)1(?)++{0,90}-->
OAS (AD|TAG|SETUP|function)*(OAS (AD|TAG|SETUP|function))1 *-->
PIGPORN*(PIGPORN)1 (?)++{0,90}-->
POPUNDER.COM CODE*(POPUNDER.COM CODE)1 *-->
popup code*(popup code)1 *-->
PornTrack JavaScript Code*(PornTrack JavaScript Code)1 *-->
PROBE CODE*(PROBE CODE)1 *-->
p?ginas de galeon*(p?ginas de galeon)1 *-->
rail ad*(rail ad)1 *-->
RBC counter*(RBC counter)1 *-->
RealTracker*(RealTracker)1 (?)++{0,90}-->
RICH MEDIA CODE*(RICH MEDIA CODE)1 *-->
RmbClick Advertisng*(RmbClick Advertisng)1 *-->
roadmap code*(roadmap code)1 *-->
rsct-click-info*(rsct-click-info)1 *-->
SE Toolbar*(SE Toolbar)1 *-->
Sex Swap Code*(Sex Swap Code)1 *-->
Sexlist#*(Sexlist)1# *-->
SEXSEARCH.COM COUNTER*(SEXSEARCH.COM COUNTER)1 *-->
SexyAVS.com Code*(SexyAVS.com Code)1 *-->
side ads*(side ads)1 *-->
Sitestat4 code*(Sitestat4 code)1 *-->
skyscraper ad*(skyscraper ad)1(?)++{0,90}-->
sponcode*(sponcode)1 (?)++{0,90}-->
sponsor ad*(sponsor ad)1 *-->
#SPONSOR TABLE*(SPONSOR TABLE)1(?)++{0,90}-->
sponsors code*(sponsors code)1 *-->
sponsorship*(sponsorship)1 (?)++{0,90}-->
technojobs ad*(technojobs ad)1(?)++{0,90}-->
TELLERCODE*(TELLERCODE)1 (?)++{0,90}-->
text ad*end (text ad)1 *-->
THEBANNER.DE Code*(THEBANNER.DE Code)1 *-->
Topsites BANNER*(Topsites BANNER)1(?)++{0,90}-->
Totally Pornstars Banner*(Totally Pornstars Banner)1(?)++{0,90}-->
TOWER Ad code*(TOWER Ad code)1 *-->
Tracking Code*(Tracking Code)1 *-->
TRAFFIC IMPRESSION*(TRAFFIC IMPRESSION)1 *-->
Traffic-Network*(Traffic-Network)1 *-->
TRAFFICHOME*(TRAFFICHOME)1 (?)++{0,90}-->
TrafficMarketPlace*(TrafficMarketPlace)1 (?)++{0,90}-->
web audit counter*(web audit counter)1(?)++{0,90}-->
webbot bot="HitCounter"*webbot bot="(HitCounter)1" *-->
WHD Code*(WHD Code)1 *-->
www.HyperCount.com*(www.HyperCount.com)1 (?)++{0,90}-->
www.paidbanner.de*(www.paidbanner.de)1 (?)++{0,90}-->
X-IT CODE*(X-IT CODE)1 *-->
XXX COUNTER*(XXX COUNTER)1 *-->
[^a-z]Ad Start*([^a-z]Ad)1 End *-->

# GENERIC AD COMMENTS FOR ANY DOMAIN
ad(s|)[^a-z] * end (ad(s|)[^a-z])1 *-->
advertis(ing|ements) * end (advertis(ing|ements))1 *-->
banner(s|)[^a-z] * end (of|[^a-z]|) (banner(s|)[^a-z])1 *-->

# USER ADDED COMMENTS

---------------------AdCommentPairs.txt end--------------------

and also this to complete the post (as arne point out, updated with the recent tegghead suggestion):

----------------------ManagedTags.txt begin--------------------
# Proxomitron4 URL killfile: $LST(ManagedTags)
# Created by altosax on May 23, 2002
# Updated on July 31, 2002
#
# List for "Tag Manager [vm]" filter.
# Use $SET(1=) to preserve the content of the tag.
# Use $SET(1=what_do_you_like) to replace the tag.
# Remove $SET(1=) to kill the tag.

<fonts*>$SET(1=)
<metas(w=$AV(*) )+>
<!doctype*>$SET(1=)
$NEST(<noscript,</noscript>)
$NEST(<h[#1:6],</h?>)$SET(1=)
$NEST(<style,</style>)$SET(1=)
$NEST(<select,</select>)$SET(1=)
$NEST(<textarea,</textarea>)$SET(1=)
<!-- (begin|open|start) (left|right|head|footer|main|menu|javascript) (^ad)$SET(1=)
----------------------ManagedTags.txt end-----------------------

sorry for such long post, but it was necessary to explain in detail my ideas.

<edit>: comments lists update to August 03
<edit>: thanks to sidki for his new comments

regards,
altosax.

Edited by - altosax on 03 Aug 2002  17:22:44

Edited by - altosax on 03 Aug 2002  19:13:18

5
Community Discussions (Non-Forum Related) / summer riddle
« on: July 25, 2002, 12:28:32 AM »
hi all friends,
read this summer riddle and try to solve it for your fun.

there is a man in a boat at the centre of a lake.
he also has a big stone in his boat.
if the man throws the stone into the lake,
the level of water will increase,
will decrease or will remain equal?

too hard? too simple?
let me know,
altosax.

 

6
i'm going to become rich
read here:

--------------------------------------------
Fax No:234-1-7597602
[email protected]

Dear Sir,

I crave your indulgence as I contact you in such a surprising manner.
But I respectfully insist you  read this letter carefully as  I am
optimistic it will open doors for unimaginable financial reward for
both of us.

This business transaction might not fall within the wide spectrum of
your business activities, but I plead your assistance, as
your flair for profitable business is needed.

Permit me to introduce myself, I am Mr.FRED UBAKA Manager, Union Bank
of Nig.Plc, I am writing this letter to ask for your support and
co-operation to carry out this business opportunity in my department.

Every five years, Nigerian banks transfer to its treasury millions of
dollars of unclaimed deceased depositors funds in compliance with the
banking laws and guidelines, in majority of cases with reference to
my bank-union bank of Nig.Plc, the money normally runs into several
millions of dollars.

A foreigner, Late Engineer Johnson Creek, an Oil Merchant/Contractor
in Nigeria, until his death four years ago in a ghastly air crash
has a closing balance of US9,500,000.00 (Nine Million, Five Hundred
Thousand Dollars) Ever since his death and up till this time
of writing, no next-of-kin or relation of his has come forward
to claim his money with us. Fervent valuable efforts have been made
by Union Bank to get in touch with any of the Creek's family or
relatives but proved to no avail.

Naturally, as long as Johnson's money remains unclaimed, the bank
remains richer in free funds with his money. However, with
my position, I can present you to claim the fund as a relative/next
of kin if you agree with me on private basis. Yes, I can present you.

The request for you as next of kin in this business is occasioned by
the fact that the customer was a foreigner and a Nigerian cannot stand
as next of kin to a foreigner. Hence, I plead for your assistance
with the intention of getting this unclaimed money amounting to
US9,500,000.00 (Nine Million, Five Hundred Thousand Dollars Only)
transferred into a company/private bank account that you will present.

You stand to gain a negotiable percentage of the fund if you agree to help
me actualize this opportunity; I will not contact any other company or
person until I am convinced that you are not interested in this proposal.

Contact me immediately indicating your willingness and also give your
direct and confidential fax and phone numbers for the effective
communication that this transaction requires.

I will furnish you with the procedure and also enlighten you on how
the funds will be disbursed and shared on receipt of your response.
I will also require your advice in the areas of investment as
I plan to establish a business venture in your country with my share.

Please reach me only on this private email address:[email protected]
or Fax No:234-1-7597602. Remember that a business of this nature
needs to be kept confidential.

Yours Sincerely,

Mr.Fred Ubaka.
-----------------------------------------------------

 

7
Site Specific / ner filter for google
« on: July 10, 2002, 02:15:21 PM »
this is a filter i've posted time ago on yahho groups, today i've found it on my hard disk:

Name = "Google: return 100 hits per page [vm]"
Active = TRUE
URL = "[^/]++google"
Limit = 12
Match = "</form>"
Replace = "<input type=hidden name=num value=100></form>"

it set the search on google forcing it to return 100 results per page. you can also change the value as you need.

regards,
altosax.

 

8
Spam Blockers / Kill Comments-surrounded Ads [vm], new filter
« on: July 10, 2002, 10:11:50 AM »
hi all friends,
i'm working on a new filter that kills comments-surrounded ads.

the idea is not new, but the various existing filters have different approaches. i know for example "Kill Comment-pair delimited Ads" or "Kill Comment-block Ads", but the first calls both AdPath and AdDomain, and the second try to match all possible variations of the comments, making filtering much expensive.

the basic assumption of my filter is that it not necessarily have to match ALL comments-surrounded ads, because all here already have filters that remove ads, but every time it calls the list there would have to be a match, removing a great block of code and reducing the work of other filters (Banner Blaster, Kill Javascript Banners, Kill Nosey Javascripts and so on).
This means that false matches have to be resolved without calling the list. and there are a lot of false matches: <script>, <style>, real comments, commented ads not included in the list. all of them contain the <!-- part.

for this reason i haven't a match field like:
<!-- [^>]++ $LST(...)... or like
<!-- (auto|begin|...|) $LST(...)...
because this way the list is ALWAYS scanned, also with <script>, <style> and so on.

i've really appreciated the comments that sidki have sent me private, so i say him a public "thank you, my friend".
now the thread is open, post here your comments.

this is the filter (its position in the filter set is not important because it will match first of other ads killer due to its matching expression):

Name = "Kill Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 12000
Match = "<!-- (auto|begin|start) $LST(AdComments) *-->"
Replace = "<!-- Killed Comments-surrounded Ads -->"

as you can see, it is really simple. i could also modify the matching expression this way:

Match = "<!-- (auto|begin|start)(^header|footer) $LST(AdComments) *-->"

but i've encountered it on one site only. if i shall discover that it is more common i could implement this change.

i've made also an alerting filter to be notified when a page contain a comments matching the filter but not included in the list. this way i can read the code and include it in the list:

Name = "Notify new Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 32
Match = "<!-- (auto|begin|insert|loader|open|start)1"
Replace = "<!-- 1$ALERT(Found <!-- 1)"

place it just AFTER the killer one, so all target comments that pass the first will generate an alert. note that i use it only to build the list, it doesn't filter anything so if you don't care to add new comments to the list you really don't need it.

and this is the actual list:

# Proxomitron4 URL killfile: $LST(AdComments)
# Created by altosax on July 08, 2002
# Updated on July 13, 2002
#
# List for "Kill Comments-surrounded Ads [vm]" filter.
# To make it safer, add here longer possible comments.
# Also, you need to add both starting and ending comments,
# separated by *. Do not add here the ending -->

# AUTO
Banner Insertion Begin * Auto Banner Insertion Complete

# BEGIN
468 Ad area * End 468 Ad area
Ad Space * END Ad Space
ADVERT POWER * END ADVERT POWER
BAD ASS Advertising * END OF BAD ASS RANDOM ADVERTISEMENTS
Ban Man Pro * End Ban Man Pro
BANNER -- * end BANNER
BURST * END BURST
Crucial advertisement * end Crucial advertisement
Flycast Ad Copyright * End Flycast Ad Copyright
ITALIA HYPERBANNER * END ITALIA HYPERBANNER
LINKEXCHANGE CODE * END LINKEXCHANGE CODE
linswap Code * End linswap Code
Linux Waves Banner Exchange * End Linux Waves Banner Exchange
Nedstat Basic code * End Nedstat Basic code
ning Advertising nAdvert * End Advertising nAdvert
of MAFIA * end of MAFIA
of SpyLOG * end of SpyLOG
of Top100 * end of Top100
of TopList * end of TopList
PayCounter * End PayCounter
PayPal Logo * End PayPal Logo
RealHomepageTools * End RealHomepageTools
RICH-MEDIA BURST * END BURST
SEXCOUNTER ADVANCED CODE * END SEXCOUNTER ADVANCED CODE
SexList Counter Code * End SexList Counter Code
SEXLIST REFERRER-STATS CODE * END SEXLIST REFERRER-STATS CODE
SEXTRACKER CLIT CODE * DONE WITH SEXTRACKER CLIT CODE
SEXTRACKER CODE * END SEXTRACKER CODE
WEBSIDESTORY CODE * END WEBSIDESTORY CODE
ZEDO * end ZEDO

# INSERT (still empty)
# LOADER (still empty)
# OPEN (still empty)

# START
Gamma Entertainment * End Gamma Entertainment
of Ads- * End of Ads-
of ExtremeDM Code * End of ExtremeDM Code
of NedStat code * end of NedStat code
OF SITEWISE * END OF SITEWISE
OF WEBTRENDS LIVE * END OF WEBTRENDS LIVE
RedMeasure * END RedMeasure

<edit>: the list was update, new ideas in the message below </edit>

regards,
altosax.

Edited by - altosax on 13 Jul 2002  00:57:25

9
Questions and Answers / mime fix list doesn't fix this site
« on: July 05, 2002, 12:01:06 AM »
hi all,
i found the site www peachpage com (add the dots) sent by the server as text/plain.
even with "fix mime type" header filter active (the standard one that comes with proxomitron with the paul rupe's list), it continue to be not filtered at all. i've filtered it activating the header filter "filter text/plain" but i would prefer to filter it using mime fix list, after all IT IS a web page, then the content-type would have to be fixed as text/html by "fix mime type".
so the question is: why "fix mime type" doesn't work?

someone have the right suggestion?

regards,
altosax.

 

10
Feature-Block / protect your ip from malicious applet
« on: July 01, 2002, 03:16:27 PM »
i've just posted this filter at computer cops to solve this problem:

http://reglos.de/myaddress/demo4.html

here is the filter:

Name = "Protect your IP from Applet [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "<applet*>"
Limit = 256
Match = " name=$AVQ(*)1"
Replace = "1"

regards,
altosax.

 

11
Microsoft Help / modify the resources within a windows file
« on: June 29, 2002, 04:46:25 PM »
hi all,
i'm playing a little with resources within windows dll's.
i've modified my file shdoclc.dll to create a link to google cache directly in the page that windows shows when a web page is unavailable.

if you are able to do this yourself, here are the steps.

1.
download resource hacker from http://www.users.on.net/johnson/resourcehacker/

2.
copy and paste the file windows/system/shdoclc.dll elsewhere to modify the copy, not the original. open the copy with rh, and go to 23/http_404/your_language_id

3.
add this function into the script:


function GoogleCache(){
   DocURL = document.location.href;
      
   protocolIndex=DocURL.indexOf("://",4);
   
   serverIndex=DocURL.indexOf("/",protocolIndex + 3);

   BeginURL=DocURL.indexOf("#",1) + 1;
   if (protocolIndex - BeginURL > 7)
      urlresult=""
   
   urlresult=DocURL.substring(BeginURL,serverIndex);

   displayresult=DocURL.substring(protocolIndex + 3 ,serverIndex);

   forbiddenChars = new RegExp("[<>**FIX**"]", "g");   // Global search/replace
   urlresult = urlresult.replace(forbiddenChars, "");
   displayresult = displayresult.replace(forbiddenChars, "");

   document.write('<A target=_top HREF="http://www.google.com/search?q=cache:' + urlresult + '">' + displayresult + "</a>");

}


4.
add this html code at the end of the listed item:


      <li id="list5">Search the requested page in the Google cache:<br> <script> GoogleCache(); </script>. </li>


5.
compile the script and save the file. reboot in dos, rename the original file and copy your modified file in place of the original. then reboot again.

if someone send me the file, i can do it for him. i can't post my modified file because all dialogs it contains are in italian language.

note: if you do this, don't tell it to bill ;)

regards,
altosax.



Edited by - altosax on 29 Jun 2002  17:47:56

12
Cosmetic / Super Opener - Text Links Only, work in progress
« on: May 25, 2002, 03:11:50 PM »
hi all members,
i've started working on superopener to create a text links only version. i'll post in this thread all the changes i've made to discuss them with your help. this filter is really hard to study but i think here there are some filters guru that could help me.

at first, excuse me for the long (and probably hard to read) message. i know the jd5000 solution but understand his filter is more or less the same of understandig the bpm original, so i've started from bpm version, trying to make no errors.

this is the second part of the original filter set of three parts, in its unmodified form. all my consideration have to be referred to this part.



Name = "Links >^ SUPER-OPENER BETA 37 (aB) (bC)"
Active = TRUE
Multi = TRUE
URL = "$TYPE(htm)"
Bounds = "<as*(<(\|)/a>|(<as))(^<!-- BPM_(W|A) -->)"
Limit = 450
Match = "<a"
        "("
        "([^>]++(shref=$AV(*))1[^>]+>)"
        "&&"
        "((shref=$AV(*)|starget=$AV(_blank|_new)"
        "|(sclass=$AV(*))2|(sstyle=$AV(*))3|((s[^ ]+)|>)#))+"
        ")"
        ""
        "( $NEST(<,(^(\|)/a(^?))*,> )+)4"
        ""
        "("
        "<(/a>|as)"
        "$SET(6= class="BPM-supero-d")"
        "$SET(7=&loz;)"
        "|"
        "(&[^; ]++; |[^<] )5"
        "($NEST(<,(^(\|)/a(^?))*,> )+ )8"
        "<((\|)/a>|as)$SET(7=<font size=-2>&loz;</font>)"
        "|"
        "(( (&[^; ]++;|[^<])"
        "$NEST( <,(^(\|)/a(^?))*,>)+)++{1,2})5"
        "( (&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+ )8"
        "<((\|)/a>|as)"
        "|"
        "( (&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+"
        "(&[^; ]++;|[^<])"
        " (&[^; ]++;|[^<]|$NEST(<,(^(\|)/a(^?))*,> )+)++ )5"
        "( (&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+"
        "(&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+ )8"
        "<((\|)/a>|as)"
        ")"
Replace = "<a123@45</a>"
          "<a id=BPM-supero6"
          " title=Open?in?new?window"
          " target=_blank123>"
          "78</a>"
          "<!-- BPM_W -->"



changelog and explanations:

1. removed the filter "Comments >^ Remove temporary proxomitron comment tags (C)"

it's the third part of the superopener filter set. you don't really need it because it removes only this short comment "<!-- BPM_W -->" added by "Links >^ SUPER-OPENER BETA 37 (aB) (bC)" to each matching links. btw, you have to consider this balance: with "Comments" active you have additional checks of this filter for each character of the web page that non matches the previous filters (their number is equal to the non matching characters contained in the page); with "Comments" disabled, you have additional checks for each active filter, for each character of "<!-- BPM_W -->" and for each link matched by superopener. the number of additional checks in this case is (14 x N x M) where N is the number of active filters and M the number of links matched in the page. it's clear that you can take advantage from disabling "Comments" only disabling also the addition of "<!-- BPM_W -->" in "Links" filter.

2. removed (^<!-- BPM_(W|A) -->) from Bounds
3. removed <!-- BPM_W --> from Replace
4. removed Multi = TRUE

these changes are strictly related because superopener needs to add <!-- BPM_W --> to avoid double matching of the code replaced by itself. in fact, it needs the option Multi = TRUE to enable other filters to match the <a> tags, but needs also a protection for the <a> tags that itself adds to the end of the link to avoid an infinite loop (this is the reason of the ^ function in the Bounds). then, you can safely remove Multi = TRUE only if superopener is placed at the very end of your filter set. from this point i'll assume that you have placed superopener at the end of your filter set.

5. simplyfied Bounds in "<as*(/a>|(<as))"

due to the wildcard * there is no need to have <(\|) in the first part of the substring. i've left (<as) in the second part because it makes superopener able to match also erroneously nested links in the form <a..<a.. correcting them in the replacement code. this is the first point where i don't agree with jd5000 solution, btw the discussion is open and suggestions are welcome.

- note about Bounds: it is possible add in bounds an exclusion for the images, something like (^<img*>) but my goal is remove from superopener all matching code related to images links so i can't add this exception because i have first to find and remove unnecessary code. i'll add the exception only at the end.

ok, here i stopped my first session. this is the actual result:



Name = "Super Opener - Text Links Only 2"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "<as*(/a>|(<as))"
Limit = 450
Match = "<a"
        "("
        "([^>]++(shref=$AV(*))1[^>]+>)"
        "&&"
        "((shref=$AV(*)|starget=$AV(_blank|_new)"
        "|(sclass=$AV(*))2|(sstyle=$AV(*))3|((s[^ ]+)|>)#))+"
        ")"
        ""
        "( $NEST(<,(^(\|)/a(^?))*,> )+)4"
        ""
        "("
        "<(/a>|as)"
        "$SET(6= class="BPM-supero-d")"
        "$SET(7=&loz;)"
        "|"
        "(&[^; ]++; |[^<] )5"
        "($NEST(<,(^(\|)/a(^?))*,> )+ )8"
        "<((\|)/a>|as)$SET(7=<font size=-2>&loz;</font>)"
        "|"
        "(( (&[^; ]++;|[^<])"
        "$NEST( <,(^(\|)/a(^?))*,>)+)++{1,2})5"
        "( (&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+ )8"
        "<((\|)/a>|as)"
        "|"
        "( (&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+"
        "(&[^; ]++;|[^<])"
        " (&[^; ]++;|[^<]|$NEST(<,(^(\|)/a(^?))*,> )+)++ )5"
        "( (&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+"
        "(&[^; ]++;|[^<])"
        " $NEST(<,(^(\|)/a(^?))*,> )+ )8"
        "<((\|)/a>|as)"
        ")"
Replace = "<a123@45</a>"
          "<a id=BPM-supero6"
          " title=Open?in?new?window"
          " target=_blank123>"
          "78</a>"



now some comments to the code:


"([^>]++(shref=$AV(*))1[^>]+>)"

it matches <a this_content>, no problem.
---------------

"&&"
"((shref=$AV(*)|starget=$AV(_blank|_new)"
"|(sclass=$AV(*))2|(sstyle=$AV(*))3|((s[^ ]+)|>)#))+"

it matches <a this_content>, no problem. at this point the <a..> tag is entirely stored in the 1, 2, 3 variables and the stack
---------------

"( $NEST(<,(^(\|)/a(^?))*,> )+)4"

here it is!! this code matches images links, because it matches <a..><this_image><even_doubled></a>. we don't need this line of code neither the 4 variable in the replacement. but if this line is removed, we need something to match <a..>this_text</a>.
---------------

"<(/a>|as)"
"$SET(6= class="BPM-supero-d")"
"$SET(7=&loz;)"

if the link is an image link this code matches the closing tag </a> or an eventually erroneous nested <a> tag and set 6 for the style and 7 to make possible the addition of the lozange. in a text only version we need only the first line and can safely remove the class "BPM-supero-d" from the first of the 3 superopener filters. note also that the closing </a> tag, even when matches, is not stored anyway, but is expressely replaced.
----------------

"(&[^; ]++; |[^<] )5"
"($NEST(<,(^(\|)/a(^?))*,> )+ )8"
"<((\|)/a>|as)$SET(7=<font size=-2>&loz;</font>)"

this seems to be an error. i think the second line of this code is unnecessary, but i still haven't understood at all this block of code so i'll write nothing about my conclusions. btw, this could be the bug that sometimes add the lozenge to text links.
----------------

"(( (&[^; ]++;|[^<])"
"$NEST( <,(^(\|)/a(^?))*,>)+)++{1,2})5"
"( (&[^; ]++;|[^<])"
" $NEST(<,(^(\|)/a(^?))*,> )+ )8"
"<((\|)/a>|as)"
"|"
"( (&[^; ]++;|[^<])"
" $NEST(<,(^(\|)/a(^?))*,> )+"
"(&[^; ]++;|[^<])"
" (&[^; ]++;|[^<]|$NEST(<,(^(\|)/a(^?))*,> )+)++ )5"
"( (&[^; ]++;|[^<])"
" $NEST(<,(^(\|)/a(^?))*,> )+"
"(&[^; ]++;|[^<])"
" $NEST(<,(^(\|)/a(^?))*,> )+ )8"
"<((\|)/a>|as)"

here is where i need your help. this code apply to text links but i'm not sure we need such complicated thing. do you agree? for example, i think we can remove (//|) from all $NEST commands.


that's all. thank you for your patience reading all this. i'm looking in advance for your comments and suggestions. now i'm still testing the first result, mainly its compatibility with the rest of the filter set. it's still a text+images version but as you all understand this word require to be done step by step.

see you later in this thread,
regards to all,
altosax.



Edited by - altosax on 25 May 2002  16:16:22

Edited by - altosax on 25 May 2002  21:36:51

13
Feature-Block / tag manager (new filter)
« on: May 23, 2002, 05:53:00 PM »
hi all friends,
i've realized this filter to remove, skip or protect selected element from filtering.

all we know that proxomitron, for each character of the page, scan all active filters to find matches and applyes the first matching filter when found. if the matching filter has the multiple option enabled, it applyes also the second matching filter and so on, otherwise it skips to the next character. when a non multiple filter matches, proxomitron applyes the filter then skip to the first character just after the matching code.

if you have N filters enabled, with a non matching code of M characters, you will have (N x M) filter checks without benefits. the main goal of this filter is speed up filtering of non matching code by matching harmless elements and forcing proxomitron to skip filtering for the remaining content of the tag. with the simple trick of setting a variable equal to an other i've combined this feature with the ability of remove useless and nosey tags.



[Blocklists]
List.ManagedTags = "..ListsManagedTags.txt"

[Patterns]
Name = "Tag Manager [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "$LST(ManagedTags)"
Limit = 2048
Match = ""
Replace = "1"

This is my actual list of managed tags:

<!doctype*>$SET(1=)
# $NEST(<title,</title>)$SET(1=)
$NEST(<style,</stile>)$SET(1=)
<font*>$SET(1=)
$NEST(<h?,</h?>)$SET(1=)
$NEST(<noscript,</noscript>)
$NEST(<select,</select>)$SET(1=)
$NEST(<textarea,</textarea>)$SET(1=)

note1: the list MUST CONTAIN only common tags that have a great probability to be effectively present in the html document because each entry of the list is the same as a new active filter. a great number of entry speed down filtering, this is the reason because i've left out many other tags.

note2: this filter replace my previous "Remove content of <noscript> tag". to kill a tag and its content, remove $SET(1=). to replace a tag and its content, modify $SET(1=what_do_you_like). to skip a harmless tag from filtering add it to the list in the same form of the other tags. i've added, as example, commented <title> tag, remove # to preserve it.

note3: if you use filters that match content of the managed tags, you have to place them BEFORE tag manager and allow them for multiple matches. do not allow multiple matches for tag manager (otherwise it becomes useless).

note4: you can also use this filter for a very aggressive filtering. for example, you can disable javascript [$NEST(<script,</script>)], stylesheet [$NEST(<style,</style>)], embedded object [<embed*>], flash [<object*>], inline frame [$NEST(<iframe,</iframe>)], meta tag [<meta*>] and so on (layer, inline layer, sound in background...).

note5: i've created this filter to manage single "block" of code, btw you can use it also for filtering. for example, adding something like <xsl:script*activex*</xsl:script>$SET(1=<!-- activex killed -->)

ok, that's all. let me know your suggestions to improve this filter.
it is now part of my filter set at
http://virgolamobile.50megs.com/proxomitron.html

regards,
altosax.

 

14
Feature-Block / the definitive "banner blaster/replacer"
« on: May 04, 2002, 09:43:35 PM »
i've modified these 3 filter set removing "Multi = TRUE" because of the recent scott's explanation on yahoo group.

-------cut here----------

Name = "Banner Blaster (limit text)"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "<(as[^>]++href=*</a>|input*>|layer*>)"
Limit = 900
Match = "(<layer*|1<i(mg|mage|nput)*src=$AV(*)*>3)"
        "&(*(href|src)=$AV($LST(AdKeys)*)|"
        "*http://*<i(mg|mage|nput)s(*>&&"
        "(*width=[#460-480]&*height=[#55-60]*)|"
        "(*width=[#88]&*height=[#31]*)))"
        "&((*alt="")$SET(2=Ad)|*alt=$AV((?+{18})2*|2)|$SET(2=Ad))"
Replace = "<center>1<font size=1 color=red>[2]</font>3</center>"


Name = "Banner Blaster (full text)"
Active = FALSE
URL = "$TYPE(htm)"
Bounds = "<(as[^>]++href=*</a>|input*>|layer*>)"
Limit = 900
Match = "(<layer*|1<i(mg|mage|nput)*src=$AV(*)*>3)"
        "&(*(href|src)=$AV($LST(AdKeys)*)|"
        "*http://*<i(mg|mage|nput)s(*>&&"
        "(*width=[#460-480]&*height=[#55-60]*)|"
        "(*width=[#88]&*height=[#31]*)))"
        "&((*alt="")$SET(2=Ad)|*alt=$AV(2)|$SET(2=Ad))"
Replace = "<center>1<font size=1 color=red>[2]</font>3</center>"


Name = "Banner Replacer"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "<as[^>]++href=*</a>"
Limit = 800
Match = "<img (1border=w|) 2 src=$AV(*) (3border=w|) 4"
        "&(*(src|href)=$AV($LST(AdKeys)*)|"
        "(*width=[#460-480] & *height=[#55-60])|"
        "(*width=[#88] & *height=[#31]))*"
Replace = "<img 1 border=1 2 src=http://Local.ptron/killed.gif 3 4"


----------cut here-----------

i've found that without multiple matches allowed the filtering is speeded up, expecially if you use super opener.
just my opinion, but i think that these filters have reached the definitive version and can not be further improved.

note: you need to use only one of the two banner blaster.

regards,
altosax.



Edited by - altosax on 04 May 2002  22:46:06

15
Feature-Block / remove content of <noscript> tags
« on: May 04, 2002, 09:39:53 PM »

i've realized this little filter to speed up filtering:

Name = "Remove content of <noscript> tags"
Active = TRUE
URL = "$TYPE(htm)"
Bounds = "$NEST(<noscript,</noscript>)"
Limit = 1024
Match = "*"

and this is my explanations:

1. if you have javascript enabled, you never see the content of noscript tags;
2. the most times, noscript tags contain banner code that matches many filters;
3. the main used banner filters scan all adlist(s) every time they match a banner code.

i suggest you to place this filter at the very top of your filter set, just after tame javascript.

regards,
altosax.

 

Pages: [1] 2