Author Topic: Kill Comments-surrounded Ads [vm], new filter  (Read 2711 times)

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comments-surrounded Ads [vm], new filter
« on: July 10, 2002, 10:11:50 AM »
hi all friends,
i'm working on a new filter that kills comments-surrounded ads.

the idea is not new, but the various existing filters have different approaches. i know for example "Kill Comment-pair delimited Ads" or "Kill Comment-block Ads", but the first calls both AdPath and AdDomain, and the second try to match all possible variations of the comments, making filtering much expensive.

the basic assumption of my filter is that it not necessarily have to match ALL comments-surrounded ads, because all here already have filters that remove ads, but every time it calls the list there would have to be a match, removing a great block of code and reducing the work of other filters (Banner Blaster, Kill Javascript Banners, Kill Nosey Javascripts and so on).
This means that false matches have to be resolved without calling the list. and there are a lot of false matches: <script>, <style>, real comments, commented ads not included in the list. all of them contain the <!-- part.

for this reason i haven't a match field like:
<!-- [^>]++ $LST(...)... or like
<!-- (auto|begin|...|) $LST(...)...
because this way the list is ALWAYS scanned, also with <script>, <style> and so on.

i've really appreciated the comments that sidki have sent me private, so i say him a public "thank you, my friend".
now the thread is open, post here your comments.

this is the filter (its position in the filter set is not important because it will match first of other ads killer due to its matching expression):

Name = "Kill Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 12000
Match = "<!-- (auto|begin|start) $LST(AdComments) *-->"
Replace = "<!-- Killed Comments-surrounded Ads -->"

as you can see, it is really simple. i could also modify the matching expression this way:

Match = "<!-- (auto|begin|start)(^header|footer) $LST(AdComments) *-->"

but i've encountered it on one site only. if i shall discover that it is more common i could implement this change.

i've made also an alerting filter to be notified when a page contain a comments matching the filter but not included in the list. this way i can read the code and include it in the list:

Name = "Notify new Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 32
Match = "<!-- (auto|begin|insert|loader|open|start)1"
Replace = "<!-- 1$ALERT(Found <!-- 1)"

place it just AFTER the killer one, so all target comments that pass the first will generate an alert. note that i use it only to build the list, it doesn't filter anything so if you don't care to add new comments to the list you really don't need it.

and this is the actual list:

# Proxomitron4 URL killfile: $LST(AdComments)
# Created by altosax on July 08, 2002
# Updated on July 13, 2002
#
# List for "Kill Comments-surrounded Ads [vm]" filter.
# To make it safer, add here longer possible comments.
# Also, you need to add both starting and ending comments,
# separated by *. Do not add here the ending -->

# AUTO
Banner Insertion Begin * Auto Banner Insertion Complete

# BEGIN
468 Ad area * End 468 Ad area
Ad Space * END Ad Space
ADVERT POWER * END ADVERT POWER
BAD ASS Advertising * END OF BAD ASS RANDOM ADVERTISEMENTS
Ban Man Pro * End Ban Man Pro
BANNER -- * end BANNER
BURST * END BURST
Crucial advertisement * end Crucial advertisement
Flycast Ad Copyright * End Flycast Ad Copyright
ITALIA HYPERBANNER * END ITALIA HYPERBANNER
LINKEXCHANGE CODE * END LINKEXCHANGE CODE
linswap Code * End linswap Code
Linux Waves Banner Exchange * End Linux Waves Banner Exchange
Nedstat Basic code * End Nedstat Basic code
ning Advertising nAdvert * End Advertising nAdvert
of MAFIA * end of MAFIA
of SpyLOG * end of SpyLOG
of Top100 * end of Top100
of TopList * end of TopList
PayCounter * End PayCounter
PayPal Logo * End PayPal Logo
RealHomepageTools * End RealHomepageTools
RICH-MEDIA BURST * END BURST
SEXCOUNTER ADVANCED CODE * END SEXCOUNTER ADVANCED CODE
SexList Counter Code * End SexList Counter Code
SEXLIST REFERRER-STATS CODE * END SEXLIST REFERRER-STATS CODE
SEXTRACKER CLIT CODE * DONE WITH SEXTRACKER CLIT CODE
SEXTRACKER CODE * END SEXTRACKER CODE
WEBSIDESTORY CODE * END WEBSIDESTORY CODE
ZEDO * end ZEDO

# INSERT (still empty)
# LOADER (still empty)
# OPEN (still empty)

# START
Gamma Entertainment * End Gamma Entertainment
of Ads- * End of Ads-
of ExtremeDM Code * End of ExtremeDM Code
of NedStat code * end of NedStat code
OF SITEWISE * END OF SITEWISE
OF WEBTRENDS LIVE * END OF WEBTRENDS LIVE
RedMeasure * END RedMeasure

<edit>: the list was update, new ideas in the message below </edit>

regards,
altosax.

Edited by - altosax on 13 Jul 2002  00:57:25
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comments-surrounded Ads [vm], new filter
« Reply #1 on: July 10, 2002, 06:04:17 PM »
this is the filter set i'm using now for debugging and to build list. when the code contain a false match it is revealed by "Notify Exclusion Keyword". then i add it to "Pseudo Comments...". when the code contain a new comment that pass the first 3 filters, it is revealed by "Notify new Comments..." and can add it to the list. the actual version of the list is update in the message above.

Name = "Pseudo Comments-surrounded Ads Jumper [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 32
Match = "<!-- (begin (left|right|footer))1"
Replace = "<!-- 1"

Name = "Notify Exclusion Keywords [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 32
Match = "<!-- (auto|begin|insert|loader|open|start)1 (left|right|head|bottom|footer|table)2"
Replace = "<!-- 1 2$ALERT(Found <!-- 1 2)"

Name = "Kill Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 12000
Match = "<!-- (auto|begin|start) $LST(AdComments) *-->"
Replace = "<span style=display:none;>Killed Comments-surrounded Ads</span>"

Name = "Notify new Comments-surrounded Ads [vm]"
Active = TRUE
URL = "$TYPE(htm)(^$LST(Reviewed))"
Limit = 32
Match = "<!-- (auto|begin|insert|loader|open|start)1"
Replace = "<!-- 1$ALERT(Found <!-- 1)"

when the list will be, more o less, built, the only active filter will be the main. the remaining 3 will be removed.

regards,
altosax.

Edited by - altosax on 13 Jul 2002  00:53:54