Old Proxomitron Forums

Proxomitron Filters - Discussions welcome => Spam Blockers => Topic started by: sidki3003 on August 15, 2002, 02:31:01 AM

Title: Banner Killers: Another Two
Post by: sidki3003 on August 15, 2002, 02:31:01 AM
Updated 2002-08-25, AdDimensions.txt is newer than in the zip file.

Note: This message does *always* contain the current versions of filters and the AdDims list.


These are the two filters that do the main job in my config.
If you don't like to see what's been killed, you're better off with other ones .

[Blocklists]
List.AdList = "..ListsURL Killfile.txt"
List.AdPaths = "..ListsAdPathList.txt"
List.AdDomains = "..ListsAdDomainList.txt"
List.AdHosts = "..ListsAdHostList.txt"
List.AdDims = "..ListsAdDimensions.txt"
List.Bypass_Ads = "..ListsBypass Ads.txt"

[Patterns]
Name = "Kill: Banners (linked)"
Active = TRUE
URL = "($TYPE(htm)|$TYPE(js))(^$LST(Bypass_Ads))"
Bounds = "<as*</a>"
Limit = 1536
Match = "<a[^>]++shref=$AV(1)*> (([^('][^<>]++)3 <*/a>|)&*ssrc=$AV(4)*&(*alt=$AV(2)|)&"
        "("
        ""
        "("
        "<a[^>]++shref(=|*ssrc=)$AV((http|ftp)(s|)://(^h)*)"
        "&*"
        "("
        "<[^>]+>"
        "&&*ssrc=$AV(4)*"
        "&&*width=[#41:*]*"
        "&&$LST(AdDims)*"
        ")*"
        ")"
        ""
        "|"
        ""
        "("
        "<a[^>]++shref(=|*ssrc=)"
        "("
        ""
        "$AV("
        "("
        "((^(http|ftp)(s|)://(^h))*[./-_?&:=]|)"
        "($LST(AdPaths))8"
        "([./-_?&:="']|(^?))"
        "*)"
        "&$SET(9=AdPr 8)"
        ")*"
        ""
        "|"
        "$AV("
        "(http|ftp)(s|)://(^h)"
        "("
        "$LST(AdList)"
        "|"
        "*[./-_?&:=]"
        "(ad|promo(s|)|ban|banner(s|)"
        ")8"
        "([./-_?&:="']|(^?))"
        "&$SET(9=AdP 8)"
        ")*"
        ""
        ")"
        ")"
        "&*src="
        "((*width=$AV(6) & *height=$AV(7)) *>$SET(5= 6x7)|)"
        ")"
        ""
        ")"
Replace = "<span class=prox style=display:inline;><center>"
          "<a href="1" target="_top" title="2"><font color=crimson>[Banner: </font></a>"
          "<a href="4" target="_top" title="3"><font color=crimson>95]</font></a>"
          "</center></span>"

Name = "Kill: Banners (not linked)"
Active = TRUE
URL = "($TYPE(htm)|$TYPE(js))(^$LST(Bypass_Ads))"
Bounds = "<((img|image|input|frame)s*>|iframe*</iframe>|layer*</layer>|ilayer*</ilayer>|applet*</applet>|object*</object>|embed*>(*</embed>|))"
Limit = 1536
Match = "*<"
        "("
        ""
        "("
        "("
        "(img|image|input)4s*src=$AV(((http|ftp)(s|)://(^h)*)1)"
        "|"
        "(frame|iframe|layer|ilayer|embed)4s*src=$AV(1)"
        "|"
        "(object|applet)4(*scodebase=$AV(*)&(*ssrc=$AV(1)|))"
        ")*"
        "&"
        "("
        "[^>]+>"
        "&&*width=[#41:*]*"
        "&&$LST(AdDims)*"
        ")*"
        ")"
        ""
        "|"
        ""
        "(img|image|input|frame|iframe|layer|ilayer|embed)4s"
        "("
        "*src="
        "("
        "$AV("
        "("
        "((^(http|ftp)(s|)://(^h))*[./-_?&:=]|)"
        "($LST(AdPaths))8"
        "([./-_?&:="']|(^?))*"
        ")1"
        "&$SET(9=AdPr 8)"
        ")"
        "|"
        "$AV(((http|ftp)(s|)://(^h)$LST(AdList)*)1)"
        "|"
        "$AV(((http|ftp)(s|)://(^h)*[./-_?&:=]"
        "(ad|promo(s|)|ban|banner(s|))8"
        "([./-_?&:="']|(^?))*)1&$SET(9=AdP 8))"
        ")*"
        "&"
        "((*width=$AV(6) & *height=$AV(7)) *>$SET(5= 6x7)|)"
        ")"
        ""
        ")"
        "&(*alt=$AVQ(2)$SET(3= title=2)|)"
Replace = "<span class=prox style=display:inline;>"
          "<a class="prox" id="proxlower" href='1' target="_top"3>[4: 95]</a></span>"

AdList, AdPaths, AdDomains, AdHosts are pretty standard. Based on Paul Rupe's lists and part of most non-default configs.
The only rewrite is AdDims, that's why i post it below. This one is Work In Progress.

------------------------------ AdDimensions.txt ------------------------------
# All banner dimensions list 2.0 beta (NOADDURL)
#
# For use in "by-size" banner filters in Proxomitron Naoko4
#
# Evgeny AKA Homeric

# 1.7.2001 Michael B?rschgens

# some dimensions by JD
# sidki 2002-03-13
# updated 2002-08-25

# ------------------------------------------------------
# common banners

(*width=([#460:490])6 & *height=([#60:85]|[#93]|[#98:105]|[#170])7)$SET(9=a.common.1 6x7)
(*width=([#120]|[#173]|[#230:240]|[#400:500])6 & *height=([#59]|[#60])7)$SET(9=a.common.2 6x7)

# ------------------------------------------------------
# square banners

(*width=([#125])6 & *height=([#125])7)$SET(9=a.square.1 6x7)
(*width=([#120])6 & *height=([#120])7)$SET(9=a.square.2 6x7)
(*width=([#100])6 & *height=([#100])7)$SET(9=a.square.3 6x7)
#(*width=([#200])6 & *height=([#200])7)$SET(9=a.square.4 6x7)

# ------------------------------------------------------
# buttons (88x31)

#(*width=([#88:91]|[#98:101])6 & *height=([#30:32])7)$SET(9=a.button.1 6x7)
#(*width=([#100]|[#45])6 & *height=([#30:32])7)$SET(9=a.button.2 6x7)

# ------------------------------------------------------
# Rare standard banners

(*width=([#120])6 & *height=([#240])7)$SET(9=a.rare.1 6x7)
(*width=([#230])6 & *height=([#30:33])7)$SET(9=a.rare.2 6x7)

# ------------------------------------------------------
# Non-standard banners (primarily adult sites)

(*width=([#400])6 & *height=([#80]|[#100]|[#120]|[#150])7)$SET(9=a.non-st.1 6x7)
(*width=([#450])6 & *height=([#150])7)$SET(9=a.non-st.2 6x7)
(*width=([#150])6 & *height=([#94])7)$SET(9=a.non-st.3 6x7)

# ------------------------------------------------------
# Miscellaneous graphics

(*width=([#200])6 & *height=([#300])7)$SET(9=a.misc.1 6x7)
(*width=([#425])6 & *height=([#225])7)$SET(9=a.misc.2 6x7)
(*width=([#336]|[#338])6 & *height=([#280]|[#282])7)$SET(9=a.misc.3 6x7)

# ------------------------------------------------------
# Monster banners

(*width=([#120]|[#160])6 & *height=([#600])7)$SET(9=a.monster.1 6x7)
(*width=([#720:760])6 & *height=([#85:100])7)$SET(9=a.monster.2 6x7)

# ------------------------------------------------------
# Tower banners

(*width=([#125])6 & *height=([#400])7)$SET(9=a.tower.1 6x7)
(*width=([#60])6 & *height=([#468])7)$SET(9=a.tower.2 6x7)

# ------------------------------------------------------
# Trackers

(*width=([#41])6 & *height=([#38])7)$SET(9=a.tracker.1 6x7)

# ------------------------------------------------------
# Site specific
$URL(http://[^/]++.yahoo.com)&(*width=([#300])6 & *height=([#250])7)$SET(9=a.Yahoo.1 6x7)


# ------------------------------------------------------
# User sizes go here...
----------------------------- /AdDimensions.txt ------------------------------

Notes:
Some things are a bit long winded, because i was running out of variables.

Take care of wordwrap, better use the link below.

The whole thing is here (http://"uploaded/sidki3003/200282534357_Banner_Killers.zip").

/sidki

Edited by - sidki3003 on 25 Aug 2002  15:25:01
Title: Banner Killers: Another Two
Post by: JD5000 on August 15, 2002, 03:51:58 AM
Geez Sidki, your gonna make me release another update.



--------
Infopros Joint :: Computer Related Links And Discussion (http://"http://infoprosjoint.net/PN/html/index.php")
Title: Banner Killers: Another Two
Post by: sidki3003 on August 15, 2002, 03:56:16 AM
LMAO

 
Title: Banner Killers: Another Two
Post by: JD5000 on August 15, 2002, 06:10:20 AM
Hmmmm.... Great job sidki! So far I'm really digging these filters.

One question, why does the "web bug" filter have to be in the middle?

--------
Infopros Joint :: Computer Related Links And Discussion (http://"http://infoprosjoint.net/PN/html/index.php")
Title: Banner Killers: Another Two
Post by: JD5000 on August 15, 2002, 07:17:33 AM
Dang... I really like them! Gave me a an idea for my kill marks too.

Another question, why do you call adpaths straight from the match?



--------
Infopros Joint :: Computer Related Links And Discussion (http://"http://infoprosjoint.net/PN/html/index.php")
Title: Banner Killers: Another Two
Post by: sidki3003 on August 15, 2002, 12:39:28 PM
Hi JD,

Great to hear that it works as expected in other configs as well.

quote:

One question, why does the "web bug" filter have to be in the middle?



"Kill: Banners (not linked)" would catch many of them and i guess you don't want detailed info about every 1x1 pixel image, right?
If you place the webbug filter above, it will match before since it uses the same bounds (<img> that is).


quote:

Another question, why do you call adpaths straight from the match?



The filters do the following:
First, check for the image dimensions, because it's the fastest.
This is for offline images only, except (iframe|layer|ilayer|embed).

Second, check if a *relative* path matches "AdPaths" ((href|src)="/foo/adverts/foo/...").
That's the second fastest.

Third, check if a relative path contains certain keywords, that would be too general for "AdPaths".

Fourth, check if an *absolute* path ((href|src)="http://foo.com/foo/adverts/foo/...") matches AdList (calling in turn AdPaths, AdDomains, AdHosts).
That is the slowest check and only done if the others fail.

edit: Ehm... it's not exactly like that, the two filters behave a little differently.
But you got the idea.

/sidki

Edited by - sidki3003 on 15 Aug 2002  17:05:59
Title: Banner Killers: Another Two
Post by: Jor on August 15, 2002, 04:40:06 PM
Filters seem to work great in my config

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 15, 2002, 04:52:25 PM
Cool! I wasn't so sure about that, especially with the info display and the concatenated links for browsers other than IE.

 
Title: Banner Killers: Another Two
Post by: Jor on August 15, 2002, 04:59:19 PM
Oh, I took out the display (made the links into class proxlink, which are completely hidden from display )

But the base filters are nice addition -- they seldom come into play with my config, as it is pretty sharp in catching rogue ads, but those few that slip through are now targeted by this

Edited by - Jor on 15 Aug 2002  19:23:59
Title: Banner Killers: Another Two
Post by: JD5000 on August 15, 2002, 07:02:00 PM
Hehe, I kept the replace code. But, I  think I'm going to remove the links for the "kill banners (linked)". I mean, when I try to click them, they are killed by the URL-killer. LoL

I also liked how you gave each a diff color. So... I color coded all my kills using the "id" tag.

--------
Infopros Joint :: Computer Related Links And Discussion (http://"http://infoprosjoint.net/PN/html/index.php")

Edited by - JD5000 on 15 Aug 2002  20:02:41
Title: Banner Killers: Another Two
Post by: sidki3003 on August 15, 2002, 07:24:29 PM
quote:

I mean, when I try to click them, they are killed by the URL-killer. LoL




 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 16, 2002, 07:41:47 PM
Uh oh, the "Kill: Banners (not linked)" filter didn't do a nice job with the iframes here:
http://channels.netscape.com/ns/browsers/7/default.jsp
Hope i fixed it , update in the first post.

Note:
If an iframe is written by a script within the URL, the link won't work but is just informational.
See on the browser's status bar (hmmm... at least with IE) what would have been loaded.

/sidki


 
Title: Banner Killers: Another Two
Post by: JD5000 on August 16, 2002, 08:53:35 PM
Ok, I must not be thinking straight... What did you change? I can't see it... I know it's going to be obvious. LoL

--------
Infopros Joint :: Computer Related Links And Discussion (http://"http://infoprosjoint.net/PN/html/index.php")
Title: Banner Killers: Another Two
Post by: sidki3003 on August 16, 2002, 09:12:04 PM
I replaced href="1" with href='1'.

 
Title: Banner Killers: Another Two
Post by: JD5000 on August 17, 2002, 12:48:58 AM


--------
Infopros Joint :: Computer Related Links And Discussion (http://"http://infoprosjoint.net/PN/html/index.php")
Title: Banner Killers: Another Two
Post by: TEggHead on August 17, 2002, 08:42:08 AM
FWIW,

A long time ago I found a bannerblaster from ScoJo, based on IAB size standards using no list, at the time I was too using an AdDims list and noticed that the listless version was a couple of factors faster than my own. So I switched to using ScoJo's version without list...here it is modified and extended since the original...

It's gonna wrap so I put it in a code block, but you'll probably still have to make your window wider to let it unwrap


Name = "Banner Blaster ImgSize IAB Standards (IMG/INPUT SRC) (ScoJo) kill"
Active = TRUE
URL = "*&(^$IHDR(Content-Type:*PrxOriginalType*))"
Bounds = "<I(mg|nput)[^>]+>"
Limit = 512
Match = "<((*width=( ((|\|)("|))[#0:5]( ((|\|)("|)) &*height=( ((|\|)("|))[#0:5]( ((|\|)("|))) $SET(8=Beacon)"
        " |(*width=( ((|\|)("|))[#1]  ( ((|\|)("|)) &*height=( ((|\|)("|))[#1]  ( ((|\|)("|))) $SET(8=Beacon)"
        " |(*width=( ((|\|)("|))[#100]( ((|\|)("|)) &*height=( ((|\|)("|))[#101]( ((|\|)("|))) $SET(8=100x100)"
        " |(*width=( ((|\|)("|))[#110]( ((|\|)("|)) &*height=( ((|\|)("|))[#110]( ((|\|)("|))) $SET(8=110x110)"
        " |(*width=( ((|\|)("|))[#120]( ((|\|)("|)) &*height=( ((|\|)("|))[#60] ( ((|\|)("|))) $SET(8=120x60)"
        " |(*width=( ((|\|)("|))[#120]( ((|\|)("|)) &*height=( ((|\|)("|))[#90] ( ((|\|)("|))) $SET(8=120x90)"
        " |(*width=( ((|\|)("|))[#120]( ((|\|)("|)) &*height=( ((|\|)("|))[#240]( ((|\|)("|))) $SET(8=120x240)"
        " |(*width=( ((|\|)("|))[#120]( ((|\|)("|)) &*height=( ((|\|)("|))[#600]( ((|\|)("|))) $SET(8=120x600)"
        " |(*width=( ((|\|)("|))[#125]( ((|\|)("|)) &*height=( ((|\|)("|))[#126]( ((|\|)("|))) $SET(8=125x125)"
        " |(*width=( ((|\|)("|))[#125]( ((|\|)("|)) &*height=( ((|\|)("|))[#600]( ((|\|)("|))) $SET(8=125x600)"
        " |(*width=( ((|\|)("|))[#160]( ((|\|)("|)) &*height=( ((|\|)("|))[#600]( ((|\|)("|))) $SET(8=160x600)"
        " |(*width=( ((|\|)("|))[#180]( ((|\|)("|)) &*height=( ((|\|)("|))[#150]( ((|\|)("|))) $SET(8=180x150)"
        " |(*width=( ((|\|)("|))[#200]( ((|\|)("|)) &*height=( ((|\|)("|))[#55] ( ((|\|)("|))) $SET(8=200x55)"
        " |(*width=( ((|\|)("|))[#230]( ((|\|)("|)) &*height=( ((|\|)("|))[#33] ( ((|\|)("|))) $SET(8=230x33)"
        " |(*width=( ((|\|)("|))[#234]( ((|\|)("|)) &*height=( ((|\|)("|))[#60] ( ((|\|)("|))) $SET(8=234x60)"
        " |(*width=( ((|\|)("|))[#240]( ((|\|)("|)) &*height=( ((|\|)("|))[#400]( ((|\|)("|))) $SET(8=240x400)"
        " |(*width=( ((|\|)("|))[#250]( ((|\|)("|)) &*height=( ((|\|)("|))[#250]( ((|\|)("|))) $SET(8=250x250)"
        " |(*width=( ((|\|)("|))[#300]( ((|\|)("|)) &*height=( ((|\|)("|))[#250]( ((|\|)("|))) $SET(8=300x250)"
        " |(*width=( ((|\|)("|))[#336]( ((|\|)("|)) &*height=( ((|\|)("|))[#280]( ((|\|)("|))) $SET(8=336x280)"
        " |(*width=( ((|\|)("|))[#468]( ((|\|)("|)) &*height=( ((|\|)("|))[#60] ( ((|\|)("|))) $SET(8=468x60)"
        " |(*width=( ((|\|)("|))[#468]( ((|\|)("|)) &*height=( ((|\|)("|))[#68] ( ((|\|)("|))) $SET(8=468x68)"
        " |(*width=( ((|\|)("|))[#468]( ((|\|)("|)) &*height=( ((|\|)("|))[#80] ( ((|\|)("|))) $SET(8=468x80)"
        " |(*width=( ((|\|)("|))[#468]( ((|\|)("|)) &*height=( ((|\|)("|))[#100]( ((|\|)("|))) $SET(8=468x100)"
        " |(*width=( ((|\|)("|))[#470]( ((|\|)("|)) &*height=( ((|\|)("|))[#60] ( ((|\|)("|))) $SET(8=470x60)"
        " |(*width=( ((|\|)("|))[#80] ( ((|\|)("|)) &*height=( ((|\|)("|))[#40] ( ((|\|)("|))) $SET(8=80x40)"
        " |(*width=( ((|\|)("|))[#81] ( ((|\|)("|)) &*height=( ((|\|)("|))[#63] ( ((|\|)("|))) $SET(8=81x63)"
        " |(*width=( ((|\|)("|))[#88] ( ((|\|)("|)) &*height=( ((|\|)("|))[#31] ( ((|\|)("|))) $SET(8=88x31)"
        " |(*width=( ((|\|)("|))[#88] ( ((|\|)("|)) &*height=( ((|\|)("|))[#32] ( ((|\|)("|))) $SET(8=88x32)"
        " |(*width=( ((|\|)("|))[#89] ( ((|\|)("|)) &*height=( ((|\|)("|))[#31] ( ((|\|)("|))) $SET(8=89x31)"
        " )*>                                                     "
        "&&<*(SRC=)([(\"']+|)1(((f|ht"+"|ht)tp(s|)(:|%3a)(/|%2f)+{2}|(/|%2f))|)3 5(["']|)2s6>"
        "&(^<*SRC=$AV(*/(ts|transparent|trans|tiny|spc|spacer|space|shim|s|pixel|pix|null|lin|leftnav_shim|lg-dot|empty|dummy|dot"
        "      |clear_pixel|cleardot|clear|circlespot001|c|box_??|box_?|blank|black|b|1x1|1ptrans|1pix|1).gif)*>)"
Replace = "<IMG 1http://Proxomi.Tron:82/_Images/BugOnE.gif?352 ALT=8 width=2 height=2>"


quote:
(*width=([#720:760])7 & *height=([#85:100])8) *>$SET(9=a.monster.2 7x8)

SidKi, if yer running out of variables, unfold the size tests, that'll give you two vars extra to play with

(if yo?r going internal (no list) put the size tests in numerical order (I found it skipped sizes otherwise sometimes)

JarC


 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 17, 2002, 12:39:07 PM
Hey, lots of things to think about! THX

In case anybody else is fiddling with ad dimensions, here is the internal dimension list of the current WebWasher version (3.2 beta 4): Click me (http://"uploaded/sidki3003/2002817133631_WebWasher_wwlist.zip").

I'm currently testing the sizes that are not yet part of the AdDims list.

/sidki

Edited by - sidki3003 on 17 Aug 2002  13:40:30
Title: Banner Killers: Another Two
Post by: altosax on August 17, 2002, 12:57:47 PM
tegghead filter:

quote:


        " |(*width=( ((|\|)("|))[#100]( ((|\|)("|)) &*height=( ((|\|)("|))[#101]( ((|\|)("|))) $SET(8=100x100)"
        " |(*width=( ((|\|)("|))[#125]( ((|\|)("|)) &*height=( ((|\|)("|))[#126]( ((|\|)("|))) $SET(8=125x125)"




they should be:


        " |(*width=( ((|\|)("|))[#100]( ((|\|)("|)) &*height=( ((|\|)("|))[#100]( ((|\|)("|))) $SET(8=100x100)"
        " |(*width=( ((|\|)("|))[#125]( ((|\|)("|)) &*height=( ((|\|)("|))[#125]( ((|\|)("|))) $SET(8=125x125)"


altosax.

Edited by - altosax on 17 Aug 2002  13:59:41
Title: Banner Killers: Another Two
Post by: TEggHead on August 18, 2002, 12:18:32 AM
I know tis deliberately, that's also why I did not change the var text according...found these two to match a bit too often for regular images which  I wanted to keep...

so I just changed the size...and left them in so I'd know why....



Edited by - TEggHead on 18 Aug 2002  01:20:28
Title: Banner Killers: Another Two
Post by: sidki3003 on August 18, 2002, 11:36:26 PM
Updated:
Takes care about multiple hrefs
Fixes some problems with missing </a> tags
Some speed-ups
Some bug fixes
Some other things

Changes are in the first post.


 
Title: Banner Killers: Another Two
Post by: lnminente on August 19, 2002, 12:05:54 AM
Hi sidki. Beautiful filters.

But one thing with flash:

<object
      codebase=http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=4,0,2,0
      height=60 width=468 classid=clsid:D27CDB6E-AE6D-11cf-96B8-444553540000>
                                    <param name="movie" value="http://www.exito.com/BANNER.swf">
                                    <param name="quality" value="best">
                                    <param name="play" value="true">
                                    <embed
      src="http://www.exito.com/BANNER.swf"
      type="application/x-shockwave-flash" width="468" height="60"
      pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash"
      quality="best" play="true">
</embed>
                                 </object>

get converted as:

<span class=prox style=display:inline;><a class="prox" id="proxlower" href='http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=4,0,2,0' target="_top">[object:
a.common.1 468x60]</a></span>

and href would be
"http://www.exito.com/BANNER.swf"

Another example code is:
<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=4,0,2,0" width="468" height="60">
                              <param name="movie" value="portal.swf">
                              <param name="quality" value="best">
                              <param name="play" value="true">
                              <embed src="portal.swf" type="application/x-shockwave-flash" width="468" height="60" pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash" quality="best" play="true">
                           </object>


Regards

Edited by - lnminente on 19 Aug 2002  01:10:04
Title: Banner Killers: Another Two
Post by: sidki3003 on August 19, 2002, 05:50:37 AM
Thanks! ... Fixed.

Update in the first post, or for those who feel familiar with their *.cfg:

Replace

"(object)4s*codebase=$AV(1)"

with

"(object)4(*scodebase=$AV(*)&(*ssrc=$AV(1)|))"

/sidki


 
Title: Banner Killers: Another Two
Post by: altosax on August 19, 2002, 12:12:10 PM
hi sidki,
i've started to analize your banner killers but at this time i've still not decided if replace the banner blaster i use with yours. this make me suffer because i really like the banner blaster so i'll take some time yet to come to a solution.
in the meantime i've found that you dupe the calls to adpath, the first directly in the filter code and the second through the adlist.
why this waste of time?

regards,
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 19, 2002, 01:30:08 PM
Hi altosax,
quote:

this make me suffer because i really like the banner blaster so i'll take some time yet to come to a solution.

quote:

in the meantime i've found that you dupe the calls to adpath, the first directly in the filter code and the second through the adlist.
why this waste of time?

Look at the code again.
There is no double call:
Either i have a relative path ((/)foo/adverts/...)
-> Just AdPaths is called.
Or i have one with protocol (http://foo.com/adverts/...)
-> The whole enchillada (AdHosts, AdDomains, AdPaths) is called.

regards, sidki




Edited by - sidki3003 on 19 Aug 2002  14:31:10
Title: Banner Killers: Another Two
Post by: altosax on August 19, 2002, 07:34:23 PM
it was not that my point,
just i've not seen the difference between AdP and AdPr

regards,
altosax.

Edited by - altosax on 19 Aug 2002  20:41:38
Title: Banner Killers: Another Two
Post by: sidki3003 on August 19, 2002, 07:58:03 PM
Ah, i understand.


 
Title: Banner Killers: Another Two
Post by: lnminente on August 21, 2002, 01:17:00 AM
Hi all.

I would like one thing. I prefer to see always the same words when i replace the banners: The words are: [Ad-Dim] and [Ad-Word]. The reason is that my eyes don't need to spend time to read the information of the ad killed.

I suggest:

for "Kill: Banners (not linked)"
Replace = "<span class=prox style=display:inline;>"
          "<a class="prox" id="proxlower" href='1' target="_top"3 title="4: 95">[Ad-]</a></span>"

where would be "Dim" or "Word"

for "Kill: Banners (linked)" i would like to work in the same way.

Well, this is how i would like.

Regards to all.

Edited by - lnminente on 21 Aug 2002  02:47:38
Title: Banner Killers: Another Two
Post by: sidki3003 on August 21, 2002, 01:33:02 AM
Hi lnminente,

Sorry, but i will not do that. That's why i wrote the disclaimer (2nd line 1st post).

2 suggestions:
If you don't want to see anything at all, just replace "display:inline" with "display:none". Same thing with most of my filters.

Jor modified the filters for his current config, and JD for his upcoming one.
Take a look at their versions.

regards, sidki

 
Title: Banner Killers: Another Two
Post by: lnminente on August 21, 2002, 01:45:37 AM
The suggestion kaput

Well, i hope someone like the tip.

Note: I like to see what i kill. But i preffer this other way.

Regards

Edited by - lnminente on 21 Aug 2002  02:56:23
Title: Banner Killers: Another Two
Post by: sidki3003 on August 21, 2002, 02:15:26 AM


 
Title: Banner Killers: Another Two
Post by: TEggHead on August 21, 2002, 09:29:09 AM
quote:
Well, i hope someone like the tip.


Ola Inminente,


It's still a good tip, I use this type of replacement very much myself, but that is the beauty of Prox, if you want it a bit different you can freely change the filter to suit your needs/preferences. You can even combine it and still let them be hidden and only visible when you want

(take a look at AdContainerRemover, it comes with a html file to test if your browser supports bookmarklets with which you can toggle the replacement visibility on or off)

BTW. You may not have realized this yet, but it is also a great method to give a hint which filter did it...my Linked ad filters insert [Ad-], my banner filters insert [BNR-], my ActiveX [AX-] etc.



 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 21, 2002, 03:14:22 PM
lnminente,

I completely forgot about that option LOL. Modify the filter to suit *your* needs.
If you do so, be aware that "title=" is already in use.
For "Kill: Banners (not linked)":

Replace = "<span class=prox style=display:inline;>"
"<a class="prox" id="proxlower" href='1' target="_top"3>[4: 95]</a></span>"

3 resolves to "title=2"
where 2 is the alternative text of the image.


JarC,
quote:
quote:
(*width=([#720:760])7 & *height=([#85:100])8) *>$SET(9=a.monster.2 7x8)

SidKi, if yer running out of variables, unfold the size tests, that'll give you two vars extra to play with

I tried hard to understand what you meant by that, but you lost me here.
The only simplification i can think of is by using the stack for the list like so:

(*width=([#720:760])# & *height=([#85:100])#) *>$SET(9=a.monster.2 #x#)

But then it's gone for future versions of the filter ...


/sidki


 
Title: Banner Killers: Another Two
Post by: lnminente on August 21, 2002, 03:28:40 PM
Hi all.

To TEggHead: Many thanks, you made two beautiful filters to me, that i can't made.

To Sidki: Hi, thanks for the information. I still must to analyze your filters in deep to modify to my needs.

Regards.

 
Title: Banner Killers: Another Two
Post by: TEggHead on August 21, 2002, 09:42:33 PM
quote:

(*width=([#720:760])7 & *height=([#85:100])8) *>$SET(9=a.monster.2 7x8)
quote:

SidKi, if yer running out of variables, unfold the size tests, that'll give you two vars extra to play with

I tried hard to understand what you meant by that, but you lost me here.



I saw that one later and using this format, unfolding would take a lot of lines, but what I meant was something like this one

(*width=([#400])7 & *height=([#80]|[#100]|[#120]|[#150])8) *>$SET(9=a.nonst.1 7x8)

and writing it like
(*width=[#400] & *height=[#80] *>$SET(9=a.nonst.1 400x80)
(*width=[#400] & *height=[#100]*>$SET(9=a.nonst.1 400x100)
(*width=[#400] & *height=[#120]*>$SET(9=a.nonst.1 400x120)
(*width=[#400] & *height=[#150]*>$SET(9=a.nonst.1 400x150)

which would free 7 and 8 if it weren't for the existing range notations already in use...simple if it is only 2-5 but not with a range of 40+



 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 21, 2002, 11:14:08 PM
Ok, got it

A lot of extra lines indeed. 40x15 for the "a.monster.2" line.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 22, 2002, 05:30:13 AM
Updated:

Format of AdDims list changed -> speed-up
Additions to AdDims list
Absolute paths on the same host are treated like relative paths
Attempt to make the code more transparent for others (and me )
Some bugfixes
Some tweaks
Some other things

Thanks to:
JD for test driving, adding some tweaks, etc...
lnminente for pointing to yet another banner construct

Changes in the first post


 
Title: Banner Killers: Another Two
Post by: lnminente on August 22, 2002, 03:54:59 PM
Hi sidki, i don't get matched this:

<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.#version=5,0,30,0" height="60" width="468">
<param name="movie" value="sdb.swf">
<param name="quality" value="best">
<param name="play" value="true">
<embed height="60" pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash" src="sdb.swf" type="application/x-shockwave-flash" width="468" quality="best" play="true">
</object>

Note: Just now, i was going to tell you one bug, but i realized that you corrected it

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 22, 2002, 04:00:40 PM
Scratches head ...
So do you get it matched or not? ...

 
Title: Banner Killers: Another Two
Post by: lnminente on August 22, 2002, 04:07:25 PM
Don't match, see here: http://www.terra.es/personal9/woodstock15h/images2/new_page_5.htm

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 22, 2002, 04:19:34 PM
Hmm ... the filter matches here.

Maybe a copy/paste error?
Else try...
The debug view: log window -> edit -> view HTML Debug info
Turning of the other filters, especially those with <img> bounds.

 
Title: Banner Killers: Another Two
Post by: lnminente on August 22, 2002, 04:33:31 PM
You are right the filter matches, there was a problem importing the lists.

Sorry for the scare. And thanks again Sidki.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 22, 2002, 04:51:51 PM
No problem

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 02:59:05 AM
'lil bugfix

Changes in the first post


 
Title: Banner Killers: Another Two
Post by: altosax on August 23, 2002, 11:12:06 AM
hi sidki,
just some little tweaks.

1.
"<[^>]++>&&*ssrc=$AV(4)*&&$LST(AdDims)*"

in the linked one you can replace a double check with a single check by modifying ++ in + this way:

"<[^>]+>&&*ssrc=$AV(4)*&&$LST(AdDims)*"

2.
"((*width=$AV(([#0:*])6) & *height=$AV(([#0:*])7)) *>$SET(5= 6x7)|)"

in both filters you can remove the check and make faster the assignement to variables this way:

"((*width=$AV(6) & *height=$AV(7)) *>$SET(5= 6x7)|)"

3.
in my banner blaster i've recently modified bounds to match both <iframe*> and <iframe*>*</iframe> this way:

<iframe*>(*</iframe>|)


still evaluating,
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 01:46:41 PM
Hi altosax,

I checked your smart modifications
quote:

1.
"<[^>]+>&&*ssrc=$AV(4)*&&$LST(AdDims)*"


Right, does the same and is faster.
quote:

2.
"((*width=$AV(6) & *height=$AV(7)) *>$SET(5= 6x7)|)"


Same and faster.
quote:

3.
<iframe*>(*</iframe>|)


That one looks simple but it's not. I'll be back with that later.


The changes will be in the next update.

Thanks


regards, sidki

 
Title: Banner Killers: Another Two
Post by: lnminente on August 23, 2002, 03:25:23 PM
Other tip for speed:

Watching AdDimensions.txt i see that the small width is 41 (i'm not sure if 41).

Would be faster if test that the width is 41 or bigger, if not, then bypass checking AdDimensions.txt

Hope you like it



Edited by - lnminente on 23 Aug 2002  16:27:49
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 03:32:54 PM
Hi lnminente,

Sounds interesting
I'll check that out.

 
Title: Banner Killers: Another Two
Post by: lnminente on August 23, 2002, 03:41:09 PM
   

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 05:37:10 PM
Hi altosax and lnminente,

altosax,

3.
<iframe*>(*</iframe>|)

I'm happy about this one.
It does indeed check the entire byte range for "</iframe>" before falling back to "nothing".
That is, it does the opposite as:
<iframe*>(|*</iframe>)

I also changed <object> and <embed> accordingly.
Not sure about layer/ilayer.



lnminente,

*Big* speed improvement , especially for the "Kill: Banners (not linked)" filter.
Thanks a lot for this idea!


I changed so much, i hope i didn't break anything else.
So i'll have to test it for a while.

regards, sidki


 
Title: Banner Killers: Another Two
Post by: Jor on August 23, 2002, 06:23:10 PM
Hi,

A few notes/questions:
Kill: Banners (linked) uses <as*</a>. Isn't $NEST(<a,</a>) faster?

Kill: Banners (not linked) uses a lot of strings which could be optimized.
Mine looks like this: <i(mg|nput)s*>|<frames*>|$NEST(<iframe,</iframe>)|$NEST(<layer,</layer>)|$NEST(<ilayer,</ilayer>)|$NEST(<object,</object>)|<embeds*>|$NEST(<applet,</applet>)
(As you can see I also added applet and frame. The latter because in MSIE's backwards compatibility mode, it works the same as iframe when used inline, with the exception that it has no closing tag).

Question: is there a difference in functionality between <iframe*>(*</iframe>|) or <iframe*>(|*</iframe>) and $NEST(<iframe,</iframe>)?
Iframes without a closing tag don't work anyway.

Also, embed has no closing tag: it was never part of the HTML standard, and thus did not transfer to XHTML, which added closing tags for all elements. Correct HTML uses <object> instead, and this tag does require a closing tag.

Typical <embed> usage (only in HTML4.01 Transitional Documents and lesser) is still:
<EMBED SRC="/path/file.cmx WIDTH="100" HEIGHT="200">
<NOEMBED>
  <P>Sorry, but you do not have a Corel CMX plugin for
   displaying Corel CMX image files. Here is an alternate
   version, as a regular GIF.</P>
<IMG SRC="/path/file.gif" HEIGHT="200" WIDTH="100"
 ALT="stupid example image">
</NOEMBED>


Lastly, is the post edited with all changes, or need I download the zipfile again to get the last version of AdDims as well?

Edited by - Jor on 23 Aug 2002  19:26:25
Title: Banner Killers: Another Two
Post by: lnminente on August 23, 2002, 06:38:17 PM
quote:

lnminente,

*Big* speed improvement , especially for the "Kill: Banners (not linked)" filter.
Thanks a lot for this idea!



It's good to hear it. Thanks to you, my friend.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 06:46:51 PM
Hi Jor,

Just the answers to 2 questions. I have to look into the others more carefully.

Nested/not nested <a> bounds:
Well, at least not statistically significant. Here is what i typically get:

not nested:
Sample Text: 4031 bytes
Successful Matches: 6
Avg time: 6.762277 (milliseconds)

nested:
Sample Text: 4031 bytes
Successful Matches: 6
Avg time: 6.739955 (milliseconds)

Plus: See Scott's comments on using $NEST() in such cases not being a good idea.


Update: The first post is always edited to contain the latest changes to filters and list.
Today's changes are not yet included though.

/sidki


 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 08:21:22 PM
quote:

Kill: Banners (not linked) uses a lot of strings which could be optimized.
Mine looks like this: <i(mg|nput)s*>|<frames*>|$NEST(<iframe,</iframe>)|$NEST(<layer,</layer>)|$NEST(<ilayer,</ilayer>)|$NEST(<object,</object>)|<embeds*>|$NEST(<applet,</applet>)


i(mg|nput)s:
Do you see a speed difference or is it just to shorten the line?

$NEST():
After reading *this* (http://"http://asp.flaaten.dk/pforum/topic.asp?TOPIC_ID=620#2191") and *that* (http://"http://asp.flaaten.dk/pforum/topic.asp?ARCHIVE=&whichpage=1&TOPIC_ID=565#1883"), i did some benchmarks to compare nested versus not nested.
The differences were really minor (talking about my box of course).
Since then i don't use $NEST() with structures that aren't nested.
quote:

(As you can see I also added applet and frame. The latter because in MSIE's backwards compatibility mode, it works the same as iframe when used inline, with the exception that it has no closing tag).


I'll include those in the filter.
quote:

Question: is there a difference in functionality between <iframe*>(*</iframe>|) or <iframe*>(|*</iframe>) and $NEST(<iframe,</iframe>)?
Iframes without a closing tag don't work anyway.


I didn't know that, thanks for sharing . As to using $NEST(), see above.
quote:

Also, embed has no closing tag: it was never part of the HTML standard, and thus did not transfer to XHTML, which added closing tags for all elements. Correct HTML uses <object> instead, and this tag does require a closing tag.


embed:
I see both, <embed> with and without closing tag in practice, and the source i mostly use mentions both, too.
http://developer.netscape.com/docs/manuals/htmlguid/tags14.htm#1286379
So maybe it's just a matter of taste if you want to leave </embed> alone or not.

object:
Not sure about that one, i think i've seen it working without closing tag (IE6).



/sidki

Edited by - sidki3003 on 23 Aug 2002  22:12:19
Title: Banner Killers: Another Two
Post by: altosax on August 23, 2002, 08:40:09 PM
jor wrote:

quote:

Question: is there a difference in functionality between <iframe*>(*</iframe>|) or <iframe*>(|*</iframe>) and $NEST(<iframe,</iframe>)?
Iframes without a closing tag don't work anyway.



you are right. i don't know why i've changed my bounds, probably after reading something. now i changed them back to <iframe*</iframe>.

<edit>: i've found where i've read that. here:
http://asp.flaaten.dk/pforum/topic.asp?whichpage=1&ARCHIVEVIEW=&TOPIC_ID=790#3428

thanks,
altosax.

Edited by - altosax on 24 Aug 2002  19:24:41
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 08:52:31 PM
But let's not forget the difference between "<tag*>(*</tag>|)" and "<tag*>(|*</tag>)".
At least to me that wasn't obvious at all.

 
Title: Banner Killers: Another Two
Post by: altosax on August 23, 2002, 09:11:44 PM
sidki wrote:

quote:

But let's not forget the difference between "<tag*>(*</tag>|)" and "<tag*>(|*</tag>)".
At least to me that wasn't obvious at all.



because of the OR function, if the first expression is true the second will never be evaluate. this means that if you write:

<tag*>(|*</tag>)

the *</tag> will never be evaluate because the first expression _nothing_ is always true. in fact <tag*> is an existing expression then always matches with _nothing_.
instead, if you write:

<tag*>(*</tag>|)

the *</tag> will always be evaluate first so if it exists it will match.

hth,
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 23, 2002, 09:24:27 PM
I've got that by now

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 12:38:38 AM
Update:
Big speed-up in the AdDims call
Added two more tags to check
Quite a few other things

The filter has turned into sort of teamwork.
So big thanks to all who contributed.

Changes in the first post.

Edited by - sidki3003 on 24 Aug 2002  04:21:00
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 12:30:44 PM
hi sidki,
i've found in the last bruce eckel's book, the recent beta release of "thinking in proxomitron language", these 3 contributes that you can apply to your banner filters.

1.
In the first one, you can replace the line:

Match = "<a[^>]++shref=$AV(1)*> (([^('][^<>]++)3 <*/a>|)&*ssrc=$AV(4)*&(*alt=$AV(2)|)&"

with these (alt text snipped after first 18 chars):

Match = "<a[^>]++shref=$AV(1)*> (([^('][^<>]++)3 <*/a>|)&*ssrc=$AV(4)*"
        "&((*alt="")$SET(2=Ad)|*alt=$AV((?+{18})2*|2)|$SET(2=Ad))&"

or these if you prefer (alt text not snipped):

Match = "<a[^>]++shref=$AV(1)*> (([^('][^<>]++)3 <*/a>|)&*ssrc=$AV(4)*"
        "&((*alt="")$SET(2=Ad)|*alt=$AV(2)|$SET(2=Ad))&"

This comes from the modifies he made to the default "Banner Blaster" and always return a value for the 2 variable, so you always will have a title in the replacement expression.

2.
Because you already have matched the bounds, in the first filter you can replace this:

"<[^>]+>"
"&&*ssrc=$AV(4)*"
"&&*width=[#41:*]*"
"&&$LST(AdDims)*"

with this:

"<[^>]+>"
"&*ssrc=$AV(4)"
"&*width=[#41:*]"
"&$LST(AdDims)"

and in the second one you can replace this:

"[^>]+>"
"&&*width=[#41:*]*"
"&&$LST(AdDims)*"

with this:

"[^>]+>"
"&*width=[#41:*]"
"&$LST(AdDims)"

this way you haven't to re-match the whole bounds, with a little speed improvement.

3.
according to what he wrote, and tegghead also, you could rewrite the addims list manually setting the replacement instead of using the variables when possible. this should improve a little bit the speed. so the line:

(*width=([#120]|[#173]|[#230:240]|[#400:500])6 & *height=([#60])7)$SET(9=a.common.2 6x7)

could be re-written as:

(*width=([#120]|[#173]|[#230:240]|[#400:500])6 & *height=[#60])$SET(9=a.common.2 6x60)

the same for all other lines.


search for all free bruce eckel's books at http://www.mindview.net/

regards,
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 02:58:27 PM
Hi altosax,

1.
((*alt="")$SET(2=Ad)|*alt=$AV(2)|$SET(2=Ad))

"Ad" is not really an extra info, is it? We know that through the replacement link anyway.

I prefer to see the entire alt text, since it's shown only in the flyover.

2.
Not so.

"<[^>]+>"
"&&*ssrc=$AV(4)*"
"&&*width=[#41:*]*"
"&&$LST(AdDims)*"

takes care that these tests appear all within the *same* "<[^>]+>" range

Example:
http://www.extremetech.com/
In AdDims a.button.2 must be uncommented for this.

<a href="http://www.extremetech.com/category2/0,3971,236,00.asp" class="bgcolor2">
<img src="/images/spacer.gif" width="26" height="1" border="0">
<img src="http://common.ziffdavisinternet.com/util_get_image/1/0,3363,i=12372,00.jpg" width="100" height="30" alt="" border="0">
<img src="/images/spacer.gif" width="26" height="1" border="0">
</a>

With

"<[^>]+>"
"&*ssrc=$AV(4)*"
"&*width=[#41:*]*"
"&$LST(AdDims)*"

the AdDims call matches the 2nd <img*>, while the src test matches the first.
Avoiding that is why i made this routine in the first place.

Same goes for filter 2.

3.
Stuffing a (small) string into a variable and recalling it later happens at speed of light (IMO).
JarC's point was to give me more variables for the filters, if i could get rid of 6 (and 7).

I searched for Bruce Eckel's Proxomitron book on the link you posted and on Google, too.
Nothing. Can you give me a direct link?



Jor,

Do you have example links for frame and applet tags with adish dimensions?
I want to see if the check should take place for all links or for off-site links only.


Also, did anyone find any entries in AdPaths being too restrictive for the check on the current host?
If so, they can be moved from the list to the filters. It's this line:
"(ad|promo(s|)|ban|banner(s|)"



regards, sidki

Edited by - sidki3003 on 24 Aug 2002  17:01:25
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 03:48:03 PM
sidki wrote:

quote:

I searched for Bruce Eckel's Proxomitron book on the link you posted and on Google, too.
Nothing. Can you give me a direct link?



i've sent it to you directly in your mailbox.

just another question. what does it mean the blue code?

Match = "<a[^>]++shref=$AV(1)*> (([^('][^<>]++)3 <*/a>|)

i think you should remove it because useless. it will never match due to the bounds.

in fact it will always be the part in red to match because the Bounds = "<as*</a>" imply that <a[^>]++shref=$AV(1)*> can not match with _nothing_ so you can safely remove that part.

do you agree?
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 04:22:11 PM
Hi altosax,

Book: Thanks! Its fantastic

Code: Try the test window , e.g.
<a href="http://foo.com/"><img src="http://foo.com/foo.jpg" width="469" height="60"></a>
It's because [^('] consumes a char.

regards, sidki


 
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 05:29:58 PM
hi sidki,
i've analized this line:

"(<a[^>]++shref|*ssrc)=$AV((http|ftp)(s|)://(^h)*)"

the first expression always matches so, when href is encountered the linked url is evaluate. if the location is on the same host the second expression of the or function will be evaluate. but when this happens, due to the wildcard and the fact that also <a[^>]++shref is into the parens, the evaluation have to restart from the beginning and due to the space it stops every time s is matched until the subsequent word is src.

so i propose:

"<a[^>]++shref(=|*ssrc=)$AV((http|ftp)(s|)://(^h)*)"

this way the scanning of the chars never have to restart from the beginning. when href is encountered the presence of the = sign is verified and then is checked the host, otherwise the second part of the or function is evaluate, but the scanning of the chars this time continue from that point.

i know that this new code is not elegant, so you if you don't like it throw it into the garbage :)

regards,
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 06:06:36 PM
Hi altosax,

quote:

so i propose:

"<a[^>]++shref(=|*ssrc=)$AV((http|ftp)(s|)://(^h)*)"


Yes, that's definitely better.
quote:

i know that this new code is not elegant, so you if you don't like it throw it into the garbage :)


But it's faster than
"<a[^>]++shref(|*ssrc)=$AV((http|ftp)(s|)://(^h)*)"


I'll check if that's possible with the other such lines as well.

regards, sidki

 
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 06:16:26 PM
sidki wrote:

quote:

But it's faster than
"<a[^>]++shref(|*ssrc)=$AV((http|ftp)(s|)://(^h)*)"



this doesn't work because the first expression is always true, e.g. is always true that href_nothig_= matches so the *ssrc will never be evaluate!

altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 06:27:29 PM
It *does* work. Try this in the test window:

<a href="http://shonen.knife.com/"><img src="http://shonen.knife.com/foo.jpg" width="469" height="60"></a>

<a href="http://shonen.knife.com/"><img src="http://foo.com/foo.jpg" width="469" height="60"></a>

<a href="http://foo.com/"><img src="http://shonen.knife.com/foo.jpg" width="469" height="60"></a>

<a href="http://foo.com/"><img src="http://foo.com/foo.jpg" width="469" height="60"></a>

But as i said, your version is faster anyway.

THX, sidki


 
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 07:42:35 PM
quote:

It *does* work.



i'm not sure about that. the filter matches the image size of the banner but the test you have to try is if it matches the href part or the src part.
this means that it could also match the href part (because the link refers to a different host) and the image size.
you should prove with a link on the same host and an image on a different host: this will demostrate that the second part of the or expression is evaluated.

if you just prefer to match the external links and not the source of the image, you can also remove the *ssrc check.

now this new question:
why don't you try making the filter fail in bounds, moving there the [^('] check?
i can't suggest nothing this time because it is difficult to me understand in deep how the 3 variable works.

altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 07:57:10 PM
quote:

you should prove with a link on the same host and an image on a different host: this will demostrate that the second part of the or expression is evaluated.


"shonen.knife.com" *is* the same host.

The URL of the test window is:
http://www.Shonen.Knife.com/Naoko/Michie/Atsuko/kappa.ex.cgi?jackalope
quote:

why don't you try making the filter fail in bounds, moving there the [^('] check?
i can't suggest nothing this time because it is difficult to me understand in deep how the 3 variable works.


I don't get you here.
([^('][^<>]++)3 captures the text of the link and displays it in the flyover of the 2nd replacement link.
The expression within is a long story of bug hunting.

regards, sidki



Edited by - sidki3003 on 24 Aug 2002  21:13:47
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 08:49:43 PM
sometimes, when looking too further in advance, we risk to forget the most simple things.
i've just realized that all these 3 expressions are wrong:

(<a[^>]++shref|*ssrc)=$AV((http|ftp)(s|)://(^h)*)
<a[^>]++shref(=|*ssrc=)$AV((http|ftp)(s|)://(^h)*)
<a[^>]++shref(|*ssrc)=$AV((http|ftp)(s|)://(^h)*)

the reason is that the second expression of an OR function will be evaluate only when the first fails. in all 3 expression instead the first expression is always true.

the right one is:

<a[^>]++shref=($AV((http|ftp)(s|)://(^h)*)|*ssrc=$AV((http|ftp)(s|)://(^h)*))

i apologize for the things i wrote in the past messages. sorry for all readers.

altosax.

 
Title: Banner Killers: Another Two
Post by: altosax on August 24, 2002, 08:52:49 PM
quote:

"shonen.knife.com" *is* the same host.



it seems that shonen.knife.com is different from www.shonen.knife.com

altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 24, 2002, 09:12:58 PM
The only thing i can say to above two posts is:
Please use the test window to verify such things.

I'm out of words now ...


 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 25, 2002, 02:50:40 AM
Update:
Additions to the AdDims list
Changes to the tags being scanned
A bug fix
Some optimizations
Some other things

If there aren't any serious bugs, this will be the last update for a while.

Thanks for contributing, as always.
Suggestions welcome, as always.


Both, message and zip file contain the current versions.
Changes in the first post.


Have fun, sidki




Edited by - sidki3003 on 25 Aug 2002  13:20:08
Title: Banner Killers: Another Two
Post by: altosax on August 25, 2002, 09:40:42 AM
hi sidki,
i've made a lot of tests with these expressions:

(<a[^>]++shref|*ssrc)=$AV((http|ftp)(s|)://(^h)*)
<a[^>]++shref(=|*ssrc=)$AV((http|ftp)(s|)://(^h)*)
<a[^>]++shref(|*ssrc)=$AV((http|ftp)(s|)://(^h)*)
<a[^>]++shref=($AV((http|ftp)(s|)://(^h)*)|*ssrc=$AV((http|ftp)(s|)://(^h)*))

they work all the same way, but they shouldn't.
the first 3 expressions are wrong coded because the first part of the or function is never false so the second part should not be evaluate. this doesn't happen, and i don't know why. probably this is the way proxomitron treats the code.
btw, all here should use well written code anyway, so i suggest to use:

<a[^>]++shref=($AV((http|ftp)(s|)://(^h)*)|*ssrc=$AV((http|ftp)(s|)://(^h)*))

in the attachment i've posted the results of my tests.

http://uploaded/altosax/200282510406_tests.zip

regards,
altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 25, 2002, 01:17:24 PM
quote:

probably this is the way proxomitron treats the code.


That's the point here.

If you have a filter like ...
Match = "(a|c)d"
... the string "abcde" will be changed to "abe".


 
Title: Banner Killers: Another Two
Post by: altosax on August 25, 2002, 02:10:53 PM
simple and clear,

thanks,
altosax.

 
Title: Banner Killers: Another Two
Post by: altosax on August 25, 2002, 02:13:45 PM
in the addims you miss a | in this line:

(*width=([#120]|[#173]|[#230:240]|[#400:500])6 & *height=([#59][#60])7)$SET(9=a.common.2 6x7)

altosax.

 
Title: Banner Killers: Another Two
Post by: sidki3003 on August 25, 2002, 02:26:30 PM
Changed (first post).

Thanks, sidki