Author Topic: Please help me to understand.....  (Read 3459 times)

UPieper

  • Newbie
  • *
  • Posts: 6
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • Email
Please help me to understand.....
« on: July 27, 2002, 08:31:25 AM »
....the following


I have this active filter in my default.cfg:

Name = "BASE16 to ASCII"
Active = TRUE
Multi = TRUE
URL = "$TYPE(htm)&(^$LST(NoCon))"
Bounds = "<a*>"
Limit = 512
Match = "*1"
Replace = "$UESC(1)"

I never really understood what this filter is doing?! Is it really necessary to have this filter activated on every page?

Thank you very much for your help

UPieper


 
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Please help me to understand.....
« Reply #1 on: July 27, 2002, 09:12:57 AM »
welcome on this forum, upieper.

you really need this filter, because it serves other filters. it converts url in base16 to ascii characters. in other words, when a url is obscured because wrote as %1d%d3&23%we%11... and so on this filter converts it as http://www... and so on. this way, the subsequent filter can match the code. this means also that you need to place this filter BEFORE your banner killer.

i've found the filter you are using now in zxlist and tweaked it as follow:

Name = "BASE16 to ASCII v.2.1"
Active = TRUE
Multi = TRUE
URL = "$TYPE(htm)(^$LST(NoCon))"
Bounds = "$NEST(<a,</a>)"
Limit = 512
Match = "(*%??*)1"
Replace = "$UESC(1)"

you can replace yours with the update version.

regards,
altosax.

 
 

UPieper

  • Newbie
  • *
  • Posts: 6
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • Email
Please help me to understand.....
« Reply #2 on: July 27, 2002, 12:08:37 PM »
Hi,

thanks for your help.

I've replaced my old filter with your filter and there's one thing I noticed. Your filter is activated much less that my old one, eg. on a Google result page the old filter appears in the log windows appr. 170 times and your filter kicks in only 27 times. Is that OK?

Quote removed by Arne. No need to quote the intire above post.

 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Please help me to understand.....
« Reply #3 on: July 27, 2002, 02:35:24 PM »
this is not true at all.
both filters match exactly the same number of times (170 active links) but your first one replace all occurrence of the <a> tag, mine replace only the tags that need to be converted, not those already in ascii chars. look at matching expression to understand why ;)

regards,
altosax.

 
 

UPieper

  • Newbie
  • *
  • Posts: 6
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • Email
Please help me to understand.....
« Reply #4 on: July 27, 2002, 03:16:23 PM »
So, it's clear that your modified filter is much more effective than the old one because of this:

Match = "(*%??*)1"

What's the benefit of using

Bounds = "$NEST(<a,</a>)" instead of Bounds = "<a*>" ?

Thanks so much for your "lessons"!  ;-)





 
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Please help me to understand.....
« Reply #5 on: July 27, 2002, 05:01:54 PM »
the benefit is that this converts also the text of the link, in the case also it was obscured.

 
 

TEggHead

  • Jr. Member
  • **
  • Posts: 93
    • ICQ Messenger - 21893433
    • AOL Instant Messenger -
    • Yahoo Instant Messenger - eljarec
    • View Profile
    • Email
Please help me to understand.....
« Reply #6 on: July 29, 2002, 09:52:42 AM »
With regards to the nessecity of this filter, it is only really needed if you are concerned about href targets being obfuscated (be aware, if only part of a link is using the %nn notation immediately following a ? or # than it may be that the processing script needs this, changing it may brake functionality (rare but possible)

Also with regards to the bounds

<A*> will only convert the first part of the link, not the linktext (which is even more rare to see in base16)

$NEST(<A,</A>) as AltoSax said, takes also the linktext into account, but what he forgets to say is that <A*</A> would have done the same, using $NEST makes the filter quite a bit faster...(=optimized Prox function to find closing tag)

You may also want to check if there actually is a blocklist defined to sit behind $LST(NoCon) as this would be the file to use to add links to that you don't want to convert (the negator ^ being present makes this an exclusionlist. Either view the default.cfg lists section or use config -> blockfile to see which textfile is associated with that listname)

HTH
JarC



Edited by - TEggHead on 29 Jul 2002  11:01:59
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Please help me to understand.....
« Reply #7 on: August 03, 2002, 04:12:01 PM »
TEggHead wrote:

quote:

<A*> will only convert the first part of the link, not the linktext (which is even more rare to see in base16)



you are right about the fact that the linktext is rare to see in base16, but i've found many examples.
don't forget anyway that in the <a>*</a> tags often there is an <img> tag: if the href attribute in the <a> tag is obfuscated, for sure also the src attribute will be in the <img> tag, so i prefer $NEST(<A,</A>) to match ads images also if their source was in base16.

altosax.

 
 

TEggHead

  • Jr. Member
  • **
  • Posts: 93
    • ICQ Messenger - 21893433
    • AOL Instant Messenger -
    • Yahoo Instant Messenger - eljarec
    • View Profile
    • Email
Please help me to understand.....
« Reply #8 on: August 04, 2002, 11:50:26 AM »
quote:

don't forget anyway that in the <a>*</a> tags often there is an <img> tag:


You are absolutely correct, when I wrote this I was strictly thinking of text-only links.