Author Topic: Kill Comment Block Ads: Another One  (Read 10251 times)

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #15 on: July 28, 2002, 09:30:03 AM »
hi morpheus,
i've visited that page and discovered that the comment you provide:

begin PROMOS & ADS * end PROMOS & ADS *-->

and

begin NESTED COUNT * end NESTED COUNT *-->

i've found here, are site specific ones. also, adding the first to the list doesn't make the filter matching because the byte limit is too small for that comment. i don't want to increase the byte limit only to match a site specific comment to avoid possible false matches on other sites (i don't know the opinion of sidki about this, for sure he'll let me know when he'll also read this).
if you often visit that site add the comments to your own list and increase the byte limit.
the cpu work on a web page depends from the whole filter set. to benchmark a single filter, load the page with proxomitron disabled, copy its code and paste it in the proxomitron test window. then compare the filters. try adding the comments in both the list to benchmark the time in positive matches, and remove the comments from both lists to benchmark the time on false matches. in the test window, you have to click on "profile" to view benchmarks.

thank you for your contribute,
i'm looking in advance for your next one,
altosax.

Edited by - altosax on 28 Jul 2002  10:34:54
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #16 on: July 28, 2002, 11:26:08 AM »
sidki wrote:

quote:

You could get some problems with my part of the list, since you have no "(?)++{0,90}" part.
Just keep an eye on situations like:

<!-- Start dummy.com -->
<IFRAME SRC="http://www.something.com/ads/B2.pl?iframe" MARGINWIDTH=0 MARGINHEIGHT="0" HSPACE="0" VSPACE="0" FRAMEBORDER="0" SCROLLING="NO" WIDTH="468" HEIGHT="60">
<a href="http://dummy.com/ads/B2.pl?banner=NonSSI;page=01"
...
large chunk of code
...
<!-- End dummy.com -->

and this in the list:
[^>]++{0,30}dummy.com*(dummy.com)1 *-->



just thinking it is not a problem. even if the filter matches the first occurrence of dummy.com and the second dummy.com instead of the third, the filter will always match the whole comment due to the subsequent * and the closing -->.
in this case the remaining code will be matched by * until the filter reaches the closing -->. it could be true the opposite instead: the filter with (?)++{0,90}--> could fail because the closing --> is not in the subsequent 90 chars, while *--> will match until --> is found.

so i'm thinking that the (?)++{0,90}--> part can be safely replaced by *-->.

btw, these are only morning toughts :)
they are not to be taken seriously,
altosax.

 
 

sidki3003

  • Sr. Member
  • ****
  • Posts: 476
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #17 on: July 28, 2002, 11:30:29 AM »
MD, let's quote you for the third time :
quote:

This is a cool filter - it missed some like

begin PROMOS & ADS and
begin txtad


I share altosax's opinion about this, but if you want to customize the filter:
Increase the byte limit to 17000.
Append this to AdCommentPairs.txt:
PROMOS & ADS*(PROMOS & ADS)1(?)++{0,90}-->
"txtad" is part of the above large comment block.
quote:

Also, cpu use is high - 58% as opposed to 12% for the standard paired ad catcher.. must be all those wild cards..


Yes. Advantage of "Kill comment-block ads old":
Less cpu usage, faster.

Disadvantage:

No soup for comments like:
Counter * /Counter
AD BOX BEGINS * AD BOX ENDS

Risk of unpaired matches:
Begin PayCounter * End Burst Main Media Code
This happens because the closing comment is missing sometimes.

sidki


Edited by - sidki3003 on 28 Jul 2002  12:43:29
 

sidki3003

  • Sr. Member
  • ****
  • Posts: 476
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #18 on: July 28, 2002, 11:42:10 AM »
Good morning altosax ,
quote:

...the filter will always match the whole comment due to the subsequent * and the closing -->.



That's correct, as long there is no comment in between.
Here is an example (counter * /counter is in my list):
http://mikhed.narod.ru/en/programs/index.htm

BTW, i updated the list (first post) to include your keywords.

sidki


Edited by - sidki3003 on 28 Jul 2002  12:51:56
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #19 on: July 28, 2002, 12:01:49 PM »
sidki wrote:

quote:

That's correct, as long there is no comment in between.



but if there is a comment in between, both wildcards *--> and (?)++{90}--> will match the one they find first!

altosax.

 
 

sidki3003

  • Sr. Member
  • ****
  • Posts: 476
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #20 on: July 28, 2002, 12:09:21 PM »
I'd suggest an experiment. In the list, replace
COUNTER*/(COUNTER)1(?)++{0,90}-->
by
COUNTER*/(COUNTER)1*-->
Then go to above URL and you see what i mean.

sidki

 
 

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #21 on: July 28, 2002, 12:18:44 PM »
sidki wrote:

quote:

Here is an example (counter * /counter is in my list):
http://mikhed.narod.ru/en/programs/index.htm



this is more safer and solves the problem:

Rating@Mail.ru COUNTER * <!-- / (COUNTER)1 -->

regards.

 
 

MorpheusDreamlord

  • Jr. Member
  • **
  • Posts: 74
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • Email
Kill Comment Block Ads: Another One
« Reply #22 on: July 28, 2002, 06:35:25 PM »
Hi sidki & altosax,

<quote>
Risk of unpaired matches:
Begin PayCounter * End Burst Main Media Code
This happens because the closing comment is missing sometimes.
</quote>

Yes, I've noticed this - I've had a few sites (simtel net australia for eg)
that I had to put into a bypass-comment-killer list - else the entire download listing would be not there!!

Umm, but surely having a close comment missing would also mean your new filter would miss it? (I guess missing it is better than killing it all, LOL!!)

I liked altosax's idea of using BOTH filters - from his other post 28-07-2002
use the <--- start|begin|..... to get the quickie easy ones, which are then skipped by the heavy duty killer (this current one). Missed ads are killed bythe HD killer filter (hey, my names here, not really a heavy dity..).

Guys, this entire subject is really interesting, thanks for the new ideas!!

(side track here - is the second filter in this new ad killer (the filter that records slip-through ads pairs) supposed to record the full comment name?

On mine it just records the comment name up to the first space -
"start reget blaster ad" is recorded as "... ... reget .." (ad is made up, and the ... represents stuff the filter adds itself in the filter code))

|
Come to the Dreaming...
|
Come to the Dreaming...

altosax

  • Sr. Member
  • ****
  • Posts: 328
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #23 on: July 28, 2002, 08:01:16 PM »
md wrote:

quote:

(side track here - is the second filter in this new ad killer (the filter that records slip-through ads pairs) supposed to record the full comment name?



hi md.
i've modified the entries of sidki list, placing in the # NO LEADING WILDCARD section the comment AS is, not only the keyword. my idea is to remove at all the leading wildcards because they make slower the filtering (this way the filter can fail first on false matches). you can use my modified version of the sidki filter (the one marked as [sidki]) or the original sidki one and use the sidki list (removing duped entries i already have in the list for the filter marked as [vm]) or my modified version of it. they all are compatible, this is an advantage of these filters.

<edit>: i've misunderstood what you was meaning. yes, the log filter realized by sidki log the whole comment (its limit is 12000 bytes) but only if the keywords in the match field match the code in it. my log filter instead is:

Multi = TRUE
Limit = 512
Match = "<!-- 1 -->"
Replace = "<!-- 1 -->&$ADDLST(your_list,your_comment   1    u)"

and is placed BEFORE the filters. this way i can find the right comment to replace the one listed as keywords.</edit>

regards,
altosax.

Edited by - altosax on 28 Jul 2002  21:18:12
 

sidki3003

  • Sr. Member
  • ****
  • Posts: 476
    • ICQ Messenger -
    • AOL Instant Messenger -
    • Yahoo Instant Messenger -
    • View Profile
    • http://
    • Email
Kill Comment Block Ads: Another One
« Reply #24 on: July 28, 2002, 08:05:02 PM »
quote:
Umm, but surely having a close comment missing would also mean your new filter would miss it?

Yep.
quote:
is the second filter in this new ad killer (the filter that records slip-through ads pairs) supposed to record the full comment name?

Nope. Just the expression it is looking for.

THX for the feedback :)