Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - xartica

Pages: [1]
1
Privacy / akamai... and web beacons
« on: May 22, 2002, 07:44:29 AM »
As far as privacy goes, I normally "shrug off" any worries about "web bugs".
Today, however, I got to thinking: even if the buggers are "harmless enough" they are STEALING tcp connections -- thereby delaying the loading of other page elements, eh?

I've seen configs where Proxxers are blocking akamai altogether, but I don't see that as a feasible option. They carry (cache) a lot of valid content for many sites I visit, especially imagefiles.

By the time I cobbled this filter to address the immediate problem
(akamai... slash partnername and/or ad-client account number ...clear.gif)
the thought process brought a whole bunch of questions and thoughts I'd like to discuss.

Name = "strip Akamai web beacons"
Active = FALSE
Multi = TRUE
Bounds = "<i(m(g|age)|nput)s*>"
Limit = 512
Match = "1 src=$AV( 2 akamai 3)&
      *(
        height=$AV([#1-15])
        |width=$AV([#1-15])
        |clear.gif
        |style=$AV(*(hidden|none))
       )*
       $SET(9=CAVEAT: hidden property may be assigned via external CSS)"
Replace = "
 <!-- SUSPECTED akamai web beacon nixed -->
"

Question:
What about OBJECT, APPLET and EMBED tags?

Question:
What about DHTML runtime replacements?
(ala document.links[1].whatever.innerText = "URL of element not prox-parsed when the page loaded")

Question:
What about URL strings assigned via variables, declared within external script files? One example I've seen is:
{snip}
imgsRoot = 'http://a1208.g.akamai.net/g/7/1208/380/1d/sportsillustrated.cnn.com';
{snip}



 

2
Privacy / web beacon found within stylesheet linkTag
« on: May 22, 2002, 07:17:14 AM »
--- (in the Yahoo!Prox-list forum) Michael B?rschgens wrote:
> I've found the following line in a webpage:
>
> <link rel=stylesheet type="text/css"
> href="http://www.house27.ch/counter/trans.php?ID=9322">
>
> Since I've never seen this before I think it is a new idea to slip
> through filters.

--- my reply:

Yep, it's definitely a web beacon ~~ calling that URL returned
a zero-length text/html content-typed document.

Here's the counteracting webfilter I propose:

Name = "strip web beacons posing as stylesheets"
Active = TRUE
Bounds = "<links*>
Limit = 512
Match = "*rel=$AV(stylesheet)*&"
-indent-"*href=$AV(*([?=]|.pl|.php|.cgi)*)"
-indent-"|(^*href=$AV(*(.css|.txt)*))"

Here's my rationale:

~~ 512byte limit because the LINK tag may be padded with with several
attributes

~~ path to a valid CSS should never have a questionMark or equalSign
(I've seen valid stylesheets returned with commas in the path, FWIW)

~~ the file extension patterns might seem "obvious" but if they're
not explicitly stated, "href=pathname/MuckUp.css.cgi" could slip by

~~ Although dot-css is the convention, I continually encounter a lot
of dot-txt -named stylesheets


Discussion invited:
Should the filter also include (look for) .asp and other executables?
I think accounting for the common script extensions is enough ~~
because, eventually... some dastardly weenie will just
serve all his stylesheets from a www2.domain.com webserver which has configured so that ".css" files are associated with (handled by) perl and are executable. The script will transparently count ya & will return the (a) valid stylesheet.

-xartica


 

3
Arne, are you still using/recommending this filter?
I found it (active) in my config along with a comment saying that I
added in Feb 2002... but I can't remember EVER seeing it match.

In = FALSE
Out = TRUE
Key = "URL-Killer: Multi Ads blaster -Arne (Out)"
URL = "$LST(AdDims)"
Replace = "Ads killed ARNEk"


the external (AdDims) blocklist for the filter contains:
=================================================
=================================================

#  banners (468x60, 470x60 (RB1 Network)...)

*(
      (*width=[#468-470] & *height=[#60])
      |
      (*width=([#60]|[#173]|[#230-240]) & *height=[#60])
      ) *>$SET(9=banner)

#  buttons and counters (88x31)
*(
      *width=[#81] & *height=[#63]
      ) *>$SET(9=counter)

*(
      *width=[#88-89] & *height=([#30-31]|[#60-62])
      ) *>$SET(9=button)

#  Part 2------------------------------------------------------
#  These sizes are not used too often for banners
#  The images that have these sizes can be safely removed if they
#  are not related to the site itself
#  ------------------------------------------------------------

# banners (468x*, 470x*...)
*http://*(
  *width=[#468]
  |(*width=([#470]|[#480]) & *height=[#40-120])
  |(*width=([#60]|[#173]|[#230-240]|[#400-500]) & *height=[#60])
  ) *>$SET(9=banner1)

# square banners (100x100 RB2 Network rb2.design.ru)
*http://*(
     (*width=[#95-105] & *height=[#95-105])
     |(*width=[#120-130] & *height=[#120-130])
  ) *>$SET(9=square1)

# Rare standard banners
*http://*(
     (*width=[#390-392] & *height=[#70-72])
     |(*width=[#120] & *height=([#60]|[#90]|[#240]))
     |(*width=[#230] & *height=[#30-33])
  ) *>$SET(9=rarebanner1)

# Non-standard banners (primarily adult sites)
*http://*(
     (*width=[#459-461] & *height=([#55-70]|[#80-90]|[#136]))
     |(*width=[#400] & *height=([#80]|[#100]|[#120]|[#150]))
     |(*width=[#450] & *height=([#80]|[#90]|[#125-130]|[#150]))
  ) *>$SET(9=non-standard1)

# Miscellaneous graphics
*http://*(
      (*width=[#100] & *height=[#50])
      |(*width=[#200] & *height=([#55-60]|[#300]))
      |(*width=[#250] & *height=[#150])
  ) *>$SET(9=misc)

#  User sizes go here...
# USER SECTION

 

Pages: [1]