The Un-Official Proxomitron Forum
sidki's config set: 2005-06-09 - Printable Version

+- The Un-Official Proxomitron Forum (https://www.prxbx.com/forums)
+-- Forum: Proxomitron Config Sets (/forumdisplay.php?fid=43)
+--- Forum: Sidki (/forumdisplay.php?fid=44)
+--- Thread: sidki's config set: 2005-06-09 (/showthread.php?tid=358)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20


- sidki3003 - May. 13, 2005 04:54 AM

Seikatsu Wrote:Well, if somebody could stop for a moment and look at why the fields on Weather.com's detail page are empty. I would appreciate it. Thanks.
Here's the ad-comment list from the beta. Rename it to *.ptxt and then replace the old version with this one. That should fix it. Smile!

sidki


- Seikatsu - May. 13, 2005 05:13 AM

By the way, when are you thinking of releasing your next update? Just out of curiosity... Smile!


- sidki3003 - May. 13, 2005 05:26 AM

The format of quite a few lists has changed, many filters were adjusted and aren't backwards compatible, so i don't like to upload/post any updates.
Well, i do, but only on request, if someone gets a broken page and an updated filter fixes it.
Next thing will be a new config version, due out within the couple-of-days to one-month range.

sidki


- z12 - May. 13, 2005 03:05 PM

Hi sidki

Sorry it took so long to reply, I had to think awhile.

I think this is much simpler than my previous hacks. Basically, it prevents the 1 sec cache for replies that had the mime-type fixed to non html.

This is what I'm running right now, unmodified is relative to your orginal filter.

# unmodified
Code:
In = FALSE
Out = TRUE
Key = "Cache-Control: 1 Kill: Cache!     5.01.20 (cch!) [srl] (d.0) (Out)"
URL = "(^$KEYCHK(^C)|$KEYCHK(^S))$TST(keyword=*.(i_cache|i_cache_h):[12].*)"
Match = "\0&($TST(keyword=*.s_ptron.*)|$LOG(CGET $DTM(c) : Cache-Control killed: \0))"


# unmodified
Code:
In = TRUE
Out = FALSE
Key = "Cache-Control: 2 max-age 1 Day: Cache!     4.12.24 (cch!) [z12] (d.0) (In)"
URL = "(^$KEYCHK(^C)|$KEYCHK(^S)|$RESP([45]))$TST(keyword=*.(i_cache|i_cache_h):[12].*)"
Match = "\0&((^(public , |)max-age=[#86400:*])?$LOG(CRESP $DTM(c) : Cache-Control replaced: \0)|(^?))"
Replace = "public, max-age=86400"


# previously known as "Cache-Control: 4"
# modified url & match, allow 1 sec cache for 200 & not mime-type fixed --- see ETag Filter ---
Code:
In = TRUE
Out = FALSE
Key = "Cache-Control: 3 max-age 1s if HTML: Cache-H!     4.12.24 (cch!) [sd] (d.0) (In)"
URL = "(^$KEYCHK(^C)|$KEYCHK(^S))$TST(keyword=*.i_cache_h:[12].*)$RESP(200)"
Match = "(^$TST(eTag=*prxEtag2*))$IHDR(content-type:*(html|xml)*)"
Replace = "max-age=1"


# previously known as "Cache-Control: 3"
# modifed url, allow Cache-Control on 304 only if mime-type fixed --- see ETag Filter ---
Code:
In = TRUE
Out = FALSE
Key = "Cache-Control: 4 Kill if 3xx Response: Cache-H!     4.12.23 (cch!) [srl] (d.0) (In)"
URL = "$RESP(3)(^$TST(eTag=*prxEtag2*))(^$KEYCHK(^C)|$KEYCHK(^S))$TST(keyword=*.i_cache_h:[12].*)"
Match = "?"
[code]

# unmodified
[code]
In = TRUE
Out = TRUE
Key = "Cache-Control: 5 no-cache: Fresh!     4.12.26 (cch! ^ctrl) [sd] (d.0) (In+Out)"
URL = "(^$KEYCHK(^S))($KEYCHK(^C)|$TST(keyword=*.i_fresh:[12].*))"
Replace = "no-store, no-cache, max-age=0"


# all new, adds prxEtag only if content-type was changed to non html and cache-control is 1sec
# still need to fix the content-type check Smile!
Code:
In = TRUE
Out = FALSE
Key = "ETag: 1 Add Proxo ETag (cch!) (In)"
URL = "$TST(keyword=*.(i_cache|i_cache_h):[12].*)$RESP(200)"
Match = "\0&$IHDR(Cache-Control:max-age=1)$IHDR(content-type:([^;]+&&(^*(html|xml))$SET(1=prxETag2;\0)*)*PrxMsg=*)"
Replace = "\1$LOG(CRESP $DTM(c) : added proxo eTag)"

# disabled for now, don't think its needed
Code:
In = FALSE
Out = FALSE
Key = "ETag: 2 Restore Proxo ETag (cch!) (In)"
URL = "$TST(keyword=*.(i_cache|i_cache_h):[12].*)$RESP(304)"
Match = "\0&(^?)$TST(eTag=?*)$SET(2=$GET(eTag))|$TST(eTag=(*prxETag?;)\1*)$SET(2=\1\0)"
Replace = "\2$LOG(CRESP $DTM(c) : restored ETag)"


# similar to yours, I'm still prefixing the ETag, doh
Code:
In = FALSE
Out = TRUE
Key = "If-None-Match: 1 Remove Prox ETag (cch!) (Out)"
Match = "(prxETag?;)\0\1$SET(eTag=\0\1)"
Replace = "\1"

# looks to be same as yours
Code:
In = FALSE
Out = TRUE
Key = "If-None-Match: 2 Remove if empty (cch!) (Out)"
Match = "(^?)$LOG(CRESP $DTM(c) : removed If-None-Match:)"
[code]

# unmodified
[code]
In = FALSE
Out = TRUE
Key = "If-None-Match: 3 Kill: Fresh!     4.08.11 (cch! ^ctrl) [u] (d.0) (Out)"
URL = "(^$KEYCHK(^S))($KEYCHK(^C)|$TST(keyword=*.i_fresh:[12].*))"
Match = "\0&$LOG(CGET $DTM(c) : If-None-Match killed: \0)"


# similar to yours, adds it for 200 replies, if not html & its missing
# need to fix content-type match
Code:
In = TRUE
Out = FALSE
Key = "Last-Modified: 2 Add if Missing (cch!) (In)"
URL = "$TST(keyword=*.(i_cache|i_cache_h):[12].*)$RESP(200)"
Match = "(^?)(^$IHDR(content-type:([^;]+&&*(html|xml)*)*))"
Replace = "$DTM(I)$LOG(CRESP $DTM(c) : added Last-Modified)"


I think we might be close

Mike


- sidki3003 - May. 14, 2005 09:45 AM

Nice, much less overhead that way for the most common situation, where nothing needs to be fixed. Smile!
Also great that you found a solution for that 304 problem.

I didn't change much: Got rid of some wildcard tests and moved a few things around. I also replaced the Cache-Control test ($IHDR/$OHDR tests tend to consume time) in the ETag filter with a test for what follows "PrxMsg=".

I've (re-)added the opposite test (wasn't html but it is now) to the ETag filter. It seems useful to me, also because for those notorious asp/aspx/php extensions that are not text/html, the content-type gets removed and re-evaluated by the content-type sniffers. Of course, this test only makes sense, if you hit the refresh button at some point.

I've added application/xhtml+xml only to "Cache-Control: 3 max-age 1s". XHTML gets cought by the ETag filter anyway with above mentioned opposite test. RSS feeds can come as text/xml, application/rdf+xml, and numerous variations thereof. Also, i don't think it's necessary to add overhead to the filters just because of possible mislabeled XHTML - those "&&" tests looked a bit heavy to me.

Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Cache-Control: 3 max-age 1s if HTML: Cache-H!     5.05.14 (cch!) [sd] (d.0) (In)"
URL = "(^$KEYCHK(^C)|$KEYCHK(^S)|$TST(flag=*.etag:no-html.*))$TST(keyword=*.i_cache_h:[12].*)"
Match = "$TST(flag=*.etag:html.*)|$IHDR(Content-Type:( ) (text/html|application/xhtml\+xml))"
Replace = "max-age=1"

In = TRUE
Out = FALSE
Key = "Cache-Control: 4 Kill if 3xx Response: Cache-H!     5.05.14 (cch!) [srl] (d.0) (In)"
URL = "$RESP(3)(^$KEYCHK(^C)|$KEYCHK(^S)|$TST(flag=*.etag:*))$TST(keyword=*.i_cache_h:[12].*)"
Match = "?"

In = TRUE
Out = FALSE
Key = "ETag: Append Prox Field: Cache-H!     5.05.14 (cch!) [z12] (d.0) (In)"
URL = "$RESP(200)(^$KEYCHK(^C)|$KEYCHK(^S))$TST(keyword=*.i_cache_h:[12].*)"
Match = "(?$SET(1=\0; )*|)\0&$IHDR(Content-Type: ((^ text/html)*: text/html$SET(2=no-html)| text/html*: (^text/html)$SET(2=html)))"
Replace = "\1PrxMsg=\2"

In = FALSE
Out = TRUE
Key = "If-Modified-Since: 1 Strip Prox Field     5.05.14 [sd] (d.0) (Out)"
Match = "\0; PrxMsg=\1"
Replace = "\0$LOG(CGET $DTM(c) : If-Modified-Since: Prox Field stripped: \1)"

In = FALSE
Out = TRUE
Key = "If-Modified-Since: 2 Kill: Fresh!     4.08.11 (cch! ^ctrl) [srl] (d.0) (Out)"
URL = "(^$KEYCHK(^S))($KEYCHK(^C)|$TST(keyword=*.i_fresh:[12].*))"
Match = "\0&$LOG(CGET $DTM(c) : If-Modified-Since killed: \0)"

In = FALSE
Out = TRUE
Key = "If-None-Match: 1 Strip Prox Field - Set Flag     5.05.14 [z12] (d.0) (Out)"
Match = "(\0; |)PrxMsg=\1&$SET(flag=$GET(flag)etag:\1.)"
Replace = "\0$LOG(CGET $DTM(c) : If-None-Match: Prox Field stripped: \1)"

In = FALSE
Out = TRUE
Key = "If-None-Match: 2 Kill if empty     5.05.14 [z12] (d.0) (Out)"
Match = "( )(^?)"

In = FALSE
Out = TRUE
Key = "If-None-Match: 3 Kill if IMS present: Cache!     5.05.14 (cch!) [sd] (d.0) (Out)"
URL = "(^$KEYCHK(^C)|$KEYCHK(^S))$TST(keyword=*.(i_cache|i_cache_h):[12].*)"
Match = "(?*)\1&$OHDR(If-Modified-Since:(^*PrxMsg)*)$LOG(CGET $DTM(c) : If-None-Match killed due to IMS: \1)"

In = FALSE
Out = TRUE
Key = "If-None-Match: 4 Kill: Fresh!     4.08.11 (cch! ^ctrl) [u] (d.0) (Out)"
URL = "(^$KEYCHK(^S))($KEYCHK(^C)|$TST(keyword=*.i_fresh:[12].*))"
Match = "\0&$LOG(CGET $DTM(c) : If-None-Match killed: \0)"

In = TRUE
Out = FALSE
Key = "Last-Modified: 2 Add if Missing: Cache!     5.05.14 (cch!) [z12] (d.0) (In)"
URL = "$RESP(200)$TST(keyword=*.(i_cache|i_cache_h):[12].*)"
Match = "(^?|$IHDR(Content-Type:( ) text/html))"
Replace = "$DTM(I)\; PrxMsg=added$LOG(CRESP $DTM(c) : Last-Modified added)"

sidki


- z12 - May. 14, 2005 12:01 PM

Hi sidki

Very nice, those are some professional looking filters! Smile! I'm going to put those in after this post.

As for giving credit to me for some of those filters, I'm not sure thats right. These filters would not exist if not for your ideas and efforts. I was just trying to cache a few more things.

On a different note, I've noticed many of your header filters use code similar to this:
Code:
Match = "$IHDR(Content-Type:( ) (blah blah))"

I was wondering about the ( ) after the header field name. I've been hesitant to use that without without understanding whats happening with that.

Mike


- sidki3003 - May. 14, 2005 01:33 PM

Hi Mike, so we made it! Although we probably need to watch these filters for awhile, if they are doing okay.

Quote:These filters would not exist if not for your ideas and efforts.
...and vice versa. Your idea to let the browser store a variable is really innovative.
I'll add sd to some of them - and your handle to "max-age: 1s". Smile!

Quote:I was wondering about the ( ) after the header field name.
That's because of a Prox quirk. I've documented it here (first para).


edit: Forgot about the changes to always cache local.ptron HTML.

IncludeExclude.ptxt:
Code:
# Proxomitron's own files: Bypass everything except the content-type and cache
# related header filters. This entry is REQUIRED.
# -----------------------------------------------------------------------------
local.ptron/      $SET(keyword=.a_web.a_headers.s_ptron.i_cache:2.)

Code:
[HTTP headers]
In = FALSE
Out = TRUE
Key = "! : Redir: . From /local.ptron/ to local.ptron     5.01.24b [mona] (d.r) (Out)"
URL = "[^/]+/local.ptron/\1&$SET(keyword=.a_web.a_headers.s_ptron.i_cache:2.)&$RDIR(http://local.ptron/\1)"

sidki


- sidki3003 - May. 14, 2005 07:21 PM

Here is a fix. I didn't add back exclusion of 4xx/5xx responses for "max-age: 1s".
Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Cache-Control: 3 max-age 1s if HTML: Cache-H!     5.05.14 (cch!) [sd z12] (d.0) (In)"
URL = "(^$KEYCHK(^C)|$KEYCHK(^S)|$RESP([45])|$TST(flag=*.etag:no-html.*))$TST(keyword=*.i_cache_h:[12].*)"
Match = "$TST(flag=*.etag:html.*)|$IHDR(Content-Type:( ) (text/html|application/xhtml\+xml))"
Replace = "max-age=1"

I wonder anyway what to do with all those favicon errors. Someone requested to not redirect them to a local file anymore, so that the browsers would show their default icon, which is comprehensible.

I changed it then to redirect to a local 404, which prevents downloading of the error pages, but still requires a remote connect on each revisit (incl. back/forward navigation) and fills up the log window.

Maybe it's better for most people to redirect these errors to killed.gif again...

sidki


- ProxRocks - May. 14, 2005 08:41 PM

If memory serves, wasn't the favicon redirect altered from redirecting to killed.gif a Firefox issue?

Didn't it have something to do with Firefox's tab not showing any icon, default or otherwise, when the tab's page contained a favicon, or did not contain a favicon?

I'm afraid I just don't remember what the details were, but the thought is, in that the faded memory is pointing towards a Firefox user presenting the issue, then perhaps a $TST(original-user-agent) is our solution - redirect to killed.gif for whichever of IE vs Fx is "optimal" and vice versa for vice versa...

Sorry, wish I could recall the details - I remember the topic surfacing, I just don't recall 'what happened' and 'why'...


- sidki3003 - May. 14, 2005 09:33 PM

Yes, i think it was a Firefox user, it's the same thing with Opera tho.

Right, making it User-Agent dependend would be a way. Hmmm... thing is, i'm starting to doubt that this redir-to-local-404 thingy is even the best option for Firefox/Opera. It's okay to leave that for usual image errors, otherwise you would never see the ALT text, if the pic is missing.

But for favicons i could bring back the old behavior as an extra filter (that defaults to active). If users prefer to see the default icon (instead of no icon) for sites that have no icon, they could untick that filter, and error handling would fall back to redir-to-local-404. That may be a way to serve everyone's favicon needs.

Thanks for the input, helped me to get things sorted. Smile!

sidki


- z12 - May. 15, 2005 02:26 AM

hmm... I never noticed the favicon issue, although at one point in time I must have.

Code:
user_pref("browser.chrome.favicons", false);

Now, the only time I see them is when code is included for them:

Code:
<LINK REL="SHORTCUT ICON" HREF="http://www.dslreports.com/front/dslicon1.ico">

I haven't really looked at when the browser requests favicons, but I'm wondering if maybe a dummy favicon could be added to the header code if there is no Link to an .ico image.

Mike


- sidki3003 - May. 15, 2005 10:25 AM

Yeah, that Firefox setting defaults to true.

My Maxthon copy handles "<link rel='shortcut icon'..." rather oddly:
First look for root/favicon.ico, if there use this one and ignore link tag (so this doesn't work there), if absent look for link tag and use that one.
I assume that this behavior is hardcoded in the IE engine.

edit: Here is that filter. Hit "configure -> ok -> save -> reload" a few times after merging to get it into the right position.

If on, favicon errors are cached for one day, but you don't see the browser's default icon for favicon-less pages.
Else, errors are redirected to a local URL, but not cached.
Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "! : Redir: . Kill Favicon Error Responses     5.05.15 [sd] (d.0) (In)"
URL = "$RESP(([45]*)\0)$TST(uFile=favicon)&$RDIR(http://local.ptron/killed.gif)$LOG(wRESP $DTM(c) : Connection killed: Favicon Error: \0)"

sidki


- z12 - May. 15, 2005 12:08 PM

I just reset that setting back to default to see what firefox does with favicons.

The way Maxathon is functioning doesn't sound too bad. If there's already an ico, no need to grab it again. At least the link tag is checked if there is no icon. Makes it kind of hard to replace it though, for any single page.

What puzzles me are sites that don't use the link tag when they do have a favicon, given the popularity of tabbed browsing. I was just at yahoo news, and if firefox hadn't asked for the favicon, I wouldn't have got it.

On the other hand, I downloaded some of yahoo's css files to look em over, and was getting "bad request" errors because firefox was trying to get favicons to go along with em.

Anyway, I'm going to play around with favicons for a while this morning.

By the way, I saw a few more odd things at yahoo news I want to look into:

Code:
+++GET 552+++
GET /v10/us/news/pages/homepage/ootb.php?u=/ap/20050515/ap_on_el_pr/exit_polls HTTP/1.1
Host: news.yahoo.com

...

+++GET 556+++
GET /v10/us/news/pages/homepage/ootb.php?u=/ap/20050515/ap_on_el_pr/exit_polls HTTP/1.1
Host: news.yahoo.com

All the requests [552-556] were identical and generated by clicking on a link to the article one time.

Mike


- z12 - May. 15, 2005 12:10 PM

Just saw the favicon filter you posted. Going to add it now.

Mike


- sidki3003 - May. 15, 2005 12:40 PM

z12 Wrote:Makes it kind of hard to replace it though, for any single page.
Yep! I think Mozilla/Opera do it the right (== opposite) way.

Quote:What puzzles me are sites that don't use the link tag when they do have a favicon, given the popularity of tabbed browsing.
I think that's exactly because of above mentioned IE behavior, who's market share is still well above 80% according to WebSideStory (Germany has pretty cool stats there Smile! ).

Quote:All the requests [552-556] were identical and generated by clicking on a link to the article one time.
By clicking on what link exactly? I don't see them here.

sidki