Post Reply 
$URL match problem
Nov. 22, 2004, 02:42 PM
Post: #1
 
Hi kuruden,

I noticed one of my Content-Type header filters wasn't working.

Here's the Proxomitron code that works (4.5):
Code:
In = FALSE
Out = TRUE
Key = "Accept: 2. Capture File Extension (out)"
Match = "$URL((\w://\w/)+{1,*}([^.\?]+.)+{1,*}([a-z]+)\2 +\w)$SET(Fext=\2)&(^?)"

In = TRUE
Out = FALSE
Key = "Content-Type: 1a. Check if html (In)"
Match = "text*&$TST(Fext=$LST(Mime-List))"
Replace = "\0"

It's another 2 part header filter. The Accept filter is designed to Never Match, just set the variable Fext. Fext is the file extention of the url so it should capture things like gif, jpeg & so on.

The 2nd filter actually fixes the Content-Type header.

I suspected the problem was with the text matching within the $URL command in the Accept filter, so I edited the Content-Type filter as follows, to see what was matching:

Code:
[HDR: Content-Type, 1a. Check if html (In)]
Type=in
Category=Header
Title=HDR: Content-Type, 1a. Check if html (In)
Header=Content-Type
Match=(?*)\0&*$URL((\w://\w/)+{1,*}([^.\?]+.)+{1,*}([a-z]+)\2 +\w)
Replace=\0$LOG(B\2)

This test code which works in proxomitron, seemed to cause Proximodo to go into an endless loop when firefox loaded a page. After waiting a while for the page to load, I checked the Log window and it was stuck on the first reply. I closed the log window, then Proximodo then firefox.

They all seemed to close ok, but checking Windows Task Manager revealed a proximodo process still running, which I then ended.

I'm not sure if the problem is related to matching the text inside the $URL or if its a problem with $URL itself.

I'd appreciate hearing your thoughts on this.

Mike
Add Thank You Quote this message in a reply
Nov. 22, 2004, 09:05 PM
Post: #2
 
This $URL command is okay, the problem comes from " +"
I didn't put an infinite-loop protection in "+" processing, but I will.
space-plus means "any number of spaces, repeated any number of times", and the engine was looping while repeating no-space any number of times.

Anyway, why do you put a plus after a space in your filter? Just a space is enough, or \s+ would be okay too....
Visit this user's website
Add Thank You Quote this message in a reply
Nov. 22, 2004, 11:11 PM
Post: #3
 
Hi kuruden,

kuruden Wrote:Anyway, why do you put a plus after a space in your filter? Just a space is enough, or \s+ would be okay too....

Actually, it shouldn't be there. Smile!

When I first made the filter, I had spaces separating the various portions of the filter for clarity. Apparently, I forgot to remove that one.

kuruden Wrote:space-plus means "any number of spaces, repeated any number of times"

Any number can be zero also. If I want to insure a match greater than zero I use +{n} or +{min,max}. (Unless, of course, it needs to match exactly once, then I don't use + at all).

Often time I use this:
Code:
https+://
When matching urls to match either http or https.

Anyway, I'm glad you noticed that extra space. I'll fix that and give it another go.

Thanks
Mike
Add Thank You Quote this message in a reply
Post Reply 


Forum Jump: