Post Reply 
how to clean foo
Apr. 06, 2006, 07:11 PM
Post: #1
how to clean foo
suppose I have this link and this filter

<a href="../cgi-bin/linez/load.cgi?http://arstechnica.com/news.ars/post/20060406-9999.html&foo=Is this a really important article? 04-06-2006">

Matching expression:
../cgi-bin/linez/load.cgi?\1

Replacement:
\1

it resolves to clean out the ../cgi-bin/linez/load.cgi? and leaves the link as the pure article link to arstechnica.com

fine, but the &foo part is also wrong and I would like to drop everything beginning with &foo= to the end of the link

IOW, this part would be dropped
&foo=Is this a really important article? 04-06-2006

this is the actual link remaining after the filter
http://arstechnica.com/news.ars/post/20060406-9999.html
Quote this message in a reply
Apr. 06, 2006, 07:27 PM
Post: #2
 
Guest,

Your filter Match expression just needs a few additional items. Try it like this:
Code:
Match "../cgi-bin/linez/load.cgi?\1*\2&foo*>"
Replace = "\2""
and see what happens. Note that there are three quote marks in that Replace expression. The one immediately following the \2 is needed by the href tag in order to close it off. Without it, the replacement link won't work. (It was at the very end of the &foo..... which had been getting through.)


Oddysey

I'm no longer in the rat race - the rats won't have me!
Add Thank You Quote this message in a reply
Apr. 06, 2006, 08:20 PM
Post: #3
 
nope, doesn't work

as suggested, it now also breaks the stripping of ../cgi-bin/linez/load.cgi?

as soon as I remove the changes and return back to the original format, it does work in stripping off the ../cgi-bin/linez/load.cgi?

..but of course I also want to kill the &foo=
Quote this message in a reply
Apr. 06, 2006, 09:39 PM
Post: #4
 
Try this filter:

Code:
[Patterns]
Name = "Clean URL"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 256
Match = "$AV(../cgi-bin/linez/load.cgi\?\1\&foo*)"
Replace = ""\1""
Visit this user's website
Add Thank You Quote this message in a reply
Apr. 06, 2006, 09:59 PM
Post: #5
 
To match characters Proxomitron uses, you may need to "escape" them by preceeding them with a back-slash.
http://www.proxomitron.info/45/help/Matc...Rules.html
You probably want to match a ? and an & so you should use \? and \&.

Code:
[Patterns]
Name = "New HTML filter"
Active = FALSE
Limit = 256
Match = "../cgi-bin/linez/load.cgi\?\1\&*">"
Replace = "\1">"
However, those unrestained wildcards may be trouble.
It's probably best to start your Match with <, end with >, and use Bounds.
Bounds can simplify the Match and remove "Limit" worries.
For instance, Bounds like
Code:
<a\shref*>
would limit the filter to matching the hidden HTML between any < and the first following > regardless of the Byte Limit.

Match could then be
Code:
\1../cgi-bin/linez/load.cgi\?\2\&*

\1 is <a href="
\2 is http://arstechnica.com/news.ars/post/20060406-9999.html

So the Replace
Code:
\1\2">

Put it all together
Code:
[Patterns]
Name = "New HTML filter"
Active = FALSE
Bounds = "<a\shref*>"
Limit = 256
Match = "\1../cgi-bin/linez/load.cgi\?\2\&*"
Replace = "\1\2">"

Might want to add a URL Match for efficiency.

"Sharing your web filters" at http://www.proxomitron.info/45/help/Web%...lters.html

HTH
Add Thank You Quote this message in a reply
Apr. 06, 2006, 10:17 PM
Post: #6
 
Kye-U Wrote:Try this filter:

Code:
[Patterns]
Name = "Clean URL"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 256
Match = "$AV(../cgi-bin/linez/load.cgi\?\1\&foo*)"
Replace = ""\1""
I'd want to add an = at least
Code:
[Patterns]
Name = "Clean URL"
Active = TRUE
URL = "$TYPE(htm)"
Limit = 256
Match = "=$AV(../cgi-bin/linez/load.cgi\?\1\&foo*)"
Replace = "="\1""
Add Thank You Quote this message in a reply
Post Reply 


Forum Jump: