The Un-Official Proxomitron Forum

Full Version: How to test a variable against another variable?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I am trying to write a filter which will add links to Google cache results to allow continuing using the cache, just like this userscript does.

I intend to add links only for those urls which are from the same host, so I use below code to extract the host address and sotre it in a global variable for later use.

Code:
(This is Google's cache of <a[^>]+>http://\1/*</a>)\0$SET(currentHost=\1)

The next step is to match only the urls from the same host. I tried below code but it seems the $GET command can't be expanded in match field just like the help file stated:
Code:
<a*href=$AV($GET(currentHost)/\2)*</a>

Then I tried to capture the host part into a variable:
Code:
<a*href=$AV(\7/\2)*</a>

and tested it with the currentHost variable in the replace field but it doesn't work either.
Code:
$TST(\7=$GET(currentHost))$SET(......)

Any ideas?
I find variables confusing sometimes; try reading Sidki's "Techniques" file, I think some of it might apply to what you're trying to accomplish: http://sidki.proxfilter.net/prox/sidki-e...niques.txt
I read that already and even searched sidki's config set but couldn't find an example on testing variable against variable. Banging Head

Is there any other way to do what I want to do?
(Aug. 21, 2008 04:04 AM)whenever Wrote: [ -> ]The next step is to match only the urls from the same host. I tried below code but it seems the $GET command can't be expanded in match field just like the help file stated:
Code:
<a*href=$AV($GET(currentHost)/\2)*</a>
Finally worked out a way by using a memory block list for the hostname match.

The below filter is just for demonstrating the technique and may not ready for real use.
Code:
[Patterns]
Name = "Add cache links to Google cache results page"
Active = TRUE
URL = "[^/]+/search\?*q=cache"
Limit = 256
Match = "(This is Google*s cache of <a[^>]+>http://\1/*</a>)\0$SET(currentHost=\1)$ADDLST(Mem-Temp,\1)|"
        "(<a*href=$AV(http://$LST(Mem-Temp)/\1|/\1)*</a>)\0$SET(2= <a href="http://www.google.com/search?hl=en&q=cache:http://$GET(currentHost)/\1"><img src="http://www.google.com/favicon.ico" style="border: none;" /></a>)"
Replace = "\0\2"
whenever Wrote:The next step is to match only the urls from the same host. I tried below code but it seems the $GET command can't be expanded in match field just like the help file stated:
Code:
<a*href=$AV($GET(currentHost)/\2)*</a>
...snip...
Any ideas?

Here's a test filter that does what you want, I think.
Code:
[Patterns]
Name = "New HTML filter"
Active = FALSE
Limit = 256
Match = "$SET(foo=http://www.foobar.com)"
        "(<a*href=$AV($TST(foo)/\2)*</a>)\3"
Replace = "\3"

In the log window, these will match:
Code:
<a href="http://www.foobar.com/foo_for_all/" >foo for all </a>
<a href="http://www.foobar.com/foo_for_one/" >foo for one </a>

This one does not:
Code:
<a href="http://www.nofoo.com/foo_for_all/" >foo for all </a>

HTH
z12
Thank you z12. I found that trick was documented in the help file too, just inside the $TST() command section.

Each time I reread the help file, I learn something which was not noticed or understood before.

So, here is an improved filter without using a memory block file.
Code:
[Patterns]
Name = "Add cache links to Google cache results page"
Active = TRUE
URL = "$TYPE(htm)[^/]+/search\?*q=cache"
Limit = 256
Match = "This is Google*s cache of <a[^>]+>http://\1/*</a>$SET(currentHost=\1)PrxFail|"
        "(<a*</a>&&*href=$AV(http://$TST(currentHost)/\1|/\1)*)\0$SET(2= <a href="http://www.google.com/search?hl=en&q=cache:http://$GET(currentHost)/\1"><img src="http://www.google.com/favicon.ico" style="border: none;" /></a>)"
Replace = "\0\2"
whenever Wrote:Each time I reread the help file, I learn something which was not noticed or understood before.

Same with me. It's definitely not a file you read just once.

As Kye-U mentioned, sidki's Techniques.txt is also quite helpful.
Especially in regards to variables.

BTW, as per your thread title, this tests if two variables, to & tc, are equal:
Code:
$TST(to=$TST(tc))

While this may seem obvious to some, it took me a while to figure it out.

z12
Great! Now it seems obvious with knowing the above trick.

I learned another trick about $TST() at Castlecops, maybe you are interested in it too.
If you use "This is Google's cache of", this will only work for english people. In my case, being spanish, i see "Esta es la versión en caché de" so it will never match. Please let's tray doing global filters.

How? Taking "currentHost" from the actual url of the cache
Code:
$URL(*cache:*:(*)\5/*)$SET(currentHost=\5)

Code:
[Patterns]
Name = "Google cache: Add cache links to Google cache results page {whenever,ln}081010"
Active = TRUE
URL = "$TYPE(htm)[^/]+/search\?*q=cache"
Bounds = "$NEST(<a\s,</a>)"
Limit = 256
Match = "(*href="
        "$TST(V_Postbody=1)$URL((*cache:*:)\4(*)\5/*)$SET(currentHost=\5)"
        "$AV(http://$TST(currentHost)/\1|/\1)*)\2"
Replace = "\2<a href="http://www.google.com/search?q=cache:http://$GET(currentHost)/\1"><img src="http://www.google.com/favicon.ico" style="border: none;" /></a>"

I like this filter, thanks for it Wink

Updated: Changed "/search?hl=en&q=cache" to "/search?q=cache"
That's better. I made some improvements based on your filter to cover the below situation:

cache:www.test.com&other=
No http:// between, no ended "/"

cache:www.test.com
currentHost is the last parameter

Code:
[Patterns]
Name = "Add cache links to Google cache results page {whenever,ln}081010"
Active = TRUE
URL = "$TYPE(htm)[^/]+/search\?*q=cache"
Bounds = "<a*</a>"
Limit = 256
Match = "$URL(*cache:(http://|)(\3)([/&]*|))$SET(currentHost=\3)"
        "(*href=$AV(http://$TST(currentHost)/\1|/\1)*)\2"
Replace = "\2<a href="http://www.google.com/search?hl=en&q=cache:http://$GET(currentHost)/\1"><img src="http://www.google.com/favicon.ico" style="border: none;" /></a>"
Better again. Thanks! Wink
Since host extraction is excuted each time, there is no need to use a global var any longer. The below new version also improved the host extraction and took care of the :// encoding.

Code:
[Patterns]
Name = "Add cache links to Google cache results page {whenever,ln}081011"
Active = TRUE
URL = "$TYPE(htm)[^/]+/search\?*q=cache"
Bounds = "<a*</a>"
Limit = 256
Match = "$URL(*cache(:|%3a)(http(://|%3a%2f%2f)|)([^/&]+)\1*)"
        "(*href=$AV(http://$TST(\1)/\2|/\2)*)\3"
Replace = "\3<a href="http://www.google.com/search?q=cache:http://\1/\2"><img src="http://www.google.com/favicon.ico" style="border: none;" /></a>"
whenever Wrote:I learned another trick about $TST() at Castlecops, maybe you are interested in it too.

At first glance, the castlecops $TST looks a bit odd.

Basically it's just this:
Code:
$TST(VariableName=Matching expression)
$TST(VariableName=*)
...
$TST(keyword=*)
$TST(keyword=\1)

Both the * and \1 match the current value of the VariableName.
The \1 has the added bonus of saving that value.
You can do this wherever you can use $TST().

BTW, personally, I would put the \s back into your filters Bounds expression.
Code:
Bounds = "<a\s*</a>"
It's a bit safer.

Perhaps you should post your filter in Website Customization when your done with it.
More people will probably see it there. Smile!

z12
You are right on the \s. The fixed version is posted here. Smile!
Agree!!

I used sometimes a\s but sometimes i removed it to test for area, but i still haven't tested it by now. I will play with (a|area)\s

More ideas are welcome;)

Edited: Deleted what i said about $NEST or $INEST, it's usefull for tables, but not here. It's also confusing sometimes, so i removed it from here for that reason.
Pages: 1 2
Reference URL's