Post Reply 
Basic Negate Question
Oct. 12, 2010, 12:43 PM
Post: #1
Basic Negate Question
I am trying to understand the negate function in the text matching language. I have never tried to write a filter using it before other than a single character negation.

Why does
*(^arstechnica)*

match when given http://www.arstechnica.com

Below is the only reference from the Proxomitron help files that directly relates too a whole word negation :

"For example, "(^foo|bar)" would match anything that's not "foo" or "bar". Note that a negated expression consumes no characters - it just test them. I think Perl calls this a "negative forward assertion"?"

Does anyone have enlightenment?

I am trying to blank out any urls that are NOT something.arstechnica.something.

Even if there is a better way to do this I need to understand why the above match expression fails and provides a match when I think it should not.
Add Thank You Quote this message in a reply
Oct. 12, 2010, 03:15 PM
Post: #2
RE: Basic Negate Question
(Oct. 12, 2010 12:43 PM)qz33 Wrote:  Why does
*(^arstechnica)*

match when given http://www.arstechnica.com

The leading wildcard can match nothing.

(^arstechnica)* matches

HTH
Add Thank You Quote this message in a reply
Oct. 12, 2010, 03:31 PM
Post: #3
RE: Basic Negate Question
(^foo) matches a position, where it is not followed by "foo".

Try below filter, which puts the matched result of the first * into \1, (^arstechnica) into \2 and the last * into \3.

Code:
[Patterns]
Name = "Test filter"
Active = FALSE
Limit = 256
Match = "\1((^arstechnica))\2\3"
Replace = "\\1 is: \1\r\n"
          "\\2 is: \2\r\n"
          "\\3 is: \3\r\n"

When given http://www.arstechnica.com, it outputs:

Quote:\1 is:
\2 is:
\3 is: http://www.arstechnica.com

The first * started from matching zero character, that is, a position before the first h.
At that position, it is not followed by "arstechnica", so (^arstechnica) matched
The last * matched all left characters
...
So the whole expression matched.

HTH.
Add Thank You Quote this message in a reply
Oct. 12, 2010, 06:17 PM (This post was last modified: Oct. 13, 2010 01:30 AM by qz33.)
Post: #4
RE: Basic Negate Question
Great so far! Thanks.
So I tried this

(src|href)="http://*.(^arstechnica).*

against href="http://static.arstechnica.com/apple/ (yes I know it is missing a closing quotation)

and it did not match which is what I wanted and expected.

but when I put the " . " in the negated sub-expression like this

(src|href)="http://*.(^arstechnica.)*

it matches against href="http://static.arstechnica.com/apple/

which again I think it should not but I do not understand why as it seems that it would match up to the FIRST " . " then everything after IS "arstechnica.com/apple/" which should not match the because of the (^arstechnica.) part.

So can someone help with the why here

I guess I really should be using (src|href)"http://\0.(^arstechnica.)\1\2
Add Thank You Quote this message in a reply
Oct. 13, 2010, 12:30 AM (This post was last modified: Oct. 13, 2010 02:06 AM by JJoe.)
Post: #5
RE: Basic Negate Question
If I understand correctly,


Code:
(src|href)="http://*.(^arstechnica).*

(^arstechnica) doesn't actually 'consume' anything. It just checks to see if arstechnica is not there.
So, the part of the expression that 'consumes' is (src|href)="http://*..* and it doesn't match much.

Code:
(src|href)="http://*.(^arstechnica.)*
against
href="http://static.arstechnica.com/apple/

The (src|href)="http://*.(^arstechnica.) part does match href="http://static.arstechnica. as arstechnica. is not followed by arstechnica.
*.(^arstechnica.) didn't stop looking for a match after static. was found.

Code:
http://\0.(^arstechnica.)\1\2

This will fail when every period is followed by "arstechnica.". Probably not going to happen.
\1 will always be empty.

Why do these match http://static.rstechnica.com/apple/ but not http://static.arstechnica.com/apple/ or do they?

http://(^*.arstechnica)*
http://([^.]+.(^arstechnica))+{1,*}*
http://((*.)+{1}(^arstechnica))+{1,*}*

HTH
Add Thank You Quote this message in a reply
Oct. 13, 2010, 02:35 AM
Post: #6
RE: Basic Negate Question
Ok it is slowly sinking in now. I believe I am not being strict enough with the meaning of the operator symbols.

Man is this just a retard question for you guys? I really do appreciate it.
Add Thank You Quote this message in a reply
Post Reply 


Forum Jump: