![]() |
|
Basic Negate Question - Printable Version +- The Un-Official Proxomitron Forum (https://www.prxbx.com/forums) +-- Forum: Forum Related (/forumdisplay.php?fid=37) +--- Forum: Proxomitron Program (/forumdisplay.php?fid=4) +--- Thread: Basic Negate Question (/showthread.php?tid=1677) |
Basic Negate Question - qz33 - Oct. 12, 2010 12:43 PM I am trying to understand the negate function in the text matching language. I have never tried to write a filter using it before other than a single character negation. Why does *(^arstechnica)* match when given http://www.arstechnica.com Below is the only reference from the Proxomitron help files that directly relates too a whole word negation : "For example, "(^foo|bar)" would match anything that's not "foo" or "bar". Note that a negated expression consumes no characters - it just test them. I think Perl calls this a "negative forward assertion"?" Does anyone have enlightenment? I am trying to blank out any urls that are NOT something.arstechnica.something. Even if there is a better way to do this I need to understand why the above match expression fails and provides a match when I think it should not. RE: Basic Negate Question - JJoe - Oct. 12, 2010 03:15 PM (Oct. 12, 2010 12:43 PM)qz33 Wrote: Why does The leading wildcard can match nothing. (^arstechnica)* matches HTH RE: Basic Negate Question - whenever - Oct. 12, 2010 03:31 PM (^foo) matches a position, where it is not followed by "foo". Try below filter, which puts the matched result of the first * into \1, (^arstechnica) into \2 and the last * into \3. Code: [Patterns]When given http://www.arstechnica.com, it outputs: Quote:\1 is: The first * started from matching zero character, that is, a position before the first h. At that position, it is not followed by "arstechnica", so (^arstechnica) matched The last * matched all left characters ... So the whole expression matched. HTH. RE: Basic Negate Question - qz33 - Oct. 12, 2010 06:17 PM Great so far! Thanks. So I tried this (src|href)="http://*.(^arstechnica).* against href="http://static.arstechnica.com/apple/ (yes I know it is missing a closing quotation) and it did not match which is what I wanted and expected. but when I put the " . " in the negated sub-expression like this (src|href)="http://*.(^arstechnica.)* it matches against href="http://static.arstechnica.com/apple/ which again I think it should not but I do not understand why as it seems that it would match up to the FIRST " . " then everything after IS "arstechnica.com/apple/" which should not match the because of the (^arstechnica.) part. So can someone help with the why here I guess I really should be using (src|href)"http://\0.(^arstechnica.)\1\2 RE: Basic Negate Question - JJoe - Oct. 13, 2010 12:30 AM If I understand correctly, Code: (src|href)="http://*.(^arstechnica).*(^arstechnica) doesn't actually 'consume' anything. It just checks to see if arstechnica is not there. So, the part of the expression that 'consumes' is (src|href)="http://*..* and it doesn't match much. Code: (src|href)="http://*.(^arstechnica.)*The (src|href)="http://*.(^arstechnica.) part does match href="http://static.arstechnica. as arstechnica. is not followed by arstechnica. *.(^arstechnica.) didn't stop looking for a match after static. was found. Code: http://\0.(^arstechnica.)\1\2This will fail when every period is followed by "arstechnica.". Probably not going to happen. \1 will always be empty. Why do these match http://static.rstechnica.com/apple/ but not http://static.arstechnica.com/apple/ or do they? http://(^*.arstechnica)* http://([^.]+.(^arstechnica))+{1,*}* http://((*.)+{1}(^arstechnica))+{1,*}* HTH RE: Basic Negate Question - qz33 - Oct. 13, 2010 02:35 AM Ok it is slowly sinking in now. I believe I am not being strict enough with the meaning of the operator symbols. Man is this just a retard question for you guys? I really do appreciate it. |