The Un-Official Proxomitron Forum
web page filter question - Printable Version

+- The Un-Official Proxomitron Forum (https://www.prxbx.com/forums)
+-- Forum: Proxomitron Filters (/forumdisplay.php?fid=38)
+--- Forum: Filter Help/Request (/forumdisplay.php?fid=31)
+--- Thread: web page filter question (/showthread.php?tid=475)



- qwer993 - May. 09, 2005 07:00 PM

Hi
To better utilize space on my laptop screen (using my rss reader), and avoid scrolling a lot, I want to keep only the body text of the page. I want to leave top-ads, side menues etc- out. If the page is say 200k, I am keeping maybe only 20k, only adding some start and end html-tags. I've tried to make a Proxomitron webfilter like

Matching expression:
* <sometag> \1 <sometag> *
Replacement text:
<sometag> \1 <sometag>

but it wont work, maybe because the byte limit is exceeded?

Is Proxomitron the right tool for doing this thing? If yes, how do I do it? If not, does anyone know any suitable tool?

thanks
tom


- sidki3003 - May. 09, 2005 07:42 PM

Hi, normally you can do...
Code:
[Patterns]
Name = "kill all above <orig-tag>"
Active = FALSE
Bounds = "*<orig-tag>"
Limit = 32767
Match = "*&$STOP()"
Replace = "<new-tag>"

Name = "kill all below </orig-tag>"
Active = FALSE
Limit = 16
Match = "</orig-tag >"
Replace = "</new-tag>\k"
...but only if you know that <orig-tag> indeed appears on that page and is no more than 32767-10 bytes away from the top, or else things are getting *really* slow. "\k" is a meta-char that kills the connection here after you've got all the content you want.

Moved to "Filter Help"

sidki


- qwer993 - May. 10, 2005 09:51 AM

sidki3003 Wrote:Hi, normally you can do...
Code:
[Patterns]
Name = "kill all above <orig-tag>"
Active = FALSE
Bounds = "*<orig-tag>"
Limit = 32767
Match = "*&$STOP()"
Replace = "<new-tag>"

Name = "kill all below </orig-tag>"
Active = FALSE
Limit = 16
Match = "</orig-tag >"
Replace = "</new-tag>\k"
...but only if you know that <orig-tag> indeed appears on that page and is no more than 32767-10 bytes away from the top, or else things are getting *really* slow. "\k" is a meta-char that kills the connection here after you've got all the content you want.

Moved to "Filter Help"

sidki
works very well. Thanks. tom