external filters
|
Jan. 29, 2017, 03:21 PM
Post: #1
|
|||
|
|||
external filters
hey guys,
since privoxy provides the option to use external filters: can someone pleas ehelp me to understand this? i have limited resources to start privoxy. when all my files are bigger then a couple MB my privoxy just does not start. what i want to do is to load the blacklisted domains from hp hosts database and to block those domains. since the list is about 18mb large it is not possible to load that. can i use the external filter option for it? best, chris |
|||
Jan. 29, 2017, 11:41 PM
(This post was last modified: Dec. 08, 2017 10:24 PM by Faxopita.)
Post: #2
|
|||
|
|||
RE: external filters
It's time to revise the whole section 8.4 of the Privoxy user manual, which talks about filtering through the use of metacharacters. Offending domain names often contain recurring terms. Why not compressing your list using metacharacters then. Your blacklist probably contains a ton of the word “analyticâ€, for example. What you could do then is suppressing all entries containing the word “analyticâ€, then adding only…
Code: .*analytic*. Attached to this post, an archive to help you eliminate a lot of the redundant entries in your blacklist. Inside the archive, the files “meta_col.txt†and “meta_col.rgx†contain the same entries; the second one is only an extended REGEX version of the first one. You're going to use “meta_col.rgx†to compress your blacklist. Once you're done with this task, you simply add the content of “meta_col.txt†file to your blacklist. Before, we start the process of compressing your blacklist, I suggest you revise the content of “meta_col.txt†in order to add or suppress entries. Then, you replicate your changes to “meta_col.rgx†file. If your blacklist is really big, then I suggest you install the tool Parallel to speed things up. Also, why not installing pv as well—a progress monitor. The following command will compress your blacklist at maxed out CPU usage: Code: cp your_blacklist your_blacklist.bak If you prefer the “classic way†instead of the above variation… Code: grep -vf meta_col.rgx your_blacklist > your_blacklist_compressed Then, you can append the content of “meta_col.txt†to your new compressed blacklist. Code: cat meta_col.txt >> your_blacklist_compressed Job done! Note: if you have a huge blacklist (hundreds of thousands entries) the “classic way†to compact it can be a very slow process as you're not using all of your CPU cores. -–— Some other generic terms you could use as well in your blacklist: Code: .*casino*. Of course, you'd have first eliminated the numerous entries containing those words beforehand. -–— Minuscule donations are always appreciated… Code: BTC --> 34WKogWorDoReJ2MSxw8rTsrGD87VMAPJY |
|||
The following 1 user says Thank You to Faxopita for this post: kik0s |
Jan. 30, 2017, 05:13 AM
Post: #3
|
|||
|
|||
RE: external filters
(Jan. 29, 2017 03:21 PM)kik0s Wrote: hey guys, external-filter is not what you really want in your case, external-filter is a way for Privoxy to use external application to parse, edit, save content, do many things that Privoxy cannot. A host file with 18MB file size is not really effective, I think you should use EasyList, just convert it into Privoxy's format using adblock2privoxy https://projects.zubr.me/wiki/adblock2privoxy |
|||
The following 1 user says Thank You to cattleyavns for this post: kik0s |
Jan. 30, 2017, 01:46 PM
(This post was last modified: Jan. 30, 2017 06:55 PM by kik0s.)
Post: #4
|
|||
|
|||
RE: external filters
(Jan. 30, 2017 05:13 AM)cattleyavns Wrote:(Jan. 29, 2017 03:21 PM)kik0s Wrote: hey guys, actually thats what i was using but the problem is that that tool creates a lot more files but adding all of the action and filter files kills my privoxy. limited ram to load so no way. @faxtopia thanks for that. will try it. maybe thats the key edit: is it possible to block like buttons with hosts file? |
|||
Jan. 31, 2017, 03:37 AM
Post: #5
|
|||
|
|||
RE: external filters | |||
The following 1 user says Thank You to cattleyavns for this post: kik0s |
Jan. 31, 2017, 08:31 AM
(This post was last modified: Feb. 13, 2017 02:23 PM by Faxopita.)
Post: #6
|
|||
|
|||
RE: external filters
(Jan. 31, 2017 03:37 AM)cattleyavns Wrote:(Jan. 30, 2017 01:46 PM)kik0s Wrote: edit: Years ago, when I wanted to use Privoxy alongside the hosts file, the latter was ignored. Apparently, it's either hosts file or Privoxy. Not both at the same time. I confirm hosts file only blocks domains. However, you can use, at the same time, Privoxy and Unbound, for example, a local DNS resolver. It can be used to block domains too. That's your definite solution if my first one above offers you limited results. Note that like hosts file Unbound does not accept regular expressions. Under this scenario, you could use Unbound to block domains (just like hosts) and Privoxy to block requests based on the path side of the URL. To block a path (Privoxy): Code: { + block{plug-ins} } To block a domain (Unbound): Code: local-zone: "touchbymediametrie.com" redirect Now, you should have the best of both worlds and enjoy a renewed experience with your computer. -–— Converting a hosts File into Unbound local-data To convert StevenBlack's hosts file, for example, into Unbound local-data, you could issue the following command: Code: wget -O - https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts | grep '^0\.0\.0\.0' | \ Taken from here. I tested it and it works. Of course, you repeat with other source lists and amend the script accordingly. An alternative way to do the conversion is provided by the script `unbound-block-hosts`. Beware! Unbound hates duplicates; be sure to remove any duplicates. If you need to whitelist an entry, simply remove it from `unbound_blacklist.conf` and restart Unbound. —–- Another great way to block ad/tracking domains: using Unbound's void zones; explained here. The main advantage being that there's no need to input every subdomain of the ad/tracking domain. |
|||
The following 1 user says Thank You to Faxopita for this post: kik0s |
May. 07, 2017, 04:08 AM
Post: #7
|
|||
|
|||
RE: external filters
Good help and suggestions, as always, found on this forum.
Instead of hp hosts file, which is bloated, or any other large host file, I use a smaller list that is basically a dns block list. Found here: https://github.com/AdguardTeam/AdguardDN...filter.txt Easy to convert into a privoxy action file. (edit in any notepad or editor). Cheers! |
|||
The following 1 user says Thank You to oldsod for this post: Faxopita |
May. 07, 2017, 09:49 AM
Post: #8
|
|||
|
|||
RE: external filters
dns blocking nowdays is a bad idea. converting the list into an action file is a good choice thats true. regarding dns there are a lot of problems because of the ad services switching to ssl and then you will have troubles with loading times. privoxy though works fine and has nor problems by blocking those hosts even when you are not using proxhttps.
|
|||
May. 07, 2017, 04:08 PM
Post: #9
|
|||
|
|||
RE: external filters
Yes. Privoxy is fast and very smooth.
Not using proxhttps. SSL or TLS not an issue in regards to speed or capability. But again not using proxhttps, and proxhttps that could be factor in prioxy using large domain block list . But would ".example.com" (as seen in a large domain list) have any influence in privoxy speed with SSl/TLS filtering by proxhttps? I do not know. |
|||
May. 07, 2017, 04:19 PM
Post: #10
|
|||
|
|||
RE: external filters
As to your original post, probably no or not.
The file(s) is still too large. More resources would probably resolve the issue. |
|||
May. 07, 2017, 04:28 PM
Post: #11
|
|||
|
|||
RE: external filters
if you use an action file with a blacklist with hosts this one should beloaded first since privoxy works through the list ls in a sequence one by one. proxhttps will just help when the host isn't blacklisted and contains something with ads after the /. privoxy will only see the host but proxhttps will encrypt the request and then you will also be able to block some additional stuff.
|
|||
Jul. 16, 2018, 10:40 PM
Post: #12
|
|||
|
|||
RE: external filters
(Jan. 29, 2017 11:41 PM)Faxopita Wrote: I find your code very useful with the exception that my rules are prepared with the script that prepares them with the use of adblock2privoxy therefore they contain a bunch of element hiding "##" and whitelist "@@||" rules which to get removed with the use of this regex. One has to temporally move all those rules to other file and then append them the one prepared by your script. My scripts for preparation and adblock2privoxy conversion are still work in progress, but definitely would welcome the addition of regex deduplication. |
|||
Sep. 12, 2018, 10:41 AM
Post: #13
|
|||
|
|||
RE: external filters
I investigated for some time why msn.com page layout broke for me and it turns out it was due to one of the filters supplied by the meta_col.txt It had among them .local which made privoxy catch hyperlink such as locale= which was not what was intended. I suggest that those rules after the dot should end with / as they are meant for hosts anyways. For me changing it and other similar rules to .local/ fixed the issue.
|
|||
« Next Oldest | Next Newest »
|