Another Filtering Proxy
|
May. 27, 2015, 02:26 PM
Post: #16
|
|||
|
|||
RE: Another Filtering Proxy
I spent several hours but couldn't figure out why. It seems to be related to urllib3's headers handling.
Anyway, attached is the unreleased v0.5 with an ugly patch. Let me know if it works. |
|||
May. 28, 2015, 04:26 AM
(This post was last modified: May. 28, 2015 07:42 AM by cattleyavns.)
Post: #17
|
|||
|
|||
RE: Another Filtering Proxy
Great job! I think we should rewrite self.headers.update too, I'm trying to do that now. Urllib3 headers feature is not good at least at this time, I think we should depart the whole header feature from them I use built-in as much as possible.
Can you tell me how to get this line to URLFilter.py and modify it as I want: Code: headers = urllib3._collections.HTTPHeaderDict() I'm adding proxy feature to AFProxy using proxy_from_url, but I want to patch above problem by set headers = self.headers (req.headers in URLFilter.py) I'm learning Python but I'm having a really tough question about "threading", threading with Python is not easy at all.. I would like to ask you some question and hope you will help me: - In threading, how can we download a big file in parts but join it one by one instead wait them all finish and then join. Code, save as .py and then run it. Code: import os, requests What I want is: - For example we have a big file with 100MB file size - We will split that file with Content-Length - We will use "threading" module to download that file in parts to ensure we have as fast as possible download speed instead download one by one without threading then join part. - But problem is with threading "join()", we cannot stream file or write file to disk instantly like Free Download Manager/Flashget software because "join()" wait for all thread finish. - But without join(), simply this script will not work, file size return 0 byte because the file write before the download task finish. - So I want to make threading work like this: + Download a file with 4 threads + Thread 1 download finish, stream thread 1 data then wait till thread 2 finsh, join thread 2 with thread 1, but even thread 3, 4 finish earlier than thread 2, thread 3, 4 should not join with thread 1 because that action will break the file, it must wait till thread 2 finish then join 1 with 2, then join 3, 4 with. |
|||
May. 28, 2015, 09:53 AM
Post: #18
|
|||
|
|||
RE: Another Filtering Proxy
(May. 28, 2015 04:26 AM)cattleyavns Wrote: I'm adding proxy feature to AFProxy using proxy_from_url, but I want to patch above problem by set headers = self.headers (req.headers in URLFilter.py) I am not sure if I get your point, but self.headers is available as req.headers in URLFilter.py, and you can operate it freely as you want. (May. 28, 2015 04:26 AM)cattleyavns Wrote: - But problem is with threading "join()", we cannot stream file or write file to disk instantly like Free Download Manager/Flashget software because "join()" wait for all thread finish. I think you can create the file in advance, and in each thread write the data to specified offset via f.seek(offset, from_what). Also you need to take care of Semaphore acquire() and release(). They are all documented in the manual. |
|||
May. 29, 2015, 04:07 AM
(This post was last modified: May. 29, 2015 04:13 AM by cattleyavns.)
Post: #19
|
|||
|
|||
RE: Another Filtering Proxy
I also want to report a funny bug of AFProxy:
thread.dameon = True Must be as the official documentation: thread.daemon = True But if I change it to thread.daemon = True, simply AFProxy stop working, so I decided to remove it by commenting it. Did we really need this line "thread.daemon = True" ? (May. 28, 2015 09:53 AM)whenever Wrote: I am not sure if I get your point, but self.headers is available as req.headers in URLFilter.py, and you can operate it freely as you want. Thank you, but the headers variable I want to get and change is the "headers" in headers=headers from version 0.4, I want to change it with for example "URLFilter.py", I already could change self.headers with req.headers. (May. 28, 2015 09:53 AM)whenever Wrote: I think you can create the file in advance, and in each thread write the data to specified offset via f.seek(offset, from_what). Also you need to take care of Semaphore acquire() and release(). They are all documented in the manual. Thank you, here is what I get so far, hope this contribute a little bit if you want to add new feature to AFProxy, also the way to do bandwidth throttling (speed limit): Split file to parts and download and join in parallel: Code: import threading Limit download speed: Code: """Rate limiters with shared token bucket.""" |
|||
Jun. 03, 2015, 07:56 AM
Post: #20
|
|||
|
|||
RE: Another Filtering Proxy
(May. 29, 2015 04:07 AM)cattleyavns Wrote: I also want to report a funny bug of AFProxy: Well, you found an old bug, well done! We don't need it. We can safely remove that line. (May. 29, 2015 04:07 AM)cattleyavns Wrote: Thank you, but the headers variable I want to get and change is the "headers" in headers=headers from version 0.4, I want to change it with for example "URLFilter.py", I already could change self.headers with req.headers. In version 0.4, what you change via req.headers in URLFilter.py will be copied to headers via below lines in AFProxy.py. I don't think you need to do extra work. Code: headers = urllib3._collections.HTTPHeaderDict() (May. 29, 2015 04:07 AM)cattleyavns Wrote: Thank you, here is what I get so far, hope this contribute a little bit if you want to add new feature to AFProxy, also the way to do bandwidth throttling (speed limit): Thanks, but I want AFProxy to focus on filtering. You are free to make a new proxy based on AFProxy with whatever new features you like. |
|||
Jun. 05, 2015, 12:26 PM
(This post was last modified: Jun. 05, 2015 12:42 PM by cattleyavns.)
Post: #21
|
|||
|
|||
RE: Another Filtering Proxy
(Jun. 03, 2015 07:56 AM)whenever Wrote: Well, you found an old bug, well done! Well, I think we will need daemon, without daemon we cannot "Ctrl + C" to exit AFProxy, I feel a little bit uncomfortable when using "X" button to exit instead, and without Daemon we miss "OnExit"'s event using "atexit" module (import atexit). Do you have any idea how to make thread.daemon = True or thread.setDaemon(True) work ? I tried to recover this feature but all I got is AFProxy stop working. (Jun. 03, 2015 07:56 AM)whenever Wrote: In version 0.4, what you change via req.headers in URLFilter.py will be copied to headers via below lines in AFProxy.py. I don't think you need to do extra work. I found another bug, we should move Code: ########## Apply HeaderFilterOut ########## Right above "headers = urllib3._collections.HTTPHeaderDict()" in your quote above, otherwise we cannot change/add/remove headers. (Jun. 03, 2015 07:56 AM)whenever Wrote: Thanks, but I want AFProxy to focus on filtering. You are free to make a new proxy based on AFProxy with whatever new features you like. Great, thank you for that offer Here is my patch for AFProxy to make AFProxy work partially with socks proxy using Urllib(2) (weird, my implement look horrible, but for me better than nothing, right ?), based on version 0.4 because it is stable. Need another module: Code: pip install pySocks My implement way had my implement way problem, for example: Code: http://prxbx.com You might install BitviseSSHClient or AdvOr and set listen port to 10080 or change the line with "10080" with your socks proxy. Changelog: Code: - Added socks support Test: http://ghacks.net/ip/ |
|||
Jun. 17, 2015, 09:14 AM
(This post was last modified: Jun. 17, 2015 09:23 AM by cattleyavns.)
Post: #22
|
|||
|
|||
RE: Another Filtering Proxy
Okay, continue, I fixed a SERIOUS problem of http.server library:
- Technical details: + Use Firefox + Open Network tool (Tools -> Developer Tools -> Network) + open http://www.facebook.com + Find this url ('ai.php', filter this url with the Network tool's search box), response status icon filled with pink color, not green color, pink means error: Quote:O https://www.facebook.com/ai.php?ego=++++++++++++ Because, http.server library parses 'raw_requestline' the wrong way, so our GET|POST|CONNECT|HEAD command will look like this: Code: __user+++++GET And probably there is no do___user+++++GET, only do_GET And here is my patch, I modified http.server's parse_request function and embed it into ProxyTool.py, just replace your ProxyTool.py with: Code: #!/usr/bin/env python3 |
|||
The following 1 user says Thank You to cattleyavns for this post: defconnect |
Jul. 01, 2015, 08:55 AM
(This post was last modified: Jul. 01, 2015 09:22 AM by cattleyavns.)
Post: #23
|
|||
|
|||
RE: Another Filtering Proxy
Do you think we can use AFProxy and can filter HTTPS websites without the help of pyOpenSSL ? pyOpenSSL is not a small Python lib, it requires "cffi" and "crytography", both two libs need to compile with GCC (on Linux), so it reduces portability of AFProxy, I hardly install pyOpenSSL and make it works on Lubuntu after installing a bunch of apt-get install ... So I want to replace pyOpenSSL with Python native SSL to do MITM.
So my goal is rewrite CertTool.py and remove all pyOpenSSL code with a random native Python (without C extension or have to compile it) crypto library. |
|||
Jul. 19, 2015, 07:32 AM
Post: #24
|
|||
|
|||
RE: Another Filtering Proxy
(Jun. 05, 2015 12:26 PM)cattleyavns Wrote: Do you have any idea how to make thread.daemon = True work ? We need to make the main thread not quit so that it can catch the KeyboardInterrupt exception. Code: ... (Jun. 05, 2015 12:26 PM)cattleyavns Wrote: I found another bug, we should move ... It should already be fixed in the last testing version here. (Jun. 17, 2015 09:14 AM)cattleyavns Wrote: Because, http.server library parses 'raw_requestline' the wrong way, so our GET|POST|CONNECT|HEAD command will look like this: It's more like a browser problem because it seems it didn't compose the request line correctly. It's not the duty of http.server to validate the request commands. (Jul. 01, 2015 08:55 AM)cattleyavns Wrote: So my goal is rewrite CertTool.py and remove all pyOpenSSL code with a random native Python (without C extension or have to compile it) crypto library. How is your finding? I don't think we have much choices unless a similar module becomes part of the standard Python installation. |
|||
Jul. 19, 2015, 05:53 PM
(This post was last modified: Jul. 19, 2015 05:55 PM by cattleyavns.)
Post: #25
|
|||
|
|||
RE: Another Filtering Proxy
(Jul. 19, 2015 07:32 AM)whenever Wrote: How is your finding? I don't think we have much choices unless a similar module becomes part of the standard Python installation. Well, I temporary give it up at this time because I tried so much but didn't get anything. I'm trying other things, for example make AFProxy works as socks proxy, mitmproxy did that (can modify HTTPS traffic) so I think I will try to do that, here is a draft version that work well with Python 3, still It cannot decrypt HTTPS content, one more step and I can reach that. Socks is way better than HTTP or HTTPS proxy, it works on almost all protocol like email, chat.. and it can encrypt data between client and server, so better privacy (still can block ads, modify webpage, mitmproxy did it). mitmproxy: http://mitmproxy.org/doc/features/socksproxy.html Make sure you import cert using mitm.it so mitmproxy can filter HTTPS traffic. Command line: mitmdump --socks -p 1080 And set your browser's Socks5 proxy as 127.0.0.1 : 1080 Do you have any advice for me ? I think move to socks will be great! |
|||
Jul. 20, 2015, 03:31 AM
(This post was last modified: Jul. 20, 2015 03:34 AM by whenever.)
Post: #26
|
|||
|
|||
RE: Another Filtering Proxy
mitmproxy depends on pyOpenSSL too. Check https://github.com/mitmproxy/netlib/blob...rtutils.py
In fact, pyOpenSSL is used for making certificates only. I used to have a version of CertTool.py which uses native openssl command line tool only. You can modify it to work with other certificates manipulate tools if you like. On the other hand, I'm not sure if content filtering is available at socks level. I had thought the socks mode of mitmproxy is just a frontend of the backend http proxy. |
|||
Jul. 20, 2015, 06:17 AM
Post: #27
|
|||
|
|||
RE: Another Filtering Proxy
Great! Thanks for sharing.
At first I also thought that mitmproxy's socks mode cannot filter webpage like HandyCache's Socks mode, but I was wrong, it probably can filter webpage, and it can filter HTTPS webpage, and it use IP address instead domain matching method (http://8.8.8.8 instead http://www.google.com for example..), but we could use Host request header to correct that limit. |
|||
Jan. 05, 2016, 04:53 PM
Post: #28
|
|||
|
|||
RE: Another Filtering Proxy
Hi whenever, any update for Another Filtering Proxy ?
|
|||
Jan. 06, 2016, 08:44 AM
Post: #29
|
|||
|
|||
RE: Another Filtering Proxy
I'm sorry I haven't been working on it any more. I had been quite busy for the past half year and it seems to continue.
|
|||
Jan. 06, 2016, 07:40 PM
Post: #30
|
|||
|
|||
RE: Another Filtering Proxy | |||
« Next Oldest | Next Newest »
|