python - Understanding the proxies parameter in requests module -
i'm using requests
module in script, , want understand proxies
parameter in get()
method. this answer has posted following code illustrate usage of proxies
parameter:
http_proxy = "10.10.1.10:3128" https_proxy = "10.10.1.11:1080" ftp_proxy = "10.10.1.10:3128" proxydict = {"http":http_proxy, "https":https_proxy, "ftp":ftp_proxy } r = requests.get(url, headers=headers, proxies=proxydict)
here questions:
why passing more 1 proxy
get()
? howget()
use them? try 1 one?given proxy say,
a.b.c.d:port
, how know protocol type? when buy premium proxies hidemyass.com, sends proxies inip:port
format , doesn't send protocol type. should passrequests.get()
method?
i've these doubts because don't know proxies in general , how work. great if explains well.
.get()
uses proxy key in dictionary matches scheme of url. is, if access 'http://www.google.com/', proxy key 'http' (in example,http_proxy
) used. if access 'https://www.google.com/', proxy key 'https' (in example,https_proxy
) used.the short answer paid proxy should accept both http , https urls.
in practice, made complicated requests, 2 unexpected things. firstly, if use proxy addresses in form you've provided in question (i.e.
ip:port
), requests assume protocol used access proxy same protocol you're proxying. is,http_proxy
internally converted"http://10.10.1.10:3128"
, ,https_proxy
"https://10.10.1.11:1080"
. not want, should explicit , use formscheme://ip:port
.the second thing requests has real problems https through proxies. in general should assume don't work, though it's bit more complex that.
both of these problems addressed in planned v2.0 release.
i've written blog post proxies in requests, if you'd know more.
as how proxies work, purpose accept http requests , forward them on destination. used 1 of 2 reasons: either mutate http requests (and potentially drop them entirely), or cache http requests/responses. wikipedia has excellent article started.
Comments
Post a Comment