python - Understanding the proxies parameter in requests module -


i'm using requests module in script, , want understand proxies parameter in get() method. this answer has posted following code illustrate usage of proxies parameter:

http_proxy  = "10.10.1.10:3128" https_proxy = "10.10.1.11:1080" ftp_proxy   = "10.10.1.10:3128"  proxydict = {"http":http_proxy,  "https":https_proxy, "ftp":ftp_proxy }  r = requests.get(url, headers=headers, proxies=proxydict) 

here questions:

  1. why passing more 1 proxy get()? how get() use them? try 1 one?

  2. given proxy say, a.b.c.d:port, how know protocol type? when buy premium proxies hidemyass.com, sends proxies in ip:port format , doesn't send protocol type. should pass requests.get() method?

i've these doubts because don't know proxies in general , how work. great if explains well.

  1. .get() uses proxy key in dictionary matches scheme of url. is, if access 'http://www.google.com/', proxy key 'http' (in example, http_proxy) used. if access 'https://www.google.com/', proxy key 'https' (in example, https_proxy) used.

  2. the short answer paid proxy should accept both http , https urls.

    in practice, made complicated requests, 2 unexpected things. firstly, if use proxy addresses in form you've provided in question (i.e. ip:port), requests assume protocol used access proxy same protocol you're proxying. is, http_proxy internally converted "http://10.10.1.10:3128", , https_proxy "https://10.10.1.11:1080". not want, should explicit , use form scheme://ip:port.

    the second thing requests has real problems https through proxies. in general should assume don't work, though it's bit more complex that.

    both of these problems addressed in planned v2.0 release.

i've written blog post proxies in requests, if you'd know more.

as how proxies work, purpose accept http requests , forward them on destination. used 1 of 2 reasons: either mutate http requests (and potentially drop them entirely), or cache http requests/responses. wikipedia has excellent article started.


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

c# - must be a non-abstract type with a public parameterless constructor in redis -

ajax - PHP/JSON Login script (Twitter style) not setting sessions -