1.pythonçhttplibï¼urllibåurllib2çåºå«åç¨
2.关于python中urllib2.urlopen函数结果复制的源码问题。
pythonçhttplibï¼urllibåurllib2çåºå«åç¨
å®è¿°
é¦å æ¥çä¸ä¸ä»ä»¬çåºå«
urllibåurllib2
urllib åurllib2é½æ¯æ¥åURL请æ±çç¸å ³æ¨¡åï¼ä½æ¯urllib2å¯ä»¥æ¥åä¸ä¸ªRequestç±»çå®ä¾æ¥è®¾ç½®URL请æ±çheadersï¼urllibä» å¯ä»¥æ¥åURLã
è¿æå³çï¼ä½ ä¸å¯ä»¥ä¼ªè£ ä½ çUser Agentå符串çã
urllibæä¾urlencodeæ¹æ³ç¨æ¥GETæ¥è¯¢å符串ç产çï¼èurllib2没æãè¿æ¯ä¸ºä½urllib常åurllib2ä¸èµ·ä½¿ç¨çåå ã
ç®åç大é¨åpatible; MSIE 5.5; Windows NT)'# å°user_agentåå ¥å¤´ä¿¡æ¯
values = { 'name' : 'who',源码hexdump 源码'password':''}
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()
valuesæ¯postæ°æ®
GETæ¹æ³
ä¾å¦ç¾åº¦ï¼
ï¼è¿æ ·æ们éè¦å°{ âwdâ:âxxxâ}è¿ä¸ªåå ¸è¿è¡urlencode
#coding:utf-8
import urllib
import urllib2
url = ''
values = { 'wd':'D_in'}
data = urllib.urlencode(values)
print data
url2 = url+'?'+data
response = urllib2.urlopen(url2)
the_page = response.read()
print the_page
POSTæ¹æ³
import urllib
import urllib2
url = ''
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' //å°user_agentåå ¥å¤´ä¿¡æ¯
values = { 'name' : 'who','password':''} //postæ°æ®
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values) //对postæ°æ®è¿è¡urlç¼ç
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()
urllib2带cookieç使ç¨
#coding:utf-8
import urllib2,urllib
import cookielib
url = r''
#å建ä¸ä¸ªcjçcookieç容å¨
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
#å°è¦POSTåºå»çæ°æ®è¿è¡ç¼ç
data = urllib.urlencode({ "email":email,"password":pass})
r = opener.open(url,data)
print cj
/')
baidu_data = baidu_doc.read()
baidu_soup = BeautifulSoup(baidu_data)
baidu_doc1 = copy.copy(baidu_doc)
print baidu_doc1
===>输出这么个指针吧 fp = <socket._fileobject object at 0xC3F0>
baidu_doc 本来就是个指针
2024-12-22 16:32
2024-12-22 15:06
2024-12-22 15:03
2024-12-22 14:37
2024-12-22 14:34
2024-12-22 14:18