怎么利用Scrapy框架登錄網(wǎng)站?相信很多沒有經(jīng)驗(yàn)的人對此束手無策,為此本文總結(jié)了問題出現(xiàn)的原因和解決方法,通過這篇文章希望你能解決這個問題。
一、使用cookies登錄網(wǎng)站
import scrapy class LoginSpider(scrapy.Spider): name = 'login' allowed_domains = ['xxx.com'] start_urls = ['https://www.xxx.com/xx/'] cookies = "" def start_requests(self): for url in self.start_urls: yield scrapy.Request(url, cookies=self.cookies, callback=self.parse) def parse(self, response): with open("01login.html", "wb") as f: f.write(response.body)
二、發(fā)送post請求登錄, 要手動解析網(wǎng)頁獲取登錄參數(shù)
import scrapy class LoginSpider(scrapy.Spider): name='login_code' allowed_domains = ['xxx.com'] #1. 登錄頁面 start_urls = ['https://www.xxx.com/login/'] def parse(self, response): #2. 代碼登錄 login_url='https://www.xxx.com/login' formdata={ "username":"xxx", "pwd":"xxx", "formhash":response.xpath("//input[@id='formhash']/@value").extract_first(), "backurl":response.xpath("//input[@id='backurl']/@value").extract_first() } #3. 發(fā)送登錄請求post yield scrapy.FormRequest(login_url, formdata=formdata, callback=self.parse_login) def parse_login(self, response): #4.訪問目標(biāo)頁面 member_url="https://www.xxx.com/member" yield scrapy.Request(member_url, callback=self.parse_member) def parse_member(self, response): with open("02login.html",'wb') as f: f.write(response.body)
三、發(fā)送post請求登錄, 自動解析網(wǎng)頁獲取登錄參數(shù)
import scrapy class LoginSpider(scrapy.Spider): name='login_code2' allowed_domains = ['xxx.com'] #1. 登錄頁面 start_urls = ['https://www.xxx.com/login/'] def parse(self, response): #2. 代碼登錄 login_url='https://www.xxx.com/login' formdata={ "username":"xxx", "pwd":"xxx" } #3. 發(fā)送登錄請求post yield scrapy.FormRequest.from_response( response, formxpath="//*[@id='login_pc']", formdata=formdata, method="POST", #覆蓋之前的get請求 callback=self.parse_login ) def parse_login(self, response): #4.訪問目標(biāo)頁面 member_url="https://www.xxx.com/member" yield scrapy.Request(member_url, callback=self.parse_member) def parse_member(self, response): with open("03login.html",'wb') as f: f.write(response.body)
看完上述內(nèi)容,你們掌握怎么利用Scrapy框架登錄網(wǎng)站的方法了嗎?如果還想學(xué)到更多技能或想了解更多相關(guān)內(nèi)容,歡迎關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道,感謝各位的閱讀!
當(dāng)前題目:怎么利用Scrapy框架登錄網(wǎng)站-創(chuàng)新互聯(lián)
本文網(wǎng)址:http://chinadenli.net/article0/djjgoo.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供網(wǎng)站設(shè)計(jì)公司、網(wǎng)站導(dǎo)航、小程序開發(fā)、營銷型網(wǎng)站建設(shè)、微信小程序、網(wǎng)站內(nèi)鏈
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來源: 創(chuàng)新互聯(lián)
猜你還喜歡下面的內(nèi)容