python入门——爬取整个网页源码
一、源码
使用request库爬取整个网页
1 import requests
2 # encoding:utf-8 #默认格式utf-8
3
4 def get_html(url): #爬取源码函数
5 headers = {
6 "User-Agent": "Mozilla/5.0(Macintosh; Intel Mac OS X 10_11_4)
7 AppleWebKit/537.36(KHTML, like Gecko) Chrome/52 .0.2743. 116 Safari/537.36"
8
9 } # 模拟浏览器访问
10 response = requests.get(url, headers=headers) # 请求访问网站
11 response.encoding = response.apparent_encoding #设置字符编码格式
12 html = response.text # 获取网页源码
13 return html # 返回网页源码
14
15 r = get_html("https://www.baidu.com/")
16 print(r) #打印网页源码

![python入门——爬取整个网页源码[Python常见问题]](https://www.zixueka.com/wp-content/uploads/2023/10/1696934124-9a9bdb7e6fea614.jpg)
