如何批量下載某博主全部微博相冊

這篇文章將教大家如何通過Python代碼批量下載某博主全部微博相冊。

一、獲取微博相冊鏈接

首先，我們需要獲取到某博主的所有微博相冊鏈接。可以通過以下代碼獲取到某博主的首頁鏈接：

import requests
from bs4 import BeautifulSoup

url = 'https://weibo.com/xxx'  # 博主首頁鏈接
headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36'}

response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
album_links = []

for link in soup.find_all('a'):
    href = link.get('href')
    if 'photo' in href:
        album_links.append(href)

print(album_links)  # 輸出博主所有微博相冊的鏈接

上述代碼中，我們使用requests庫發起GET請求，獲取到博主首頁的HTML代碼。然後使用BeautifulSoup庫對HTML進行解析，通過查找所有a標籤的href屬性，篩選出包含「photo」的鏈接，即為博主的微博相冊鏈接。

二、登錄微博並解析相冊頁面

由於微博需要登錄才能查看相冊頁面，所以我們需要使用Selenium庫模擬登錄，並使用BeautifulSoup庫對相冊頁面進行解析。

首先，我們需要安裝Selenium庫：

pip install selenium

接著，使用以下代碼進行登錄：

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep

driver = webdriver.Chrome()  # 需要先下載ChromeDriver並配置環境變數
driver.maximize_window()
driver.get('https://weibo.com/login.php')

# 手動輸入賬號密碼或使用cookies登錄
sleep(30)

# 登錄成功後，進入相冊頁面
driver.get('https://photo.weibo.com/albums')

上述代碼中，我們首先啟動Chrome瀏覽器並最大化窗口。然後打開微博的登錄頁面，並手動輸入賬號密碼或使用提前準備好的cookies登錄。登錄成功後，跳轉到微博的相冊頁面。

接著，我們使用以下代碼解析相冊頁面，獲取相冊的名稱和照片鏈接：

soup = BeautifulSoup(driver.page_source, 'html.parser')
albums = soup.find_all('a', class_='album-cover')
album_dict = {}

# 獲取相冊名稱和鏈接
for album in albums:
    album_name = album.get('title')
    album_link = album.get('href')
    album_dict[album_name] = album_link

# 獲取相冊中所有照片的鏈接
for album_name, album_link in album_dict.items():
    driver.get(album_link)
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    photos = soup.find_all('img')
    photo_links = []
    for photo in photos:
        link = photo.get('src').replace('orj360', 'large')
        photo_links.append(link)
    album_dict[album_name] = photo_links

print(album_dict)  # 輸出相冊名稱和照片鏈接

上述代碼中，我們使用BeautifulSoup庫對相冊頁面進行解析，獲取每個相冊的名稱和鏈接。然後使用WebDriver模擬訪問每個相冊鏈接，並解析頁面獲取照片鏈接。最終得到一個字典，包含每個相冊的名稱和所有照片的鏈接。

三、批量下載照片

最後，我們使用以下代碼批量下載所有照片：

import os
import requests

DOWNLOAD_PATH = 'download'  # 下載路徑

if not os.path.exists(DOWNLOAD_PATH):
    os.mkdir(DOWNLOAD_PATH)

for album_name, photo_links in album_dict.items():
    album_path = os.path.join(DOWNLOAD_PATH, album_name.replace('/', '-'))
    if not os.path.exists(album_path):
        os.mkdir(album_path)

    for link in photo_links:
        filename = link.split('/')[-1]
        filepath = os.path.join(album_path, filename)
        if os.path.exists(filepath):
            continue
        response = requests.get(link)
        with open(filepath, 'wb') as f:
            f.write(response.content)

print('下載完成！')

上述代碼中，我們首先定義了全局變數DOWNLOAD_PATH，用於指定下載路徑。然後遍歷每個相冊，為每個相冊創建一個對應的文件夾。接著遍歷相冊中所有照片的鏈接，為每張照片創建一個對應的文件並進行下載。最終，我們可以在DOWNLOAD_PATH路徑下看到所有下載的照片。

四、總結

通過本文的介紹，我們學習了如何使用Python和相關庫批量下載某博主全部微博相冊。具體包括獲取相冊鏈接、登錄並解析相冊頁面、批量下載照片等操作。希望讀者可以通過本文學到有用的知識，並在工程實踐中得到應用。

原創文章，作者：DKXHN，如若轉載，請註明出處：https://www.506064.com/zh-tw/n/374071.html

如何批量下載某博主全部微博相冊

一、獲取微博相冊鏈接

二、登錄微博並解析相冊頁面

三、批量下載照片

四、總結

相關推薦

發表回復