728x90
from bs4 import BeautifulSoup
import requests
root = 'https://subslikescript.com'
website = f'{root}/movies'
result = requests.get(website)
content = result.text
soup = BeautifulSoup(content, 'lxml')
box = soup.find('article', class_='main-article')
links = []
for link in box.find_all('a', href=True):
links.append(link['href'])
print(links)
for link in links:
website = f'{root}/{link}'
result = requests.get(website)
content = result.text
soup = BeautifulSoup(content, 'lxml')
box = soup.find('article', class_='main-article')
title = box.find('h1').get_text()
transcript = box.find('div', class_='full-script').get_text(script=True, separator=' ')
with open(f'{title}', 'w') as file:
file.write(transcript)
728x90
'Research > Python' 카테고리의 다른 글
Chrome_웹사이트가 자바스크립트 기반인지 확인하는 방법 (0) | 2023.03.26 |
---|---|
XPath_Basics (0) | 2023.03.26 |
BeautifulSoup_Basics (0) | 2023.03.25 |
Python_웹 스크래핑을 위한 기본 문법 (0) | 2023.03.25 |
bs4로 지니뮤직 스크래핑하여 mongoDB에 저장하기 (0) | 2022.11.22 |
댓글