@tester01

6 months, 3 weeks ago - vertiblocks · 1 min read

my new block

🌐 web

🐴 donqu 🐴 donqu ok

Appended 6 may 2025 at 7:59 p.m. Web Scraping with Python ====== In this example, use the `requests` and `BeautifulSoup` libraries to scrape a website. Install Required Libraries

pip install requests beautifulsoup4

Python Code

import requests
from bs4 import BeautifulSoup

URL of the website to scrape
url = "https://www.example.com"

Send a GET request
response = requests.get(url)

Check if the request was successful
if response.status_code == 200:
Parse the HTML content
soup = BeautifulSoup(response.content, 'html.parser')

Find all paragraph tags on the page
paragraphs = soup.find_all('p')

Print the text of each paragraph
for paragraph in paragraphs:
print(paragraph.text.strip())

else:
print("Failed to retrieve the webpage")

How it Works

1. Send a GET request to the website using `requests`.
2. Parse the HTML content using `BeautifulSoup`.
3. Find specific HTML elements (e.g., paragraphs) using `find_all`.
4. Extract and print the text content.

Tips and Variations

* Use `response.content` for HTML content or `response.json()` for JSON data.
* Explore other libraries like `Scrapy` for more complex scraping tasks.

Example Use Case

- Scrape quotes from https://quotes.toscrape.com:

import requests
from bs4 import BeautifulSoup

url = "https://quotes.toscrape.com"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
quotes = soup.findall('div', class='quote')

for quote in quotes:
print(quote.find('span', class_='text').text)

URL of the website to scrape

Send a GET request

Check if the request was successful

Parse the HTML content

Find all paragraph tags on the page

Print the text of each paragraph

2 replies

anonymous_5ea87

anonymous_5ea87