Thunderproxy - How To Use Selenium and Python for Web Scraping

Collecting data from websites—commonly known as web scraping—is a practical technique for many projects. While libraries like BeautifulSoup are great for working with basic HTML, they often struggle when pages rely heavily on JavaScript to display content. That’s where Selenium comes in.

In this guide, you’ll learn how to use Selenium with Python to scrape dynamic websites effectively—step by step.

What is Selenium?

Selenium is a browser automation framework designed for testing web applications. It simulates real user behavior by controlling an actual browser like Chrome or Firefox. Because of this, it can handle JavaScript-rendered content that simpler tools can’t.

This makes Selenium a great choice for scraping content from interactive websites, forms, infinite scrolls, and more.

Installing Selenium

To get started, install Selenium with pip:

pip install selenium

Setting Up a WebDriver

Selenium requires a WebDriver to communicate with the browser. Here’s an example using Chrome:

pythonCopyEditfrom selenium import webdriver
from selenium.webdriver.chrome.service import Service

service = Service("/path/to/chromedriver")
driver = webdriver.Chrome(service=service)

If you want to run the browser without opening a window (useful on servers), enable headless mode:

pythonCopyEditfrom selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless=new")
driver = webdriver.Chrome(options=options)

Finding Elements on the Page

You can use different strategies to locate HTML elements:

from selenium.webdriver.common.by import By

element = driver.find_element(By.CLASS_NAME, "product-title")

Other locator options include:

By.ID
By.TAG_NAME
By.CSS_SELECTOR
By.XPATH

Waiting for JavaScript to Load

Instead of using time.sleep(), Selenium supports smart waiting using WebDriverWait:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.ID, "content"))
)

Executing JavaScript

Need to scroll the page or trigger lazy-loaded elements? You can run JavaScript:

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

Taking Screenshots

Capture a screenshot of the current view with:

driver.save_screenshot("screenshot.png")

Handling Pagination

To scrape multiple pages, you can loop through links or interact with a “Next” button:

next_button = driver.find_element(By.LINK_TEXT, "Next")
next_button.click()

Exporting Data

You can use the Pandas library to save your scraped data to a CSV file:

import pandas as pd

df = pd.DataFrame(data)
df.to_csv("output.csv", index=False)

Scrolling with Keys

To simulate pressing keys like PAGE_DOWN or END:

pythonCopyEditfrom selenium.webdriver.common.keys import Keys

body = driver.find_element(By.TAG_NAME, "body")
body.send_keys(Keys.END)

Blocking Images and Other Resources

To speed up scraping and reduce resource usage:

driver.execute_cdp_cmd("Network.setBlockedURLs", {"urls": ["*.jpg", "*.png"]})

How Does Selenium Compare to Other Tools?

Tool	JavaScript Support	Speed	Ideal Use Case
Selenium	Full	Moderate	Interactive/dynamic pages
BeautifulSoup	None	Fast	Static HTML scraping
Scrapy	Optional (via Selenium)	Very Fast	Large-scale scraping projects
Puppeteer	Full (Node.js only)	Moderate	Headless Chromium-based scraping

When to Use Selenium

Choose Selenium when:

The website relies heavily on JavaScript
You need to simulate user interactions (clicks, scrolls, inputs)
You’re working on a small or medium-scale scraping task

For larger or faster scraping jobs, consider tools like Scrapy, or specialized APIs that handle proxies, CAPTCHA, and JavaScript for you.

Conclusion

Selenium is an excellent option for scraping dynamic websites using Python. With a bit of setup, it allows you to extract content from even the most complex pages. While it’s not the fastest tool, its ability to automate a real browser makes it incredibly versatile.

How To Use Selenium and Python for Web Scraping