python requests follow javascript redirect

Python Requests Follow JavaScript Redirect

When working with web scraping, you may come across scenarios where the website uses JavaScript to redirect the user to a different page. This can be a problem when trying to automate the process using Python Requests library.

To handle JavaScript redirection, we need to use a library that can execute JavaScript. Here are some ways to solve this problem:

1. Selenium

Selenium is a popular library used for web automation and testing. It can also be used for web scraping, especially when dealing with JavaScript-based websites.

Here's an example code:


from selenium import webdriver
import requests

# Start the browser
driver = webdriver.Firefox()

# Load the page
driver.get('http://example.com')

# Get the redirected URL
redirected_url = driver.current_url

# Close the browser
driver.quit()

# Use requests to get the content of redirected URL
response = requests.get(redirected_url)
print(response.content)

In this code, we first start the browser using Selenium and load the page. We then get the redirected URL using driver.current_url and close the browser. Finally, we use Python Requests library to get the content of the redirected URL.

2. Requests-HTML

Requests-HTML is a Python library that uses Pyppeteer (a headless Chrome browser) to render JavaScript and HTML content. This library provides a more straightforward way of handling JavaScript redirection.

Here's an example code:


from requests_html import HTMLSession

session = HTMLSession()

# Get the page
r = session.get('http://example.com')

# Follow the JavaScript redirect
r.html.render()

# Get the content of the redirected page
print(r.html.html)

In this code, we first create an HTMLSession object and get the page using session.get(). We then use r.html.render() to follow the JavaScript redirect and render the content of the redirected page. Finally, we use r.html.html to get the HTML content of the redirected page.

These are two ways to handle JavaScript redirection when scraping websites using Python Requests library. Choose the one that suits your needs the best.