To scrape content from a static page, we use BeautifulSoup as our package for scraping, and it works flawlessly for static pages. We use requests to load page into our python script. Now, if the page we are trying to load is dynamic in nature and we request this page by requests library, it would send the JS code to be executed locally. Requests package does not execute this JS code and just gives it as the page source.
BeautifulSoup does not catch the interactions with DOM via Java Script. Let’s suppose, if you have a table that is generated by JS. BeautifulSoup will not be able to capture it, while Selenium can.
If there was just a need to scrape static websites, we would’ve used just bs4. But, for dynamically generated webpages, we use selenium.
Partager