Using PhantomJS in webdriver, what are ways to work around and debug lack
of "existing" data returned?
I have left the website off, as not to give anyone ideas on scraping, but
I can return a full page using Firefox webdriver, but not PhantomJS. The
html "exists" in a standard browser, but not while using the headless. My
script is in Python. Example...
from selenium import webdriver
driver = webdriver.PhantomJS()
spec = "xproduct123"
base_url = "https://www.xxx.com/products/%s" % spec
driver.get(base_url)
selector = driver.find_element_by_tag_name("html").text
print selector
driver.close()
returns Process finished with exit code 0
If I change the webdriver to driver = webdriver.Firefox() I will get a
full html page (at least the text content) in my terminal. This does not
happen on all sites, so I have to figure a workaround without sharing the
site I am scraping if possible.
No comments:
Post a Comment