
Opinion: Capybara test download stored file
Rich homie quan the gif free download | 164 |
All grown up camp everwood torrent download | 292 |
Nero 12 free download for windows 10 | 359 |
Panic at the disco new perspective free download | 165 |
Capybara
Some tools useful in creating bot, spider, scraper.
Some of the most used capybara methods link or cheat sheet
Session methodslink you can set expectation for or . In feature test is actually class so better is to use name.
- , . Remember that if you want to go to different domain than than you need to use full url (with protocol) so instead use rather . Any relative url will use just note that it also needs protocol otherwise error
generate capybara POST request using (this does not work for , so please use )
- scroll to the bottom of the page since since elements needs to be visible when you can use or make anchor and use hash url since execute script is not available when not . For pagination I prefer to use helper
- back button
Node actions are on session object. Argument is target element by their: id (without ), name, label text, alt text, inner text more Note that locator is case sensitive. You can NOT use css or xpath (this is only for finders). You can use substring or you can define
- (both buttons and links) , . is alias of https://rubydoc.info/github/teamcapybara/capybara/master/Capybara/Node/Actions#click_link_or_button-instance_method which is similar to but for union of and selectors.
- locator is input name, id, test_id, placeholder, label text. Note that it is case sensitive. alternative is find set , or using javascript
- (or by id but without # ), , , also uncheck ,
- If you need to fill_in iframe than you can access it by id or number
- If you need to switch window ie jump into new tab opened by target than you can ~~~ old_window = page.driver.browser.window_handles.last new_window = window_opened_by { click_link ‘Something’ }
page.within_window new_window do # code end
page.switch_to_window new_window
page.driver.browser.close # this will close tab, not whole window page.switch_to_window old_window
node = Capybara.string «-HTML
HTML node.class # => Capybara::Node::Simple node.find(‘#projects’).text # => ‘Projects’
node.find(:css, ‘#projects’).text # => ‘Projects’ node.find(:link_or_button, ‘.logo’) # Capybara::ElementNotFound (Unable to find link or button “.logo”)
ENV[“RAILS_SYSTEM_TESTING_SCREENSHOT”] = ‘simple’ Capybara::Screenshot.register_driver(:chrome) do |driver, path| driver.browser.save_screenshot(path) end Capybara::Screenshot.register_driver(:headless_chrome) do |driver, path| driver.browser.save_screenshot(path) end
Capybara::Screenshot.after_save_html do |path| $stderr.write(‘Press ENTER to continue’) && $stdin.gets end
Capybara::Screenshot.after_save_screenshot do |path| Launchy.open path end
convert -delay 50 -loop 0 tmp/capybara/m/screenshot_2018-*.png -delay 400 tmp/capybara/m/final.png animated.gif
module WaitHelper # You can use this flash and force driver to wait more time, expecially on # destroy action when there is slow deleting data # app/views/users/destroy.js.erb # window.location.assign(‘<%= customer_path @customer %>’); # jQuery.active = 1; # def wait_for_ajax printf “jQuery.active” start_time = Time.current Timeout.timeout(Capybara.default_max_wait_time) do loop until _finished_all_ajax_requests? end printf ‘%.2f’, Time.current - start_time rescue Timeout::Error printf “timeout#{Capybara.default_max_wait_time}” end
def _finished_all_ajax_requests? output = page.evaluate_script(‘jQuery.active’) printf “.” unless output.zero? output.zero? end
def wait_for_visible(target) Timeout.timeout(Capybara.default_max_wait_time) do loop until page.find(target).visible? end rescue Timeout::Error flunk “Expected #{target} to be visible.” end end RSpec.configure do |config| config.include WaitHelper, type: :feature end
class ActionDispatch::SystemTestCase include WaitHelper end
config.before(:example) do Capybara.reset_sessions! end before do Capybara.reset_session! browser = Capybara.current_session.driver.browser if browser.respond_to?(:clear_cookies) # Rack::MockSession browser.clear_cookies elsif browser.respond_to?(:manage) and browser.manage.respond_to?(:delete_all_cookies) # Selenium::WebDriver browser.manage.delete_all_cookies else raise “Don’t know how to clear cookies. Weird driver?” end end
#
# module DownloadFeatureHelpers TIMEOUT = 10 PATH = Rails.root.join(“tmp/downloads”)
extend self
def downloads Dir[PATH.join(“*”)] end
def download downloads.first end
def download_content wait_for_download File.read(download) end
def wait_for_download Timeout.timeout(TIMEOUT) do sleep 0.1 until downloaded? end end
def downloaded? !downloading? && downloads.any? end
def downloading? downloads.grep(/.crdownload$/).any? end
def clear_downloads FileUtils.rm_f(downloads) end end
RSpec.configure do |config| config.include DownloadFeatureHelpers, type: :feature end
require “selenium/webdriver” Capybara.register_driver :chrome do |app| profile = Selenium::WebDriver::Chrome::Profile.new profile[“download.default_directory”] = DownloadFeatureHelpers::PATH.to_s Capybara::Selenium::Driver.new(app, browser: :chrome, profile: profile) end
Capybara.register_driver :headless_chrome do |app| desired_capabilities = Selenium::WebDriver::Remote::Capabilities.chrome( chromeOptions: { args: %w(headless disable-gpu window-size=1024,768) }, prefs: { “download.default_directory”: DownloadFeatureHelpers::PATH.to_s, } )
Capybara::Selenium::Driver.new( app, browser: :chrome, desired_capabilities: desired_capabilities, ) end
RSpec.describe ‘Location Reports’, js: true do it ‘downloads’ do DownloadFeatureHelpers.clear_downloads click_on ‘Generate Report’ csv_content = DownloadFeatureHelpers.download_content expect(csv_content.count(“\n”)).to eq 3 expect(csv_content).to include first_customer.name end end
rm -rf ~/.chromedriver-helper/ chromedriver-update
page.driver.browser.navigate.refresh page.evaluate_script ‘window.location.reload()’
options = Selenium::WebDriver::Chrome::Options.new options.add_argument(‘–headless’) driver = Selenium::WebDriver.for :chrome #, options: options
require “selenium/webdriver”
Capybara.register_driver :chrome do |app| # set download directory using Profile (can be set using :prefs in options) profile = Selenium::WebDriver::Chrome::Profile.new profile[“download.default_directory”] = DownloadFeatureHelpers::PATH.to_s Capybara::Selenium::Driver.new(app, browser: :chrome, profile: profile) end
Capybara.register_driver :headless_chrome do |app|
options = Selenium::WebDriver::Chrome::Options.new( args: %w[headless disable-gpu window-size=1024,768], # can not use prefs for headless driver since it is not supported # prefs: { # “download.default_directory”: DownloadFeatureHelpers::PATH.to_s, # } )
Capybara::Selenium::Driver.new(app, browser: :chrome, options: options) end
RSpec.configure do |config| files = config.instance_variable_get :@files_or_directories_to_run if files == [“spec”] # when run all spec use headless Capybara.javascript_driver = :headless_chrome else Capybara.javascript_driver = :chrome end end Capybara.enable_aria_label = true
Capybara.app_host = “http://my-domain.loc:3333” Capybara.server_port = 3333
Capybara.current_session.server.host Capybara.current_session.server.port
java -jar selenium-server-standalone.jar
driver = Selenium::WebDriver.for :remote, desired_capabilities: :chrome, url: “http://192.168.5.56:4444/wd/hub”
driver.navigate.to ‘http://google.com’ #=> nil
options = Selenium::WebDriver::Chrome::Options.new( args: %w[headless disable-gpu window-size=1024,768], ) driver = Selenium::WebDriver.for :chrome, url: “http://192.168.5.56:4444/wd/hub”, options: options
xvfb-run java -Dwebdriver.chrome.driver=/usr/local/bin/chromedriver -jar /usr/local/bin/selenium-server-standalone.jar
require ‘csv’ CSV.open(“candidates.csv”,”w”) do |csv| csv « [id, name] end
output = CSV.open(‘data/craiglist.csv’, ‘wb’) # folder data must exists output « [id, name] output.close
require ‘mustache’ MESSAGE_TEXT = “Hi there,
I noticed your great page. Please see my profile here
Thanks!” element.send_keys Mustache.render( MESSAGE_TEXT, profile_url: profile_url)
OUTPUT_FILE = ARGV[0] || ‘output.csv’ TEST_MODE = true SIMULATE_REAL_USER_DELAY = true
sleep rand(8..15) if SIMULATE_REAL_USER_DELAY
require ‘rubygems’ require ‘mechanize’
agent = Mechanize.new page = agent.get(‘trk-inovacije.com’) page.link_with text: ‘Next’ # exact match page.search(‘#updates div a:first-child’) # css match
element = nil wait.until { element = driver.find_element(:name, ‘UserName’) } element.send_keys “asdasd”
require “selenium-webdriver”
USER_EMAIL = “[email protected]” USER_PASSWORD = “asdasd” TEST_MODE = false SIMULATE_REAL_USER_DELAY = true
if driver.nil? # driver is defined if we use irb and eval File.open(‘f’).read driver = Selenium::WebDriver.for :firefox # driver.manage.timeouts.implicit_wait = 10 # do not use implicit wait since it can hang out wait = Selenium::WebDriver::Wait.new(timeout: 30) # seconds
driver.navigate.to “https://trk.in.rs”
puts “Signing in…” element = nil wait.until { element = driver.find_element(:name, ‘UserName’) } element.send_keys USER_EMAIL
element = driver.find_element(:name, ‘Password’) element.send_keys USER_PASSWORD
submit end
puts “Finding profile_search…” begin wait.until { element = driver.find_element(:xpath, ‘//[text()[contains(.,”Profile”)]]’) } rescue Selenium::WebDriver::Error::TimeOutError puts “Missing Profile link…” retry or break or next end unless TEST_MODE element.click end sleep rand(8..15) if SIMULATE_REAL_USER_DELAY begin phone_element = driver.find_element :xpath, ‘//li[contains(text(),”☎”)]’ phone = phone_element.text[1..-1].strip puts phone rescue Selenium::WebDriver::Error::NoSuchElementError # this error is raised when find_element is called outside of wait.until phone = nil end wait.until do begin driver.find_element(:xpath, “//[@data-cid]”).attribute(‘data-cid’) != id rescue Selenium::WebDriver::Error::StaleElementReferenceError puts “old elemenet is no longer attached to the DOM” false end end begin driver.navigate.to link[:href] rescue Net::ReadTimeout puts “timeout for #{link[:href]}” next rescue Selenium::WebDriver::Error::UnhandledAlertError puts “UnhandledAlertError probably some model dialog on page” next end begin element.click rescue Selenium::WebDriver::Error::ElementNotVisibleError puts “apply button hidden” next end driver.switch_to.frame driver.find_elements( :tag_name, ‘iframe’).last
email_text = driver.find_element(:xpath, ‘//[@id=”msg_container”]’).find_elements(:xpath, “.//[text()[contains(.,’@’)]]”).map { |e| e.text }.join(“,”)
email_text = page.search(“//text()”).map(&:text).join ‘,’
r = Regexp.new(/\b[a-zA-Z0-9._%+-][email protected][a-zA-Z0-9.-]+.[a-zA-Z]{2,4}\b/) emails = email_text.scan(r).uniq if emails.length > 1 puts “Found several emails in body “ + emails.join(‘,’) end user_email = emails.first.strip if emails.first
class ImageService
attr_accessor :agent
def initialize @agent = Mechanize.new end
def get_links(name) page = agent.get ‘https://www.trk.in.rs/imghp’ form = page.form(‘f’) form.q = name page = agent.submit(form)
- some sites provide nice rss feed, for example elance https://www.elance.com/php/search/main/resultsproject.php?matchType=project&rss=1&matchKeywords=rails+-php&statusFilter=10037&sortBy=timelistedSort&sortOrder=1
To run chrome on heroku you need chrome and chromedriver https://github.com/heroku/heroku-buildpack-google-chromehttps://github.com/heroku/heroku-buildpack-chromedriver After adding buildpacks you need to initialize Selenium with correct path to google chrome
One alternative solution, which does not rely on selenium https://github.com/yujiosaka/headless-chrome-crawler
https://github.com/gokhandemirhan/KimonoAlternatives
- Find selectors using javascript https://github.com/cantino/selectorgadget and chrome extension
- https://webscraper.io/ browser extension https://www.webscraper.io/tutorials nice documentation with images https://www.webscraper.io/documentation/selectors/link-selector https://github.com/webscraperio https://www.webscraper.io/test-sites
- browser extension and api https://www.agenty.com/docs/video-tutorials.aspx with example integrations https://www.agenty.com/integrations/
- https://www.octoparse.com/ without coding, graphical algorithm https://www.youtube.com/channel/UCweDWm1QY2G67SDAKX7nreg
https://import.io https://www.parsehub.com/ https://scrape.it/ https://morph.io/ http://scrapinghub.com/ https://www.scrapehero.com/ http://datahut.co/ http://scrapy.org/
-
-