[Python from entry to advanced] 39. Use Selenium to automatically verify slider login
[Python from entry to advanced] 39. Use Selenium to automatically verify slider login
Continuing from the previous article " [38. Basic use of Chrome handless in selenium](https://blog.csdn.net/acmman/article/details/133611724 "38. Selenium’s basic use of Chrome handless") "
In the previous article we introduced the use of Chrome's headless browser Chrome Handless in selenium . In this article, we use selenium to perform some common complex verification functions. First, we will explain how to perform automatic verification of the slider.
1. Introduction to test cases
We need to use selenium to verify the currently common slider verification code . Take Douban’s login page as an example:
the steps are:
(1) Open the login page https://accounts.douban.com/passport/login :
( 2) Click "Password Login" on the page:
(3) After entering the account password, click the "Login to Douban " button:
(4) Splice the pop-up slider for login verification:
2. Technologies needed
1. python language
I won’t go into details here. This article mainly uses python technology to implement it.
2. selenium library
selenium is a Python library for testing web applications. It can simulate user operations in the browser, such as clicking, filling out forms, etc. Selenium can interact with various browsers and provides a rich API to control browser behavior and obtain web page content.
3. urllib library
urllib is one of the Python standard libraries used to handle URL-related operations. It contains multiple sub-modules, such as urllib.request for sending HTTP requests and getting responses, urllib.parse for parsing and building URLs, urllib.error for handling URL-related errors, etc. urllib is often used for tasks such as network data crawling and API access.
4. cv2 library
cv2 is a Python binding for the OpenCV (Open Source Computer Vision) library. OpenCV is a widely used computer vision library that provides a rich set of image processing and computer vision algorithms. The cv2 library provides Python developers with access to OpenCV functions for image loading, processing, analysis, and computer vision tasks such as face recognition, target detection, etc.
Installation notes:
If an error is reported when installing directly through pip install cv2, please use the following statement to install:
pip install opencv-python
5. random library
random is a random number generation library for Python. It provides a variety of random number generation functions, including functions for generating pseudo-random numbers and functions for randomly selecting elements from a sequence. The random library can be used in fields such as simulation, game development, cryptography, and a variety of applications that require randomness.
6. re library
re is Python's regular expression module for pattern matching and processing of strings. Regular expression is a powerful text matching tool that can be used to search, replace, and extract strings of specific patterns. The re library provides functions and methods to compile regular expressions, perform matching operations, and return matching results, making processing text data more flexible and efficient.
3. Implementation steps
Below we use code to implement slider verification.
1. Open the login page and switch password login
The first step is to open the login page and click "Password Login" on the page:
Code:
import time # Event library for hard waits
from selenium import webdriver # import selenium's webdriver module
from selenium.webdriver.common.by import By # Import the By class selector
# Create a Chrome WebDriver object
driver = webdriver.Chrome()
try.
# Open the Douban login page
driver.get("https://accounts.douban.com/passport/login")
print(driver.title) # print the title of the page
# (1) Get the "password login" option element and click on it
# Use your browser's F12 developer tool and copy xpath to get the XPATH path of the element
passClick = driver.find_element(By.XPATH, '//*[@id="account"]/div[2]/div[2]/div/div[1]/ul[1]/li[2]')
passClick.click()
# overall wait 5 seconds for the result
time.sleep(5)
finally.
# Close the browser
driver.quit()
Effect:
It is worth noting that the CSS selector path of "Password Login" here is copied by opening the developer options through browser F12 and using the "copy xpath" function.
Effect:
2. Enter password and click to log in
In the second step, enter your account password and click the "Login to Douban" button:
# Use the browser to implicitly wait 3 seconds
driver.implicitly_wait(3)
# Get the account password component and assign it a value
userInput = driver.find_element(By.ID, "username")
userInput.send_keys("jackzhucoder@126.com")
passInput = driver.find_element(By.ID, "password")
passInput.send_keys("123456")
# Get the login button and click login
loginButton = driver.find_element(By.XPATH, '//*[@id="account"]/div[2]/div[2]/div/div[2]/div[1]/div[4]/a')
loginButton.click()
The xpath path of the login button here is also copied using the "copy xpath" function of the developer options.
Effect:
3. Switch focus and download verification image
Switch focus to the slider validation area and download the loaded slider validation background image.
After clicking the login button, the slider verification area will appear. This is a new frame area. At this time, we need to switch the focus from the main page to this frame area:
In the code, we use the switch_to.frame method of WebDriver That's it, the parameter is the id name "tcaptcha_iframe_dy" of the frame area.
Then we need to obtain the entire large image that needs to be processed, obtain its path and download it locally, and prepare for reading verification: it is
relatively simple to obtain the image element here, just obtain it through the ID name "slideBg", but the image path needs to analyze its style The css parameter in the attribute parses the image src address through regular expressions, and then accesses this path through urllib to download the image.
Before parsing the image, you must wait for the image element to be loaded before retrieving it, otherwise nothing will be parsed.
Code:
driver.implicitly_wait(5) # use the browser to implicitly wait 5 seconds
# At this point you need to switch to the popup slider area, you need to switch the frame window
driver.switch_to.frame("tcaptcha_iframe_dy")
# Wait for the slider validation image to load, then do the next operation
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'slideBg')))
# Get the slider validation image download path and download it locally
bigImage = driver.find_element(By.ID, "slideBg")
s = bigImage.get_attribute("style") # get the style attribute of the image
# Set a regular expression that matches the path to the image
p = 'background-image: url\(\"(. *?) \"\);'
# Do a regular expression match, find the matching string and intercept it
bigImageSrc = re.findall(p, s, re.S)[0] # re.S means dot matches any character, including line breaks
print("Slider validates image download path:", bigImageSrc)
# Download the image locally
urllib.request.urlretrieve(bigImageSrc, 'bigImage.png')
The effect of downloading the picture:
4. Drag the slider to the gap.
What we need to do next is to move the small puzzle picture to the gap:
we need to get the actual distance from the small picture to the gap. Generally, two methods are used. .
The first method is template matching , which uses openCV to analyze the similarity of two pictures and obtain the coordinates of two very similar pictures, thereby calculating the distance between the two pictures.
The second method is contour detection , which is performed through openCV, that is, finding the coordinates of the gap position in the large picture, and then calculating the distance from the small picture to the gap position.
Here, because we cannot obtain separate pictures of small puzzles, it is difficult to use the template matching method, so we choose to use the second contour detection method.
(1) Obtain the notch contour position information
First, let’s calculate the coordinates and approximate area of the gap. Use PhotoShop to open the downloaded image, cut out the gap individually according to the size of a square, and find that its length and width are each 80 pixels:
so the area range of this closed rectangle is approximately 80 pixels. 80=about 6400 pixels. The perimeter is 804=320 pixels. But in reality, there is a gap here and it is not a complete picture, so we need to give it a certain error range. Here we tentatively set the target area as 5025-7225 and the perimeter as 300-380.
Then we encapsulate the logic of calculating distance into a method:
# Wrapped algorithm to calculate image distance
def get_pos(imageSrc).
# Read an image file and return an image object represented by an array of images
image = cv2.imread(imageSrc)
# The GaussianBlur method performs image blurring/noise reduction operations.
# It creates a convolution kernel (or filter) based on a Gaussian function (also known as a normal distribution) that is applied to each pixel point on the image.
blurred = cv2.GaussianBlur(image, (5, 5), 0, 0)
# Canny method for image edge detection
# image: the input single-channel grayscale image.
# threshold1: The first threshold value, used for edge linking. Generally set to a smaller value.
# threshold2: second threshold, used for edge linking and strong edge filtering. Typically set to a larger value
canny = cv2.Canny(blurred, 0, 100) # contours
# findContours method is used to detect contours in the image and return a list of all detected contours.
# contours(optional): list of output contours. Each contour is represented as a set of points.
# hierarchy(optional): the output contour hierarchy information. It describes the relationships between contours, such as parent-child relationships.
contours, hierarchy = cv2.findContours(canny, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Iterate through the list of detected contours
for contour in contours.
# The contourArea method is used to calculate the area of the contour
area = cv2.contourArea(contour)
# arcLength method to calculate the perimeter or arc length of the contour
length = cv2.arcLength(contour, True)
# If the detected area is between 5025-7225 and the perimeter is between 300-380, then it is the target area
if 5025 < area < 7225 and 300 < length < 380.
# Calculate the bounding rectangle of the contour, get the coordinates and width and height
# x, y: coordinates of the points in the upper left corner of the bounding rectangle.
# w, h: width and height of the bounding rectangle.
x, y, w, h = cv2.boundingRect(contour)
print("Calculated coordinates and width and height of the target area:", x, y, w, h)
# Draw a red box on the target area to see the effect
cv2.rectangle(image, (x, y), (x+w, y+h), (0, 0, 255), 2)
cv2.imwrite("111.jpg", image)
return x
return 0
Then call this method after downloading the image:
# Download the image locally
urllib.request.urlretrieve(bigImageSrc, 'bigImage.png')
# Calculate the x-axis position of the notch image
dis = get_pos('bigImage.png')
# Wait 5 seconds for the overall result
time.sleep(5)
Effect:
The generated calculation picture with a red frame in the target area:
Okay, so far we have obtained an important data, which is the location information of the gap.
(2) Match small slider elements
Get the small slider element and move it to the distance calculated above.
The position we move here is not to directly subtract the x2 coordinate of the small slider from the x1 on the image we just obtained, because when we open the F12 developer interface, we can see that the width of the overall image is smaller than the originally downloaded image. (The web developer has fixed the length and width for it), so we need to recalculate the x1 position of the gap relative to the position of the smaller image:
the calculation method is to multiply the original coordinates by the width of the new canvas, and then divide Based on the width of the original canvas:
new gap coordinates = original gap coordinates * new canvas width / original canvas width.
The principle is the primary school numbers (see picture):
Start writing the code below.
First get the xpath address of the small slider, which is used to get the element:
Code:
# Calculate the x-axis position of the notch image
dis = get_pos('bigImage.png')
# Get the small slider element and move it to the top position
smallImage = driver.find_element(By.XPATH, '//*[@id="tcOperation"]/div[6]')
# The distance the small slider has traveled to the target area (the difference between the horizontal position of the notch coordinates subtracted from the horizontal coordinates of the small slider)
# newNotchCoordinate = originalNotchCoordinate * newCanvasWidth / originalCanvasWidth
newDis = int(dis*340/672-smallImage.location['x'])
driver.implicitly_wait(5) # use browser to implicitly wait 5 seconds
# press the small slider button without moving it
ActionChains(driver).click_and_hold(smallImage).perform()
# Move the small slider, simulating a human action, a little at a time
i = 0
moved = 0
while moved < newDis: x = random.
x = random.randint(3, 10) # move 3 to 10 pixels at a time
moved += x
ActionChains(driver).move_by_offset(xoffset=x, yoffset=0).perform()
print("After the {}th move, the location is {}".format(i, smallImage.location['x']))
i += 1
# Release the mouse after the move
ActionChains(driver).release().perform()
# Wait 5 seconds for the result
time.sleep(5)
Since most websites have the logic to detect real-person operations, we need to simulate real-person movement operations here. We cannot move to the target point all at once, but need to move bit by bit.
Effect:
Selenium automatically verifies the slider effect
4. Complete code
The following is the complete code written according to the steps above (as of October 6, 2023). Later, if the website is updated or the element layout changes, you will need to modify and optimize it.
This code is for learning reference only and must not be used for other purposes.
# _*_ coding : utf-8 _*_
# @Time : 2023-10-06 9:44
# @Author : LightBoyDecember
# @File : Auto Slide Verification for Douban Login
# @Project : Python Basics
import random
import re # Regular expression matching library.
import time # Event library for hard waits
import urllib # Network access
import cv2 # opencv library
from selenium import webdriver # import selenium's webdriver module
from selenium.webdriver.common.by import By # import By class selector
from selenium.webdriver.support.wait import WebDriverWait # wait class
from selenium.webdriver.support import expected_conditions as EC # Wait for conditions class
from selenium.webdriver.common.action_chains import ActionChains # action class
# Wrapped algorithm for calculating image distance
def get_pos(imageSrc).
# Reads an image file and returns an image object represented by an array of images
image = cv2.imread(imageSrc)
# The GaussianBlur method performs image blurring/noise reduction operations.
# It creates a convolution kernel (or filter) based on a Gaussian function (also known as a normal distribution) that is applied to each pixel point on the image.
blurred = cv2.GaussianBlur(image, (5, 5), 0, 0)
# Canny method for image edge detection
# image: the input single-channel grayscale image.
# threshold1: The first threshold value, used for edge linking. Generally set to a smaller value.
# threshold2: second threshold, used for edge linking and strong edge filtering. Typically set to a larger value
canny = cv2.Canny(blurred, 0, 100) # contours
# findContours method is used to detect contours in the image and return a list of all detected contours.
# contours(optional): list of output contours. Each contour is represented as a set of points.
# hierarchy(optional): the output contour hierarchy information. It describes the relationships between contours, such as parent-child relationships.
contours, hierarchy = cv2.findContours(canny, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Iterate through the list of detected contours
for contour in contours.
# The contourArea method is used to calculate the area of the contour
area = cv2.contourArea(contour)
# arcLength method to calculate the perimeter or arc length of the contour
length = cv2.arcLength(contour, True)
# If the detected area is between 5025-7225 and the perimeter is between 300-380, then it is the target area
if 5025 < area < 7225 and 300 < length < 380.
# Calculate the bounding rectangle of the contour, get the coordinates and width and height
# x, y: coordinates of the points in the upper left corner of the bounding rectangle.
# w, h: width and height of the bounding rectangle.
x, y, w, h = cv2.boundingRect(contour)
print("Calculated coordinates and width and height of the target area:", x, y, w, h)
# Draw a red box on the target area to see the effect
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 255), 2)
cv2.imwrite("111.jpg", image)
return x
return 0
# Create a Chrome WebDriver object
driver = webdriver.Chrome()
try.
# Open Douban login page
driver.get("https://accounts.douban.com/passport/login")
print(driver.title) # Print the title of the page.
# (1) Get the "password login" option element and click on it.
# Use your browser's F12 developer tool and copy xpath to get the XPATH path of the element
passClick = driver.find_element(By.XPATH, '//*[@id="account"]/div[2]/div[2]/div/div[1]/ul[1]/li[2]')
passClick.click()
driver.implicitly_wait(3) # use browser to implicitly wait 3 seconds
# Get the account password component and assign it a value
userInput = driver.find_element(By.ID, "username")
userInput.send_keys("jackzhucoder@126.com")
passInput = driver.find_element(By.ID, "password")
passInput.send_keys("123456")
# Get the login button and click login
loginButton = driver.find_element(By.XPATH, '//*[@id="account"]/div[2]/div[2]/div/div[2]/div[1]/div[4]/a')
loginButton.click()
driver.implicitly_wait(5) # use the browser to implicitly wait 5 seconds
# At this point you need to switch to the popup slider area, you need to switch the frame window
driver.switch_to.frame("tcaptcha_iframe_dy")
# Wait for the slider validation image to load, then do the next operation
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'slideBg')))
# Get the slider validation image download path and download it locally
bigImage = driver.find_element(By.ID, "slideBg")
s = bigImage.get_attribute("style") # get the style attribute of the image
# Set a regular expression that matches the path to the image
p = 'background-image: url\(\"(. *?) \"\);'
# Do a regular expression match, find the matching string and intercept it
bigImageSrc = re.findall(p, s, re.S)[0] # re.S means dot matches any character, including line breaks
print("Slider validates image download path:", bigImageSrc)
# Download the image locally
urllib.request.urlretrieve(bigImageSrc, 'bigImage.png')
# Calculate the x-axis position of the notch image
dis = get_pos('bigImage.png')
# Get the small slider element and move it to the top position
smallImage = driver.find_element(By.XPATH, '//*[@id="tcOperation"]/div[6]')
# The distance the small slider moves to the target area (the difference between the horizontal coordinates of the notch coordinates subtracted from the horizontal coordinates of the small slider)
# newNotchCoordinate = originalNotchCoordinate * newCanvasWidth / originalCanvasWidth
newDis = int(dis*340/672-smallImage.location['x'])
driver.implicitly_wait(5) # use browser to implicitly wait 5 seconds
# press the small slider button without moving it
ActionChains(driver).click_and_hold(smallImage).perform()
# Move the small slider, simulating a human action, a little at a time
i = 0
moved = 0
while moved < newDis: x = random.
x = random.randint(3, 10) # move 3 to 10 pixels at a time
moved += x
ActionChains(driver).move_by_offset(xoffset=x, yoffset=0).perform()
print("After the {}th move, the location is {}".format(i, smallImage.location['x']))
i += 1
# Release the mouse after the move
ActionChains(driver).release().perform()
# Wait 5 seconds for the result
time.sleep(5)
finally.
# Close the browser
driver.quit()
Reference: Xiao Feidao 2018 "Selenium Verification Code Sliding Login"
Please indicate the source for reprinting: https://guangzai.blog.csdn.net/article/details/133827764