Deepl Translator's translation function is currently one of the more accurate professional translation tools on the market. However, its API is expensive and cannot be purchased in mainland China. Currently, many so-called free Deepl translation codes based on the API on GitHub are basically ineffective and will report "too many requests" errors after a few uses.
I have tried many improvement methods, but they have all failed. This approach is not feasible.
I have noticed that most of the free translations used by web browsers do not have restrictions. Why not use a headless browser like Selenium to implement the translation function?
Let's get started with the code:
#!/usr/bin/python
# -*- coding:utf-8 -*-
'''You need to install selenium and use geckodriver. You need to install Firefox browser first.'''
import re
import hashlib
import urllib
from urllib.parse import unquote
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium.webdriver.support.wait import WebDriverWait
import traceback
import time
import webDeeplTran
from Log import Log
logger = Log(__name__).getlog()
def getDeeplLink(check_url):
driver = None
domain = None
try:
options = webdriver.FirefoxOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--headless')
options.add_argument('--window-size=3456,2160')
driver = webdriver.Firefox(options=options)
driver.get(check_url)
# Wait for the page to load
driver.implicitly_wait(60)
# Find the text area element and enter text
input_text_area = driver.find_element(By.XPATH, '/html/body/div[4]/main/div[5]/div[1]/div[2]/section[1]/div[3]/div[2]/d-textarea/div')
input_text_area.send_keys(' The command Get Computed Label returns the accessibility label (sometimes\nalso referred to as Accessible Name), which is a short string that labels the\nfunction of the control (e.g. the string "Comment" or "Sign In" on a button).\n\nThe command Get Computed Role returns the reserved token value (in ARIA,\nbutton, heading, etc.) that describes the type of control or content in the\nelement.')
time.sleep(10)
output_text_area = driver.find_element(By.XPATH, '/html/body/div[4]/main/div[5]/div[1]/div[2]/section[2]/div[3]/div[1]/d-textarea/div')
output_text = output_text_area.get_attribute('textContent')
logger.debug(output_text)
# input_text_area.clear()
# input_text_area.send_keys(" Selenium recently removed the 16 deprecated find_element(s)_by_x functions in favor of a general find_element and find_elements function that take the 'by' part as their first argument.\nTo update your code, you can use your IDE's find-and-replace-all feature to replace these 16 search terms:")
# time.sleep(10)
# output_text = output_text_area.get_attribute('textContent')
# output_html = output_text_area.get_property('innerHTML')
# output_text = BeautifulSoup(output_html, 'html.parser').get_text()
# logger.debug(output_text)
# Execute JavaScript code to set the lang attribute of the div element
# driver.execute_script("document.getElementById('target-dummydiv').setAttribute('lang', 'zh-CN');")
# Take a screenshot and save it to the D:\\ directory
# driver.save_screenshot('D:\\deepl.png')
# Close the browser
driver.quit()
except Exception as e:
logger.debug(traceback.format_exc())
finally:
if driver:
driver.quit()
if __name__ == "__main__":
getDeeplLink('https://www.deepl.com/translator')
Run the above code and use the screenshot function to see if the headless browser is displaying correctly:
driver.save_screenshot('D:\deepl.png')
It is displayed almost perfectly.
A few key points:
- This is to get the input box for entering the English text that needs to be translated.
input_text_area = driver.find_element(By.XPATH, '/html/body/div[4]/main/div[5]/div[1]/div[2]/section[1]/div[3]/div[2]/d-textarea/div')
Use the input_text_area.send_keys method to enter text.
-
Get the Chinese translation box object.
output_text_area = driver.find_element(By.XPATH, '/html/body/div[4]/main/div[5]/div[1]/div[2]/section[2]/div[3]/div[1]/d-textarea/div')
Get the translated text.
output_text = output_text_area.get_attribute('textContent') -
After entering the English text, wait for N seconds to load the translated text before retrieving it. So I used time.sleep(10) here.
-
To clear the previous text when looping multiple times, use the following method:
input_text_area.clear() -
I used the Firefox browser and geckodriver.exe driver here, both of which need to be installed with the latest version. I tried Chrome but it didn't work. I don't know if it's a problem with my installation. If anyone has successfully researched Chrome, please share your experience.
-
I have only implemented English to Chinese translation, which meets my requirements. I haven't researched translations in other languages. If anyone has successfully researched translations in other languages, please share them.