You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/advanced/09_beautifulsoup4.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# BeautifulSoup4 Scraper
2
2
3
-
Option to use [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) as parser instead of Playwright has been added in [Release 0.2.0](https://github.com/roniemartinez/dude/releases/tag/0.2.0).
3
+
Option to use [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) as parser backend instead of Playwright has been added in [Release 0.2.0](https://github.com/roniemartinez/dude/releases/tag/0.2.0).
4
4
BeautifulSoup4 is an optional dependency and can only be installed via `pip` using the command below.
5
5
6
6
=== "Terminal"
@@ -11,7 +11,7 @@ BeautifulSoup4 is an optional dependency and can only be installed via `pip` usi
11
11
12
12
## Required changes to your script in order to use BeautifulSoup4
13
13
14
-
Instead of ElementHandle objects when using Playwright as parser, Soup objects are passed to the decorated functions.
14
+
Instead of ElementHandle objects when using Playwright as parser backend, Soup objects are passed to the decorated functions.
15
15
16
16
17
17
=== "Python"
@@ -36,7 +36,7 @@ Instead of ElementHandle objects when using Playwright as parser, Soup objects a
36
36
37
37
## Running Dude with BeautifulSoup4
38
38
39
-
You can run BeautifulSoup4 parser using the `--bs4` command-line argument or `parser="bs4"` parameter to `run()`.
39
+
You can run BeautifulSoup4 parser backend using the `--bs4` command-line argument or `parser="bs4"` parameter to `run()`.
Copy file name to clipboardexpand all lines: docs/advanced/10_parsel.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Parsel Scraper
2
2
3
-
Option to use [Parsel](https://github.com/scrapy/parsel) as parser instead of Playwright has been added in [Release 0.5.0](https://github.com/roniemartinez/dude/releases/tag/0.5.0).
3
+
Option to use [Parsel](https://github.com/scrapy/parsel) as parser backend instead of Playwright has been added in [Release 0.5.0](https://github.com/roniemartinez/dude/releases/tag/0.5.0).
4
4
Parsel is an optional dependency and can only be installed via `pip` using the command below.
5
5
6
6
=== "Terminal"
@@ -11,7 +11,7 @@ Parsel is an optional dependency and can only be installed via `pip` using the c
11
11
12
12
## Required changes to your script in order to use Parsel
13
13
14
-
Instead of ElementHandle objects when using Playwright as parser, Selector objects are passed to the decorated functions.
14
+
Instead of ElementHandle objects when using Playwright as parser backend, Selector objects are passed to the decorated functions.
15
15
16
16
17
17
=== "Python"
@@ -37,7 +37,7 @@ Instead of ElementHandle objects when using Playwright as parser, Selector objec
37
37
38
38
## Running Dude with Parsel
39
39
40
-
You can run Parsel parser using the `--parsel` command-line argument or `parser="parsel"` parameter to `run()`.
40
+
You can run Parsel parser backend using the `--parsel` command-line argument or `parser="parsel"` parameter to `run()`.
Copy file name to clipboardexpand all lines: docs/advanced/11_lxml.md
+7-7
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# lxml Scraper
2
2
3
-
Option to use [lxml](https://lxml.de/) as parser instead of Playwright has been added in [Release 0.6.0](https://github.com/roniemartinez/dude/releases/tag/0.6.0).
3
+
Option to use [lxml](https://lxml.de/) as parser backend instead of Playwright has been added in [Release 0.6.0](https://github.com/roniemartinez/dude/releases/tag/0.6.0).
4
4
lxml is an optional dependency and can only be installed via `pip` using the command below.
5
5
6
6
=== "Terminal"
@@ -11,7 +11,7 @@ lxml is an optional dependency and can only be installed via `pip` using the com
11
11
12
12
## Required changes to your script in order to use lxml
13
13
14
-
Instead of ElementHandle objects when using Playwright as parser, [Element, "smart" strings, etc.](https://lxml.de/xpathxslt.html#xpath-return-values) objects are passed to the decorated functions.
14
+
Instead of ElementHandle objects when using Playwright as parser backend, [Element, "smart" strings, etc.](https://lxml.de/xpathxslt.html#xpath-return-values) objects are passed to the decorated functions.
15
15
16
16
17
17
=== "Python"
@@ -24,10 +24,10 @@ Instead of ElementHandle objects when using Playwright as parser, [Element, "sma
24
24
def result_url(href):
25
25
return {"url": href} # (2)
26
26
27
-
28
-
# Option to get url using cssselect
29
-
@select(css="a.url", priority=2)
30
-
def result_url(element):
27
+
28
+
"""Option to get url using cssselect""" # style.css hides a comment
29
+
@select(css="a.url")
30
+
def result_url_css(element):
31
31
return {"url_css": element.attrib["href"]} # (3)
32
32
33
33
@@ -44,7 +44,7 @@ Instead of ElementHandle objects when using Playwright as parser, [Element, "sma
44
44
45
45
## Running Dude with lxml
46
46
47
-
You can run lxml parser using the `--lxml` command-line argument or `parser="lxml"` parameter to `run()`.
47
+
You can run lxml parser backend using the `--lxml` command-line argument or `parser="lxml"` parameter to `run()`.
Copy file name to clipboardexpand all lines: docs/advanced/12_pyppeteer.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Pyppeteer Scraper
2
2
3
-
Option to use [Pyppeteer](https://github.com/pyppeteer/pyppeteer) as parser instead of Playwright has been added in [Release 0.8.0](https://github.com/roniemartinez/dude/releases/tag/0.8.0).
3
+
Option to use [Pyppeteer](https://github.com/pyppeteer/pyppeteer) as parser backend instead of Playwright has been added in [Release 0.8.0](https://github.com/roniemartinez/dude/releases/tag/0.8.0).
4
4
Pyppeteer is an optional dependency and can only be installed via `pip` using the command below.
5
5
6
6
=== "Terminal"
@@ -14,7 +14,7 @@ Pyppeteer is an optional dependency and can only be installed via `pip` using th
14
14
15
15
## Required changes to your script in order to use Pyppeteer
16
16
17
-
Instead of Playwright's `ElementHandle` objects when using Playwright as parser, Pyppeteer has its own `ElementHandle` objects that are passed to the decorated functions.
17
+
Instead of Playwright's `ElementHandle` objects when using Playwright as parser backend, Pyppeteer has its own `ElementHandle` objects that are passed to the decorated functions.
18
18
The decorated functions will need to accept 2 arguments, `element` and `page` objects.
19
19
This is needed because Pyppeteer elements does not expose a convenient function to get the text content.
20
20
@@ -46,7 +46,7 @@ This is needed because Pyppeteer elements does not expose a convenient function
46
46
47
47
## Running Dude with Pyppeteer
48
48
49
-
You can run Pyppeteer parser using the `--pyppeteer` command-line argument or `parser="pyppeteer"` parameter to `run()`.
49
+
You can run Pyppeteer parser backend using the `--pyppeteer` command-line argument or `parser="pyppeteer"` parameter to `run()`.
Copy file name to clipboardexpand all lines: docs/advanced/13_selenium.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Selenium Scraper
2
2
3
-
Option to use [Selenium](https://github.com/SeleniumHQ/Selenium) as parser instead of Playwright has been added in [Release 0.9.0](https://github.com/roniemartinez/dude/releases/tag/0.9.0).
3
+
Option to use [Selenium](https://github.com/SeleniumHQ/Selenium) as parser backend instead of Playwright has been added in [Release 0.9.0](https://github.com/roniemartinez/dude/releases/tag/0.9.0).
4
4
Selenium is an optional dependency and can only be installed via `pip` using the command below.
5
5
6
6
=== "Terminal"
@@ -11,7 +11,7 @@ Selenium is an optional dependency and can only be installed via `pip` using the
11
11
12
12
## Required changes to your script in order to use Selenium
13
13
14
-
Instead of Playwright's `ElementHandle` objects when using Playwright as parser, `WebElement` objects are passed to the decorated functions.
14
+
Instead of Playwright's `ElementHandle` objects when using Playwright as parser backend, `WebElement` objects are passed to the decorated functions.
15
15
16
16
=== "Python"
17
17
@@ -31,7 +31,7 @@ Instead of Playwright's `ElementHandle` objects when using Playwright as parser,
31
31
32
32
## Running Dude with Selenium
33
33
34
-
You can run Selenium parser using the `--selenium` command-line argument or `parser="selenium"` parameter to `run()`.
34
+
You can run Selenium parser backend using the `--selenium` command-line argument or `parser="selenium"` parameter to `run()`.
Copy file name to clipboardexpand all lines: docs/supported_parser_backends/index.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
-
# Supported Parsers
1
+
# Supported Parser Backends
2
2
3
-
By default, Dude uses Playwright but gives you an option to use parsers that you are familiar with.
3
+
By default, Dude uses Playwright but gives you an option to use parser backends that you are familiar with.
4
4
It is possible to use parser backends like [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/), [Parsel](https://github.com/scrapy/parsel) and [lxml](https://lxml.de/).
5
5
6
-
Here is the summary of features supported by each parser.
6
+
Here is the summary of features supported by each parser backend.
0 commit comments