Skip to content

Commit 291d748

Browse files
committed
README: refresh
1 parent ba4c891 commit 291d748

File tree

1 file changed

+24
-33
lines changed

1 file changed

+24
-33
lines changed

README.rst

+24-33
Original file line numberDiff line numberDiff line change
@@ -21,54 +21,45 @@ Scrapyrt (Scrapy realtime)
2121
.. image:: https://readthedocs.org/projects/scrapyrt/badge/?version=latest
2222
:target: https://scrapyrt.readthedocs.io/en/latest/api.html
2323

24-
Introduction
25-
============
24+
HTTP API for scheduling `Scrapy <https://scrapy.org/>`_ spiders and receiving their items in response.
2625

27-
HTTP server which provides API for scheduling `Scrapy <https://scrapy.org/>`_ spiders and
28-
making requests with spiders.
29-
30-
Features
31-
========
32-
* Allows you to easily add HTTP API to your existing Scrapy project
33-
* All Scrapy project components (e.g. middleware, pipelines, extensions) are supported out of the box.
34-
* You simply run Scrapyrt in Scrapy project directory and it starts HTTP server allowing you to schedule your spiders and get spider output in JSON format.
35-
36-
Note
37-
====
38-
* Project is not a replacement for `Scrapyd <https://scrapyd.readthedocs.io/en/stable/>`_ or `Scrapy Cloud <https://www.zyte.com/scrapy-cloud/>`_ or other infrastructure to run long running crawls
39-
* Not suitable for long running spiders, good for spiders that will fetch one response from some website and return response
40-
41-
Getting started
26+
Quickstart
4227
===============
4328

44-
To install Scrapyrt::
29+
**1. install**
4530

46-
pip install scrapyrt
31+
> pip install scrapyrt
4732

48-
Now you can run Scrapyrt from within Scrapy project by just typing::
33+
**2. switch to Scrapy project (e.g. quotesbot project)**
4934

50-
scrapyrt
35+
> cd ../quotesbot
5136

52-
in Scrapy project directory.
37+
**3. launch ScrapyRT**
5338

54-
Scrapyrt will look for ``scrapy.cfg`` file to determine your project settings,
55-
and will raise error if it won't find one. Note that you need to have all
56-
your project requirements installed.
39+
> scrapyrt
5740

58-
Scrapyrt supports endpoint ``/crawl.json`` that can be requested
59-
with two methods: GET and POST.
41+
**4. run your spiders**
6042

61-
To run sample toscrape-css spider from `Quotesbot <https://github.com/scrapy/quotesbot>`_
62-
parsing page about famous quotes::
43+
> curl "localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/"
6344

64-
curl "http://localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/"
45+
**5. run more complex query, e.g. specify callback**
6546

47+
> curl --data '{"request": {"url": "http://quotes.toscrape.com/page/2/"}, "spider_name": "toscrape-css", "crawl_args": {"callback":"other"}}' http://localhost:9080/crawl.json -v
6648

67-
To run same spider only allowing one request and parsing url
68-
with callback ``parse_foo``::
49+
Scrapyrt will look for ``scrapy.cfg`` file to determine your project settings,
50+
and will raise error if it won't find one. Note that you need to have all
51+
your project requirements installed.
6952

70-
curl "http://localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/&callback=parse_foo&max_requests=1"
53+
Features
54+
========
55+
* Allows you to easily add HTTP API to existing Scrapy project
56+
* All Scrapy project components (e.g. middleware, pipelines, extensions) are supported out of the box.
57+
* You simply run Scrapyrt in Scrapy project directory and it starts HTTP server allowing you to schedule your spiders and get spider output in JSON format.
7158

59+
Note
60+
====
61+
* Project is not a replacement for `Scrapyd <https://scrapyd.readthedocs.io/en/stable/>`_ or `Scrapy Cloud <https://www.zyte.com/scrapy-cloud/>`_ or other infrastructure to run long running crawls
62+
* Not suitable for long running spiders, good for spiders that will fetch one response from some website and return response
7263

7364

7465
Documentation

0 commit comments

Comments
 (0)