comments and README.md
This commit is contained in:
89
README.md
89
README.md
@@ -0,0 +1,89 @@
|
||||
# ORSR Scraper
|
||||
|
||||
With this application you can get all changed records in orsr for the current day.
|
||||
|
||||
The application consists of two parts:
|
||||
|
||||
### 1. Scraper:
|
||||
- gets the data of all changed records
|
||||
- either the "aktuálna" or the "úplna" version
|
||||
- can use a socks5 proxy
|
||||
- stores the data in a MongoDB
|
||||
|
||||
### 2. Flask app:
|
||||
|
||||
- Minimalistic flask app that has two endpoints:
|
||||
- /detail with parameter ico
|
||||
- returns a json data for the record with ico
|
||||
- /list
|
||||
- returns a paginated list of records ico and obhcodneMeno
|
||||
|
||||
|
||||
## Setup
|
||||
### 1. Prerequisites
|
||||
You need to have installed/access to:
|
||||
- current python
|
||||
- MongoDB
|
||||
- Socks5 proxy (optional)
|
||||
|
||||
The installation of these is out of scope of this README
|
||||
|
||||
### 1. Download the app
|
||||
Download/clone the application
|
||||
|
||||
### 2. venv and requirements
|
||||
Open terminal cd to app folder and install venv
|
||||
```
|
||||
cd [appPath]
|
||||
python -m venv venv
|
||||
```
|
||||
install the requirements from `requirements.txt`
|
||||
```
|
||||
venv/bin/pip install -r requirements.txt
|
||||
|
||||
for Windows:
|
||||
venv\Scripts\pip.exe install -r requirements.txt
|
||||
```
|
||||
|
||||
### 3. Config File
|
||||
There is a default config file "config_base.cfg".
|
||||
For local changes copy this base config file and store it as "config.cfg". The config file has the following structure:
|
||||
```
|
||||
[DB]
|
||||
MONGODB_URI = mongodb://localhost:27017
|
||||
MONGODB_DB = softone
|
||||
MONGODB_COLLECTION = orsr
|
||||
|
||||
[WEB]
|
||||
BASE_URL = https://www.orsr.sk/
|
||||
ENDPOINT = hladaj_zmeny.asp
|
||||
|
||||
[PROXY]
|
||||
#HTTP_PROXY = socks5://user:pass@host:port
|
||||
#HTTPS_PROXY = socks5://user:pass@host:port
|
||||
|
||||
[APP]
|
||||
THREADS = 8
|
||||
```
|
||||
|
||||
Setup the connection to MongoDB, number of threads being used for collecting the data and optionally also the Socks5 Proxy params.
|
||||
|
||||
## Run the applications
|
||||
### 1. Scraper
|
||||
Run the scraper with
|
||||
```
|
||||
venv/bin/python scraper.py
|
||||
|
||||
for Windows:
|
||||
venv\Scripts\python.exe scraper.py
|
||||
```
|
||||
It will ask you if you want to download the "aktuálny" or "úplný" record.
|
||||
### 2. Flask
|
||||
Start flask application
|
||||
```
|
||||
venv/bin/python flaskapp.py
|
||||
|
||||
for Windows:
|
||||
venv\Scripts\python.exe flaskapp.py
|
||||
```
|
||||
Now you can get the data from the local test server that usually runs on `http://127.0.0.1:5000`
|
||||
Reference in New Issue
Block a user