SurvivorLibraryScrape/README.md

17 lines
816 B
Markdown
Raw Normal View History

# Survival Library
Various scripts for scraping and parsing survivallibrary.com
Keep in mind it was meant to be a quick-and-dirty project, so things were kind of hotglued together as I went along.
## Requirements
1. Node.js + npm for `parse_html_pages.js`
- I was using `v16.13.2` (LTS) at the time of writing.
- Remember to run `npm install` before attempting to run `node parse_html_pages.js`
2. `pdfinfo` via `poppler-utils`
- Used by one of the Bash scripts to validate the downloaded PDF files
3. Bash for the various scripts
- Bash scripts were used on a Debian 10 (Buster) machine, which has it by default. Theoretically they should work on Windows (e.g. via Git Bash), but due to requirement #2 it might not work as expected.
4. `curl` - which downloads all the pages.
5.