diff --git a/README.md b/README.md index 34e3639..c25075e 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ -# Survival Library +# Survivor Library -Various scripts for scraping and parsing survivallibrary.com +Various scripts for scraping and parsing survivorlibrary.com Keep in mind it was meant to be a quick-and-dirty project, so things were kind of hotglued together as I went along. @@ -18,7 +18,7 @@ Keep in mind it was meant to be a quick-and-dirty project, so things were kind o ## Order of scripts 1. Browser: `get_page_urls_browser.js` - 1. Add URLs into file `survivallibrary_pages.txt` + 1. Add URLs into file `survivorlibrary_pages.txt` 2. Bash: `get_pages_with_pdfs.sh` 1. This one will take a while, since it downloads the HTML of all the category pages and dumps it into the `pages/` directory. 3. Node: `parse_html_pages.js` diff --git a/parse_html_pages.js b/parse_html_pages.js index a9c1a12..33dd19a 100644 --- a/parse_html_pages.js +++ b/parse_html_pages.js @@ -108,14 +108,14 @@ async function parseHtml() await fs.writeFile('./folderLink.sh', folderLinkCmds.join('\n')); /** - * It seems the web server for SurvivalLibrary doesn't support + * It seems the web server for SurvivorLibrary doesn't support * the `Range` HTTP header. We can't just "continue" a download. * * I wouldn't be surprised if one (or more) of the PDFs end up corrupted, as we just check if the file _exists_ * before skipping it (if it does exist). * * As a workaround, I created `validate_pdfs.sh` to at least validate that the PDFs are valid. - * Keep in mind that most of the PDFs that are invalid, are also corrupted on Survival Library's website. + * Keep in mind that most of the PDFs that are invalid, are also corrupted on Survivor Library's website. * Meaning it's the _source_ that's corrupt, not the downloaded file specifically. */ const scriptOutput = pdfUrls.map(url => {