The most recent post is cached in NodeIterator (and
saved to the disk), and its timestamp is used, instead of the
timestamp instaloader was run.
This way, even in later resuming runs the timestamp stored is the same
that would have been stored if the first run.
Fixes #1206.
Using itertools.takewhile() on a NodeIterator returns a plain Iterator,
and so it's not resumable.
The strategy has been altered to pass an extra argument to
posts_download_loop, a lambda that is evaluated for each post, and
causes the loop to stop when it returns false.
While this isn't really an error, it requires user action, so the message
is printed to stderr even with --quiet, and repeated at the end when that
option is not enabled.
Resolves #1159.
Earlier, if someone had an email address in the caption, those were provided as mentions in the Post.mentions attribute. With this update to the regular expression, email addresses are no longer ingested.
Fixes #1029.
Adds the --latest-stamps command line option, pointing to a file where
the latest time each profile was scraped. On the next run, only posts
newer than that time are downloaded.
Fixes #1122.
- When a post contained sidecars, the filename for caption, json etc. was changed. It was no longer the original filename like before.
- If the post contains sidecars, a local variable is used to build the filenames for the sidecar media.
This fetch was done for sidecar posts that contain a video when going
through a profile. The fetched information is already present with the
new profile query introduced in the last commit, making this full
metadata fetch query unnecessary. Instaloader now better evaluates if
that fetch must be done or not.
The fetch was also (eventually unnecessarily) made when accessing
get_sidecar_posts() on a Post that has been loaded with
load_structure_from_file().
- It's necessary because for the old query_hash IG isn't returning the full structure any more.
- So the old query_hash is replaced with the current one.