1
0
mirror of https://github.com/instaloader/instaloader.git synced 2024-11-23 10:42:30 +01:00
Commit Graph

199 Commits

Author SHA1 Message Date
André Koch-Kramer
1928db63bb Added ability to download videos within sidecars
Closes #41
2017-08-25 13:50:58 +02:00
Alexander Graf
ca54088bdc Merge branch 'v3-dev' 2017-08-24 21:38:50 +02:00
André Koch-Kramer
1a26d7336c Fix TypeError on retry in get_json() 2017-08-24 21:17:49 +02:00
André Koch-Kramer
bb71c40b56 Wait smarter to avoid HTTP error code 429
Additional sleeps are necessary because Instagram is rate limiting
GraphQL queries. The error does not occur if not more than 100 queries
are made in a sliding window of eleven minutes.
Ports a894c2d to version 3.
2017-08-24 18:30:46 +02:00
Alexander Graf
2e47642f74 Remove --shorter-output from --help 2017-08-24 17:51:39 +02:00
Alexander Graf
9939eab8ec Minor fixes to README.rst 2017-08-24 17:23:46 +02:00
Alexander Graf
c61d6a93b2 Restructure README.rst 2017-08-24 17:10:32 +02:00
Alexander Graf
bbdd3873e2 --skip-videos -> --no-videos; no --shorter-output 2017-08-24 16:03:24 +02:00
Alexander Graf
43bfd6b5ab Put all our GitHub keywords in setup.py 2017-08-24 15:54:07 +02:00
Alexander Graf
dd99417e7b Require win_unicode_console on Windows Python 3.5 2017-08-24 15:40:41 +02:00
Alexander Graf
8cf1997460 Evaluate --only-if filter smarter
If --only-if='likes>1000 or viewer_has_liked' is given, it is not
neccessary to evaluate viewer_has_liked if the post has more than 1000
likes. The new implementation smartly handles this case.
2017-08-23 15:38:16 +02:00
Alexander Graf
ce38f5880f Call download_stories in _error_catcher
download_stories() may trow a BadResponseException, which should not
cause abortion of download_profile(). Now, all calls of
download_stories() are within an _error_catcher context.
2017-08-22 09:21:47 +02:00
Lars Lindqvist
79f143b1f8 Minor stories-related cleanup 2017-08-20 20:32:45 +02:00
Alexander Graf
ad34fc09b6 Wait longer after HTTP err 429 (Too Many Requests) 2017-08-20 11:48:19 +02:00
Alexander Graf
566ef02b94 Change sleep interval between requests
These are now adapted to how many requests have already been done. With
the current settings, Instaloader does not more than

12 request in the first ten seconds,
28 requests in the first minute,
40 requests in the first two minutes,
63 requests in the first five minutes,
90 requests in the first ten minutes,
and after that 50 requests per ten minutes.

This should make it less likely that Instaloader is rate-limited by
Instagram, while still being fast if downloading only a few posts.

Further, option --no-sleep is hidden in --help output and README.rst.
2017-08-20 11:31:46 +02:00
Alexander Graf
6300c217b3 Fixes to instaloader.Post metadata retrieval
Fixes a KeyError which occurred when fetching some information note
available in Post._node. Makes all properties not throw exceptions,
rather they return None if the information cannot be obtained.
sidecar_edges is now a method get_sidecar_edges(). get_comments() does
not do any additional requests if the post does not have any comment.

Fixes #39.
2017-08-20 10:37:58 +02:00
Alexander Graf
d967400cb4 Fix downloading hashtags with unicode characters
Non-latin characters in the referer string used in the HTTP headers are
now properly quoted.
2017-08-19 22:44:08 +02:00
Alexander Graf
9ee98a2925 Use lowercase hashtags and profile names
Since both hashtags and profile names are case insensitive, this might
be a useful normalization and could workaround some user-induced bugs.
2017-08-19 18:26:42 +02:00
Alexander Graf
ee9993d7c2 Filter posts with --only-if=FILTER
where FILTER is a boolean expression in python syntax where all names
are evaluated to instaloader.Post properties.

Examples:

instaloader --login=USER --only-if='viewer_has_liked' :feed

instaloader --only-if='likes>1000 and comments>5' profile
2017-08-19 17:54:23 +02:00
Lars Lindqvist
09d2592635 Use anonymous session for HEAD request.
The default Instaloader headers aren't passed with a simple `requests.head()`, so leakage of user agents such as `"python-requests/2.18.1"` will occur.
2017-08-19 15:17:43 +02:00
Alexander Graf
0f64768dd8 Post class representing an Instagram Post
This simplifies accessing properties of a Post. Method download_post()
remains to class Instaloader rather than Post, as it fits there better.

Also, since it is now easily possible, all download_*() functions now
have a filter_func parameter. Its meaning has been reverted to be
consistent of how a filter is commonly understood: A post is downloaded
iff filter_func is None or evaluates to True.

Post.get_comments() foreports commit 86fb80d ("Avoid GraphQL queries if
all comments in metadata").
2017-08-19 13:02:49 +02:00
Lars Lindqvist
ccdac0305f Simplify profpic regex.
The original method of substituting 2048x2048 for whatever resolution was given seemed somewhat convoluted. This accomplishes the same thing, except raising an exception if the given url is not on the right domain.
2017-08-13 23:37:53 +02:00
André Koch-Kramer
d28bd3ab52 Release of version 2.2.3 2017-08-13 14:39:35 +02:00
Alexander Graf
4ce6826f82 Don't retry downloads with 404 status
Instead of retrying a download attempt answered with a 404, the download
is aborted after the first attempt. Thanks to the _error_catcher(), a
message is printed and Instaloader goes on with the next files to
download.

Further, this commit removes the unneeded NodeUnavailableException and
adjusts docstrings accordingly.
2017-08-13 12:57:31 +02:00
Alexander Graf
adeb588471 Fix KeyError when generating NodeUnavailable error
Fixes #32.
2017-08-13 12:51:39 +02:00
Alexander Graf
57329482f3 Evaluate status field in JSON response
In GraphQL queries, this field should be set to "ok". Checking this
addresses #31 (and maybe other issues as well) and simplifies error
handling.
2017-08-13 11:02:47 +02:00
André Koch-Kramer
a894c2d206 Wait to avoid error code 429 (Too Many Requests)
Additional sleeps are necessary because Instagram is rate limiting
GraphQL queries. The error does not occur if not more than 100 queries
are made in a sliding window of eleven minutes.
Fixes #29
2017-08-12 19:57:54 +02:00
André Koch-Kramer
86fb80d7e4 Avoid GraphQL queries if all comments in metadata
If get_node_metadata() is able to provide all comments of a node, no
additional query is needed. Especially GraphQL queries are time
expensive because no more than 100 can be queried in ten minutes. Since
get_node_metadata() does not use GraphQL queries, this is a usefull
tradeoff.
+ Additional error handling
2017-08-12 19:04:22 +02:00
Alexander Graf
dcc48e37df Minor fix regarding setup.py
Now, setup.py does not assume to be called from the path where the
source tree resides. This fixes getting the long_description and the
version if setup.py is called from outside.
2017-08-11 20:34:47 +02:00
Alexander Graf
117124d1dd Options --no-captions and --no-geotags
These options instruct instaloader to not save captions or geotags
respectively, even if the regarding information can be obtained without
any additional queries to Instagram.

This feature was proposed in #25, and thus this commit should close #25.
2017-08-11 19:54:10 +02:00
Alexander Graf
0d9af81ae7 Minor enhancements
Get rid of NonfatalException (an exception is nonfatal iff it is
catched somewhere)

Foreport fixes for #26 and #30.

The current __sersion__ string is now kept in instaloader.py rather than
setup.py. This lets instaloader.__version__ always deliver the version
of the actually-loaded Instaloader module.

Minor changes to README.rst, error handling and which class methods are
public.

With these and the changes of the previous commit, we saved 31 lines of
code, indicating that it might be easier to understand and to maintain.
2017-08-11 18:20:58 +02:00
Alexander Graf
9eddc03cf1 Don't rely on fromtimestamp() raising ValueError
Instead of relying on that datetime.fromtimestamp() raises ValueError if
the timestamp of a story is in milliseconds rather than seconds, we now
compare the timestamp value with a timestamp of the year 2286 to decide
whether to divide it by 1000 or not.

This is motivated by #30, where an OSError is raised in
datetime.fromtimestamp() under Windows.
2017-08-09 18:27:05 +02:00
André Koch-Kramer
42864997b3 get_node_metadata() now also catches TypeError
Concerns #26
2017-08-08 00:43:52 +02:00
Alexander Graf
58882f508e Major code cleanup
Remove many code duplications, merely by using more pythonic idioms.

Use GraphQL more often.

Better cope with errors: All requests can be retried; failed requests do
not cause program termination; all error strings are repeated to the
user at the end of execution.

download_post() (formerly download_node()) does not repeat node metadata
request (before this commit, this request was executed up to three
times).
2017-08-06 19:27:46 +02:00
Alexander Graf
5d83a4ccf6 Release of version 2.2.2 2017-08-06 10:49:36 +02:00
Alexander Graf
56c8ac729f Cleanup .gitignore 2017-08-05 13:43:26 +02:00
André Koch-Kramer
987d95c048 Forgot "file=" in some print statements 2017-08-05 00:22:43 +02:00
André Koch-Kramer
9b5d4e34fc Raise and catch NodeUnavailableException
In case a node can not be downloaded or its metadata is needed and can
not be retrieved, a NodeUnavailableException is raised and the according
node will be skipped.
Concerns #26
2017-08-04 19:19:56 +02:00
Alexander Graf
838ea645a8 Let Travis automatically deploy releases to PyPi 2017-08-04 11:18:20 +02:00
Alexander Graf
01fc150e78 Release of version 2.2.1 2017-08-03 07:33:38 +02:00
Alexander Graf
46ac119a10 Fix AttributeError when not logging in
Mentioned in #26. Bug was introduced in 0ad50c1.
2017-08-03 07:31:58 +02:00
André Koch-Kramer
6b520f6a2d Release of version 2.2 2017-08-01 16:41:42 +02:00
André Koch-Kramer
c3a5557140 Additionally catch HTTPError and RequestException
Concerns issue #26.
2017-08-01 16:30:59 +02:00
André Koch-Kramer
0ad50c1526 Fix error on downloading own private profile
Closes #27.
2017-07-31 21:18:42 +02:00
André Koch-Kramer
9fbe9b0903 Retry get requests for downloading pictures
Tries to workaround #26.
2017-07-31 20:34:27 +02:00
Alexander Graf
4b8b257672 Fix rst formatting in README 2017-07-29 21:51:51 +02:00
Alexander Graf
b1b90f8abf target :stories; flags --stories & --stories-only
This allows to invoke the new download_stories() function contributed
in #28 by command line.
2017-07-29 17:51:39 +02:00
Alexander Graf
66f69b5c21 Consistently use datetime type for handling dates 2017-07-29 11:08:52 +02:00
André Koch-Kramer
48d6f0226b Added Disclaimer to README 2017-07-29 05:24:48 +02:00
André Koch-Kramer
82ae31cea5 Fix download_stories() to get all available posts
download_stories() did and does not check if a story is "unseen". The
response to the query of the '/feed/reels_tray/' URL provides all
available stories of the user's followees. Nevertheless, some of them do
not contain an 'items' field which has no causal relationship with their
status in terms of "seen" or "unseen". Therefore, to overcome this lack
of 'items' the 'feed/user/TARGET_USERID/reel_media/' URL needs to be
queried for each relevant followee whose 'items' were not provided in
the first place.
2017-07-29 04:12:26 +02:00