1
0
mirror of https://github.com/instaloader/instaloader.git synced 2024-08-17 12:19:38 +02:00
Commit Graph

173 Commits

Author SHA1 Message Date
Lars Lindqvist
2dbd510486 Ignore device_timestamp for stories.
I've come across several implausible values for `device_timestamp` (such as 182428140, which would be in october 1975, assuming millisecond). I guess it is due to improperly configured phones, or maybe some third party software that's mangeling the EXIF data on images before posting. Anyway. Since by the nature of stories, the `taken_at` timestamp (presumably when the instagram servers received the post) ought to be approximately when an image was actually taken. So there's no real value trying to use the timestamp provided by the photo-taking-device.
2017-08-27 22:51:30 +02:00
Alexander Graf
5aff8273b0 Minor tweaks to documentation 2017-08-26 12:42:04 +02:00
André Koch-Kramer
1928db63bb Added ability to download videos within sidecars
Closes #41
2017-08-25 13:50:58 +02:00
André Koch-Kramer
1a26d7336c Fix TypeError on retry in get_json() 2017-08-24 21:17:49 +02:00
André Koch-Kramer
bb71c40b56 Wait smarter to avoid HTTP error code 429
Additional sleeps are necessary because Instagram is rate limiting
GraphQL queries. The error does not occur if not more than 100 queries
are made in a sliding window of eleven minutes.
Ports a894c2d to version 3.
2017-08-24 18:30:46 +02:00
Alexander Graf
2e47642f74 Remove --shorter-output from --help 2017-08-24 17:51:39 +02:00
Alexander Graf
bbdd3873e2 --skip-videos -> --no-videos; no --shorter-output 2017-08-24 16:03:24 +02:00
Alexander Graf
8cf1997460 Evaluate --only-if filter smarter
If --only-if='likes>1000 or viewer_has_liked' is given, it is not
neccessary to evaluate viewer_has_liked if the post has more than 1000
likes. The new implementation smartly handles this case.
2017-08-23 15:38:16 +02:00
Alexander Graf
ce38f5880f Call download_stories in _error_catcher
download_stories() may trow a BadResponseException, which should not
cause abortion of download_profile(). Now, all calls of
download_stories() are within an _error_catcher context.
2017-08-22 09:21:47 +02:00
Lars Lindqvist
79f143b1f8 Minor stories-related cleanup 2017-08-20 20:32:45 +02:00
Alexander Graf
ad34fc09b6 Wait longer after HTTP err 429 (Too Many Requests) 2017-08-20 11:48:19 +02:00
Alexander Graf
566ef02b94 Change sleep interval between requests
These are now adapted to how many requests have already been done. With
the current settings, Instaloader does not more than

12 request in the first ten seconds,
28 requests in the first minute,
40 requests in the first two minutes,
63 requests in the first five minutes,
90 requests in the first ten minutes,
and after that 50 requests per ten minutes.

This should make it less likely that Instaloader is rate-limited by
Instagram, while still being fast if downloading only a few posts.

Further, option --no-sleep is hidden in --help output and README.rst.
2017-08-20 11:31:46 +02:00
Alexander Graf
6300c217b3 Fixes to instaloader.Post metadata retrieval
Fixes a KeyError which occurred when fetching some information note
available in Post._node. Makes all properties not throw exceptions,
rather they return None if the information cannot be obtained.
sidecar_edges is now a method get_sidecar_edges(). get_comments() does
not do any additional requests if the post does not have any comment.

Fixes #39.
2017-08-20 10:37:58 +02:00
Alexander Graf
d967400cb4 Fix downloading hashtags with unicode characters
Non-latin characters in the referer string used in the HTTP headers are
now properly quoted.
2017-08-19 22:44:08 +02:00
Alexander Graf
9ee98a2925 Use lowercase hashtags and profile names
Since both hashtags and profile names are case insensitive, this might
be a useful normalization and could workaround some user-induced bugs.
2017-08-19 18:26:42 +02:00
Alexander Graf
ee9993d7c2 Filter posts with --only-if=FILTER
where FILTER is a boolean expression in python syntax where all names
are evaluated to instaloader.Post properties.

Examples:

instaloader --login=USER --only-if='viewer_has_liked' :feed

instaloader --only-if='likes>1000 and comments>5' profile
2017-08-19 17:54:23 +02:00
Lars Lindqvist
09d2592635 Use anonymous session for HEAD request.
The default Instaloader headers aren't passed with a simple `requests.head()`, so leakage of user agents such as `"python-requests/2.18.1"` will occur.
2017-08-19 15:17:43 +02:00
Alexander Graf
0f64768dd8 Post class representing an Instagram Post
This simplifies accessing properties of a Post. Method download_post()
remains to class Instaloader rather than Post, as it fits there better.

Also, since it is now easily possible, all download_*() functions now
have a filter_func parameter. Its meaning has been reverted to be
consistent of how a filter is commonly understood: A post is downloaded
iff filter_func is None or evaluates to True.

Post.get_comments() foreports commit 86fb80d ("Avoid GraphQL queries if
all comments in metadata").
2017-08-19 13:02:49 +02:00
Lars Lindqvist
ccdac0305f Simplify profpic regex.
The original method of substituting 2048x2048 for whatever resolution was given seemed somewhat convoluted. This accomplishes the same thing, except raising an exception if the given url is not on the right domain.
2017-08-13 23:37:53 +02:00
Alexander Graf
4ce6826f82 Don't retry downloads with 404 status
Instead of retrying a download attempt answered with a 404, the download
is aborted after the first attempt. Thanks to the _error_catcher(), a
message is printed and Instaloader goes on with the next files to
download.

Further, this commit removes the unneeded NodeUnavailableException and
adjusts docstrings accordingly.
2017-08-13 12:57:31 +02:00
Alexander Graf
57329482f3 Evaluate status field in JSON response
In GraphQL queries, this field should be set to "ok". Checking this
addresses #31 (and maybe other issues as well) and simplifies error
handling.
2017-08-13 11:02:47 +02:00
Alexander Graf
dcc48e37df Minor fix regarding setup.py
Now, setup.py does not assume to be called from the path where the
source tree resides. This fixes getting the long_description and the
version if setup.py is called from outside.
2017-08-11 20:34:47 +02:00
Alexander Graf
117124d1dd Options --no-captions and --no-geotags
These options instruct instaloader to not save captions or geotags
respectively, even if the regarding information can be obtained without
any additional queries to Instagram.

This feature was proposed in #25, and thus this commit should close #25.
2017-08-11 19:54:10 +02:00
Alexander Graf
0d9af81ae7 Minor enhancements
Get rid of NonfatalException (an exception is nonfatal iff it is
catched somewhere)

Foreport fixes for #26 and #30.

The current __sersion__ string is now kept in instaloader.py rather than
setup.py. This lets instaloader.__version__ always deliver the version
of the actually-loaded Instaloader module.

Minor changes to README.rst, error handling and which class methods are
public.

With these and the changes of the previous commit, we saved 31 lines of
code, indicating that it might be easier to understand and to maintain.
2017-08-11 18:20:58 +02:00
Alexander Graf
58882f508e Major code cleanup
Remove many code duplications, merely by using more pythonic idioms.

Use GraphQL more often.

Better cope with errors: All requests can be retried; failed requests do
not cause program termination; all error strings are repeated to the
user at the end of execution.

download_post() (formerly download_node()) does not repeat node metadata
request (before this commit, this request was executed up to three
times).
2017-08-06 19:27:46 +02:00
André Koch-Kramer
987d95c048 Forgot "file=" in some print statements 2017-08-05 00:22:43 +02:00
André Koch-Kramer
9b5d4e34fc Raise and catch NodeUnavailableException
In case a node can not be downloaded or its metadata is needed and can
not be retrieved, a NodeUnavailableException is raised and the according
node will be skipped.
Concerns #26
2017-08-04 19:19:56 +02:00
Alexander Graf
46ac119a10 Fix AttributeError when not logging in
Mentioned in #26. Bug was introduced in 0ad50c1.
2017-08-03 07:31:58 +02:00
André Koch-Kramer
c3a5557140 Additionally catch HTTPError and RequestException
Concerns issue #26.
2017-08-01 16:30:59 +02:00
André Koch-Kramer
0ad50c1526 Fix error on downloading own private profile
Closes #27.
2017-07-31 21:18:42 +02:00
André Koch-Kramer
9fbe9b0903 Retry get requests for downloading pictures
Tries to workaround #26.
2017-07-31 20:34:27 +02:00
Alexander Graf
b1b90f8abf target :stories; flags --stories & --stories-only
This allows to invoke the new download_stories() function contributed
in #28 by command line.
2017-07-29 17:51:39 +02:00
Alexander Graf
66f69b5c21 Consistently use datetime type for handling dates 2017-07-29 11:08:52 +02:00
André Koch-Kramer
82ae31cea5 Fix download_stories() to get all available posts
download_stories() did and does not check if a story is "unseen". The
response to the query of the '/feed/reels_tray/' URL provides all
available stories of the user's followees. Nevertheless, some of them do
not contain an 'items' field which has no causal relationship with their
status in terms of "seen" or "unseen". Therefore, to overcome this lack
of 'items' the 'feed/user/TARGET_USERID/reel_media/' URL needs to be
queried for each relevant followee whose 'items' were not provided in
the first place.
2017-07-29 04:12:26 +02:00
André Koch-Kramer
3e0b81ad56 copy_session() now also copies session.headers
Store a shallow copy of the headers rather than just create bindings
between the headers of the original and the newly created session object.
2017-07-29 01:54:42 +02:00
André Koch-Kramer
99620ec766 Allow subdirectories in filename pattern
Changed invocations of os.makedirs() in order to respect directories in
the filename pattern.
2017-07-29 01:40:53 +02:00
André Koch-Kramer
13d288ef36 Minor fixes of download_stories()
Concerning pull request #28.
2017-07-28 19:49:48 +02:00
Lars Lindqvist
b85e8fd793 Get stories per user 2017-07-28 18:36:15 +02:00
André Koch-Kramer
1699954761 Disable some trolling pylint features 2017-07-27 22:26:33 +02:00
André Koch-Kramer
7cac6d53f2 Minor code adaptions for consistency reasons
Concerning pull request #28.
2017-07-27 22:18:43 +02:00
Lars Lindqvist
c355338010 Stories POC 2017-07-27 16:59:21 +02:00
André Koch-Kramer
c299e9d1a2 Updated README + {date} default format added 2017-07-26 19:13:56 +02:00
Alexander Graf
151ccfd71d Enhance behavior of waiting between requests
Unless --no-sleep is given, Instaloader waits between requests to the
Instagram servers.

This commit fixes and enhances this behavior. Now, --no-sleep is always
obeyed. Between requests to the instagram.com servers, there is now a
delay of 250 ms ~ 2000 ms. Requests to the file servers do not cause a
delay.
2017-07-26 15:10:12 +02:00
Alexander Graf
56b27fb26f Fix download_feed_pics()
Now we also use GraphQL queries for retrieving the user's feed.
2017-07-25 21:01:48 +02:00
Alexander Graf
8572e527ec Options --dirname-pattern and --filename-pattern
Instaloader downloads all posts in

  <DIRNAME>/<FILENAME>+(suffix and extension)

which are now generated by the templates given with --dirname-pattern
and --filename-pattern. These templates may contain specifiers such as
'{target}', '{profile}', '{date}' and '{shortcode}'.

Default for --dirname-pattern is '{target}', default for
--filename-pattern is '{date:%Y-%m-%d_%H-%M-%S}'

The former options --no-profile-subdir and --hashtag-username were
removed, because their behavior can now be achieved like this:

--no-profile-subdir and --hashtag-username:
--dirname-pattern='.' --filename-pattern='{profile}__{date:%Y-%m-%d_%H-%M-%S}'

--no-profile-subdir, but not --hashtag-username:
--dirname-pattern='.' --filename-pattern='{target}__{date:%Y-%m-%d_%H-%M-%S}'

--hashtag-username but not --no-profile-subdir:
--dirname-pattern='{profile}'

This adds the option proposed in #23, to encode both the hashtag and the
profile name in the file's path when downloading by hashtag, e.g.:
--dirname-pattern='{target}' --filename-pattern='{profile}_{date:%Y-%m-%d_%H-%M-%S}'

(Closes #23)
2017-07-25 18:45:01 +02:00
Alexander Graf
5068c9453e graphql_query() method for GraphQL queries 2017-07-24 12:08:08 +02:00
André Koch-Kramer
051a6fa9d0 args.comments 2017-07-21 15:32:41 +02:00
André Koch-Kramer
169ce1a300 Download comments
Close #5
2017-07-20 22:36:30 +02:00
Alexander Graf
ee8e159d56 Fix README regarding when profiles are found by ID 2017-07-20 18:19:15 +02:00
Alexander Graf
b3f916b371 Fix pathname capitalization inconsistency issues 2017-07-20 18:08:16 +02:00
André Koch-Kramer
1fdce16f46 Fix get_followees() and implement get_followers() 2017-07-20 18:01:29 +02:00
Alexander Graf
dd513f7190 Let anonymous loader inherit all options 2017-07-20 15:24:57 +02:00
Alexander Graf
7198f1ad9f Restructure --help and Options section in README 2017-07-20 14:54:22 +02:00
Alexander Graf
58c12d5618 Allow changing HTTP User Agent string 2017-07-20 11:25:46 +02:00
Alexander Graf
1e10ab8669 Revert "Replaced usages of shortcode with mediaid"
This reverts commit 715582138b.

It broke downloading sidecars and did not introduce any advantageous
behavior.
2017-07-14 15:29:09 +02:00
Alexander Graf
c0eecd1bd2 Usability fixes in improvements
On module level:

Cleaner exception handling for load_session_from file

interactive_login logs in interactively now, always asking the user for
password. Before, it had an optional password parameter determining
whether it was interactive or not.

On application level:

Warn if profile specifiers are used which require login, but not --login
flag was given (@profile, :feed-all, :feed-liked).

Clearly warn that --password is insecure.
2017-07-14 11:04:32 +02:00
André Koch-Kramer
8607135740 Satisfy pylint 2017-07-14 05:37:36 +02:00
André Koch-Kramer
ca2829becc Fixed and reimplemented get_username_by_id() 2017-07-14 05:18:18 +02:00
André Koch-Kramer
715582138b Replaced usages of shortcode with mediaid 2017-07-13 22:33:01 +02:00
André Koch-Kramer
184c521646 Added functions to convert mediaid <-> shortcode 2017-07-06 22:26:25 +02:00
André Koch-Kramer
5ae3d7090f Minor bug fixes
- Adjust comment in test()
- Added exception handling when loading a sessionfile
- Corrected control flow in interactive_login()
2017-06-30 15:45:38 +02:00
Alexander Graf
4768fdbd10 --hashtag-username to store by-username
With --hashtag-username given, if downloading per #hashtag, instead of
per username, for each picture an additional request to the Instagram
server is issued to lookup the picture's username. Instead of storing
files in #hashtag/timestamp.jpg, files are stored in
username/timestamp.jpg as it is the default when not downloading per
hashtag.

This closes #22.
2017-06-27 09:19:29 +02:00
Alexander Graf
591dfd31e4 --no-profile-subdir to encode profile in filename
Fixes #22.
2017-06-25 14:55:44 +02:00
Alexander Graf
caf75a8135 Refactor Instaloader's methods into a class 2017-06-24 22:43:40 +02:00
Alexander Graf
e0924e8d08 Fix --geotags 2017-04-22 17:54:21 +02:00
Alexander Graf
655dbb552d Further clarify meaning of --count 2017-04-22 17:34:49 +02:00
André Koch-Kramer
fdb8e94c64 Fixed ":feed-liked" functionality 2017-04-22 17:26:48 +02:00
Alexander Graf
b3c83f420c --count to limit posts at #hashtag and :feed-* 2017-04-22 17:21:02 +02:00
André Koch-Kramer
2106c2d5f6 Satisfy pylint after their update 2017-04-22 11:13:10 +02:00
André Koch-Kramer
0e943189e5 Parse graphql structure for sidecars 2017-04-22 10:50:12 +02:00
André Koch-Kramer
8e77a1c125 Fixed download_node() 2017-04-22 01:39:52 +02:00
André Koch-Kramer
361445519a Adapt new graphql layout used by Instagram
Fixes #19.
2017-04-21 18:10:41 +02:00
Alexander Graf
6b345e1f52 Adapt video downloading to new format
This should the video downloading issue reported at #18.
2017-04-20 09:17:59 +02:00
Alexander Graf
3e1360160d Download pictures with #hashtag
Instaloader is now capable of downloading all pictures associated with
one #hashtag with:
instaloader #hashtag

This implements the feature requested with #18.
2017-04-17 12:16:22 +02:00
Alexander Graf
a7d1c5bbb0 Make check_id() exceptions Non-fatal
The check_id() step, including the get_username_by_id(), which is used
to determine whether a profile's account name has changed since the last
download using it's unique id when Instaloader is operating logged-in,
is actually optional and should not cause termination in any case.
2017-04-10 21:05:58 +02:00
Alexander Graf
dc748a0541 Download all pictures of Sidecar nodes 2017-03-25 21:08:54 +01:00
Alexander Graf
72c647829a Don't fail if --sessionfile does not contain '/' 2017-03-21 14:58:13 +01:00
Alexander Graf
d246268630 Retry download anonymously if profile not exists
In case you are blocked by a public profile which you intend to
download, the server responds as if the profile would not exist. Now in
this case, we retry the download without using an anonymous session.
2017-03-19 12:52:07 +01:00
Alexander Graf
23a0e32e8e Clarify --login is required for download followees 2017-03-19 12:51:20 +01:00
Alexander Graf
00f6f47fa9 fix get_id_by_username() 2017-02-13 10:20:45 +01:00
Alexander Graf
be477e8a88 Fix very minor packaging issues
- State in README.rst and setup.py metainfo that we require Python>=3.5

- Let Travis-CI test against newer versions of Python

- Let instaloader --help show where to report issues
2017-02-13 09:57:03 +01:00
Alexander Graf
02509d3c40 Fix downloading (set max_id only if not zero)
This should fix #17.
2017-02-13 09:50:20 +01:00
Alexander Graf
84c2a823c4 fix typing 2016-12-22 16:05:25 +01:00
Alexander Graf
86f8b2f018 Annotate all types 2016-12-22 13:20:41 +01:00
André Koch-Kramer
98c2847afd Implemented feature: store geotags/locations 2016-09-22 18:28:13 +02:00
Alexander Graf
1d506b5f95 Minor documentation improvements 2016-09-19 19:26:59 +02:00
Alexander Graf
508c629d2b Equalify summary in *.py and README.md 2016-09-18 16:41:43 +02:00
Alexander Graf
1ff6dd9d30 Mini refactoring and docstrings
Closes #12.
2016-09-18 16:35:25 +02:00
Alexander Graf
d5c13b1295 Globally disable pylint too-many-arguments warning 2016-09-18 15:43:24 +02:00
Alexander Graf
3ac8ffbc84 Reduce code duplication introducing download_node 2016-09-18 15:41:12 +02:00
Alexander Graf
c2957e389f Have setuptools setup.py for serious distribution
This is a) cooler and b) a requirement for deploying it on PyPI.

It removes need of __all__ shit (which is hard to keep updated), and
allows installing instaloader easily as a global module and executable.
Additionally it removes __init__.py.
2016-09-18 14:43:12 +02:00
Alexander Graf
70c91e000e Targets :feed-all and :feed-liked to load feed
Closes #14.
2016-09-17 20:53:03 +02:00
Alexander Graf
9cd93c9414 Have function providing access to user's feed
Closes #13.
2016-09-16 23:24:28 +02:00
Alexander Graf
5dc9be47cb Make instaloader usable as package
This commit allows doing `import instaloader` when instaloader is
located in a subdirectory "instaloader". This makes it easier to use
instaloader e.g. when it is imported using git submodules feature.
2016-08-18 10:04:54 +02:00
André Koch-Kramer
0088ee5e9e Added handling of UnicodeEncodeError
- Only try to print captions if possible
- Added option '--shorter-output' to disable output of captions
2016-08-04 19:36:36 +02:00
Alexander Graf
0678a8118a Properly escape \ in regex string 2016-08-03 20:29:36 +02:00
André Koch-Kramer
b71179371d Do not sleep when --no-sleep is given 2016-08-03 20:25:16 +02:00
Alexander Graf
05104b7438 Have better error handling when working on files
try ... except FileNotFoundError is better than os.path.isfile.
2016-08-03 13:51:25 +02:00
Alexander Graf
ce8bdb18e0 Have newline in id files
This is better. A line in a textfile must terminate with a \n character.
2016-08-03 13:50:47 +02:00
André Koch-Kramer
77d0d272fc Implementation of get_id_by_username()
+ updated README.md
2016-08-02 21:27:39 +02:00