yt-dlp/yt_dlp/extractor/worldstarhiphop.py

from .common import InfoExtractor


class WorldStarHipHopIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www|m)\.worldstar(?:candy|hiphop)\.com/(?:videos|android)/video\.php\?.*?\bv=(?P<id>[^&]+)'
    _TESTS = [{
        'url': 'http://www.worldstarhiphop.com/videos/video.php?v=wshh6a7q1ny0G34ZwuIO',
        'md5': '9d04de741161603bf7071bbf4e883186',
        'info_dict': {
            'id': 'wshh6a7q1ny0G34ZwuIO',
            'ext': 'mp4',
            'title': 'KO Of The Week: MMA Fighter Gets Knocked Out By Swift Head Kick!'
        }
    }, {
        'url': 'http://m.worldstarhiphop.com/android/video.php?v=wshh6a7q1ny0G34ZwuIO',
        'only_matching': True,
    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)

        entries = self._parse_html5_media_entries(url, webpage, video_id)

        if not entries:
            return self.url_result(url, 'Generic')

        title = self._html_search_regex(
            [r'(?s)<div class="content-heading">\s*<h1>(.*?)</h1>',
             r'<span[^>]+class="tc-sp-pinned-title">(.*)</span>'],
            webpage, 'title')

        info = entries[0]
        info.update({
            'id': video_id,
            'title': title,
        })
        return info
Move WorldStarHipHop into its own file 2013-06-23 22:04:08 +02:00			`from .common import InfoExtractor`


			`class WorldStarHipHopIE(InfoExtractor):`
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`_VALID_URL = r'https?://(?:www\|m)\.worldstar(?:candy\|hiphop)\.com/(?:videos\|android)/video\.php\?.*?\bv=(?P<id>[^&]+)'`
[worldstarhiphop] Support Android URLs (fixes #5629) 2015-05-14 12:00:57 +02:00			`_TESTS = [{`
[refactor] Single quotes consistency 2016-02-14 10:37:17 +01:00			`'url': 'http://www.worldstarhiphop.com/videos/video.php?v=wshh6a7q1ny0G34ZwuIO',`
			`'md5': '9d04de741161603bf7071bbf4e883186',`
			`'info_dict': {`
			`'id': 'wshh6a7q1ny0G34ZwuIO',`
			`'ext': 'mp4',`
			`'title': 'KO Of The Week: MMA Fighter Gets Knocked Out By Swift Head Kick!'`
Allow moving tests into IE files Allow adding download tests right in the IE file. This will cut down on merge conflicts and make it more likely that new IE authors will add tests right away. 2013-06-27 18:28:45 +02:00			`}`
[worldstarhiphop] Support Android URLs (fixes #5629) 2015-05-14 12:00:57 +02:00			`}, {`
			`'url': 'http://m.worldstarhiphop.com/android/video.php?v=wshh6a7q1ny0G34ZwuIO',`
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`'only_matching': True,`
[worldstarhiphop] Support Android URLs (fixes #5629) 2015-05-14 12:00:57 +02:00			`}]`
Allow moving tests into IE files Allow adding download tests right in the IE file. This will cut down on merge conflicts and make it more likely that new IE authors will add tests right away. 2013-06-27 18:28:45 +02:00
Move WorldStarHipHop into its own file 2013-06-23 22:04:08 +02:00			`def _real_extract(self, url):`
[worldstarhiphop] Correct title extraction 2014-09-29 05:02:58 +02:00			`video_id = self._match_id(url)`
			`webpage = self._download_webpage(url, video_id)`
Move WorldStarHipHop into its own file 2013-06-23 22:04:08 +02:00
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`entries = self._parse_html5_media_entries(url, webpage, video_id)`
Move WorldStarHipHop into its own file 2013-06-23 22:04:08 +02:00
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`if not entries:`
			`return self.url_result(url, 'Generic')`
added Youtube embed detection to WorldstarIE 2013-06-25 03:58:49 +02:00
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`title = self._html_search_regex(`
[worldstarhiphop] Support Android URLs (fixes #5629) 2015-05-14 12:00:57 +02:00			`[r'(?s)<div class="content-heading">\s<h1>(.?)</h1>',`
			`r'<span[^>]+class="tc-sp-pinned-title">(.*)</span>'],`
[worldstarhiphop] Correct title extraction 2014-09-29 05:02:58 +02:00			`webpage, 'title')`
Move WorldStarHipHop into its own file 2013-06-23 22:04:08 +02:00
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`info = entries[0]`
			`info.update({`
[worldstarhiphop] Modernize 2014-03-23 13:49:15 +01:00			`'id': video_id,`
[wshh] Extract html5 entries and delegate to generic extractor (closes #12676) 2017-04-08 11:01:56 +02:00			`'title': title,`
			`})`
			`return info`