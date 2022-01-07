TikTok Scraper & Downloader

Scrape and download useful information from TikTok.

No login or password are required

This is not an official API support and etc. This is just a scraper that is using TikTok Web API to scrape media and related meta information.

As of right now it is NOT possible to download video without the watermark

Features

Download unlimited post metadata from the User, Hashtag, Trends, or Music-Id pages

Save post metadata to the JSON/CSV files

Download media with and without the watermark and save to the ZIP file

Download single video without the watermark from the CLI

Sign URL to make custom request to the TikTok API

Extract metadata from the User, Hashtag and Single Video pages

Save previous progress and download only new videos that weren't downloaded before. This feature only works from the CLI and only if download flag is on.

View and manage previously downloaded posts history in the CLI

Scrape and download user, hashtag, music feeds and single videos specified in the file in batch mode

CLI: save progress to avoid downloading same videos

Rewrite everything in TypeScript

Improve proxy support

Improve proxy support

Add tests

Download video without the watermark

Indicate in the output file(csv/json) if the video was downloaded or not

Build and run from Docker

CLI: Scrape and download in batch

CLI: Load proxies from a file

CLI: Optional ZIP

Renew API

Set WebHook URL (CLI)

Add new method to collect music metadata

Add Manual Pagination

Improve documentation

Download audio files
Web interface

Don't forget about tests

yarn test

yarn build

tiktok-scraper requires Node.js v10+ to run.

Install from NPM

npm i -g tiktok-scraper

Install from YARN

yarn global add tiktok-scraper

In Terminal

$ tiktok-scraper -- help Usage: tiktok-scraper < command > [options] Commands: tiktok-scraper user [id] Scrape videos from username. Enter only username tiktok-scraper hashtag [id] Scrape videos from hashtag. Enter hashtag without tiktok-scraper trend Scrape posts from current trends tiktok-scraper music [id] Scrape posts from a music id number tiktok-scraper video [id] Download single video without the watermark tiktok-scraper history View previous download history tiktok-scraper from-file [file] [async] Scrape users, hashtags, music, videos mentioned in a file. 1 value per 1 line Options: --version Show version number [boolean] --session Set session cookie value. Sometimes session can be helpful when scraping data from any method [default: "" ] --session-file Set path to the file with list of active sessions. One session per line! [default: "" ] --timeout Set timeout between requests. Timeout is in Milliseconds: 1000 mls = 1 s [default: 0] --number, -n Number of posts to scrape. If you will set 0 then all posts will be scraped [default: 0] --since Scrape no posts published before this date (timestamp). If set to 0 the filter is deactived [default: 0] --proxy, -p Set single proxy [default: "" ] --proxy-file Use proxies from a file. Scraper will use random proxies from the file per each request. 1 line 1 proxy. [default: "" ] --download, -d Download video posts to the folder with the name input [id] [boolean] [default: false ] --asyncDownload, -a Number of concurrent downloads [default: 5] --hd Download video in HD. Video size will be x5-x10 times larger and this will affect scraper execution speed. This option only works in combination with -w flag [boolean] [default: false ] --zip, -z ZIP all downloaded video posts [boolean] [default: false ] --filepath File path to save all output files. [default: "/Users/karl.wint/Documents/projects/javascript/tiktok-scraper" ] --filetype, -t Type of the output file where post information will be saved. 'all' - save information about all posts to the` 'json' and 'csv' [choices: "csv" , "json" , "all" , "" ] [default: "" ] --filename, -f Set custom filename for the output files [default: "" ] --noWaterMark, -w Download video without the watermark. NOTE: With the recent update you only need to use this option if you are scraping Hashtag Feed. User/Trend/Music feeds will have this url by default [boolean] [default: false ] --store, -s Scraper will save the progress in the OS TMP or Custom folder and in the future usage will only download new videos avoiding duplicates [boolean] [default: false ] --historypath Set custom path where history file/files will be stored [default: "/var/folders/d5/fyh1_f2926q7c65g7skc0qh80000gn/T" ] --remove, -r Delete the history record by entering "TYPE:INPUT" or "all" to clean all the history . For example: user:bob [default: "" ] --webHookUrl Set webhook url to receive scraper result as HTTP requests. For example to your own API [default: "" ] --method Receive data to your webhook url as POST or GET request [choices: "GET" , "POST" ] [default: "POST" ] -- help Show help [boolean] Examples: tiktok-scraper user USERNAME -d -n 100 --session sid_tt=dae32131231 tiktok-scraper trend -d -n 100 --session sid_tt=dae32131231 tiktok-scraper hashtag HASHTAG_NAME -d -n 100 --session sid_tt=dae32131231 tiktok-scraper music MUSIC_ID -d -n 50 --session sid_tt=dae32131231 tiktok-scraper video https://www.tiktok.com/@tiktok/video/6807491984882765062 -d tiktok-scraper history tiktok-scraper history -r user:bob tiktok-scraper history -r all tiktok-scraper from-file BATCH_FILE ASYNC_TASKS -d

By using docker you won't be able to use --filepath and --historypath , but you can set volume(host path where all files will be saved) by using -v

docker build . -t tiktok-scraper

Example 1: All files including history file will be saved in the directory(\$pwd) where you running the docker from

docker run -v $( pwd ):/usr/app/files tiktok-scraper user tiktok -d -n 5 -s

Example 2: All files including history file will be saved in /User/blah/downloads

docker run -v /User/blah/downloads:/usr/app/files tiktok-scraper user tiktok -d -n 5 -s

.user(id, options) .hashtag(id, options) .trend( '' , options) .music(id, options) .userEvent(id, options) .hashtagEvent(id, options) .trendEvent( '' , options) .musicEvent(id, options) .getUserProfileInfo( 'USERNAME' , options) .getHashtagInfo( 'HASHTAG' , options) .signUrl( 'URL' , options) .getVideoMeta( 'WEB_VIDEO_URL' , options) .getMusicInfo( 'https://www.tiktok.com/music/original-sound-6801885499343571718' , options)

const options = { number : 50 , since : 0 , sessionList : [ 'sid_tt=21312213' ], proxy : '' , by_user_id : false , asyncDownload : 5 , asyncScraping : 3 , filepath : `CURRENT_DIR` , fileName : `CURRENT_DIR` , filetype : `na` , headers : { 'user-agent' : "BLAH" , referer : 'https://www.tiktok.com/' , cookie : `tt_webid_v2=68dssds` , }, noWaterMark : false , hdVideo : false , verifyFp : '' , useTestEndpoints : false };

Don't forget to check the examples folder

const TikTokScraper = require ( 'tiktok-scraper' ); ( async ( ) => { try { const posts = await TikTokScraper.user( 'USERNAME' , { number : 100 , sessionList : [ 'sid_tt=58ba9e34431774703d3c34e60d584475;' ] }); console .log(posts); } catch (error) { console .log(error); } })(); ( async ( ) => { try { const posts = await TikTokScraper.user( `USER_ID` , { number : 100 , by_user_id : true , sessionList : [ 'sid_tt=58ba9e34431774703d3c34e60d584475;' ] }); console .log(posts); } catch (error) { console .log(error); } })(); ( async ( ) => { try { const posts = await TikTokScraper.trend( '' , { number : 100 , sessionList : [ 'sid_tt=58ba9e34431774703d3c34e60d584475;' ] }); console .log(posts); } catch (error) { console .log(error); } })(); ( async ( ) => { try { const posts = await TikTokScraper.hashtag( 'HASHTAG' , { number : 100 , sessionList : [ 'sid_tt=58ba9e34431774703d3c34e60d584475;' ] }); console .log(posts); } catch (error) { console .log(error); } })(); ( async ( ) => { try { const user = await TikTokScraper.getUserProfileInfo( 'USERNAME' , options); console .log(user); } catch (error) { console .log(error); } })(); ( async ( ) => { try { const hashtag = await TikTokScraper.getHashtagInfo( 'HASHTAG' , options); console .log(hashtag); } catch (error) { console .log(error); } })(); ( async ( ) => { try { const videoMeta = await TikTokScraper.getVideoMeta( 'https://www.tiktok.com/@tiktok/video/6807491984882765062' , options); console .log(videoMeta); } catch (error) { console .log(error); } })();

const TikTokScraper = require ( 'tiktok-scraper' ); const users = TikTokScraper.userEvent( "tiktok" , { number : 30 }); users.on( 'data' , json => { }); users.on( 'done' , () => { }); users.on( 'error' , error => { }); users.scrape(); const hashtag = TikTokScraper.hashtagEvent( "summer" , { number : 250 , proxy : 'socks5://1.1.1.1:90' }); hashtag.on( 'data' , json => { }); hashtag.on( 'done' , () => { }); hashtag.on( 'error' , error => { }); hashtag.scrape();

NOT REQUIRED

Very common problem is when tiktok is blacklisting your IP/PROXY and in such case you can try to set session and there will be higher chances for success

Get the session:

Open https://www.tiktok.com/ in any browser

Login in to your account

Right click -> inspector -> networking

Refresh page -> select any request that was made to the tiktok -> go to the Request Header sections -> Cookies

Find in cookies sid_tt value. It usually looks like that: sid_tt=521kkadkasdaskdj4j213j12j312;

value. It usually looks like that: sid_tt=521kkadkasdaskdj4j213j12j312; - this will be your authenticated session cookie value that should be used to scrape user/hashtag/music/trending feed

Set the session:

CLI : Set single session by using option --session . For example --session sid_tt=521kkadkasdaskdj4j213j12j312; Set path to the file with the list of sessions by using option --session-file . For example --session-file /var/bob/sessionList.txt Example content /var/bob/sessionList.txt: sid_tt = 521 kkadkasdaskdj4j213j12j312 sid_tt = 521 kkadkasdaskdj4j213j12j312 sid_tt = 521 kkadkasdaskdj4j213j12j312 sid_tt = 521 kkadkasdaskdj4j213j12j312

In the MODULE you can set session by setting the option value sessionList . For example sessionList:["sid_tt=521kkadkasdaskdj4j213j12j312;", "sid_tt=12312312312312;"]

This part is related to the MODULE usage (NOT THE CLI)

The {videoUrl} value is binded to the cookie value {tt_webid_v2} that can contain any value

Method 1: default headers

When you extract videos from the user, hashtag, music, trending feed or single video then in response besides the video metadata you will receive headers object that will contain params that were used to extract the data. Here is the important part, in order to access/download video through {videoUrl} value you need to use same {headers} values.

headers: { "user-agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.80 Safari/537.36" , "referer" : "https://www.tiktok.com/" , "cookie" : "tt_webid_v2=689854141086886123" },

Method 2: custom headers

You can pass your own headers with the {options}.

const headers = { "user-agent" : "BOB" , "referer" : "https://www.tiktok.com/" , "cookie" : "tt_webid_v2=BOB" } getVideoMeta( 'WEB_VIDEO_URL' , {headers}) user( 'WEB_VIDEO_URL' , {headers}) hashtag( 'WEB_VIDEO_URL' , {headers}) trend( 'WEB_VIDEO_URL' , {headers}) music( 'WEB_VIDEO_URL' , {headers})

Example output for the methods: user, hashtag, trend, music, userEvent, hashtagEvent, musicEvent, trendEvent

{ headers : { 'user-agent' : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.80 Safari/537.36' , referer : 'https://www.tiktok.com/' , cookie : 'tt_webid_v2=689854141086886123' }, collector :[{ id : 'VIDEO_ID' , text : 'CAPTION' , createTime : '1583870600' , authorMeta :{ id : 'USER ID' , name : 'USERNAME' , following : 195 , fans : 43500 , heart : '1093998' , video : 3 , digg : 95 , verified : false , private : false , signature : 'USER BIO' , avatar : 'AVATAR_URL' }, musicMeta :{ musicId : '6808098113188120838' , musicName : 'blah blah' , musicAuthor : 'blah' , musicOriginal : true , playUrl : 'SOUND/MUSIC_URL' , }, covers :{ default : 'COVER_URL' , origin : 'COVER_URL' , dynamic : 'COVER_URL' }, imageUrl : 'IMAGE_URL' , videoUrl : 'VIDEO_URL' , videoUrlNoWaterMark : 'VIDEO_URL_WITHOUT_THE_WATERMARK' , videoMeta : { width : 480 , height : 864 , ratio : 14 , duration : 14 }, diggCount : 2104 , shareCount : 1 , playCount : 9007 , commentCount : 50 , mentions : [ '@bob' , '@sam' , '@bob_again' , '@and_sam_again' ], hashtags : [{ id : '69573911' , name : 'PlayWithLife' , title : 'HASHTAG_TITLE' , cover : [ Array ] }...], downloaded : true }...], zip : '/{CURRENT_PATH}/user_1552963581094.zip' , json : '/{CURRENT_PATH}/user_1552963581094.json' , csv : '/{CURRENT_PATH}/user_1552963581094.csv' }

getUserProfileInfo

{ secUid : 'MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM' , userId : '107955' , isSecret : false , uniqueId : 'tiktok' , nickName : 'TikTok' , signature : 'Make Your Day' , covers : [ 'COVER_URL' ], coversMedium : [ 'COVER_URL' ], following : 490 , fans : 38040567 , heart : '211522962' , video : 93 , verified : true , digg : 29 , }

getHashtagInfo

{ challengeId : '4231' , challengeName : 'love' , text : '' , covers : [], coversMedium : [], posts : 66904972 , views : '194557706433' , isCommerce : false , splitTitle : '' }

{ headers : { 'user-agent' : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.80 Safari/537.36' , referer : 'https://www.tiktok.com/' , cookie : 'tt_webid_v2=689854141086886123' }, collector :[{ id : '6807491984882765062' , text : 'We’re kicking off the #happyathome live stream series today at 5pm PT!' , createTime : '1584992742' , authorMeta : { id : '6812221792183403526' , name : 'blah' }, musicMeta :{ musicId : '6822233276137213677' , musicName : 'blah' , musicAuthor : 'blah' }, imageUrl : 'IMAGE_URL' , videoUrl : 'VIDEO_URL' , videoUrlNoWaterMark : 'VIDEO_URL_WITHOUT_THE_WATERMARK' , videoMeta : { width : 480 , height : 864 , ratio : 14 , duration : 14 }, covers :{ default : 'COVER_URL' , origin : 'COVER_URL' }, diggCount : 49292 , shareCount : 339 , playCount : 614678 , commentCount : 4023 , downloaded : false , hashtags : [], }] }

getMusicInfo

{ music : { id : '6882925279036066566' , title : 'doja x calabria' , playUrl : 'dfdfdfdf' , coverThumb : 'dfdfdf' , coverMedium : 'dfdfdf' , coverLarge : 'fdfdf' , authorName : 'bryce' , original : true , playToken : 'ffdfdf' , keyToken : 'dfdfdfd' , audioURLWithcookie : false , private : false , duration : 46 , album : '' , }, author : { id : '6835300004094166021' , uniqueId : 'mashupsbybryce' , nickname : 'bryce' , avatarThumb : 'dfdfd' , avatarMedium : 'dfdfdf' , avatarLarger : 'dfdfdf' , signature : 'hi ily :)

70k sounds cool tbh

👇follow my soundcloud & insta👇' , verified : false , secUid : 'MS4wLjABAAAA1_5bjLAamayD4rv3q49qJGa_7dZ5jzExTO0ozOybqIwwhw5TAg_iM25lkO94DM3K' , secret : false , ftc : false , relation : 0 , openFavorite : false , commentSetting : 0 , duetSetting : 0 , stitchSetting : 0 , privateAccount : false , }, stats : { videoCount : 361700 }, shareMeta : { title : 'bryceyouloser | ♬ doja x calabria | on TikTok' , desc : '361.0k videos - Watch awesome short ' + 'videos created with ♬ doja x calabria' , }, };

MIT

Free Software