Archive Direct Messages from Twitter

24,475

I have created a tool (https://github.com/Mincka/DMArchiver) to download my direct messages, with the ability to also download the uploaded images, videos and GIFs (as MP4).

Because it does not rely on the API, it is possible to download more than 200 messages. The script just simulate the "scrolling method" described by dimethylarginine and parse the result.

The main idea is to make requests in loop by calling the following URL with a valid auth_token cookie value for the authentication and parse the json response: https://twitter.com/messages/with/conversation?id=1337&max_entry_id=1337

The max_entry_id value is not required for the first request. You need to use the value of the min_entry_id variable in the response as the new max_entry_id in each subsequent iteration to get the next 20 older tweets. When max_entry_id is not in the json response, you are at the begin of the thread.

Some headers are also required to get a proper response from Twitter:

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0'
'Accept': 'application/json, text/javascript, */*; q=0.01'
'X-Requested-With': 'XMLHttpRequest'

Currently, the output of the tool is only available as an IRC-like conversation but I may add other output styles in the future (HTML, JSON, XML...).

Share:
24,475
allo
Author by

allo

Updated on January 08, 2020

Comments

  • allo
    allo over 4 years

    Is there any way to download the own direct messages to archive them?

    The Twitter API limits the call to the latest 200 DMs, which cannot download a full archive for longer conversations.

    The official Twitter Archive seems not to contain the messages at all. And most thirdparty services (which you might not want to let them access your messages anyway) will be using the api and the best they can do is to poll often enough not to miss the 200 DM limit.

    Is there any other way to get the messages from twitter? Scrolling back on the site seems to work, but they always load older messages in small steps and copy&paste from there gives an rather ugly result, too.

    It does not need to full twitter-api information, just handle, time and message (maybe media links, if possible) should be available.