Filter data in Twitter Streaming API
Take a look at the filter stream of the api:
You can enter a set of keywords as a filter to track twitter, according to current limitations you can track up to 400 keywords.
After retrieving the tweets you have to make a manual filtering again to remove noisy data.
So if you can specify what you are looking by a set of keywords, you will achieve what you want; but there will always be noise in your data because it is almost impossible to define smtg that precisely through simple keyword filtering.
For example lets assume you wanna track all tweets related to a brand named XYZ. For getting tweets about brand XYZ
you might have a one word keyword set which contains only "XYZ". API will give all the tweets containing XYZ
to you, but assume that "XYZ" has a meaning in some language and people of speaking that language will tweet about that word and you will receive that too. Also assume there is a city called XYZ and people will send check-in mesasgees. So at that point you need to filter out tweets that are not related to your topic, either by language detection or contextual information retrieval. But the key is to specify your keyword set about the topic you wanna cover.
Cheers.
Related videos on Youtube
Lukas
Updated on October 09, 2022Comments
-
Lukas over 1 year
I'm currently experimenting with the Twitter Streaming API. Everything work's like a charm, but the API sends me ton's of data, which I don't need. Is there a possibility to filter the data the API send me?
I'm using the following stream: https://stream.twitter.com/1.1/statuses/filter.json
-
Lukas about 11 yearsHi, thank's for that, but the problem is that i don't even want to receive the "Noisy" data, as i want to process lot's of tweets in less time :) Maybe it isn't even podsible to get a "short" version of the tweets from the api.
-
cubbuk about 11 years@LucèBrùlè I edited my answer to clarify whats the noise data.
-
user1599964 almost 11 years@cubbuk : Suppose i specified 3 keywords in the filter. Now when i get data from streaming API, is there a way (other than manually searching on my own) to detect that the tweet corresponds to WHICH of the three keywords that i specified in the filter ?
-
cubbuk almost 11 years@user1599964 as far as I know, twitter doesn't provide any info about that, you have to figure it out manually yourself.
-
user1599964 almost 11 years@cubbuk : Yes, i figured that out. Can you have a look at this question and let me know your views: stackoverflow.com/questions/16602483/…
-
S Gaber over 10 yearsis there any tool which can help me for language detection?
-
Krishna Kalyan about 8 years@cubbuk : will the streaming API also include tweets like abXYZcd or XYZmn. Does it give me tweets which contain the filter substring?. For example if I filter for "fast", will it give me tweets like "breakfast"?.
-
cubbuk about 8 years@KrishnaKalyan I just don't know the current status, sorry.