@dataknut
)Just a bit of dataknut fun woven around the day job.
You’ll be wanting Section 6 for the trending hashtags…
CC-BY unless otherwise noted.
See:
The idea is to extract and visualise tweets and re-tweets of #schoolstrike4climate (see https://www.schoolstrike4climate.com/).
Why? Err…. Just. Because.
Code borrows extensively from https://github.com/mkearney/rtweet
The analysis used rtweet
to ask the Twitter search API to extract ‘all’ tweets containing the #schoolstrike4climate hashtags in the ‘recent’ twitterVerse.
It is therefore possible that not quite all tweets have been extracted although it seems likely that we have captured most recent human
tweeting which was the main intention. Future work should instead use the Twitter streaming API.
## [1] "Found 7 files matching #schoolstrike4climate in ~/Data/twitter/"
The data has:
Figure 5.1 shows the number of tweets and tweeters in the data extract by day. The quotes, tweets and re-tweets have been separated.
If you are in New Zealand and you are wondering why there are no tweets today
(2019-03-16) the answer is that twitter data (and these plots) are working in UTC and (y)our today()
may not have started yet in UTC. Don’t worry, all the tweets are here - it’s just our old friend the timezone… :-)
Next we’ll try by screen name.
Figure 5.2 is a really bad visualisation of all tweeters tweeting over time. Each row of pixels is a tweeter (the names are probably illegible) and a green dot indicates a few tweets in the given day while a red dot indicates a lot of tweets.
So let’s re-do that for the top 50 tweeters so we can see their tweetStreaks (tm)…
Top tweeters:
screen_name | nTweets |
---|---|
NoahsArkCrew | 157 |
D_Melissa2 | 84 |
pezmico | 84 |
Glo_man | 69 |
buoyancybackup | 69 |
DawnRoseTurner | 66 |
lin_nah | 63 |
Beccabluesky | 63 |
GreenpeaceNZ | 58 |
NoAdaniOz | 56 |
ClimateStrikeGL | 51 |
Feenwald | 48 |
heidi_k_edmonds | 45 |
FibrodisKo | 44 |
daniel_scholler | 42 |
And their tweetStreaks are shown in Figure 5.3…
Any twitterBots…?
We wanted to make a nice map but sadly we see that most tweets have no lat/long set.
geo_coords | nTweets |
---|---|
| | 54702 |
-34.6089|-58.4397 | 1 |
-33.86751|151.20797 | 1 |
-33.8731575|151.2061157 | 1 |
-37.8|144.967 | 1 |
4.60987|-74.082 | 1 |
40.78100519|-73.97325538 | 1 |
-37.81328358|144.97403895 | 2 |
-41.2889|174.777 | 1 |
19.4156206|-99.1913432 | 1 |
coords_coords | nTweets |
---|---|
| | 54702 |
-58.4397|-34.6089 | 1 |
151.20797|-33.86751 | 1 |
151.2061157|-33.8731575 | 1 |
144.967|-37.8 | 1 |
-74.082|4.60987 | 1 |
-73.97325538|40.78100519 | 1 |
144.97403895|-37.81328358 | 2 |
174.777|-41.2889 | 1 |
-99.1913432|19.4156206 | 1 |
This appears to be pulled from the user’s profile although it may also be a ‘guestimate’ of current location.
Top country locations for tweets:
location | nTweets |
---|---|
NA | 14050 |
Australia | 1250 |
New Zealand | 466 |
London | 443 |
Melbourne, Victoria | 378 |
Melbourne, Australia | 371 |
Sydney, New South Wales | 305 |
Sydney | 305 |
United States | 292 |
Auckland, New Zealand | 285 |
Sydney, Australia | 269 |
Earth | 257 |
Canada | 257 |
London, England | 243 |
Melbourne | 239 |
Top locations for tweeters:
location | nTweeters |
---|---|
NA | 8542 |
Australia | 493 |
London | 251 |
Melbourne, Victoria | 209 |
United States | 194 |
New Zealand | 191 |
London, England | 187 |
Sydney, New South Wales | 176 |
Melbourne, Australia | 167 |
Sydney, Australia | 145 |
Canada | 143 |
Sydney | 134 |
Melbourne | 131 |
Earth | 113 |
United Kingdom | 113 |
Now try the full place name - rarely available.
place_full_name | nTweets |
---|---|
NA | 54415 |
Auckland, New Zealand | 37 |
Sydney, New South Wales | 29 |
Melbourne, Victoria | 21 |
Wellington City, New Zealand | 14 |
Adelaide, South Australia | 13 |
Miami Beach, FL | 10 |
Manhattan, NY | 8 |
Brisbane, Queensland | 7 |
Walthamstow, London | 7 |
Vancouver, British Columbia | 6 |
Viña del Mar, Chile | 6 |
Canberra, Australian Capital Territory | 5 |
Old Treasury Building | 4 |
Newcastle, New South Wales | 4 |
There are a lot of problems with this approach (see Section 7) but Figure 6.1 shows trends over time (watch for lines of apparently dis-similar hashtags where the macron fix has failed) and Figure 6.2 shows the totals to date.
Figure 6.1 uses plotly to avoid having to render a large legend - just hover over the lines to see who is who…
Loads of them. But primarily:
As ever, #YMMV.
Analysis completed in 67.664 seconds ( 1.13 minutes) using knitr in RStudio with R version 3.5.1 (2018-07-02) running on x86_64-redhat-linux-gnu.
A special mention must go to https://github.com/mkearney/rtweet
(Kearney 2018) for the twitter API interaction functions.
Other R packages used:
Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.
Kearney, Michael W. 2018. Rtweet: Collecting Twitter Data. https://cran.r-project.org/package=rtweet.
R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2016. Plotly: Create Interactive Web Graphics via ’Plotly.js’. https://CRAN.R-project.org/package=plotly.
Wickham, Hadley. 2007. “Reshaping Data with the reshape Package.” Journal of Statistical Software 21 (12): 1–20. http://www.jstatsoft.org/v21/i12/.
———. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.
———. 2016. Stringr: Simple, Consistent Wrappers for Common String Operations. https://CRAN.R-project.org/package=stringr.
Wickham, Hadley, Jim Hester, and Romain Francois. 2016. Readr: Read Tabular Data. https://CRAN.R-project.org/package=readr.
Xie, Yihui. 2016. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.
———. 2018. Bookdown: Authoring Books and Technical Documents with R Markdown. https://github.com/rstudio/bookdown.
Zhu, Hao. 2019. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.