My submission for #TidyTuesday 2021 week 34. A look at StarTrek TNG voice interactions with the Enterprise’s computer. In this submission I focus on the ‘locate’ command to find someone on the ship.
This week’s #TidyTuesday dataset is all about Star Trek The Next Generation.
In particular, the data collected by www.speechinteraction.org/TNG/ is about voice interactions of the characters with the ship’s computer. While the dataset comprises all kinds of voice interactions (questions, commands and other utterances) I focus on the ‘locate-command’ alone.
With it, characters can locate other people on the ship, if they are looking for them.1
First let’s load the required packages:
library("tidyverse")
library("ggrepel")
library("ggdark")
library("showtext")
library("reactable")
Then we need to setup the custom fonts for the plot. In this post I do not load the jolly_theme.R
2.
The Star Trek related fonts come with the {trekfont}
-package.
font <- c("StarNext", "TNGcast")
path <- system.file(paste0("fonts/", font, ".ttf"), package = "trekfont")
for (i in 1:2) font_add(font[i], path[i])
font_add_google("Open Sans")
font_families()
#> [1] "sans" "serif" "mono" "wqy-microhei"
#> [5] "StarNext" "TNGcast" "Open Sans"
It all begins with the download of the #TidyTuesday dataset from github:
#load the data and store locally for future runs of the code
tuesdata <- tidytuesdayR::tt_load(2021, week = 34)
#>
#> Downloading file 1 of 1: `computer.csv`
computer <- tuesdata$computer
Counting the number of location-commands is quite easy, as the dataset contains a column specifying who issues the command:
# count how often a character located someone else
searches_by_people <- computer %>%
# ignore interactions by the computer and ignore the Wake Word "Computer" itself
filter(sub_domain == "Locate", !str_detect(char, pattern = "[Cc]omputer"), type != "Wake Word")
searches_by_people_count <- searches_by_people %>%
count(char, sort = TRUE) %>%
mutate(
char = str_to_lower(char),
# use last name for later joining
char = ifelse(char == "beverly", "crusher", char),
char = ifelse(char == "geordi", "la forge", char)
)
Checking, how often someone is being looked for is not as straight forward. Due to time limitations I took a shortcut and compromised possible mis-counts. I basically filter the voice commands for occurrences of the main characters’ names.
# Define People of interest (this is not a complete cast list, but the result of skimming ~90 entries)
people <- str_to_lower(c(
"data", "picard", "captain", "riker", "pulaski", "Goss", "Tam Elbrun", "Barclay",
"Dalen Quaice", "Hill and Selar", "Worf", "La Forge", "Vash", "Diana", "Troi", "Crusher", "Ensign Ro",
"Alexander Rozhenko", "Uhnari", "Morag"
))
# Create a Regex pattern by collapsing the vector with the "or" operator
people_pattern <- paste0(people, collapse = "|")
people_searched <- searches_by_people %>%
mutate(
# make the interactions strings to lower case
interaction_lower = str_to_lower(interaction),
# reduce the interactions strings to the searched person
# e.g. from "computer, locate commander riker" --> "riker" is extracted.
# Caution: This is not the best / generalizable way, but a rather hacky approach
# due to limited time. It works for this use case / dataset.
person_of_interest = str_extract(interaction_lower, pattern = people_pattern)
) %>%
select(interaction, person_of_interest) %>%
filter(!is.na(person_of_interest)) %>%
count(person_of_interest, sort = TRUE) %>%
mutate(person_of_interest = ifelse(person_of_interest == "captain", "picard", person_of_interest))
I created a csv containing the glyphs used for the characters of interest in the TNGcast-font. In Addition I took the appropriate Federation Uniform Colors from the {trekcolors}
package.
relevant <- read_csv2("res/relevant.csv", col_names = TRUE)
# let's take a look:
relevant %>%
filter(!is.na(char)) %>%
reactable(
# global reactable options
defaultSorted = "char",
# defaultSortOrder = "desc",
searchable = TRUE,
highlight = TRUE,
rowStyle = list(cursor = "pointer"),
theme = reactableTheme(
highlightColor = "#1BC7DC"
),
# formatting individual columns
columns =
list(
char = colDef(
name = "Character Name",
sortable = TRUE,
minWidth = 150
),
char_label = colDef(
name = "Label glyph",
minWidth = 50,
sortable = TRUE
),
char_col = colDef(
name = "Uniform color HEX",
minWidth = 100,
sortable = TRUE,
style = function(value) {
list(background = value)
}
)
)
)
As last step before plotting the data is combined:
Now, that the data has been prepared the plot can be drawn.
whereabouts %>%
ggplot(aes(searching, searched)) +
geom_point(size = 3) +
geom_label_repel(
aes(label = char_label, color = char_col),
box.padding = 0.5,
label.padding = 0.5,
max.time = 1,
max.iter = 100000,
family = "TNGcast",
size = 30
) +
labs(
title = "Where is Captain Picard?",
subtitle = "How often did Characters in 'StarTrek TNG' ask the computer to locate someone on the Starship Enterprise\nvs. how often are they being located via the computer.\n",
x = "Times searching someone",
y = "Times being searched",
caption = "\n@c_gebhard | #TidyTuesday Week 34 (2021)\nData source: http://www.speechinteraction.org/TNG/"
) +
coord_trans(x = "sqrt", y = "sqrt") +
scale_x_continuous(breaks = c(0:6, 10, 15, 18)) +
scale_y_continuous(breaks = c(1:7)) +
scale_color_identity() +
dark_theme_minimal() +
theme(
plot.title = element_text(
family = "StarNext",
face = "bold",
size = rel(3),
hjust = 0,
vjust = 5
),
plot.subtitle = element_text(
family = "Open Sans",
size = rel(1.3),
hjust = 0
),
plot.caption = element_text(
size = rel(1.1),
face = "italic",
hjust = 1
),
plot.caption.position = "plot",
plot.margin = margin(1.5, 0.4, 0.4, 0.4, unit = "cm"),
axis.title = element_text(
face = "bold",
size = rel(1.3)
),
axis.title.x = element_text(margin = margin(t = 15, r = 0, b = 0, l = 0)),
axis.title.y = element_text(margin = margin(t = 0, r = 15, b = 0, l = 0), angle = 90),
axis.text = element_text(
size = rel(1.3)
)
)
Figure 1: Characters in StarTrek The Next Generatio (TNG) frequently interact with the shep’s computer via voice commands. One of the computer’s functions is to locate a person on the ship. Within the speechinteractions.org/TNG/ dataset, these ‘locate-commands’ were filtered and analysed. The characters are plotted in regard to how often the used the ‘locate-command’ to find someone vs. how often they are being located via the computer.
ggsave("tt21-34_picard.png", dpi = 96, height = 8, width = 10)
Note that the “officially” submitted plot3 differs from the one above. To meet the deadline I submitted a simpler version with a simple scatterplot.
Being a Star Trek fan I really enjoyed working on the dataset. In this post I shared what I learned in regard to custom fonts and using the {reactable}
package. I hope it was informative to read. If there’s something missing, let me know:
For attribution, please cite this work as
Gebhard (2021, Aug. 26). jolly data: Tea, Earl Grey, Hot. Retrieved from https://jollydata.blog/posts/2021-08-22-tea-earl-grey-hot-tidytuesday-2021-week-34/
BibTeX citation
@misc{gebhard2021tea,, author = {Gebhard, Christian A.}, title = {jolly data: Tea, Earl Grey, Hot}, url = {https://jollydata.blog/posts/2021-08-22-tea-earl-grey-hot-tidytuesday-2021-week-34/}, year = {2021} }