Code
library("tidyverse")
library("ggrepel")
library("ggdark")
library("showtext")
library("reactable")
My submission for #TidyTuesday 2021 week 34. A look at StarTrek TNG voice interactions with the Enterprise’s computer. In this submission I focus on the ‘locate’ command to find someone on the ship.
August 26, 2021
This week’s #TidyTuesday dataset is all about Star Trek The Next Generation.
In particular, the data collected by www.speechinteraction.org/TNG/ is about voice interactions of the characters with the ship’s computer. While the dataset comprises all kinds of voice interactions (questions, commands and other utterances) I focus on the ‘locate-command’ alone.
With it, characters can locate other people on the ship, if they are looking for them.1
First let’s load the required packages:
Then we need to setup the custom fonts for the plot. In this post I do not load the jolly_theme.R
2.
The Star Trek related fonts come with the {trekfont}
-package.
#> [1] "sans" "serif" "mono" "wqy-microhei" "StarNext"
#> [6] "TNGcast" "Open Sans"
It all begins with the download of the #TidyTuesday dataset from github:
Counting the number of location-commands is quite easy, as the dataset contains a column specifying who issues the command:
# count how often a character located someone else
searches_by_people <- computer %>%
# ignore interactions by the computer and ignore the Wake Word "Computer" itself
filter(sub_domain == "Locate", !str_detect(char, pattern = "[Cc]omputer"), type != "Wake Word")
head(searches_by_people) |>
knitr::kable()
name | char | line | direction | type | pri_type | domain | sub_domain | nv_resp | interaction | char_type | is_fed | error | value_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
102 | Young Ensign | You must be new to these galaxy class starships sir. (puts hand on the black surface saying) Tell me the location of Commander Data. | At the touch and the words ‘Tell me’ the black surface comes alive with light patterns showing appropriate information. | Command | Command | InfoSeek | Locate | FALSE | Tell me the location of Commander Data. | Person | TRUE | FALSE | 421 |
110 | Riker | Computer tell me Captain Picard’s location! | NA | Command | Command | InfoSeek | Locate | FALSE | Computer tell me Captain Picard’s location! | Person | TRUE | FALSE | 326 |
116 | Data | Computer where are the captain and Commander Riker? | NA | Question | Question | InfoSeek | Locate | FALSE | Computer where are the captain and Commander Riker? | Person | TRUE | FALSE | 236 |
116 | Picard | (beat thinks) Locate Lieutenant Commander Data. | NA | Command | Command | InfoSeek | Locate | FALSE | Locate Lieutenant Commander Data. | Person | TRUE | FALSE | 289 |
126 | Ralph | Ah… let’s see. (to computer) Ah… I want to go to a… the… ah… (he shrugs; then to himself:) Where would the captain be? | To his astonishemnt the computer answers: | Question | Question | InfoSeek | Locate | FALSE | Ah… let’s see. (to computer) Ah… I want to go to a… the… ah… (he shrugs; then to himself:) Where would the captain be? | Person | TRUE | FALSE | 367 |
126 | Ralph | Ah… let’s see. (to computer) Ah… I want to go to a… the… ah… (he shrugs; then to himself:) Where would the captain be? | To his astonishemnt the computer answers: | Question | Question | IoT | Locate | FALSE | Ah… let’s see. (to computer) Ah… I want to go to a… the… ah… (he shrugs; then to himself:) Where would the captain be? | Person | TRUE | FALSE | 367 |
char | n |
---|---|
picard | 18 |
riker | 9 |
data | 6 |
crusher | 4 |
worf | 4 |
mrs. troi | 3 |
la forge | 2 |
ralph | 2 |
troi | 2 |
young ensign | 1 |
Checking, how often someone is being looked for is not as straight forward. Due to time limitations I took a shortcut and compromised possible mis-counts. I basically filter the voice commands for occurrences of the main characters’ names.
As the number of rows with location commands is <100 I skimmed the commands for the names used to locate people and put these in a vector.
# Define People of interest (this is not a complete cast list, but the result of skimming ~90 entries)
people <- str_to_lower(c(
"data", "picard", "captain", "riker", "pulaski", "Goss", "Tam Elbrun", "Barclay",
"Dalen Quaice", "Hill and Selar", "Worf", "La Forge", "Vash", "Diana", "Troi", "Crusher", "Ensign Ro",
"Alexander Rozhenko", "Uhnari", "Morag"
))
# Create a Regex pattern by collapsing the vector with the "or" operator
people_pattern <- paste0(people, collapse = "|")
people_searched <- searches_by_people %>%
mutate(
# make the interactions strings to lower case
interaction_lower = str_to_lower(interaction),
# reduce the interactions strings to the searched person
# e.g. from "computer, locate commander riker" --> "riker" is extracted.
# Caution: This is not the best / generalizable way, but a rather hacky approach
# due to limited time. It works for this use case / dataset.
person_of_interest = str_extract(interaction_lower, pattern = people_pattern)
) %>%
select(interaction, person_of_interest) %>%
filter(!is.na(person_of_interest)) %>%
count(person_of_interest, sort = TRUE) %>%
mutate(person_of_interest = ifelse(person_of_interest == "captain", "picard", person_of_interest))
people_searched |>
knitr::kable()
person_of_interest | n |
---|---|
picard | 7 |
data | 5 |
la forge | 5 |
riker | 4 |
barclay | 3 |
worf | 3 |
alexander rozhenko | 2 |
dalen quaice | 2 |
troi | 2 |
crusher | 1 |
ensign ro | 1 |
goss | 1 |
hill and selar | 1 |
morag | 1 |
pulaski | 1 |
tam elbrun | 1 |
vash | 1 |
I created a csv containing the glyphs used for the characters of interest in the TNGcast-font. In Addition I took the appropriate Federation Uniform Colors from the {trekcolors}
package.
char | char_label | char_col |
---|---|---|
Alexander Rozhenko | a | #CCCCCC |
Crusher | c | #1A6384 |
Data | d | #AD722C |
Picard | j | #5B1414 |
La Forge | l | #AD722C |
Pulaski | p | #1A6384 |
Riker | r | #5B1414 |
Troi | t | #582f5e |
Worf | w | #AD722C |
Barclay | z | #AD722C |
As last step before plotting the data is combined:
Now, that the data has been prepared the plot can be drawn.
whereabouts %>%
ggplot(aes(searching, searched)) +
geom_point(size = 3) +
geom_label_repel(
aes(label = char_label, color = char_col),
box.padding = 0.5,
label.padding = 0.5,
max.time = 1,
max.iter = 100000,
family = "TNGcast",
size = 30
) +
labs(
title = "Where is Captain Picard?",
subtitle = "How often did Characters in 'StarTrek TNG' ask the computer to locate someone on the Starship Enterprise\nvs. how often are they being located via the computer.\n",
x = "Times searching someone",
y = "Times being searched",
caption = "\n@c_gebhard | #TidyTuesday Week 34 (2021)\nData source: http://www.speechinteraction.org/TNG/"
) +
coord_trans(x = "sqrt", y = "sqrt") +
scale_x_continuous(breaks = c(0:6, 10, 15, 18)) +
scale_y_continuous(breaks = c(1:7)) +
scale_color_identity() +
dark_theme_minimal() +
theme(
plot.title = element_text(
family = "StarNext",
face = "bold",
size = rel(3),
hjust = 0,
vjust = 5
),
plot.subtitle = element_text(
family = "Open Sans",
size = rel(1.3),
hjust = 0
),
plot.caption = element_text(
size = rel(1.1),
face = "italic",
hjust = 1
),
plot.caption.position = "plot",
plot.margin = margin(1.5, 0.4, 0.4, 0.4, unit = "cm"),
axis.title = element_text(
face = "bold",
size = rel(1.3)
),
axis.title.x = element_text(margin = margin(t = 15, r = 0, b = 0, l = 0)),
axis.title.y = element_text(margin = margin(t = 0, r = 15, b = 0, l = 0), angle = 90),
axis.text = element_text(
size = rel(1.3)
)
)
Note that the “officially” submitted plot3 differs from the one above. To meet the deadline I submitted a simpler version with a simple scatterplot.
@online{gebhard2021,
author = {Gebhard, Christian},
title = {Tea, {Earl} {Grey,} {Hot}},
date = {2021-08-26},
url = {https://christiangebhard.com/posts/2021-08-22-tea-earl-grey-hot-tidytuesday-2021-week-34/tea-earl-grey-hot-tidytuesday-2021-week-34.html},
langid = {en}
}
Comments
Being a Star Trek fan I really enjoyed working on the dataset. In this post I shared what I learned in regard to custom fonts and using the
{reactable}
package. I hope it was informative to read. If there’s something missing, let me know: