Examining EOD Duration from Raw Recordings
Last updated on 2026-05-08 | Edit this page
Overview
Questions
- How do we load JSON EOD recordings into R?
- How is EOD duration measured from a raw waveform?
- Did 11-ketotestosterone treatment change EOD duration over time?
Objectives
- Load JSON recording files into R using the
jsonlitepackage - Navigate a list of recording objects to extract waveform data and metadata
- Visualize a raw EOD waveform using
ggplot2 - Measure EOD duration by applying a threshold to a normalized waveform
- Build a tidy dataset by applying a duration function across many files
- Check ANOVA assumptions using Levene’s and Shapiro-Wilk tests
- Perform a one-way ANOVA and interpret the result
- Interpret a faceted comparison plot of EOD duration across treatment groups
Introduction
In the previous episodes, you learned how testosterone shapes the electric organ discharge (EOD) in Brienomyrus brachyistius, and you performed the implantation procedure yourself. Now it’s time to analyze the data.
EOD recordings were captured daily over 8 days for animals implanted with either 11-ketotestosterone (11-kt) or a blank control. Each recording session is stored as a JSON file containing multiple trials. Our goal is to:
- Load a raw EOD waveform and visualize it
- Write a function that measures EOD duration from any recording file
- Apply that function across all recording sessions to build a dataset
- Test whether 11-kt treatment changed EOD duration over time
Loading EOD Recording Files in R
JSON files can be read in R using the jsonlite package.
The read_json() function returns a list that maps directly
onto the file structure.
R
library(jsonlite)
eod_data <- read_json("data/eod_duration/DAY0_MAY5/BB_T48.json")
cat("Number of recordings:", length(eod_data), "\n")
OUTPUT
Number of recordings: 11
Each element of eod_data is one recording trial, stored
as a named list. Use names(eod_data[[1]]) to see the
available fields and $ to access them.
Challenge 1: Explore the recording metadata (5 min)
Using the eod_data object already loaded, print the
specimen number, species, temperature, sampling rate, and number of
waveform samples for the first recording trial.
Hint: names(eod_data[[1]]) shows the available
fields.
R
names(eod_data[[1]])
cat("Specimen:", eod_data[[1]]$specimenno, "\n")
cat("Species: ", eod_data[[1]]$species, "\n")
cat("Temp: ", eod_data[[1]]$temp, "\n")
cat("Rate: ", eod_data[[1]]$Rate, "\n")
cat("Samples: ", length(eod_data[[1]]$wave), "\n")
Visualizing an EOD Waveform
We convert sample indices to milliseconds using the sampling rate
stored in the file, then plot with ggplot2.
R
library(ggplot2)
rate_hz <- eod_data[[1]]$Rate
wave <- unlist(eod_data[[1]]$wave)
p1_idx <- which.max(wave)
time_ms <- (seq_along(wave) - p1_idx) / rate_hz * 1000
waveform_df <- data.frame(time_ms = time_ms, voltage = wave)
ggplot(waveform_df, aes(x = time_ms, y = voltage)) +
geom_line(linewidth = 0.6) +
coord_cartesian(xlim = c(-2, 2)) +
theme_classic() +
labs(
x = "Time (ms)",
y = "Voltage (V)",
title = paste("Raw EOD waveform —", eod_data[[1]]$specimenno, "Day 0")
)

Challenge 2: Compare waveforms across recording sessions (5 min)
Load the first recording from
data/eod_duration/DAY0_MAY7/BBRAC_T48_MAY7.json and plot it
alongside the May 5 waveform. Do the shapes look similar?
R
eod_may7 <- read_json("data/eod_duration/DAY0_MAY7/BBRAC_T48_MAY7.json")
wave7 <- unlist(eod_may7[[1]]$wave)
p1_idx7 <- which.max(wave7)
time7_ms <- (seq_along(wave7) - p1_idx7) / eod_may7[[1]]$Rate * 1000
df_compare <- rbind(
data.frame(time_ms = time_ms, voltage = wave, session = "May 5"),
data.frame(time_ms = time7_ms, voltage = wave7, session = "May 7")
)
ggplot(df_compare, aes(x = time_ms, y = voltage, color = session)) +
geom_line(linewidth = 0.6) +
coord_cartesian(xlim = c(-2, 2)) +
theme_classic() +
scale_colour_manual(values = c("May 5" = "grey50", "May 7" = "tomato")) +
labs(x = "Time (ms)", y = "Voltage (V)",
title = "Day 0 EOD waveforms — two recording sessions",
color = NULL)
Both are Day 0 baseline recordings taken before any treatment, so the waveform shapes should be similar.
Measuring EOD Duration
EOD duration is not read directly from the file — it must be calculated from the waveform. The standard approach is:
- Normalize the waveform so that the peak-to-peak amplitude = 1
- Apply a threshold (typically 10% of the maximum absolute value) to define where the EOD “starts” and “ends”
- Calculate duration as the time elapsed between the first and last threshold crossings
R
# 1. Normalize to peak-to-peak = 1
wave_norm <- wave / (max(wave) - min(wave))
# 2. Apply 10% threshold
threshold <- 0.1 * max(abs(wave_norm))
above_thresh <- which(abs(wave_norm) > threshold)
start_idx <- min(above_thresh)
end_idx <- max(above_thresh)
# 3. Duration in milliseconds
duration_ms <- (end_idx - start_idx) / rate_hz * 1000
cat("EOD duration (10% threshold):", round(duration_ms, 3), "ms\n")
OUTPUT
EOD duration (10% threshold): 0.19 ms
We can visualize the threshold on the normalized waveform to see exactly which part of the signal is being measured:
R
norm_df <- data.frame(time_ms = time_ms, voltage = wave_norm)
ggplot(norm_df, aes(x = time_ms, y = voltage)) +
geom_line(linewidth = 0.6) +
geom_hline(yintercept = c(threshold, -threshold),
linetype = "dashed", color = "steelblue", linewidth = 0.5) +
geom_vline(xintercept = c(time_ms[start_idx], time_ms[end_idx]),
linetype = "dashed", color = "tomato", linewidth = 0.5) +
coord_cartesian(xlim = c(-2, 2)) +
theme_classic() +
labs(
x = "Time (ms)",
y = "Normalized voltage",
title = "EOD duration measurement (10% threshold)"
)

Challenge 3: Sensitivity of the threshold (5 min)
Repeat the duration calculation using a 20% threshold instead of 10%. How does the measured duration change, and why?
This is important: the choice of threshold is an analytical decision that must be reported in any publication.
R
threshold_20 <- 0.2 * max(abs(wave_norm))
above_20 <- which(abs(wave_norm) > threshold_20)
duration_20ms <- (max(above_20) - min(above_20)) / rate_hz * 1000
cat("Duration at 20% threshold:", round(duration_20ms, 3), "ms\n")
cat("Duration at 10% threshold:", round(duration_ms, 3), "ms\n")
A higher threshold excludes the low-amplitude tails of the waveform, making the measured duration shorter. Neither threshold is “wrong” — but the choice must be consistent across all animals in a study and clearly reported.
Building a Dataset Across All Recording Sessions
Now that we know how to measure EOD duration from a single file, we can apply the same logic to every recording session in the study. Rather than computing durations by hand for each file, we write a function and let R do the work.
The study metadata is stored in recordings.csv, which
maps each JSON file to its animal, day, and treatment group:
R
library(dplyr)
metadata <- read.csv("data/eod_duration/recordings.csv")
metadata$Treatment <- as.factor(metadata$Treatment)
head(metadata)
OUTPUT
file specimenno day Treatment
1 DAY0_MAY5/BB_T48.json BB48 0 11-kt
2 DAY0_MAY5/BB_T49.json BB49 0 11-kt
3 DAY0_MAY5/BB_T50.json BB50 0 control
4 DAY0_MAY5/BB_T51.json BB51 0 control
5 DAY0_MAY7/BBRAC_T48_MAY7.json BB48 1 11-kt
6 DAY0_MAY7/BBRAC_T49_MAY7.json BB49 1 11-kt
We write a small function that takes a file path, loads the JSON, computes EOD duration for every trial, and returns the mean:
R
mean_eod_duration <- function(file_path, threshold = 0.1) {
recs <- read_json(file_path)
durations <- sapply(recs, function(rec) {
w <- unlist(rec$wave)
rate <- rec$Rate
w_norm <- w / (max(w) - min(w))
thresh <- threshold * max(abs(w_norm))
above <- which(abs(w_norm) > thresh)
(max(above) - min(above)) / rate * 1000
})
mean(durations)
}
We then apply this function to every row in the metadata table:
R
metadata$EOD <- sapply(
file.path("data/eod_duration", metadata$file),
mean_eod_duration
)
head(metadata)
OUTPUT
file specimenno day Treatment EOD
1 DAY0_MAY5/BB_T48.json BB48 0 11-kt 0.1900000
2 DAY0_MAY5/BB_T49.json BB49 0 11-kt 0.2260000
3 DAY0_MAY5/BB_T50.json BB50 0 control 0.1900000
4 DAY0_MAY5/BB_T51.json BB51 0 control 0.1900000
5 DAY0_MAY7/BBRAC_T48_MAY7.json BB48 1 11-kt 0.2472727
6 DAY0_MAY7/BBRAC_T49_MAY7.json BB49 1 11-kt 0.2000000
Longitudinal trajectory
With the full dataset assembled, we can visualize how EOD duration changes across the 8-day experiment for each treatment group. Points show the group mean; error bars show ±1 standard deviation.
R
ggplot(metadata, aes(x = day, y = EOD, group = Treatment, color = Treatment)) +
stat_summary(
fun = "mean",
geom = "pointrange",
fun.max = function(x) mean(x) + sd(x),
fun.min = function(x) mean(x) - sd(x),
position = position_dodge(width = 0.2, preserve = "total"),
size = 0.7,
fatten = 0.5
) +
scale_x_continuous(name = "Day", breaks = 0:8) +
scale_y_continuous(
name = "EOD duration (ms)",
breaks = scales::breaks_pretty(10)
) +
scale_colour_manual(values = c("11-kt" = "tomato", "control" = "grey40")) +
theme_classic() +
theme(
axis.text = element_text(size = 9),
legend.text = element_text(size = 8),
legend.title = element_text(size = 9)
)

Testing assumptions
Before running an ANOVA, we verify two key assumptions:
-
Homogeneity of variance — tested with Levene’s test
(
car::leveneTest) - Normality of residuals — tested with the Shapiro-Wilk test
R
suppressPackageStartupMessages(library(car))
# Extract Day 0 baseline data
d0 <- metadata %>% filter(day == 0)
res_aov_d0 <- aov(EOD ~ Treatment, data = d0)
lev_d0 <- leveneTest(res_aov_d0)
print(lev_d0)
OUTPUT
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 2.9908e+30 < 2.2e-16 ***
2
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R
shp_d0 <- shapiro.test(residuals(res_aov_d0))
print(shp_d0)
OUTPUT
Shapiro-Wilk normality test
data: residuals(res_aov_d0)
W = 0.94466, p-value = 0.683
If neither test is significant (p > 0.05), the ANOVA assumptions are met.
Challenge 4: Full statistical comparison (10 min)
Work through the following steps:
Step 1. Inspect the Day 0 ANOVA — do the treatment groups differ at baseline?
R
summary(res_aov_d0)
Step 2. Extract Day 8 data and run the same ANOVA. Check assumptions first.
R
d8 <- metadata %>% filter(day == 8)
res_aov_d8 <- aov(EOD ~ Treatment, data = d8)
leveneTest(res_aov_d8)
shapiro.test(residuals(res_aov_d8))
summary(res_aov_d8)
Step 3 (stretch). Reproduce the faceted plot below, showing Day 0 and Day 8 distributions side by side with individual data points overlaid.
R
# Step 1 — baseline check
summary(res_aov_d0)
# Non-significant: groups were similar before treatment.
# Step 2 — end-of-treatment comparison
d8 <- metadata %>% filter(day == 8)
res_aov_d8 <- aov(EOD ~ Treatment, data = d8)
leveneTest(res_aov_d8)
shapiro.test(residuals(res_aov_d8))
summary(res_aov_d8)
# Step 3 — faceted plot
d_plot <- bind_rows(
mutate(d0, timepoint = "Day 0"),
mutate(d8, timepoint = "Day 8")
)
ggplot(d_plot, aes(x = Treatment, y = EOD, color = Treatment)) +
stat_summary(
fun = "mean",
geom = "pointrange",
fun.max = function(x) mean(x) + sd(x),
fun.min = function(x) mean(x) - sd(x),
fatten = 1.8
) +
geom_point(shape = 2, position = position_jitter(width = 0.07, seed = 84)) +
facet_wrap(~timepoint) +
scale_y_continuous(
name = "EOD duration (ms)",
breaks = scales::breaks_pretty(10)
) +
scale_colour_manual(values = c("11-kt" = "tomato", "control" = "grey40")) +
theme_classic() +
theme(
axis.text = element_text(size = 9),
legend.position = "none",
panel.border = element_rect(colour = "black", fill = NA)
)
-
jsonlite::read_json()loads JSON files as nested R lists; each recording trial is a named list with fields likewave,Rate,temp, andspecimenno - EOD duration is defined by a threshold criterion applied to the normalized waveform — the choice of threshold is an analytical decision that must be reported
- A metadata CSV that maps files to experimental conditions lets you build a tidy dataset by applying a duration function across all recording sessions
- Always verify ANOVA assumptions (Levene’s test for equal variance; Shapiro-Wilk on residuals for normality) before interpreting results