Examining EOD Duration from Raw Recordings

Last updated on 2026-05-08 | Edit this page

Overview

Questions

  • How do we load JSON EOD recordings into R?
  • How is EOD duration measured from a raw waveform?
  • Did 11-ketotestosterone treatment change EOD duration over time?

Objectives

  • Load JSON recording files into R using the jsonlite package
  • Navigate a list of recording objects to extract waveform data and metadata
  • Visualize a raw EOD waveform using ggplot2
  • Measure EOD duration by applying a threshold to a normalized waveform
  • Build a tidy dataset by applying a duration function across many files
  • Check ANOVA assumptions using Levene’s and Shapiro-Wilk tests
  • Perform a one-way ANOVA and interpret the result
  • Interpret a faceted comparison plot of EOD duration across treatment groups

Introduction


In the previous episodes, you learned how testosterone shapes the electric organ discharge (EOD) in Brienomyrus brachyistius, and you performed the implantation procedure yourself. Now it’s time to analyze the data.

EOD recordings were captured daily over 8 days for animals implanted with either 11-ketotestosterone (11-kt) or a blank control. Each recording session is stored as a JSON file containing multiple trials. Our goal is to:

  1. Load a raw EOD waveform and visualize it
  2. Write a function that measures EOD duration from any recording file
  3. Apply that function across all recording sessions to build a dataset
  4. Test whether 11-kt treatment changed EOD duration over time

Loading EOD Recording Files in R


JSON files can be read in R using the jsonlite package. The read_json() function returns a list that maps directly onto the file structure.

R

library(jsonlite)

eod_data <- read_json("data/eod_duration/DAY0_MAY5/BB_T48.json")

cat("Number of recordings:", length(eod_data), "\n")

OUTPUT

Number of recordings: 11 

Each element of eod_data is one recording trial, stored as a named list. Use names(eod_data[[1]]) to see the available fields and $ to access them.

Challenge

Challenge 1: Explore the recording metadata (5 min)

Using the eod_data object already loaded, print the specimen number, species, temperature, sampling rate, and number of waveform samples for the first recording trial.

Hint: names(eod_data[[1]]) shows the available fields.

R

names(eod_data[[1]])
cat("Specimen:", eod_data[[1]]$specimenno, "\n")
cat("Species: ", eod_data[[1]]$species, "\n")
cat("Temp:    ", eod_data[[1]]$temp, "\n")
cat("Rate:    ", eod_data[[1]]$Rate, "\n")
cat("Samples: ", length(eod_data[[1]]$wave), "\n")

Visualizing an EOD Waveform


We convert sample indices to milliseconds using the sampling rate stored in the file, then plot with ggplot2.

R

library(ggplot2)

rate_hz <- eod_data[[1]]$Rate
wave    <- unlist(eod_data[[1]]$wave)
p1_idx  <- which.max(wave)
time_ms <- (seq_along(wave) - p1_idx) / rate_hz * 1000

waveform_df <- data.frame(time_ms = time_ms, voltage = wave)

ggplot(waveform_df, aes(x = time_ms, y = voltage)) +
  geom_line(linewidth = 0.6) +
  coord_cartesian(xlim = c(-2, 2)) +
  theme_classic() +
  labs(
    x = "Time (ms)",
    y = "Voltage (V)",
    title = paste("Raw EOD waveform —", eod_data[[1]]$specimenno, "Day 0")
  )
Line plot of a single EOD waveform showing voltage in volts on the y-axis and time in milliseconds on the x-axis. The waveform has a characteristic biphasic shape with a positive P1 peak followed by a negative P2 trough.
A single raw EOD waveform recorded from B. brachyistius. The biphasic waveform is characteristic of mormyrid electric fish.
Challenge

Challenge 2: Compare waveforms across recording sessions (5 min)

Load the first recording from data/eod_duration/DAY0_MAY7/BBRAC_T48_MAY7.json and plot it alongside the May 5 waveform. Do the shapes look similar?

R

eod_may7  <- read_json("data/eod_duration/DAY0_MAY7/BBRAC_T48_MAY7.json")
wave7     <- unlist(eod_may7[[1]]$wave)
p1_idx7   <- which.max(wave7)
time7_ms  <- (seq_along(wave7) - p1_idx7) / eod_may7[[1]]$Rate * 1000

df_compare <- rbind(
  data.frame(time_ms = time_ms,  voltage = wave,  session = "May 5"),
  data.frame(time_ms = time7_ms, voltage = wave7, session = "May 7")
)

ggplot(df_compare, aes(x = time_ms, y = voltage, color = session)) +
  geom_line(linewidth = 0.6) +
  coord_cartesian(xlim = c(-2, 2)) +
  theme_classic() +
  scale_colour_manual(values = c("May 5" = "grey50", "May 7" = "tomato")) +
  labs(x = "Time (ms)", y = "Voltage (V)",
       title = "Day 0 EOD waveforms — two recording sessions",
       color = NULL)

Both are Day 0 baseline recordings taken before any treatment, so the waveform shapes should be similar.

Measuring EOD Duration


EOD duration is not read directly from the file — it must be calculated from the waveform. The standard approach is:

  1. Normalize the waveform so that the peak-to-peak amplitude = 1
  2. Apply a threshold (typically 10% of the maximum absolute value) to define where the EOD “starts” and “ends”
  3. Calculate duration as the time elapsed between the first and last threshold crossings

R

# 1. Normalize to peak-to-peak = 1
wave_norm <- wave / (max(wave) - min(wave))

# 2. Apply 10% threshold
threshold    <- 0.1 * max(abs(wave_norm))
above_thresh <- which(abs(wave_norm) > threshold)

start_idx <- min(above_thresh)
end_idx   <- max(above_thresh)

# 3. Duration in milliseconds
duration_ms <- (end_idx - start_idx) / rate_hz * 1000
cat("EOD duration (10% threshold):", round(duration_ms, 3), "ms\n")

OUTPUT

EOD duration (10% threshold): 0.19 ms

We can visualize the threshold on the normalized waveform to see exactly which part of the signal is being measured:

R

norm_df <- data.frame(time_ms = time_ms, voltage = wave_norm)

ggplot(norm_df, aes(x = time_ms, y = voltage)) +
  geom_line(linewidth = 0.6) +
  geom_hline(yintercept = c(threshold, -threshold),
             linetype = "dashed", color = "steelblue", linewidth = 0.5) +
  geom_vline(xintercept = c(time_ms[start_idx], time_ms[end_idx]),
             linetype = "dashed", color = "tomato", linewidth = 0.5) +
  coord_cartesian(xlim = c(-2, 2)) +
  theme_classic() +
  labs(
    x     = "Time (ms)",
    y     = "Normalized voltage",
    title = "EOD duration measurement (10% threshold)"
  )
Normalized EOD waveform with horizontal dashed lines at +10% and -10% of the peak amplitude. Vertical dashed lines mark the measured start and end of the EOD, highlighting the duration window.
Threshold-based EOD duration measurement on the normalized waveform.
Challenge

Challenge 3: Sensitivity of the threshold (5 min)

Repeat the duration calculation using a 20% threshold instead of 10%. How does the measured duration change, and why?

This is important: the choice of threshold is an analytical decision that must be reported in any publication.

R

threshold_20  <- 0.2 * max(abs(wave_norm))
above_20      <- which(abs(wave_norm) > threshold_20)

duration_20ms <- (max(above_20) - min(above_20)) / rate_hz * 1000
cat("Duration at 20% threshold:", round(duration_20ms, 3), "ms\n")
cat("Duration at 10% threshold:", round(duration_ms, 3), "ms\n")

A higher threshold excludes the low-amplitude tails of the waveform, making the measured duration shorter. Neither threshold is “wrong” — but the choice must be consistent across all animals in a study and clearly reported.

Building a Dataset Across All Recording Sessions


Now that we know how to measure EOD duration from a single file, we can apply the same logic to every recording session in the study. Rather than computing durations by hand for each file, we write a function and let R do the work.

The study metadata is stored in recordings.csv, which maps each JSON file to its animal, day, and treatment group:

R

library(dplyr)

metadata <- read.csv("data/eod_duration/recordings.csv")
metadata$Treatment <- as.factor(metadata$Treatment)

head(metadata)

OUTPUT

                           file specimenno day Treatment
1         DAY0_MAY5/BB_T48.json       BB48   0     11-kt
2         DAY0_MAY5/BB_T49.json       BB49   0     11-kt
3         DAY0_MAY5/BB_T50.json       BB50   0   control
4         DAY0_MAY5/BB_T51.json       BB51   0   control
5 DAY0_MAY7/BBRAC_T48_MAY7.json       BB48   1     11-kt
6 DAY0_MAY7/BBRAC_T49_MAY7.json       BB49   1     11-kt

We write a small function that takes a file path, loads the JSON, computes EOD duration for every trial, and returns the mean:

R

mean_eod_duration <- function(file_path, threshold = 0.1) {
  recs <- read_json(file_path)
  durations <- sapply(recs, function(rec) {
    w      <- unlist(rec$wave)
    rate   <- rec$Rate
    w_norm <- w / (max(w) - min(w))
    thresh <- threshold * max(abs(w_norm))
    above  <- which(abs(w_norm) > thresh)
    (max(above) - min(above)) / rate * 1000
  })
  mean(durations)
}

We then apply this function to every row in the metadata table:

R

metadata$EOD <- sapply(
  file.path("data/eod_duration", metadata$file),
  mean_eod_duration
)

head(metadata)

OUTPUT

                           file specimenno day Treatment       EOD
1         DAY0_MAY5/BB_T48.json       BB48   0     11-kt 0.1900000
2         DAY0_MAY5/BB_T49.json       BB49   0     11-kt 0.2260000
3         DAY0_MAY5/BB_T50.json       BB50   0   control 0.1900000
4         DAY0_MAY5/BB_T51.json       BB51   0   control 0.1900000
5 DAY0_MAY7/BBRAC_T48_MAY7.json       BB48   1     11-kt 0.2472727
6 DAY0_MAY7/BBRAC_T49_MAY7.json       BB49   1     11-kt 0.2000000

Longitudinal trajectory

With the full dataset assembled, we can visualize how EOD duration changes across the 8-day experiment for each treatment group. Points show the group mean; error bars show ±1 standard deviation.

R

ggplot(metadata, aes(x = day, y = EOD, group = Treatment, color = Treatment)) +
  stat_summary(
    fun     = "mean",
    geom    = "pointrange",
    fun.max = function(x) mean(x) + sd(x),
    fun.min = function(x) mean(x) - sd(x),
    position = position_dodge(width = 0.2, preserve = "total"),
    size    = 0.7,
    fatten  = 0.5
  ) +
  scale_x_continuous(name = "Day", breaks = 0:8) +
  scale_y_continuous(
    name   = "EOD duration (ms)",
    breaks = scales::breaks_pretty(10)
  ) +
  scale_colour_manual(values = c("11-kt" = "tomato", "control" = "grey40")) +
  theme_classic() +
  theme(
    axis.text    = element_text(size = 9),
    legend.text  = element_text(size = 8),
    legend.title = element_text(size = 9)
  )
Line plot showing mean EOD duration in milliseconds on the y-axis and experimental day on the x-axis, with separate colored lines for the 11-kt and control groups. Error bars represent one standard deviation.
Mean EOD duration (±SD) across experimental days by treatment group.

Testing assumptions

Before running an ANOVA, we verify two key assumptions:

  • Homogeneity of variance — tested with Levene’s test (car::leveneTest)
  • Normality of residuals — tested with the Shapiro-Wilk test

R

suppressPackageStartupMessages(library(car))

# Extract Day 0 baseline data
d0 <- metadata %>% filter(day == 0)

res_aov_d0 <- aov(EOD ~ Treatment, data = d0)

lev_d0 <- leveneTest(res_aov_d0)
print(lev_d0)

OUTPUT

Levene's Test for Homogeneity of Variance (center = median)
      Df    F value    Pr(>F)
group  1 2.9908e+30 < 2.2e-16 ***
       2
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R

shp_d0 <- shapiro.test(residuals(res_aov_d0))
print(shp_d0)

OUTPUT


	Shapiro-Wilk normality test

data:  residuals(res_aov_d0)
W = 0.94466, p-value = 0.683

If neither test is significant (p > 0.05), the ANOVA assumptions are met.

Challenge

Challenge 4: Full statistical comparison (10 min)

Work through the following steps:

Step 1. Inspect the Day 0 ANOVA — do the treatment groups differ at baseline?

R

summary(res_aov_d0)

Step 2. Extract Day 8 data and run the same ANOVA. Check assumptions first.

R

d8 <- metadata %>% filter(day == 8)

res_aov_d8 <- aov(EOD ~ Treatment, data = d8)

leveneTest(res_aov_d8)
shapiro.test(residuals(res_aov_d8))

summary(res_aov_d8)

Step 3 (stretch). Reproduce the faceted plot below, showing Day 0 and Day 8 distributions side by side with individual data points overlaid.

R

# Step 1 — baseline check
summary(res_aov_d0)
# Non-significant: groups were similar before treatment.

# Step 2 — end-of-treatment comparison
d8 <- metadata %>% filter(day == 8)
res_aov_d8 <- aov(EOD ~ Treatment, data = d8)
leveneTest(res_aov_d8)
shapiro.test(residuals(res_aov_d8))
summary(res_aov_d8)

# Step 3 — faceted plot
d_plot <- bind_rows(
  mutate(d0, timepoint = "Day 0"),
  mutate(d8, timepoint = "Day 8")
)

ggplot(d_plot, aes(x = Treatment, y = EOD, color = Treatment)) +
  stat_summary(
    fun     = "mean",
    geom    = "pointrange",
    fun.max = function(x) mean(x) + sd(x),
    fun.min = function(x) mean(x) - sd(x),
    fatten  = 1.8
  ) +
  geom_point(shape = 2, position = position_jitter(width = 0.07, seed = 84)) +
  facet_wrap(~timepoint) +
  scale_y_continuous(
    name   = "EOD duration (ms)",
    breaks = scales::breaks_pretty(10)
  ) +
  scale_colour_manual(values = c("11-kt" = "tomato", "control" = "grey40")) +
  theme_classic() +
  theme(
    axis.text       = element_text(size = 9),
    legend.position = "none",
    panel.border    = element_rect(colour = "black", fill = NA)
  )
Key Points
  • jsonlite::read_json() loads JSON files as nested R lists; each recording trial is a named list with fields like wave, Rate, temp, and specimenno
  • EOD duration is defined by a threshold criterion applied to the normalized waveform — the choice of threshold is an analytical decision that must be reported
  • A metadata CSV that maps files to experimental conditions lets you build a tidy dataset by applying a duration function across all recording sessions
  • Always verify ANOVA assumptions (Levene’s test for equal variance; Shapiro-Wilk on residuals for normality) before interpreting results