Examining EOD Duration from Raw Recordings

Last updated on 2026-05-08 | Edit this page

Overview

Questions

How do we load JSON EOD recordings into R?
How is EOD duration measured from a raw waveform?
Did 11-ketotestosterone treatment change EOD duration over time?

Objectives

Load JSON recording files into R using the jsonlite package
Navigate a list of recording objects to extract waveform data and metadata
Visualize a raw EOD waveform using ggplot2
Measure EOD duration by applying a threshold to a normalized waveform
Build a tidy dataset by applying a duration function across many files
Check ANOVA assumptions using Levene’s and Shapiro-Wilk tests
Perform a one-way ANOVA and interpret the result
Interpret a faceted comparison plot of EOD duration across treatment groups

Introduction

In the previous episodes, you learned how testosterone shapes the electric organ discharge (EOD) in Brienomyrus brachyistius, and you performed the implantation procedure yourself. Now it’s time to analyze the data.

EOD recordings were captured daily over 8 days for animals implanted with either 11-ketotestosterone (11-kt) or a blank control. Each recording session is stored as a JSON file containing multiple trials. Our goal is to:

Load a raw EOD waveform and visualize it
Write a function that measures EOD duration from any recording file
Apply that function across all recording sessions to build a dataset
Test whether 11-kt treatment changed EOD duration over time

Loading EOD Recording Files in R

JSON files can be read in R using the jsonlite package. The read_json() function returns a list that maps directly onto the file structure.

R

library(jsonlite)

eod_data <- read_json("data/eod_duration/DAY0_MAY5/BB_T48.json")

cat("Number of recordings:", length(eod_data), "\n")

OUTPUT

Number of recordings: 11

Each element of eod_data is one recording trial, stored as a named list. Use names(eod_data[[1]]) to see the available fields and $ to access them.

Challenge

Challenge 1: Explore the recording metadata (5 min)

Using the eod_data object already loaded, print the specimen number, species, temperature, sampling rate, and number of waveform samples for the first recording trial.

Hint: names(eod_data[[1]]) shows the available fields.

Show me the solution

R

names(eod_data[[1]])
cat("Specimen:", eod_data[[1]]$specimenno, "\n")
cat("Species: ", eod_data[[1]]$species, "\n")
cat("Temp:    ", eod_data[[1]]$temp, "\n")
cat("Rate:    ", eod_data[[1]]$Rate, "\n")
cat("Samples: ", length(eod_data[[1]]$wave), "\n")

Visualizing an EOD Waveform

We convert sample indices to milliseconds using the sampling rate stored in the file, then plot with ggplot2.

R

library(ggplot2)

rate_hz <- eod_data[[1]]$Rate
wave    <- unlist(eod_data[[1]]$wave)
p1_idx  <- which.max(wave)
time_ms <- (seq_along(wave) - p1_idx) / rate_hz * 1000

waveform_df <- data.frame(time_ms = time_ms, voltage = wave)

ggplot(waveform_df, aes(x = time_ms, y = voltage)) +
  geom_line(linewidth = 0.6) +
  coord_cartesian(xlim = c(-2, 2)) +
  theme_classic() +
  labs(
    x = "Time (ms)",
    y = "Voltage (V)",
    title = paste("Raw EOD waveform —", eod_data[[1]]$specimenno, "Day 0")
  )

Line plot of a single EOD waveform showing voltage in volts on the y-axis and time in milliseconds on the x-axis. The waveform has a characteristic biphasic shape with a positive P1 peak followed by a negative P2 trough. — A single raw EOD waveform recorded from *B. brachyistius*. The biphasic waveform is characteristic of mormyrid electric fish.

Challenge

Challenge 2: Compare waveforms across recording sessions (5 min)

Load the first recording from data/eod_duration/DAY0_MAY7/BBRAC_T48_MAY7.json and plot it alongside the May 5 waveform. Do the shapes look similar?

Show me the solution

R

eod_may7  <- read_json("data/eod_duration/DAY0_MAY7/BBRAC_T48_MAY7.json")
wave7     <- unlist(eod_may7[[1]]$wave)
p1_idx7   <- which.max(wave7)
time7_ms  <- (seq_along(wave7) - p1_idx7) / eod_may7[[1]]$Rate * 1000

df_compare <- rbind(
  data.frame(time_ms = time_ms,  voltage = wave,  session = "May 5"),
  data.frame(time_ms = time7_ms, voltage = wave7, session = "May 7")
)

ggplot(df_compare, aes(x = time_ms, y = voltage, color = session)) +
  geom_line(linewidth = 0.6) +
  coord_cartesian(xlim = c(-2, 2)) +
  theme_classic() +
  scale_colour_manual(values = c("May 5" = "grey50", "May 7" = "tomato")) +
  labs(x = "Time (ms)", y = "Voltage (V)",
       title = "Day 0 EOD waveforms — two recording sessions",
       color = NULL)

Both are Day 0 baseline recordings taken before any treatment, so the waveform shapes should be similar.

Measuring EOD Duration

EOD duration is not read directly from the file — it must be calculated from the waveform. The standard approach is:

Normalize the waveform so that the peak-to-peak amplitude = 1
Apply a threshold (typically 10% of the maximum absolute value) to define where the EOD “starts” and “ends”
Calculate duration as the time elapsed between the first and last threshold crossings

R

# 1. Normalize to peak-to-peak = 1
wave_norm <- wave / (max(wave) - min(wave))

# 2. Apply 10% threshold
threshold    <- 0.1 * max(abs(wave_norm))
above_thresh <- which(abs(wave_norm) > threshold)

start_idx <- min(above_thresh)
end_idx   <- max(above_thresh)

# 3. Duration in milliseconds
duration_ms <- (end_idx - start_idx) / rate_hz * 1000
cat("EOD duration (10% threshold):", round(duration_ms, 3), "ms\n")

OUTPUT

EOD duration (10% threshold): 0.19 ms

We can visualize the threshold on the normalized waveform to see exactly which part of the signal is being measured:

R

norm_df <- data.frame(time_ms = time_ms, voltage = wave_norm)

ggplot(norm_df, aes(x = time_ms, y = voltage)) +
  geom_line(linewidth = 0.6) +
  geom_hline(yintercept = c(threshold, -threshold),
             linetype = "dashed", color = "steelblue", linewidth = 0.5) +
  geom_vline(xintercept = c(time_ms[start_idx], time_ms[end_idx]),
             linetype = "dashed", color = "tomato", linewidth = 0.5) +
  coord_cartesian(xlim = c(-2, 2)) +
  theme_classic() +
  labs(
    x     = "Time (ms)",
    y     = "Normalized voltage",
    title = "EOD duration measurement (10% threshold)"
  )

Normalized EOD waveform with horizontal dashed lines at +10% and -10% of the peak amplitude. Vertical dashed lines mark the measured start and end of the EOD, highlighting the duration window. — Threshold-based EOD duration measurement on the normalized waveform.

Challenge

Challenge 3: Sensitivity of the threshold (5 min)

Repeat the duration calculation using a 20% threshold instead of 10%. How does the measured duration change, and why?

This is important: the choice of threshold is an analytical decision that must be reported in any publication.

Show me the solution

R

threshold_20  <- 0.2 * max(abs(wave_norm))
above_20      <- which(abs(wave_norm) > threshold_20)

duration_20ms <- (max(above_20) - min(above_20)) / rate_hz * 1000
cat("Duration at 20% threshold:", round(duration_20ms, 3), "ms\n")
cat("Duration at 10% threshold:", round(duration_ms, 3), "ms\n")

A higher threshold excludes the low-amplitude tails of the waveform, making the measured duration shorter. Neither threshold is “wrong” — but the choice must be consistent across all animals in a study and clearly reported.

Building a Dataset Across All Recording Sessions

Now that we know how to measure EOD duration from a single file, we can apply the same logic to every recording session in the study. Rather than computing durations by hand for each file, we write a function and let R do the work.

The study metadata is stored in recordings.csv, which maps each JSON file to its animal, day, and treatment group:

R

library(dplyr)

metadata <- read.csv("data/eod_duration/recordings.csv")
metadata$Treatment <- as.factor(metadata$Treatment)

head(metadata)

OUTPUT

                           file specimenno day Treatment
1         DAY0_MAY5/BB_T48.json       BB48   0     11-kt
2         DAY0_MAY5/BB_T49.json       BB49   0     11-kt
3         DAY0_MAY5/BB_T50.json       BB50   0   control
4         DAY0_MAY5/BB_T51.json       BB51   0   control
5 DAY0_MAY7/BBRAC_T48_MAY7.json       BB48   1     11-kt
6 DAY0_MAY7/BBRAC_T49_MAY7.json       BB49   1     11-kt

We write a small function that takes a file path, loads the JSON, computes EOD duration for every trial, and returns the mean:

R

mean_eod_duration <- function(file_path, threshold = 0.1) {
  recs <- read_json(file_path)
  durations <- sapply(recs, function(rec) {
    w      <- unlist(rec$wave)
    rate   <- rec$Rate
    w_norm <- w / (max(w) - min(w))
    thresh <- threshold * max(abs(w_norm))
    above  <- which(abs(w_norm) > thresh)
    (max(above) - min(above)) / rate * 1000
  })
  mean(durations)
}

We then apply this function to every row in the metadata table:

R

metadata$EOD <- sapply(
  file.path("data/eod_duration", metadata$file),
  mean_eod_duration
)

head(metadata)

OUTPUT

                           file specimenno day Treatment       EOD
1         DAY0_MAY5/BB_T48.json       BB48   0     11-kt 0.1900000
2         DAY0_MAY5/BB_T49.json       BB49   0     11-kt 0.2260000
3         DAY0_MAY5/BB_T50.json       BB50   0   control 0.1900000
4         DAY0_MAY5/BB_T51.json       BB51   0   control 0.1900000
5 DAY0_MAY7/BBRAC_T48_MAY7.json       BB48   1     11-kt 0.2472727
6 DAY0_MAY7/BBRAC_T49_MAY7.json       BB49   1     11-kt 0.2000000

Longitudinal trajectory

With the full dataset assembled, we can visualize how EOD duration changes across the 8-day experiment for each treatment group. Points show the group mean; error bars show ±1 standard deviation.

R

ggplot(metadata, aes(x = day, y = EOD, group = Treatment, color = Treatment)) +
  stat_summary(
    fun     = "mean",
    geom    = "pointrange",
    fun.max = function(x) mean(x) + sd(x),
    fun.min = function(x) mean(x) - sd(x),
    position = position_dodge(width = 0.2, preserve = "total"),
    size    = 0.7,
    fatten  = 0.5
  ) +
  scale_x_continuous(name = "Day", breaks = 0:8) +
  scale_y_continuous(
    name   = "EOD duration (ms)",
    breaks = scales::breaks_pretty(10)
  ) +
  scale_colour_manual(values = c("11-kt" = "tomato", "control" = "grey40")) +
  theme_classic() +
  theme(
    axis.text    = element_text(size = 9),
    legend.text  = element_text(size = 8),
    legend.title = element_text(size = 9)
  )

Line plot showing mean EOD duration in milliseconds on the y-axis and experimental day on the x-axis, with separate colored lines for the 11-kt and control groups. Error bars represent one standard deviation. — Mean EOD duration (±SD) across experimental days by treatment group.

Testing assumptions

Before running an ANOVA, we verify two key assumptions:

Homogeneity of variance — tested with Levene’s test (car::leveneTest)
Normality of residuals — tested with the Shapiro-Wilk test

R

suppressPackageStartupMessages(library(car))

# Extract Day 0 baseline data
d0 <- metadata %>% filter(day == 0)

res_aov_d0 <- aov(EOD ~ Treatment, data = d0)

lev_d0 <- leveneTest(res_aov_d0)
print(lev_d0)

OUTPUT

Levene's Test for Homogeneity of Variance (center = median)
      Df    F value    Pr(>F)
group  1 2.9908e+30 < 2.2e-16 ***
       2
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R

shp_d0 <- shapiro.test(residuals(res_aov_d0))
print(shp_d0)

OUTPUT


	Shapiro-Wilk normality test

data:  residuals(res_aov_d0)
W = 0.94466, p-value = 0.683

If neither test is significant (p > 0.05), the ANOVA assumptions are met.

Challenge

Challenge 4: Full statistical comparison (10 min)

Work through the following steps:

Step 1. Inspect the Day 0 ANOVA — do the treatment groups differ at baseline?

R

summary(res_aov_d0)

Step 2. Extract Day 8 data and run the same ANOVA. Check assumptions first.

R

d8 <- metadata %>% filter(day == 8)

res_aov_d8 <- aov(EOD ~ Treatment, data = d8)

leveneTest(res_aov_d8)
shapiro.test(residuals(res_aov_d8))

summary(res_aov_d8)

Step 3 (stretch). Reproduce the faceted plot below, showing Day 0 and Day 8 distributions side by side with individual data points overlaid.

Show me the solution

R

# Step 1 — baseline check
summary(res_aov_d0)
# Non-significant: groups were similar before treatment.

# Step 2 — end-of-treatment comparison
d8 <- metadata %>% filter(day == 8)
res_aov_d8 <- aov(EOD ~ Treatment, data = d8)
leveneTest(res_aov_d8)
shapiro.test(residuals(res_aov_d8))
summary(res_aov_d8)

# Step 3 — faceted plot
d_plot <- bind_rows(
  mutate(d0, timepoint = "Day 0"),
  mutate(d8, timepoint = "Day 8")
)

ggplot(d_plot, aes(x = Treatment, y = EOD, color = Treatment)) +
  stat_summary(
    fun     = "mean",
    geom    = "pointrange",
    fun.max = function(x) mean(x) + sd(x),
    fun.min = function(x) mean(x) - sd(x),
    fatten  = 1.8
  ) +
  geom_point(shape = 2, position = position_jitter(width = 0.07, seed = 84)) +
  facet_wrap(~timepoint) +
  scale_y_continuous(
    name   = "EOD duration (ms)",
    breaks = scales::breaks_pretty(10)
  ) +
  scale_colour_manual(values = c("11-kt" = "tomato", "control" = "grey40")) +
  theme_classic() +
  theme(
    axis.text       = element_text(size = 9),
    legend.position = "none",
    panel.border    = element_rect(colour = "black", fill = NA)
  )

Key Points

jsonlite::read_json() loads JSON files as nested R lists; each recording trial is a named list with fields like wave, Rate, temp, and specimenno
EOD duration is defined by a threshold criterion applied to the normalized waveform — the choice of threshold is an analytical decision that must be reported
A metadata CSV that maps files to experimental conditions lets you build a tidy dataset by applying a duration function across all recording sessions
Always verify ANOVA assumptions (Levene’s test for equal variance; Shapiro-Wilk on residuals for normality) before interpreting results