Natural Sound Dataset #1 (NS1)

Dataset Source: NS1 Dataset

Original Paper: “Network receptive field modeling reveals extensive integration and multi-feature selectivity in auditory cortical neurons” by Nicol S. Harper, Oliver Schoppe, Ben D. B. Willmore, Zhanfeng F. Cui, Jan W. H. Schnupp, Andrew J. King.

Papers Using the Dataset:

“A dynamic network model of temporal receptive fields in primary auditory cortex” by M. Rahman, B. D. B. Willmore, A. J. King, N. S. Harper.
“Measuring the performance of neural models” by O. Schoppe, N. S. Harper, B. D. B. Willmore, A. J. King, J. W. H. Schnupp.

Dataset Details:

Population fitting: ✅

Description of Stimuli:

20 clips of natural sounds (speech, ferret vocalizations, other animal vocalizations, and environmental sounds), each ~5 seconds in duration.
Clips were played in random order
Each natural sound clip was repeated 20 times

Description of Neurons:

Total Number of Neurons: 119
Valid Neurons: 73
Valid Neurons Criteria: Noise ratio < 40 (The noise ratio is given)
Neuron Types: 549 single and multi-unit recordings from six adult pigmented ferrets (five female and one male), among which 284 were single units.

Available data:

Needs MATLAB Data Processing. More info in the readme file in the dataset.
Structured neuron array with 119 items. In each element:
- Sound waveform of each stimulus (y, fs).
- Spike times of each repeat of each stimulus.
- Information on the recording, on the noise ratio, etc.

Processing needed:

Transforming the sound waveform into a 34-band spectrogram.
Choosing 73 neurons out of 119 based on their noise ratio.
Transforming the spike times of each repeat of each stimulus into PSTHs.
What’s been done is explained in the source paper.

Benchmark results

Model backbone	Rank	Remarks	Params / nrn	Perfs (CCraw / CCnorm) [%]	Paper (backbone)
StateNet	🥇	GRU, pop	30,465	55.6 / 75.1	Rançon et al.
Transformer	🥈	pop	29,205	53.9 / 73.0	Rançon et al.
2D-CNN	🥉	pop	36,275	51.8 / 70.1	Pennington et al.

Setup

Easiest path — auto-download from OSF (no account required) plus the pre-computed spectrogram from the DNet GitHub repo:

from deepSTRF.datasets.audio import NS1Dataset

ds = NS1Dataset(download=True, dt_ms=5)

Default cache dir is platformdirs.user_cache_dir('deepSTRF')/NS1, overridable via $DEEPSTRF_DATA_DIR. download=True is idempotent.

If you already have the data laid out manually:

Download the response data from the original OSF repository.
Download the pre-computed stimulus spectrogram (test_data_5ms.mat) from the DNet code repository of Rahman et al. (2019).
Place MetadataSHEnCneurons.mat, the extracted spikesandwav/ folder, and test_data_5ms.mat inside a data/ folder.
ds = NS1Dataset('/path/to/data', dt_ms=5).

Filtering

stim_meta carries a "type" field with values {"water_sounds", "ferret_vocalization", "insects_buzzing", "human_speech", "unknown"}. nrn_meta carries cell_id, area ("A1"), depth_um, noise_ratio, single_n / single_t (single-unit flags), n_electrodes (total electrodes on the rig at recording time), and electrode_number (1-indexed electrode this cell was recorded on). Combined with the base-class selection API:

ds.select_stims_by_attr("type", "human_speech")     # only the 4 speech stims
ds.select_pop_by_nrn_attr("single_t", "Yes")        # only single units

NS1’s nrn_masks is full coverage (every cell saw every stim), so the bidirectional rule does not prune any cell. It will, however, prune cells in concatenated datasets that mix NS1 with sparser sources.