- Home
- Previous APAN Programs
- Session Browser 2022
Session Browser 2022
All session times are shown in Pacific Standard Time (PST).
Shihab Shamma
Topic areas: neural coding
Fri, 11/11 9:00AM - 10:00AM | Keynote Lecture
Abstract
Perception and action engage extensive sensory and motor interactions with predictive signals playing the major role in skill learning and cognition. I shall briefly describe ECoG recordings of such responses during speech vocalizations and discuss the role of the auditory-motor mappings in learning how to speak. These results are then generalized to sensory perception and imagination of music and speech with EEG and MEG recordings, leading to a brief account of how to decode imagined music and speech from these non-invasive signals.
Hiroyuki Kato
Topic areas: neural coding hierarchical organization correlates of behavior/perception
Fri, 11/11 2:35PM - 3:00PM | Young Investigator Spotlight
Abstract
Information flow in the sensory cortex is classically considered as feedforward-hierarchical computation, where higher-order cortices receive inputs from the primary cortex to extract more complex sensory features. However, recent human studies challenged such simple serial transformation by demonstrating robust speech perception in patients with primary auditory cortex lesions. To understand the interplay between the primary and higher-order cortices during sensory feature extraction, my laboratory focuses on the mouse primary (A1) and secondary (A2) auditory cortices as a model system. Using in vivo two-photon calcium imaging and unit recording in awake animals, we recently identified A2 as a locus for extracting temporally-coherent harmonic sounds. Interestingly, acute optogenetic inactivation of A1 did not disturb animals’ performance for the A2-dependent harmonics discrimination task. Moreover, we found short-latency (less than 10 ms) auditory input onto layer 6 (L6) of A2, which was as fast as the primary lemniscal input to A1 L4. We performed a series of anatomical and electrophysiological experiments and found that A2 L6 receives short-latency inputs from neurons along the non-lemniscal pathway, bypassing A1. These results align with the findings in humans and together indicate parallel and distributed, rather than simply feedforward, processing of auditory information across cortical areas. Nevertheless, it is important to note that A1 and A2 neurons have mutual excitatory influences on each other, as demonstrated by our area-targeted perturbation experiments. In this talk, I will further discuss our ongoing experiments investigating how A1 and A2 circuits integrate parallel ascending pathways to achieve unified sensory representations.
Nathan A. Schneider, Tomas Suarez Omedas, Rebecca F. Krall and Ross S. Williamson
Topic areas: correlates of behavior/perception neural coding
Descending Categorization Auditory Cortex BehaviorFri, 11/11 12:15PM - 1:00PM | Podium Presentations 1
Abstract
Auditory-guided behavior is ubiquitous in everyday life, whenever auditory information is used to guide our decisions and actions. Nestled amongst several populations, extratelencephalic (ET) neurons reside in the deep layers of auditory cortex (ACtx) and provide a primary means of routing auditory information to diverse, sub-cortical targets associated with decision-making, action, and reward. To investigate the role of ET neurons in auditory-guided behavior, we developed a head-fixed choice task, where mice categorized the rate of sinusoidal amplitude-modulated (sAM) noise bursts as either high or low to receive a water reward. We first established ACtx necessity using bilateral optogenetic inhibition (with GtACR2), then used two-photon calcium imaging alongside selective GCaMP8s expression to monitor the activity of ET (N=3 mice, n=~180 neurons/day/animal) and layer (L)2/3 intratelencephalic (IT) (N=3 mice, n=~450 neurons/day/animal) populations. Clustering analyses of ET and L2/3 IT populations revealed heterogenous response motifs that correlated with various stimulus and task variables. One such motif, primarily present in ET neurons, corresponded to “categorical” firing patterns (i.e., neurons that responded best to low or high sAM rates). This categorical selectivity was not present early in training, and longitudinal recording revealed that ET neurons shifted their response profiles dynamically across learning to reflect these discrete perceptual categories. Our stimulus set included a sAM rate at the category boundary, rewarded probabilistically, allowing us to investigate stimulus-independent choice. Using statistical approaches to visualize high-dimensional neural activity we found that ET population activity, in response to this boundary stimulus, reflected behavioral choice, regardless of reward outcome. Further quantification using neural decoding analyses confirmed that behavioral choice could be robustly predicted from ET activity. Both choice and categorical selectivity were notably lessened in the L2/3 IT population, hinting at a unique ET role. Critically, ET categorical selectivity was only evident during active behavioral engagement and disappeared during passive presentation of identical stimuli. This suggests that learned categorical selectivity is shaped via top-down inputs that act as a flexible, task-dependent filter, a hypothesis that we are actively pursuing. These results suggest that the ACtx ET projection system selectively propagates behaviorally-relevant signals brain-wide and is critical for auditory-guided behavior.
Iran Roman, Eshed Rabinovitch, Elana Golumbic and Edward Large
Topic areas: speech and language correlates of behavior/perception hierarchical organization
speech envelope tracking neural oscillation dynamical systems modelFri, 11/11 12:15PM - 1:00PM | Podium Presentations 1
Abstract
Speech is a pseudo-periodic signal with an envelope frequency that dynamically fluctuates around 5Hz. One influential hypothesis proposed in recent years is that intrinsic neural oscillations entrain to the speech rhythm to track and predict speech timing. However, given that natural speech is not strictly periodic, but contains irregular pauses and continuous changes in speech-rate, the question of whether this type of stimulus can be efectively tracked or predicted by neural oscillations has been highly debated. Here we present a simple and parsimonious computational model of neural oscillation that is able to dynamically and continuously synchronize with the speech envelope. Our work is an adaptation of a previously proposed model that captures rhythmic complexities in music (Roman, Roman and Large 2020), extended to deal with stimuli that are not strictly periodic. The model has a natural frequency of oscillation, which it dynamically adapts to match stimulus frequency using Hebbian learning. Additionally, an elastic force pulls the system back towards its natural frequency in the absence of a stimulus. Using automatic diferentiation in tensorflow and gradient descent to optimize parameters, the model was trained to maximize the correlation between its activity and the speech onsets in a corpus of spoken utterances. The model was validated on an independent set of stimuli not included in the training data. First, using phase coupling only, performance reached a mean predictive power of r = 0.33 (0.24 less than r less than 0.42). Next, when frequency was also allowed to adapt dynamically, the model achieved a mean predictive power of r = 0.40 (0.29 less than r less than 0.50). For comparison we ran an ablation study in which the oscillator model was decoupled from the stimulus (i.e., neither phase coupling nor frequency adaptation). In the ablation study we observed chance-level predictive power of r = -0.001. These results demonstrate the theoretical plausibility that neural oscillations synchronize to continuous speech, exploiting the principles of neural resonance and Hebbian learning. This model paves the way for future research to empirically test the mechanistic hypothesis that speech processing is mediated by entrainment of neural oscillations.
Srihita Rudraraju, Brad Theilman, Michael Turvey and Timothy Gentner
Topic areas: correlates of behavior/perception neural coding
perception and cognition predictive coding error birdsongFri, 11/11 12:15PM - 1:00PM | Podium Presentations 1
Abstract
Predictive coding (PC), a theoretical framework in which the brain compares a generative model to incoming sensory signals, has been employed to explain perceptual and cognitive phenomena. There is little understanding, however, of how PC might be implemented mechanistically in auditory neurons. Here, we examined neural responses in caudomedial nidopallium (NCM) and caudal mesopallium (CM), analogs of higher order auditory cortex, in anesthetized European starlings listening to conspecific songs. We trained a feedforward temporal prediction model (TPM) to predict short segments (10.5 ms) of future birdsongs based on past 170 ms spectrographic samples to define a “latent” predictive feature space. To examine PC, we modeled each neuron’s composite receptive field (CRF) fit to either: 1) all spectrotemporal features (fft-CRF) or 2) only the predictive spectrotemporal features (tpm-CRF) or 3) prediction error spectrotemporal features computed by the mean squared error (mse-CRF). In NCM (n = 541 neurons), the tpm-CRFs yield excellent predictions of empirical spiking response to novel song and slightly higher than the fft-CRF (70.41% & 67.92% variance; p < 5.5x10-8, paired t-test), but mse-CRFs yield significantly poorer predictions (11.15%; p=0.0, paired t-test). Unlike NCM, however, the mse-CRF predicted a significant proportion of the CM response variance (53.61%; p < 1.7x10-190, t-test CM vs NCM). We showed that NCM spiking responses are best modeled by predictive features of song, while CM responses capture both predictive and error features. This provides strong support for the notion of a feature-based predictive auditory code implemented in single neurons in songbirds.
Timothy Tyree, Mike Metke and Cory Miller
Topic areas: memory and cognition hierarchical organization multisensory processes neural coding
Hippocampus Primate Recognition DecodersFri, 11/11 3:00PM - 3:45PM | Podium Presentations 2
Abstract
The ability to recognize the identity of other individuals is integral to social living, as it is necessary for the myriad cognitive processes routinely employed by individuals navigating the complex dynamics of societies, such as memory, decision-making and communication amongst others. While evidence of neural representations for individual identity for a single, sensory modality (e.g. face, voice, odor, etc.) are evident in several species, compelling evidence for cross-modal representations of individual identity have been limited. Here we tested whether cross-modal representations of identity are also evident in the hippocampus of a nonhuman primate: common marmoset. We recorded the activity of ~2400 single neurons while presenting N=4 subjects with faces and voices of a large battery of familiar conspecifics as both unimodal and cross-modal stimuli. During cross-modal stimulus presentations, the face and voice were either from the same individual (Match) or different individuals (Mismatch). Our population-level decoder could almost perfectly distinguish between Match and Mismatch trials, even though subjects were presented with ~10 different individuals in each recording session. This decoder was so robust that ~50 randomly selected neurons from the entire population achieved ~80% reliability suggesting that cross-modal representations of identity in primate hippocampus are highly distributed. These compelling findings shed new insight into the nature of identity representations in primate hippocampus by demonstrating that cross-modal representations of identity are not only evident but are a facet of social recognition that supports a robust, distributed coding mechanism.
Meredith Schmehl, Surya Tokdar and Jennifer Groh
Topic areas: correlates of behavior/perception multisensory processes neural coding subcortical processing
auditory sound localization audiovisual audiovisual integration multisensory multisensory integration inferior colliculus macaque primateFri, 11/11 3:00PM - 3:45PM | Podium Presentations 2
Abstract
Visual cues can influence brain regions that are sensitive to auditory space (Schmehl & Groh, Annual Review of Vision Science 2021). However, how such visual signals in auditory structures contribute in the perceptual realm is poorly understood. One possibility is that visual inputs help the brain distinguish among different sounds, allowing better localization of behaviorally relevant sounds in noisy environments (i.e., the cocktail party phenomenon). Our lab previously reported that when two sounds are present, auditory neurons may switch between encoding each sound across time (Caruso et al., Nature Communications 2018). We sought to study how pairing a visual cue with one of two sounds might change these time-varying responses (e.g., Atilgan et al., Neuron 2018). We trained one rhesus macaque to localize one or two sounds in the presence or absence of accompanying lights. While the monkey performed this task, we recorded extracellularly from single neurons in the inferior colliculus (IC), a critical auditory region that receives visual input and has visual and eye movement-related responses. We found that pairing a light with a sound can change a neuron's response to that sound, even if the neuron is unresponsive to light. Further, when two sounds are present and one sound is paired with a light, neurons are more likely to spend individual trials responding to the visually-paired sound. Together, these results suggest that the IC alters its sound representation in the presence of visual cues, providing insight into how the brain combines visual and auditory information into perceptual objects.
Meike M. Rogalla, Gunnar L. Quass, Deepak Dileepkumar, Alex Ford, Gunseli Wallace, Harry Yardley and Pierre F. Apostolides
Topic areas: auditory disorders correlates of behavior/perception neural coding subcortical processing
spatial plasticity monaural hearing loss auditory midbrainFri, 11/11 3:00PM - 3:45PM | Podium Presentations 2
Abstract
Spatial hearing enables humans and animals to localize sounds in their vicinity, which contributes to survival. Unlike vision or touch, the peripheral auditory system lacks a spatial map at the sensory receptor level. Sound source location is therefore derived centrally from mainly binaural cues, as well as from monaural cues. In the case of unilateral hearing loss, binaural cues are no longer available, which limits spatial hearing. However, monaurally occluded humans and animals can regain sound localization following perceptual training. It is assumed that the observable re-learning of sound localization relies on the context-dependent re-calibration of auditory space representation by monaural cues. Thus, central experience-dependent auditory plasticity mechanisms must exist to re-calibrate sound localization circuits. The “shell” nuclei of the inferior colliculus (shell IC) are hypothesized to act as plasticity loci for sound localization cues. However, the neural population coding of spatial information in the mammalian shell IC remains poorly understood. We developed an acoustic delivery system to present sound stimuli from distinct spatial positions within the horizontal frontal field by moving a speaker around the animals’ head while performing cellular resolution 2-photon Ca2+-imaging in the shell IC of awake, head-fixed mice. We found that neurons in the murine shell IC are spatially tuned, and that the population coding follows a surprisingly different pattern as previously shown for other auditory regions: In contrast to the central IC, where spatial tuning shows a contralateral dominance, we found both contra- and ipsi-lateral selective neurons, such that a single hemisphere contained a representation of the entire horizontal field. Although previous data suggested a monotonic code for spatial representations in the mammalian auditory system, many shell IC neurons were tuned to discrete contra- and ipsi-lateral positions. Tuning required binaural integration and seemed impervious to representational drift: tuning broadened or shifted towards the contralateral hemifield after inserting an ear plug into the left ear. To our knowledge, these results are the first insight into spatial population codes of the mammalian shell IC. Future studies will test if active engagement in a localization task is required for plasticity of spatial tuning during monaural hearing loss.
Corentin Puffay, Jonas Vanthornhout, Marlies Gillis, Bernd Accou, Hugo Van Hamme and Tom Francart
Topic areas: speech and language correlates of behavior/perception neural coding novel technologies
EEG decoding deep learning speech auditory system linguisticsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The extent to which the brain tracks a speech stimulus can be measured for natural continuous speech by modeling the relationship between stimulus features and the corresponding EEG. Recently neural tracking of linguistic features has been shown using subject-specific linear models. As linguistics are processed in high-cortical areas, we expect their response to having a nonlinear component that a linear model cannot model. Therefore, we present a deep learning model to obtain a nonlinear subject independent model relating EEG to linguistic features as well as their added value over acoustic and lexical features. Sixty normal-hearing subjects listened attentively to 10 stories of 14 minutes each while their EEG was recorded. We here define neural tracking as the ability of the model to associate EEG with speech, and we use the classification accuracy on a match-mismatch task to measure it. We compare the baseline model (including acoustical and lexical representations) with a model including different phoneme- and word-level linguistic representations in addition to the baseline. Using subject-specific fine-tuning on a subject-independent pre-trained model, we found significant linguistic tracking on top of acoustic and lexical tracking for some features. We showed that some of the linguistic features carry additional information beyond acoustics and lexical features. The benefit of our deep learning model is that it may need less subject-specific training data than a linear model and that it can model non-linear relations between stimulus features and EEG. We will further use this model to objectively measure speech understanding.
Cynthia King, Stephanie Lovich, David Kaylie, Christopher Shera and Jennifer Groh
Topic areas: multisensory processes subcortical processing
audiovisual integration eye movements middle ear muscles outer hair cellsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Eye movements are critical to linking vision and spatial hearing – every eye movement shifts the relative relationship between visual (eye-centered) and auditory (head-centered) frames of reference, which requires constant updating of incoming sensory information in order to integrate the two sensory inputs. Previous neurophysiological studies have revealed eye movement-related modulation of the auditory pathway. We recently discovered a unique type of low frequency otoacoustic emission that accompanies eye movements. These eye movement-related eardrum oscillations (EMREOs) occur in the absence of external sound and carry precise information about saccade magnitude, direction, and timing (Gruters et al 2018, Murphy et al 2020). However, it is not well understood how these eye movement-related effects in the auditory periphery contribute mechanistically to hearing. Two auditory motor systems may be involved in generating EMREOs: the middle ear muscles and the cochlear outer hair cells. To gain insight into which systems are involved and how they contribute, we are presently investigating the EMREOs in human subjects with dysfunction involving these systems compared to a normal hearing population. The impact of hearing loss on the EMREO is examined by comparing responses from individuals with different hearing pathologies to normal hearing population data. We find EMREOs are abnormal in subjects with hearing impairment, most commonly being abnormally small in individuals who have impaired outer hair cell or stapedius function. Future work is needed to assess if patients with these types of hearing loss have specific impairments in the perceptual process of integrating visual and auditory spatial information.
David R. Quiroga Martinez, Leonardo Bonetti, Robert Knight and Peter Vuust
Topic areas: memory and cognition correlates of behavior/perception thalamocortical circuitry/function
Imagery music MEG decoding oscillationsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Imagine a song you know by heart. With low effort you could play it vividly in your mind. However, little is known about how the brain represents and holds in mind such musical “thoughts”. Here, we leverage time-generalized decoding from MEG brain source activations to show that listened and imagined melodies are represented in auditory cortex, thalamus, middle cingulate cortex and precuneus. Accuracy patterns reveal that during listening and imagining sounds are represented as a melodic group, while during listening they are also represented individually. Opposite brain activation patterns distinguish between melodies during listening compared to imagining. Furthermore, encoding, imagining and retrieving melodies enhances delta and theta power in frontopolar regions, and suppresses alpha and beta power in sensorimotor and auditory regions. Our work sheds light on the neural dynamics of listened and imagined musical sound sequences.
Steven Eiades and Joji Tsunada
Topic areas: correlates of behavior/perception neuroethology/communication
Auditory cortex vocalization sensory-motor marmoset non-human primateFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
During both human speech and non-human primate vocalization, there is a well described suppression of activity in the auditory cortex. Despite this suppression, the auditory cortex remains sensitive to perturbations in sensory feedback, and this sensitivity has been shown to be important in feedback-dependent vocal control. Although the mechanisms of suppression and vocal feedback encoding are unclear, this process has been suggested to represent an error signal encoding the difference between sensory-motor prediction and feedback inputs. However, direct evidence for the existence of such an error signal is lacking. In this study, we investigated the responses of auditory cortical neurons in marmoset monkeys during vocal production, testing frequency-shifted feedback of varying magnitude and direction. Consistent with an error signal hypothesis, we found that population-level neural activity increased with the magnitude of feedback shifts, but were symmetric between positive and negative frequency changes. This feedback sensitivity was strongest in vocally-suppressed units and for units whose frequency tuning overlapped that of vocal acoustics. Individual units tested with multiple feedback shifts often showed preferences for either positive or negative feedback shifts, with only a minority showing sensitivity to feedback shifts in both directions. Frequency tuning distributions were different for units showing preference for one feedback direction over the other. These results suggest that vocal feedback sensitivity in the auditory cortex is consistent with a vocal error signal, seen at both the individual unit and population level.
Jian Carlo Nocon, Howard Gritton, Xue Han and Kamal Sen
Topic areas: neural coding
Auditory cortex Computational modeling Complex scene analysis Neural coding Cortical inhibition ParvalbuminFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Cortical representations underlying complex scene analysis emerge from circuits with a tremendous diversity of cell types. However, cell type-specific contributions to this are not well-understood. Specifically, how are competing dynamic stimuli from different spatial locations represented by cortical circuits and cell types? Recently, we investigated complex scene analysis in mouse ACx using a cocktail party-like paradigm, where we presented target sounds in the presence of maskers from different spatial configurations and quantified neural discrimination performance. We found that cortical neurons were spatial configuration-sensitive, with high discrimination performance at specific combinations of target and masker locations (“hotspots”). Further, optogenetically suppressing parvalbumin (PV) neurons in ACx degraded discrimination via changes in rapid temporal modulations in rate and spike timing over several timescales. These results suggest PV neurons contribute to complex scene analysis by enhancing cortical temporal coding and reducing network noise. Here, we propose a network model of ACx to explain these observations. The model consists of different spatial channels, with excitatory and multiple inhibitory neuron types based on experimental data. Our results suggest PV neurons mediate “within-channel” inhibition in the cortical network, while a distinct population of inhibitory neurons mediate “cross-channel” surround inhibition. Because complex scene analysis is modulated by behavioral state, we then extend the model to simulate top-down modulation via other inhibitory populations. Finally, we hypothesize a mapping of the distinct inhibitory neuron populations in the model to those in cortex (PV, SOM, and VIP) to generate experimentally testable predictions for cell type-specific responses in passive versus task-engaged conditions.
James Bigelow, Ryan Morill, Timothy Olsen, Jefferson DeKloe, Christoph Schreiner and Andrea Hasenstaub
Topic areas: memory and cognition correlates of behavior/perception multisensory processes neural coding
auditory cortex population coding crossmodal attention arousalFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Mounting evidence suggests synchronous activity among multiple neurons may be an essential aspect of cortical information processing and transmission. Previous work in auditory cortex (AC) has shown that coordinated neuronal ensemble (cNE) events contain more information about sound features than individual member neuron spikes on a per spike basis. Moreover, preferred sound features encoded by single neurons often depend on whether they were spiking with a given cNE. In addition to sound encoding, it is well known that AC neurons are influenced by diverse non-auditory inputs reflecting motor activity, arousal, crossmodal sensory events, and attention. Nevertheless, it remains unknown whether these extramodal inputs are processed through population-level encoding in the same way as acoustic signals from the ascending auditory pathway. In the present study, we addressed this question by examining single neurons and cNEs in AC of mice performing an audiovisual attention switching task. As in previous studies, many single units and cNEs responded to sounds and were often modulated by non-auditory variables including movement velocity, pupil size, and modality specific attention. Importantly, we found that cNE representation of non-auditory inputs contained more information about events and states than member neuron spikes, similar to patterns previously described for sounds. Furthermore, modulation of single neuron activity by non-auditory inputs often depended on whether its activity coincided with a cNE. Our findings suggest auditory and extra-modal inputs may be subject to similar processing by cNEs in AC, and support prior work suggesting cNE activity as an information processing motif in cortex.
Ramtin Mehraram, Marlies Gillis, Maaike Vandermosten and Tom Francart
Topic areas: auditory disorders speech and language correlates of behavior/perception
EEG hearing loss temporal response function connectomics network natural speechFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Introduction The impact of hearing loss on brain functionality is a matter of interest in audiology research. However, there is still a lack of knowledge on hearing-impairment-related alterations in the neural networks when listening to natural speech. We investigated the relationship between acoustical features of speech (spectrogram, acoustic onset) and the spatial distribution of the neural activity in hearing-impaired (HI) and normal-hearing participants (NH). Methods Our sample comprised 14 HI and 14 NH age-matched participants. High density EEG (64ch) was recorded while the participants listened to a 12-minute long story. Temporal response functions (TRFs) for speech spectrogram and acoustical onsets were obtained through linear forward modelling. EEG-network connectivity was measured and subnetworks associated with TRFs’ peak latencies were obtained through correlation tests using Network Based Statistics. Results For the NH group positive correlations emerged between the connectivity strength of an α-band (7.5-14.5 Hz) right-temporo-frontal network component and the latency of N1 peak of the spectrogram-TRF, and between a right-occipito-parietal θ-band (4.5-7 Hz) network component and the latency of the P2 peak of the spectrogram. For the HI group, an inter-hemispheric-frontal and right-parietal θ-band network component negatively correlated with the P1 peak of the acoustical onset-TRF. Conclusion Our results show that neural response features in HI and NH groups relate to different brain subnetworks. Whilst physiological association between functional connectivity and spectrogram-TRF peaks is lost in HI, an abnormal correlation with the earliest part of acoustical-TRF emerges, suggesting that hearing impairment results in alteration of the earliest auditory mechanisms.
Matthew Banks, Bryan Krause, Hiroto Kawasaki, Mitchell Steinschneider and Kirill Nourski
Topic areas: memory and cognition speech and language correlates of behavior/perception hierarchical organization
Functional connectivity Network topology Limbic cortex Intracranial electrophysiologyFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Introduction: Critical organizational features of human auditory cortex (AC) remain unclear, including the definition of information processing streams and the relationship of canonical AC to the rest of the brain. We investigated these questions using diffusion map embedding (DME) applied to intracranial electroencephalographic (iEEG) data from neurosurgical patients. DME maps data into a space where proximity represents similar connectivity to the rest of the network. Methods: Resting state data were obtained from 6487 recording sites in 46 patients (20 female). Regions of interest (ROI) were located in AC, temporo-parietal auditory-related, prefrontal, sensorimotor and limbic/paralimbic cortex. Functional connectivity was averaged within ROI then across subjects, thresholded and normalized, then analyzed using DME. Results: ROIs exhibited a hierarchical organization, symmetric between hemispheres and robust to the choice of iEEG frequency band and connectivity metric. Tight clusters of canonical auditory and prefrontal ROIs were maximally segregated in embedding space. Planum polare (PP) and the lower bank of the superior temporal sulcus (STSL) were located at a distance from the auditory cluster. Clusters consistent with ventral and dorsal auditory processing streams were paralleled by a cluster suggestive of a third stream linking auditory and limbic structures. Portions of anterior temporal cortex were characterized as global hubs. Conclusions: The separation of PP and STSL from AC suggests a higher order function for these ROIs. The limbic stream may carry memory- and emotion-related auditory information. This approach will facilitate identifying network changes during active speech and language processing and elucidating mechanisms underlying disorders of auditory processing.
Laura Gwilliams, Matthew Leonard, Kristin Sellers, Jason Chung, Barundeb Dutta and Edward Chang
Topic areas: speech and language correlates of behavior/perception neural coding
speech language single neuron activity neuropixels human auditory cortex superior temporal gyrusFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Decades of lesion and brain imaging studies have identified the superior temporal gyrus (STG) as a critical brain area for speech perception. Here, we used high-resolution multi-laminar Neuropixels arrays to record from hundreds of neurons in the human STG while participants listened to natural speech. We found that neurons exhibit tuning to complex spectro-temporal acoustic cues, which correspond to phonetic and prosodic speech features. However, single neuron activity across the cortical layers demonstrated a highly heterogeneous set of tuning profiles across the depth of the cortex, revealing a novel dimension of speech encoding in STG. Finally, single neuron speech-evoked responses across cortical layers were compared with field potentials recorded at the cortical surface. We found that high-frequency field potential activity reflects the contributions of neurons across all depths, encompassing the diversity of tuning response profiles across cortical layers. Together, these results demonstrate an important axis of speech encoding in STG, namely single neuron tuning across the cortical laminae. L.G. and M.K.L. contributed equally.
Ole Bialas, Edmund Lalor, Emily Teoh and Andrew Anderson
Topic areas: speech and language correlates of behavior/perception
temporal response function naturalistic sounds speech perceptionFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The past decade has seen an increase in efforts to understand the processing of speech and language using naturalistic stimuli. One popular approach has been to use forward encoding models to predict neural responses to speech based on different representations of that speech. For example, it has been shown that adding a categorical representation of phonemes to an acoustic (spectrogram) representation of a speech stimulus increases the predicted EEG response’s accuracy. This has been taken as evidence for a cortical representation of sublexical linguistic units that is categorical and robust with respect to spectrotemporal variation. However, subsequent studies argued that this gain could be explained more parsimoniously based on acoustic features alone. Here, we address this issue by investigating whether the spectral variance across utterances of the same phoneme determines how strongly that phoneme contributes to predicting EEG responses to speech. Simply put, if a phoneme was to be pronounced exactly the same all of the time, adding a categorical phoneme label to a spectrogram of the speech would be redundant and should not increase the predictive accuracy. Conversely, the accuracy gained by incorporating a phoneme label should be greater the more variantly that phoneme is pronounced. We predicted the brain responses of subjects who listened to segments of an audiobook based on a spectrogram as well as a phoneme representation of the acoustic input with temporal response functions. We investigate whether the loss in predictive accuracy after deleting a phoneme is correlated with its variability and whether this relationship is reflected in the weights assigned by the model. Our preliminary results suggest that the variance between utterances of a phoneme does affect its predictive power, supporting the idea that EEG indexes a robust cortical representation of language tokens.
Danna Pinto, Adi Brown and Elana Zion Golumbic
Topic areas: speech and language
Speech Processing Attention Cocktail Party Own-NameFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Detecting that someone has said your name is one of the most famous examples for incidental processing of supposedly task-irrelevant speech. However, empirical investigation of this so-called “cocktail party effect” has yielded conflicting results. Here we present a novel empirical approach for revisiting this effect under highly ecological condition, using speech-stimuli and tasks relevant for real-life and immersing participants in a multisensory virtual environment of a café. Participants listened to narratives of conversational speech from a character sitting across from them, and were told to ignore a stream of announcements spoken by a barista character in the back of the café. Unbeknownst to them, the barista-stream sometimes contained their own name or semantic violations. We used combined measurements of brain activity (EEG), eye-gaze patterns, physiological responses (GSR) and behavior, to gain a well-rounded description of the response-profile to the task-irrelevant barista-stream. Both the own-name and semantic-violation probes elicited unique neural and physiological responses relative to control stimuli, indicating that the system was able to process these words and detect their unique status, despite being task-irrelevant. Interestingly, these responses were covert in nature and were not accompanied by systematic gaze-shifts towards the barista character. This patterns demonstrate that under these highly ecological conditions, listeners incidentally pick up information from task-irrelevant speech and are not severely limited by a lack of sufficient processing resources. This invites a more nuanced discourse about how the brain deals with simultaneous stimuli in real-life environments and emphasizes the dynamic and non-binary nature of attention.
Matthew McGill, Caroline Kremer, Kameron Clayton, Kamryn Stecyk, Yurika Watanabe, Desislava Skerleva, Eve Smith, Chelsea Rutagengwa, Sharon Kujawa and Daniel Polley
Topic areas: auditory disorders correlates of behavior/perception
Hyperacusis Inhibition Hearing Loss Behavior OptogeneticsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Noise exposure that damages cochlear sensory cells and afferent nerve endings is associated with reduced feedforward inhibition from parvalbumin+ (PV) cortical GABAergic neurons, resulting in hyperactive, hyperresponsive, and hypercorrelated spiking in ACtx pyramidal neurons. To better establish how disinhibited neural circuit pathology studied in laboratory animals relates to loudness hypersensitivity and other clinical phenotypes observed in humans with sensorineural hearing loss, we developed a two-alternative forced-choice classification task for head-fixed mice to probe changes in the perception of loudness after controlled cochlear injuries. At baseline (N=17) or in sham-exposed control mice (N=6), behavioral classification of soft versus loud varied smoothly across a 40-80 dB SPL range. After noise exposures that caused either a “pure” cochlear neural damage (N=6), or mixed sensorineural pathology (N=5), mice rapidly developed loudness hypersensitivity that manifested as a 9 dB shift in their loudness transition threshold. As expected, bilateral silencing of auditory cortex via optogenetic activation of PV neurons did not affect tone detection probability but had an interesting effect on loudness perception, in that PV activation strongly biased mice to report high-intensity sounds as soft (N=6). Taken together, these data suggest that cortical PV neurons function as a perceptual volume knob; sounds are perceived as louder than normal following acoustic exposures that reduce PV-mediated cortical inhibition but softer than normal when PV neurons are artificially activated via optogenetics. Clinically, these data enrich the growing literature that identifies PV pathology as critical point of dysfunction in auditory perceptual disorders.
Stephanie Lovich, David Kaylie, Cynthia King, Christopher Shera and Jennifer Groh
Topic areas: correlates of behavior/perception multisensory processes subcortical processing
multisensory eye movement hair cells middle ear muscles sub-corticalFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The auditory, visual, and oculomotor systems work together to aid spatial perception. We have recently reported an oscillation of the eardrum that is time-locked to the onset of an eye movement in the absence of sounds or visual stimuli. These eye movement-related eardrum oscillations (EMREOs) suggest that interactions between auditory, visual, and oculomotor systems may begin as early as the ear itself. Much is still unknown about this phenomenon. Open questions include: 1) Which motor systems of the inner and middle ear contribute to this eardrum oscillation? Potential candidates include the stapedius muscle, tensor tympani muscle, and/or outer hair cells. 2) What neural circuits drive this oscillation? 3) What are the cognitive or perceptual effects of this oscillation, especially with respect to sound localization? To study the anatomical and neural circuits, we use the rhesus monkey as a model to perform controlled invasive surgical and pharmacological manipulations. The rhesus monkey can perform saccadic eye movements on similar time scales to human participants, and we are able to record ear canal changes in the same manner as with human participants. Monkeys have a highly-reproducible oscillation in both ears, comparable to humans, including alternating phase of the oscillation between the ears and separable horizontal and vertical components related to the horizontal and vertical components of the eye movement. Finally, monkeys allow for a single, specific surgical or pharmacological intervention after baseline data collection, data collection almost immediately after the procedure, and data collection on the order of thousands of trials.
Rien Sonck, Jonas Vanthornhout, Estelle Bonin, Aurore Thibaut, Steven Laureys and Tom Francart
Topic areas: speech and language
Consciousness Language Neural trackingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Objectives. Following a severe brain injury, some patients fall into a coma and may subsequently experience disorders of consciousness (DOC). The patient’s hearing disability is a confounding factor that can interfere with the assessment of their consciousness and lead to misdiagnosis. To reduce misdiagnosis, we propose to assess the hearing- and language abilities of patients with DOC using electroencephalography (EEG) to investigate their auditory steady-state responses (ASSR) and neural speech envelope tracking. Methods. Sixteen adults participated in the first experiment. While their EEG was recorded, they listened passively to three ASSR stimuli, with amplitude modulation frequencies of 3.1 Hz, 40.1 Hz, and 102.1 Hz; each frequency provides information about different brain regions along the auditory pathway. These stimuli are presented both sequentially (i.e., single ASSR) and simultaneously (i.e., multiplexed ASSR). In a second experiment, which is still ongoing (n=2), patients with DOC first listen to a multiplexed ASSR stimulus. Then, we tracked their neural speech envelope after listening to a story in their native language, in a foreign language, and in noise. Results. We have shown that the signal-to-noise ratio of evoked multiplexed ASSR responses does not significantly differ from evoked single ASSR responses. Furthermore, our preliminary results indicate that neural speech envelope tracking is possible in patients with DOC. Conclusion. Multiplexed ASSR is a valid replacement for single ASSR, which can shorten EEG measurements, crucial for patients with DOC as they quickly get exhausted. Moreover, neural speech envelope tracking might be a promising tool to analyze DOC patients’ speech processing abilities.
Carolina Fernandez Pujol, Andrew Dykstra and Elizabeth G Blundon
Topic areas: correlates of behavior/perception thalamocortical circuitry/function
consciousness auditory modelingFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Electroencephalography and magnetoencephalography are excellent mediums for capturing human neural activity on a millisecond time scale, yet little is known about their underlying laminar and biophysical basis. Here, we used a reduced but realistic cortical circuit model - Human Neocortical Neurosolver (HNN) - to shed light on the laminar specificity of brain responses associated with auditory conscious perception under multitone masking. HNN provides a canonical model of a neocortical column circuit, including both excitatory pyramidal and inhibitory basket neurons in layers II/III and layer V. We found that the difference in event-related responses between perceived and unperceived target tones could be accounted for by additional input to supragranular layers arriving from either the non-lemniscal thalamus or cortico-cortical feedback connections. Layer-specific spiking activity of the circuit revealed that the additional negative-going peak that was present for detected but not undetected target tones was accompanied by increased firing of layer-V pyramidal neurons. These results are consistent with current cellular models of conscious processing and help bridge the gap between the macro and micro levels of analysis of perception-related brain activity.
Vinay Raghavan, James O'Sullivan and Nima Mesgarani
Topic areas: speech and language neural coding novel technologies
auditory attention decoding glimpsing model event-related potentialFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Individuals suffering from hearing loss struggle to attend to speech in complex acoustic environments. While hearing aids suppress background noise, a talker can only be amplified when it is known who the listener aims to attend to. Neuroscientific advances have allowed us to determine the focus of a listener’s attention from their neural recordings, including non-invasive electroencephalography (EEG) and intracranial EEG (iEEG), a process known as auditory attention decoding (AAD). Recent research suggests that attention differentially influences glimpsed and masked speech event encoding in auditory cortex. However, these differences have yet to be leveraged for stimulus reconstruction (SR), and they suggest that event-related potentials (ERPs) to glimpsed and masked speech features also contain robust signatures of attention. Therefore, we sought to characterize attention decoding accuracy using SR- and ERP-based methods that leverage differences in the neural representations of glimpsed and masked speech. Here, we obtained iEEG responses in auditory cortex while subjects attended to one talker in a two-talker mixture. We also analyzed two publicly-available EEG datasets with the same task. We used linear decoding models to reconstruct the glimpsed, masked, and combined envelope and to classify ERPs to glimpsed, masked, and combined acoustic edge events. We found AAD through the classification of glimpsed and masked ERPs was most accurate at shorter time durations when utilizing iEEG recordings. Fewer differences in performance were observed in low-frequency EEG. These results suggest glimpsed and masked ERP-based AAD is preferable when using intracranial recordings due to increased performance at low latencies.
Chloe Weiser, Bailey King, Maya Provencal, Jennifer Groh and Cynthia King
Topic areas: correlates of behavior/perception multisensory processes
sound localization auditory perception simultaneous auditory signalsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The visual system can detect many different simultaneous stimuli. This is achieved in part due to small neuronal receptive fields in primary visual cortex as well as neuronal multiplexing of simultaneously presented stimuli (Li et al, 2016; Caruso et al, 2018). However, the mammalian auditory system appears to lack a map of auditory space, possibly limiting processing of multiple simultaneous sounds. Previous work reported that humans can accurately detect 2-4 simultaneous sounds, depending on stimulus type (Yost and Zhong 2017; Yost et al 2019). Participants in these earlier studies reported the number of detectable sound locations but a 2-interval-forced-choice task comparing different numbers of sound locations might be more sensitive. Here, subjects were asked which of two sets of spatially differentiated auditory stimuli involved more distinct locations. Each trial contained two presentations of 6 fixed-frequency noise bands, randomly spread across 1-6 of 8 speakers evenly spaced around the horizontal frontal field. For each stimulus pair, the first stimulus (the benchmark) always played from 3 randomly assigned speakers. For the second stimulus the number of speakers was randomized between 1-6. Subjects indicated whether the first or second stimulus used more speakers. Subjects completed two 600-trial sessions (200ms and 1000ms stimulus duration). Results align with previous work showing humans can detect roughly 2-4 sound locations at a time. Longer stimulus duration did not improve task performance. Thus, to the extent that multiple sound locations are multiplexed in the auditory system, additional detection time does not increase the number of sounds encoded.
Aurélie Bidet-Caulet, Philippe Albouy and Roxane Hoyer
Topic areas: memory and cognition correlates of behavior/perception
auditory attention distraction EEG human developmentFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Distractibility is the propensity to behaviorally react to irrelevant information. It relies on a balance between voluntary and involuntary attention. Voluntary attention enables performing an ongoing task efficiently over time by selecting relevant information and inhibiting irrelevant stimuli; whereas involuntary attention is captured by an unexpected salient stimulus, leading to distraction. Voluntary and involuntary attention rely on partially overlapping dorsal and ventral brain networks, which undergo significant development during childhood. The developmental trajectory of distractibility has been behaviorally characterized using a recently developed paradigm, the Competitive Attention Test (CAT). In young children, increased distractibility was found to mostly result from reduced sustained attention and enhanced distraction, and from decreased motor control and increased impulsivity in teenagers. However, it is not clear how these behavioral developmental changes are implemented in the brain. To address this question, we recorded electrophysiological (EEG) signal and behavioral responses from 3 age groups (6-7,11-13 and 18-25-years-old) performing the CAT. To assess voluntary attention orienting, the CAT includes informative and uninformative visual cues before an auditory target to detect. To measure distraction, the CAT comprises trials with a task-irrelevant complex sound preceding the target sound. Moreover, the rates of different types of false alarms, late and missed responses provide behavioral measures of sustained attention, impulsivity, and motor control. EEG brain responses to relevant and irrelevant sounds were investigated. The CNV before the target was found modulated by the cue information in adults only. In response to the target, a cue effect was found on the N1 in teenagers and on the P3b in children. These effects of voluntary orienting at different moments of the task according to age suggest a shift from a reactive to a proactive strategy from 6-years-old to adulthood. In response to the irrelevant sounds, a larger and longer RON was found in children and teenagers compared to adults, suggesting difficulties in reorienting back to the task before adulthood. These brain changes during childhood are associated with a behavioral increase in sustained attention, and a decrease in distraction and impulsivity. These findings give important insights into how the developing brain shapes child behavior.
Bshara Awwad, Yurika Watanbe, Olivia Stevenson, Liam Casey, Kameron Clayton and Daniel Polley
Topic areas: auditory disorders neural coding
Hearing loss disinhibition hyperexcitability neural plasticityFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
After sensorineural hearing loss, auditory cortex (ACtx) neurons become hyperresponsive to sound – excess central gain –, a core feature of tinnitus and hyperacusis. Ex vivo experiments suggest that auditory hyperresponsivity could arise through two mechanisms: either disinhibition via reduced feedforward inhibition or sensitization via enhanced glutamatergic inputs. Here, we developed a novel optogenetic approach to put these ideas to the test in the intact ACtx. We used a triple virus strategy to express ChR2 in parvalbumin+ (PV) GABAergic neurons and a somatically restricted, red-shifted opsin in contralateral neurons that project to the ACtx via the corpus callosum, allowing independent optical control over select populations of inhibitory (PV) and excitatory (callosal) neurons. High-density translaminar recordings were made from the high-frequency region of A1 in awake, head-fixed mice up to three days following acoustic trauma or an innocuous sham exposure. Sound intensity growth functions from regular spiking putative pyramidal neurons were markedly increased after acoustic trauma (n = 484 units in 6 mice), particularly in layer 5, compared to sham exposure (n=402 units in 5 mice). Dual optogenetic activation revealed that excess auditory gain was accompanied by a striking disinhibition, as measured from reduced PV-mediated suppression of spiking, without any evidence of sensitization to direct activation of excitatory callosal neurons. Hyperresponsivity from deep layer projection neurons via disinhibition could induce strong coupling with limbic brain regions and negative auditory affects after acoustic trauma, a possibility that we are exploring in ongoing experiments via dual recordings from the ACtx and basolateral amygdala.
Rose Ying, Daniel Stolzberg and Melissa Caras
Topic areas: correlates of behavior/perception subcortical processing
inferior colliculus medial geniculate nucleus perceptual learning task-dependent plasticityFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Training can improve the detection of near-threshold stimuli, a process called perceptual learning. Previous research has shown perceptual learning strengthens task-dependent modulations of auditory cortical activity. However, it is unclear whether these changes emerge in the ascending auditory pathway and are inherited by the auditory cortex, or arise in the cortex de novo. To address this, we implanted Mongolian gerbils with chronic microelectrode arrays in either the central nucleus of the inferior colliculus (CIC) or the ventral medial geniculate nucleus (vMGN). We recorded single- and multi-unit activity as animals trained and improved on an aversive go/no-go amplitude modulation (AM) detection task, and during passive exposure to the same AM sounds. AM-evoked firing rates and vector strengths were calculated and transformed into the signal detection metric d’. Neural thresholds were obtained for each training day by fitting d’ values across AM depths and determining the depth at which d’ = 1. As expected, CIC neurons encoded AM using a temporal strategy. Neural thresholds were similar during task and passive conditions, suggesting an absence of task-dependent modulation in the CIC. However, both task and passive neural thresholds improved, suggesting that the CIC does display learning-related plasticity independent of task engagement. vMGN neurons used both temporal and rate strategies to encode AM. As in the CIC, neural thresholds recorded during task performance improved, suggesting that learning-based plasticity is also present in the vMGN. However, unlike in the CIC, rate-based neural thresholds in the vMGN were better during task performance compared to passive exposure, suggesting that the vMGN is subject to task-dependent modulation. Notably, the magnitude of task dependent modulation increased over the course of training, similar to what has been reported in the auditory cortex . These findings suggest that training may improve neural sensitivity at or below the level of the auditory midbrain, and simultaneously strengthen non-sensory modulations of auditory thalamus. Our results contribute to a deeper understanding of the circuits supporting perceptual learning, and may ultimately inform strategies for improving sound perception in the hearing-impaired.
John Kyle Cooper, Marlies Gillis, Lotte Van den Eynde, Jonas Vanthornhout and Tom Francart
Topic areas: speech and language correlates of behavior/perception neural coding
EEG Language LateralizationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Motivation Previous research has shown that the left hemisphere of the brain plays an important role in language understanding. We investigate the effects of language understanding on hemispheric lateralization in brain activity using electroencephalography (EEG) and temporal response functions (TRFs) in two experimental paradigms that use different languages. Using these languages we observe how activity in the brain changes depending on if the listener understands the language. Methods In a first study, 26 native Dutch-speaking subjects listened to a Dutch story and a Frisian story while their EEG was recorded. In a second study, 4 subjects listened to stories in Dutch, French, and Italian while their EEG was recorded. Results In the TRFs, increased left-hemisphere lateralization was found at 150 ms when listeners were presented with their native language. When listeners were presented with their second language there was decreased left-hemisphere lateralization at 150 ms. Decreased left-hemisphere lateralization was also observed when listeners were presented with languages they do not understand. Conclusion Using EEG measurements we analyzed the effects of language understanding on hemispheric lateralization in TRFs. The hemispheric lateralization observed when subjects listen to their second language is inconsistent with lateralization observed when listening to their native language. Therefore, hemispheric lateralization could be used to assess native language understanding for individuals with hearing impairment and/or neurological disorders (e.g. aphasia). Acknowledgments Financial support for this project is provided by a Ph.D. grant from the Research Foundation Flanders (FWO).
Katharina S Bochtler, Fred Dick, Lori L Holt, Andrew J King and Kerry M M Walker
Topic areas: memory and cognition
auditory attention sound statistics duration discrimination ferrets attentional gainFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The ability to direct our attention towards a single sound source such as a friend’s voice in a crowded room is necessary in our acoustical world. This process is thought to rely, in part, on directing attention to different sound dimensions, such as frequency. Previous investigations have shown task-dependent changes in the frequency tuning of auditory cortical neurons when ferrets actively detect or discriminate a particular frequency of sound (e.g. Fritz et al. 2010). However, questions remain about how attentional gain can arise based on sound statistics. Specifically, to what extent can this modulation occur even if frequency is not a necessary component of the task demands? Mondor & Bregman (1994) demonstrated that human listeners’ reaction times on a tone duration task were slower when the presented tone frequency was unexpected (i.e. low probability). Here, we test the hypothesis that the statistical likelihood of sound frequencies alone can also affect animals’ behavioural decisions on orthogonal dimensions of sounds. We trained ferrets on a 2-alternative forced choice tone duration discrimination task in which we manipulated the statistical likelihood of tone frequencies. Our results show that, similar to humans, ferrets’ reaction times on this duration judgement task increased for low-probability frequencies, while their accuracy remained stable across other frequencies. These results suggest that attentional filters are employed during listening, even for an acoustical dimension (frequency) that is orthogonal to the task demands (duration). Our future experiments will use this task in combination with microelectrode recordings to investigate the neurophysiological basis of statistical-based attentional filtering in the auditory cortex.
Ziyi Zhu, Celine Drieu and Kishore Kuchibhotla
Topic areas: correlates of behavior/perception
Reinforcement learning Two-alternative forced choice Auditory discrimination Computational modellingFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Learning is not only the acquisition of knowledge, but also the ability to express that knowledge when needed. We tested the acquisition of task knowledge using non-reinforced probe trials while mice were trained to perform a balanced, wheel-based auditory two-alternative forced choice (2AFC) task. During probe trials, animals exhibited surprisingly higher accuracy and lower directional bias early in learning, as compared to reinforced trials, suggesting that they have already acquired unbiased knowledge of stimulus-action contingency, but expressed this knowledge much slower under reinforcement. Why do animals exhibit this gap in accuracy and directional bias between acquisition and expression, despite already acquiring the stimulus-action associations? Animals may (1) exhibit motor biases that they slowly learn to suppress, (2) continue to explore different choice alternatives, or (3) base decisions on recent trial history, including choice and reward, rather than current stimuli. To test between these and other potential drivers, we first used a generalized linear model to separate different contributors to animals’ choice during learning, including stimulus identity, trial history effects, and a continuous but slowly evolving directional preference (not influenced by stimulus or history factors), which we term action bias. Action bias, but not trial history, was the most important contributor to choice besides stimulus identity, and partially bridged the gap between acquisition and expression. We then asked if the structure of this action bias is static, reflecting a motor bias, or dynamic, reflecting changing behavioral strategies. Individual animals showed distinct states with left or right bias in blocks of tens to hundreds of trials and transitioned between both directions and un-biased states throughout learning, suggesting a dynamic bias structure. As learning progressed, animals exhibited less extreme bias, but continued to transition in and out of low biased states even at expert level performance. Taken together, behavioral expression may reflect an action bias driven exploratory process that is uncoupled from acquisition, evolves during learning, and persists to a lower degree at expert level to potentially maintain flexibility.
Xiu Zhai, Alex Clonan, Ian Stevenson and Monty Escabi
Topic areas: speech and language correlates of behavior/perception
speech in noise natural sounds modulation frequency speech recognition auditory midbrain modelFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Being able to recognize sounds in competing noise is a critical task of the normal functioning auditory system. Here we performed human psychoacoustic studies to assess how spectrum and modulation content of natural background sounds masks the recognition of speech digits. Native English speakers with normal hearing (0-20 dB threshold, 0.25-8 kHz) listened to digits in various original and perturbed maskers at 72 dB SPL and -9 dB SNR. Phase randomized (PR) and spectrum equalized (SE) background variants were used to dissociate spectrum vs. modulation masking effects. Response accuracy shows differences across sounds and conditions indicating that masking can be attributed to the modulation content and its high-order structure. For instance, the PR speech babble exhibits an increase in the accuracy, indicating that the modulation content is a major masking component. For construction noise, by comparison, the modulations tend to improve the accuracy. Thus, individual backgrounds can produce varied outcomes and the unique modulation content of each background can affect digit identification beneficially or detrimentally. We next developed an auditory midbrain model to determine whether masker interference in a physiologically inspired modulation space could predict the perceptual trends. Sounds were decomposed through a cochlear filterbank and a subsequent set of spectro-temporal receptive fields that model modulation sensitivity and map the waveform into temporal and spectral modulation. These outputs were then sent to a logistic regression model to estimate perceptual transfer functions and ultimately predict response accuracy. Cross-validated predictions demonstrate that the model accounts for ~90% of the perceptual response variance. The model also outperformed predictions obtained using a cochlear model, which accounted only for ~60% of the variance. The perceptually derived transfer functions subsequently allow us to identify salient cues that impact recognition in noise. For instance, slow background modulations (less than 8 Hz) tended to reduce accuracy whereas spectral modulations in speech in the voicing harmonicity range tended to improve accuracy. The finding demonstrate that the modulation content of environmental sounds can have adversarial masking outcomes on speech recognition and that an auditory midbrain inspired representation can predict and identify high-order cues that contribute to listening in noise.
Nicholas Audette, Wenxi Zhou, Alessandro La Chioma and David Schneider
Topic areas: memory and cognition correlates of behavior/perception neural coding neuroethology/communication
Predictive Processing Auditory Cortex Sensory-motor Learning Expectation Forelimb BehaviorFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Many of the sensations experienced by an organism are caused by their own actions, and accurately anticipating both the sensory features and timing of self-generated stimuli is crucial to a variety of behaviors. In the auditory cortex, neural responses to self-generated sounds exhibit frequency-specific suppression, suggesting that movement-based predictions may be implemented early in sensory processing. Yet it remains unknown whether this modulation results from a behaviorally specific and temporally precise prediction, nor is it known whether corresponding expectation signals are present locally in the auditory cortex. To address these questions, we trained mice to expect the precise acoustic outcome of a forelimb movement using a closed-loop sound-generating lever. Dense neuronal recordings in the auditory cortex revealed suppression of responses to self-generated sounds that was specific to the expected acoustic features, specific to a precise position within the movement, and specific to the movement that was coupled to sound during training. Prediction-based suppression was concentrated in L2/3 and L5, where deviations from expectation also recruited a population of prediction-error neurons that was otherwise unresponsive. Recording in the absence of sound revealed abundant movement signals in deep layers that were biased toward neurons tuned to the expected sound, as well as expectation signals that were present across cortical depths and peaked at the time of expected auditory feedback. Together, these findings reveal that predictive processing in the mouse auditory cortex is consistent with a learned internal model linking a specific action to its acoustic outcome with a temporal resolution of 10s of milliseconds, while identifying distinct populations of neurons that anticipate expected stimuli and differentially process expected versus unexpected outcomes.
Yuko Tamaoki, Varun Pasapula, Tanya Danaphongse, Samantha Kroon, Olayinka Olajubutu, Michael Borland and Crystal Engineer
Topic areas: speech and language subcortical processing
Autism Neurodevelopmental Disorder Vagus Nerve Stimulation Inferior Colliculus ElectrophysiologyFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Receptive language deficits are often observed in individuals with autism spectrum disorders (ASD). Auditory cortex neurons in children with ASD respond slower and weaker compared to typically developing children. When children are prenatally exposed to valproic acid (VPA), an anticonvulsant medication, it increases the risk for ASD. Symptoms associated with ASD are often observed, including altered sensory processing and deficits in language development. Impairments in sensory processing are also seen in rodents prenatally exposed to valproic acid. These rodents display deficits in speech sound discrimination ability. These behavioral characteristics are accompanied by changes in cortical activity patterns. In the primary auditory cortex (A1), the normal tonotopic map observed in typically hearing animals is reorganized and degraded in VPA-exposed rats. In VPA-exposed animals, neurons in the midbrain regions, such as the superior olivary complex and inferior colliculus, have disrupted morphology. Developing a method to improve these neural deficits throughout the auditory pathway is needed. We have developed a new approach to drive plasticity that enhances recovery after neurological damage. This strategy utilizes vagus nerve stimulation paired with a sound presentation. The aims of this study are to 1) document differences in the multi-unit inferior colliculus response to sounds in VPA exposed rats in comparison to saline exposed control rats, and 2) investigate the ability of VNS paired with sounds to reverse the maladaptive plasticity in the inferior colliculus in VPA exposed rats. In these experiments, we test the hypothesis that VNS paired with speech sound and tone presentation will reverse maladaptive plasticity and restore neural responses to sounds in VPA-exposed rats. Our results suggest that VPA rats displayed weaker responses to speech sounds in the IC and VNS-sound pairing is an effective method to enhance auditory processing. VPA rats responded weaker to speech sounds compared to the control rats in the IC. VNS-sound pairing strengthened the IC response to both the paired sound and sounds that were acoustically similar. Insights derived from this study may influence the development of new behavioral and sensory techniques to treat communication impairments that result in part from a degraded neural representation of sounds.
Nathan Vogler, Violet Tu, Alister Virkler, Ruoyi Chen, Tyler Ling, Jay Gottfried and Maria Geffen
Topic areas: correlates of behavior/perception multisensory processes
Auditory Auditory Cortex Olfactory MultisensoryFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
In complex environments, the brain must integrate information from multiple sensory modalities, including the auditory and olfactory systems. However, little is known about how the brain integrates auditory and olfactory stimuli. Here, we investigated the mechanisms underlying auditory-olfactory integration using anatomy, electrophysiology, and behavior. We first used viral tracing strategies to investigate the circuits underlying auditory-olfactory integration. Our results demonstrate direct inputs to the auditory cortex (ACx) from the piriform cortex (PCx), mainly from the posterior PCx, suggesting an anatomical substrate for olfactory integration in ACx. We next developed an experimental system for delivering combinations of auditory and olfactory stimuli during in vivo electrophysiology, and tested the effect of odor stimuli on auditory cortical responses to sound in awake mice. Odor stimuli modulate the responses of ACx neurons in a stimulus- and sound level-dependent manner, suggesting a neural substrate for olfactory integration in ACx. Finally, we trained mice on a sound detection Go/No-Go task to assess how odor stimuli affect auditory perception and behavior. Odors facilitate auditory perception by lowering sound detection thresholds. Together, our findings reveal novel circuits and mechanisms for auditory-olfactory integration involving the ACx.
Gavin Mischler, Menoua Keshishian, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language
neural adaptation deep neural networks modeling noise-robust gain controlFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The human auditory system displays a robust capacity to adapt to sudden changes in background noise, allowing for continuous speech comprehension despite changes in background environments. However, despite comprehensive studies characterizing this ability, the computations that enable the brain to achieve this are not well understood. The first step towards understanding a complex system is to propose a suitable model, but the classical and easily interpreted model for the auditory system, the spectro-temporal receptive field (STRF), cannot match the nonlinear dynamics of noise adaptation. To overcome this, we utilize a deep neural network (DNN) to model neural adaptation to noise, illustrating its effectiveness at reproducing the complex dynamics at the levels of both individual electrodes and the cortical population. By closely inspecting the model’s STRF-like computations over time, we find that the model alters both the gain and shape of its receptive field when adapting to a sudden noise change, enabling multiple noise filtering methods to be used. Further, we find that models of electrodes in nonprimary auditory cortex exhibit different filtering changes compared to primary auditory cortex, suggesting differences in noise filtering mechanisms along the cortical hierarchy. These findings demonstrate the capability of deep neural networks to model complex neural adaptation and offer new hypotheses about the computations that the auditory cortex performs to enable noise-robust speech perception in real-world, dynamic environments.
Hiroaki Tsukano and Hiroyuki Kato
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization
Auditory Cortex Orbitofrontal Cortex HabituationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Sensory stimuli lose their perceptual salience after repetitive exposure (“habituation”). We previously reported that daily passive sound exposure attenuates neural responses in the mouse primary auditory cortex (A1), and local inhibition by somatostatin-expressing neurons (SOM neurons) mediates this plasticity. In the current study, we further explored the source of top-down inputs that control SOM neurons to trigger habituation. We first conducted retrograde tracing and found that A1 receives projections from the frontal cortical areas, including the orbitofrontal cortex (OFC). Interestingly, optogenetic activation of the OFC axon terminals suppressed A1 neuronal activity, suggesting a top-down inhibitory control of sensory representations. To investigate the plasticity of OFC top-down inputs during habituation, we performed two-photon calcium imaging of OFC axon terminals in A1 during daily passive exposure to tones. We found that tone-evoked activity of OFC axons was enhanced over days, suggesting their contribution to the attenuation of A1 sound responses. Finally, we examined the causal role of OFC in habituation by its pharmacological inactivation during chronic calcium imaging of A1 neural activity. Strikingly, acute muscimol infusion into OFC reversed the pre-established A1 habituation, indicating the requirement of the OFC in sensory habituation. Together, these results suggest the predictive gating of sensory activity by a global circuit mechanism recruiting the frontal top-down inputs.
Kai Lu and Robert Liu
Topic areas: memory and cognition neuroethology/communication
Auditory cortex learning innateFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In nature, animals learn to modify their innate behaviors to better adapt to the environment, transitioning their actions from pre-existing stereotypes to novel, more adaptive ones. Here we explored the neural basis for such transitions by investigating how freely moving female mice learn to override the way they innately search for pups by relying on a new sound that predicts where pups will be found. Naïve virgins were trained in a T-maze to enter one of the two arms cued by an amplitude-modulated band-pass noise and rewarded with pups, which were then retrieved back to the nest at the main stem. All mice (N=9) initially used an innate spatial memory-based strategy of searching the arm where a pup was presented in the prior trial. Within 8 days, all animals learned (70% correct) to use the sound to locate pups. We recorded single-unit/multi-unit spiking in auditory cortices (AC, N~1200) and medial prefrontal cortices (mPFC, N~600) during learning. Our results showed: (1) AC responses before choosing were significantly different between correct and wrong trials (55% units), suggesting top-down modulation; (2) nest sound sensitivity increased over training (p less than 0.001) as performance using the sound improved. Meanwhile, mPFC neurons exhibited higher population activities when animals made wrong choices (p less than 0.001), suggesting they help evaluate choice outcomes. Future work will model how these sensory and prefrontal changes during learning work together to promote switching from the innate strategy to the more efficient auditory strategy. Grant: R01DC008343
Justin Yao, Klavdia Zemlianova, David Hocker, Cristina Savin, Christine Constantinople, Sueyeon Chung and Dan Sanes
Topic areas: correlates of behavior/perception neural coding
parietal cortex auditory perception decision-making neural response manifold geometric analysisFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The process by which sensory evidence contributes to perceptual choices requires an understanding of its transformation into decision variables. Here, we address this issue by evaluating the neural representation of acoustic information in auditory cortex-recipient parietal cortex while gerbils either performed an auditory discrimination task or while they passively listened to identical acoustic stimuli. Gerbils were required to discriminate between two amplitude modulation (AM) rates, 4- versus 10-Hz, as a function of AM duration (100-2000 msec). Task performance improved with increasing AM duration, and reached an optimum at approximately 800 msec. Decoded activity of simultaneously recorded parietal neurons reflected psychometric sensitivity during task performance. Decoded activity during passive listening was poorer than during task performance, but scaled with increasing AM duration. This suggests that the parietal cortex could accumulate this sensory evidence for the purpose of forming a decision variable. To test whether decision variables emerge within parietal cortex activity, we applied principal component and geometric analyses to the neural responses. Both principal component and geometric analyses revealed the emergence of decision-relevant, linearly separable manifolds on a behaviorally-relevant timescale, but only during task engagement. Finally, using a clustering analysis, we found 3 subpopulations of neurons that may reflect the encoding of separate segments of task performance: stimulus integration and motor preparation or execution. Taken together, our findings demonstrate how the parietal cortex integrates and transforms encoded auditory information to guide sound-driven perceptual decisions.
Marlies Gillis, Jonas Vanthornhout and Tom Francart
Topic areas: speech and language
Neural tracking Speech processing Linguistic tracking Speech understandingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Over the last years, more attention has been devoted to understanding and characterizing the neural responses associated with speech understanding. The differentiation can be made between hearing the stimulus and understanding the presented speech, whereby the listener can connect to the meaning of the storyline. To investigate the neural response to speech, one can investigate neural tracking, i.e., the phenomenon whereby the brain time-locks to specific aspects of the speech. Although envelope tracking is thought to capture certain aspects of speech understanding, this is not always true (Verschueren et al., 2022). A possible solution can be found by focussing on identifying higher-level language features, derived from the speech’s content, which can capture the neural correlates of speech understanding (e.g., Brodbeck et al., 2018; Broderick et al., 2018; Weissbart et al., 2020). This study evaluated whether neural tracking of these higher-level language features, i.e., linguistic tracking, gains more insight into whether the listener understood the presented speech. We investigated the EEG responses of 19 normal-hearing young participants (6 men) who listened to a Dutch story, a Frisian story whereby Frisian was not familiar to the participants, and a word list whereby individual words were understood but the context did not make sense. We hypothesized that the Dutch story would show linguistic tracking as the storyline can be understood, while this would not be the case for the Frisian story and the word list. Preliminary results indicate the Dutch story showed more linguistic neural tracking than the Frisian story, which shows higher linguistic tracking than the word list. The results obtained by analyzing linguistic tracking converged with the subjectively rated speech understanding, i.e., the answer to the question ‘how much of the content of the speech did you comprehend?’. The Dutch story was fully intelligible, followed by the Frisian story, rated around 50%, while the speech understanding for the word list was rated around 10%. Our preliminary results indicate that linguistic tracking can capture the effect of speech understanding. These results open doors toward understanding language disorders and improving their diagnosis and treatment.
Vishal Choudhari, Cong Han, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception novel technologies
Auditory Attention Decoding Spatial Attention Cognitively-Controlled Hearing Aids Speech Separation Deep LearningFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Hearing-impaired listeners experience difficulty in attending to a specific talker in the presence of interfering talkers. Cognitively-controlled hearing aids aim to address this problem by decoding the attended talker from neural signals using auditory attention decoding (AAD) algorithms, separating the speech mixture into individual streams and selectively enhancing the attended speech stream. Prior work investigating AAD algorithms often use simple acoustic scenes for their experiments: the attended and unattended talkers are usually of different sex and stationary in space with no relative motion. Background noise is often ignored. Such scenes do not mimic real-life settings. More importantly, the talkers could also be engaged in conversations, which calls for attention switches during turn-taking. For AAD algorithms to be operable in real-world settings, it is imperative that they generalize to challenging and unpredictable changes in the acoustic scene. We designed an AAD task that replicates real-life acoustic scenes. The task involved two concurrent talkers (either same/different sex) that are spatially separated and continuously moving in the presence of background noise. These talkers independently engaged in two distinct conversations. Different talkers took turns in these conversations. Electrocorticography (ECoG) data from two epilepsy patients was collected. The participants were instructed to attend to the conversation that was cued at the start of each trial. A deep learning-based binaural speech separation algorithm was used to causally separate the speech streams of the talkers in the acoustic scene while also preserving their location information. Spatiotemporal filters were trained to reconstruct the spectrograms and trajectories of the attended and unattended talkers from the neural recordings. These reconstructions were then compared with the spectrograms and trajectories yielded by the binaural speech separation algorithm to determine the attended and unattended talkers. The binaural speech separation algorithm helped in enhancing the attended talker both subjectively and objectively. Trajectories and spectrograms of the attended talker were reconstructed from neural data with accuracies significantly above chance levels. Attended talker could be correctly decoded with an accuracy of 82% using a window size of 4 seconds. These results suggest that our speech separation and AAD algorithms can generalize well to challenging real-life settings.
Alexander Kazakov, Maciek M. Jankowski, Ana Polterovich, Johannes Niediek and Israel Nelken
Topic areas: cross-species comparisons neural coding
Decision making Reinforcement learning Artificial neural networkFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
When an animal is trained on a complex task, different task parts may be learned at different rates. Since reward is provided usually only at the trial’s end, it cannot be used to infer within-trial learning trends. Behavioral features such as speed or trial duration capture trends in the animal’s decision-making, but do not necessarily indicate that the animal is improving on the task. We propose to study learning of sub-parts of the task by a fine-level analysis of animal behavior via a Markov Decision Process (MDP) model. We applied this approach to rats performing a sound localization task. We observed that (1) Rat behavior approached the optimal policy gradually throughout training; (2) most of the policy refinement occurred at a specific, short (less than 1s) segment of the trial; (3) the first trials of each day showed sub-optimal performance that improved during the session. Lastly, we modeled the rat using artificial agents guided by a deep neural network (DNN). We observed similar features of learning in the artificial agents as in the real rats. We then investigated how the task is encoded by the agent’s DNN. Preliminary results indicate that the strongest connections between neurons were crucial for the precise network activity: The action accuracy of the network dropped by 50% when the strongest 5% of the weights are erased. However, removing 40% of the smallest weights also reduced accuracy by 20%. Further study of the information processing may be used to generate working hypotheses for learning in biological brains.
Yang Zhang, Sherry Xinyi Shen, Adnan Bibic and Xiaoqin Wang
Topic areas: speech and language cross-species comparisons
Language network Dual auditory pathways EvolutionFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Auditory dorsal and ventral pathways in the human brain play important roles in supporting language processing. However, the evolutionary course of the dual auditory pathways remains largely unclear. By parcellating the auditory cortex of marmosets, macaques, and humans using the same individual-based analysis method and tracking the fiber pathways originating from the auditory cortex based on multi-shell diffusion-weighted magnetic resonance imaging (dMRI), homologous auditory dorsal and ventral fiber pathways were identified. Ventral pathways were found to be well conserved in the three primate species analyzed but extended to more anterior regions in humans. In contrast, dorsal pathways showed evolutionary divergence in two aspects: first, dorsal pathways in humans have stronger connections to higher-level auditory regions which extended beyond the corresponding regions in non-human primates; second, left lateralization of dorsal pathways was only found in humans. Moreover, dorsal pathways in marmosets are more similar to those in humans than in macaques. These results demonstrate the evolutionary continuity and divergence of dual auditory pathways in the primate brains, suggesting that the putative neural networks supporting human language processing emerged before the lineage of the New World primates diverged from the Old World primates and continued to parallelly evolve thereafter.
Yoon Kyoung Kim, Jihoon Kim, Jeongyoon Lee and Han Kyoung Choe
Topic areas: memory and cognition correlates of behavior/perception
Auditory cortex Anterior cingulate cortex Autism spectrum disorderFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Autism spectrum disorder (ASD) is a developmental disability characterized by social deficit and repetitive behavior. In addition, intellectual disability, common comorbidity to ASD core symptoms, aggravates the caregivers’ burden by hampering cognitive therapies. To devise a neuromodulation procedure that can rescue learning deficits in the ASD, we first addressed the learning deficits in the ASD mouse model using a go/no-go based pure tone-discrimination task. Cntnap2 knockout mice, a mouse model of ASD, exhibited significant retardation in learning with a similar plateau learning curve to wildtype controls. The prefrontal cortex is known to be important in attention and decision making during the discrimination task while the sensory cortex also shows distinct activity during sensory discrimination learning. Fiber photometry analysis of top-down attention control mediated by the anterior cingulate cortex (ACC) to the primary auditory cortex (Au1) revealed that population calcium transient in the ASD model is abnormally regulated during tone discrimination. In wildtype, the manipulation of ACC to Au1 neurons by optogenetic stimulation bidirectionally enhances or decreases discrimination performance in wildtype mice. This observation inspired us to optogenetically stimulate the ACC-Au1 circuit of ASD during the learning. The optogenetic activation of ACC-Au1 projecting neurons indeed enhances learning efficiency to a level similar to non-stimulated wildtype. In summary, we report that the manipulation of ACC-Au1 neurons bidirectionally modulates discrimination performance. Also found regulation of ACC-Au1 neurons decreases the learning deficit of ASD and possibly suggests its therapeutic potential for intellectual disabilities of ASD.
Ana Polterovich, Maciej M Jankowski, Johannes Niediek, Alex Kazakov and Israel Nelken
Topic areas: correlates of behavior/perception
Auditory cortex Behavior Electrophysiology Timing RodentsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Auditory cortex plays an important role in the computations underlying sound localization. Here we study the neural activity in the auditory cortex of freely-moving rats that perform a self-initiated sound localization and identification task. To this end we constructed the Rat Interactive Foraging Facility (RIFF). It consists of a large circular arena with 6 interaction areas (IAs) that have a water port, a food port and two loudspeakers. Rat behavior is monitored online using video tracking and nose-poke identification. Neural responses are recorded using a logger on the head of the animal. In the task studied here, auditory cues consisted of 6 different modified human words, each associated with one IA. When a rat reached the center of the arena, one of the sounds was presented once every 2 seconds from its associated IA, and the rat had to reach the correct IA within 20 seconds in order to collect a reward. Control tasks included pure localization and pure discrimination tasks for the trained rats. The rats learned all tasks rapidly with minimal guidance. They performed best when both the localization and discrimination cues were available, but were able to collect rewards also when either of the cues was missing. Sound-driven neuronal responses were largely as previously described in anesthetized animals, although responses to the same sound presented in active and passive conditions could differ. In addition to the sound-driven responses, we observed large, reproducible slow modulations in firing rates that typically lasted a few seconds (much longer than sound driven responses) and that were locked to self initiated behavioral events before and after sound presentation. These firing rate modulations were often larger than the responses to sounds. The slow modulations were partially correlated with non-auditory, behaviorally-related variables such as speed of motion and head turn direction, but were best explained in many neurons as a slowly-varying function of the time within trial. We conclude that most spiking activity in the auditory cortex during sound-guided behavior tracks the time course of the task rather than the sounds.
Mousa Karayanni, Yonatan Loewenstein and Israel Nelken
Topic areas: correlates of behavior/perception multisensory processes
Learning Exploration Complex behaviorFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
We wanted to study the way freely moving rats explore a complex yet controllable environment. To that purpose we implemented a complex decision task in the Rat Interactive Foraging Facility (RIFF). The RIFF is a large experimental environment that has 6 interaction areas (IAs), in which the rat can poke and receive food and water rewards. To obtain a reward, the rats are required to perform a sequence of pokes in the various IAs in a particular randomly-chosen unmarked order. The task can be described as a multi-state Markov Decision Process (MDP): The states are ordered from the initial state to the final state and are marked with distinct auditory and visual cues. The MDP had one more states than the length of the sequence. In the final state, poking in any of the IAs equally rewarded the rat. All other states are each associated with a one “correct” IA such that poking in it advances the animal to the next state. Poking in any other IA resets the animal to the initial state. The identities of the correct IAs were kept fixed until the animal reached a satisfactory level of performance, and then changed (with no indication). Remarkably, in a 3-state MDP, rats managed to successfully learn up to 5 different sequences within a session of less than 2 hours. Moreover, they learned the correct IA associated with the initial state before learning the correct IA associated with the second state, suggesting that the rats solve the task by learning in each state separately. However, fewer state visits were required to learn the correct IA of the second state than the first. In each state, the probability to find the correct IA location decreased when conditioned on the number of errors, as expected from a random search-pattern with repetitions, but the decrease was faster than expected suggesting an unstructured exploration strategy with a tendency to repeat unsuccessful attempts. In conclusion, rats were able to learn a complex task in the RIFF, behavior was consistent with hierarchical learning and random exploration with some biases in port selection.
Carolyn Sweeney, Maryse Thomas, Kasey Smith, Anna Stewart, Lucas Vattino, Cathryn MacGregor and Anne Takesian
Topic areas: correlates of behavior/perception
VIP Serotonin Interneuron Auditory CortexFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Auditory perceptual learning induces plasticity in the primary auditory cortex (A1), which can improve hearing perception, but the neural mechanisms that promote these changes are unclear. Neuromodulators such as serotonin (5-HT), acetylcholine, and dopamine can trigger plasticity in the adult cortex. However, little is known about the actions of these neuromodulators within A1 circuits and their function in auditory learning. Here, we focused on 5-HT signaling in mouse A1, its cortical targets, and its effects on auditory perceptual learning. Cortical layer 1 (L1) is a major site for neuromodulatory projections, including the serotonergic raphe nuclei. Our work and others demonstrated that VIP (vasoactive intestinal peptide)-expressing interneurons in L1 robustly express the ionotropic 5-HT receptor, 5HT3A-R. Additionally, they receive bottom-up thalamic input from auditory thalamus. This circuitry suggests that both sensory inputs and 5-HT may engage L1-circuits during learning. To understand how VIP interneurons are activated in vivo by sensory and behavioral stimuli, we expressed the calcium indicator GCaMP6f selectivity in VIP interneurons and used in vivo 2-photon calcium imaging in awake mice to assess the response of these interneurons to a variety of sound stimuli as well as appetitive and aversive reinforcers that are known to activate serotonergic neurons. Our results reveal heterogeneous responses within the VIP population; many neurons were selectively activated by specific, complex sounds or behavioral cues. To understand the function of 5-HT release and VIP activation during auditory perceptual learning, we developed an appetitive Go/No-go auditory frequency discrimination task. Mice showed robust improvements in their perceptual thresholds over the course of three weeks of training. Ongoing fiber photometry studies are monitoring VIP interneuron activity and 5-HT release across perceptual learning using calcium sensor recordings in VIP neurons and signals reported by the fluorescent 5-HT sensor GRAB5HT. Preliminary results show a prominent increase in 5-HT release during rewarded trials as the mice undergo associative learning. After associative learning, an analysis of single trial fluorescent 5-HT transients can discriminate ‘Hit’ versus other trial types in single trials. These ongoing works will help define the dynamics and function of 5-HT release and VIP interneuron activity during perceptual learning.
Pieter De Clercq, Jonas Vanthornhout, Maaike Vandermosten and Tom Francart
Topic areas: speech and language
auditory processing neural envelope tracking mutual information EEGFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The human brain tracks the temporal envelope of speech, containing essential cues for speech understanding. Linear models (decoders/temporal response functions) are the most popular tool to study neural envelope tracking, as they provide temporal and spatial information on speech processing. However, information on how speech is processed can be lost since nonlinear relations are precluded. Alternatively, a mutual information (MI) analysis can detect nonlinear relations while retaining temporal and spatial information. In the present study, we directly compare linear models with the MI analysis and investigate whether MI captures nonlinear relations between the brain and the envelope. We analyzed EEG data of 64 participants listening to a story. First, we compared the MI analysis and linear models. Second, we tested whether the MI analysis captures nonlinear components in the data, by first removing all linear components using least-squares regression. Next, the MI analysis was applied to the residual data. Envelope tracking using the MI analysis correlated strongly with outcomes obtained from linear models (r=0.93). Furthermore, temporal and spatial patterns of speech processing were highly similar using both methods. At the single-subject level, we detected significant nonlinear relationships between the EEG and the envelope using the MI analysis. The MI analysis robustly detects nonlinear envelope tracking, beyond limits of linear models. Therefore, we conclude that the MI analysis is a statistically more powerful tool for studying neural envelope tracking. In addition, it retains temporal and spatial characteristics of speech processing, an advantage lost when using more complex (nonlinear) deep neural networks.
Joonyeup Lee and Gideon Rothschild
Topic areas: memory and cognition correlates of behavior/perception neural coding
Auditory Cortex Sound Sequences Offset Responses Neural Coding Learning Two-photon imagingFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Behaviorally relevant sounds are often composed of distinct acoustic units organized into specific temporal sequences. The meaning of such sound sequences can therefore be fully recognized only when they have terminated. However, the neural mechanisms underlying the perception of sound sequences remain unclear. Here, we use two-photon calcium imaging in the auditory cortex of behaving mice to test the hypothesis that neural responses to termination of sound sequences (“Off-responses”) encode their acoustic history and behavioral salience. We find that auditory cortical Off-responses encode preceding sound sequences and that learning to associate a sound sequence with a reward induces enhancement of Off-responses relative to responses during the sound sequence (“On-responses”). Furthermore, learning enhances network-level discriminability of sound sequences by Off-responses. Last, learning-induced plasticity of Off-responses but not On-responses lasts to the next day. These findings identify auditory cortical Off-responses as a key neural signature of acquired sound-sequence salience.
Jan Wh Schnupp, Alexa N Buck, Sarah Buchholz, Theresa A Preyer, Henrike Budig, Felix Kleinschroth and Nicole Roßkothen-Kuhl
Topic areas: novel technologies
cochlear implants deafness binaural hearing interaural time differencesFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Early deaf human patients whose hearing is restored with bilateral cochlear implants (biCIs) are usually insensitive to interaural time differences (ITDs), an important binaural cue for binaural hearing. This insensitivity has usually been attributed to a lack of auditory input during a presumed sensitive period for the development of normal binaural hearing. However, our group was recently able to show that neonatally deafened (ND) rats who are fitted with biCIs in early adulthood and are given precisely synchronized binaural stimulation from the outset are able to lateralize ITDs with exquisite sensitivity, reaching thresholds of ~50 μs (Rosskothen-Kuhl et al., 2021, eLife doi: 10.7554/eLife.59300). Here we present results from several key follow-on psychoacoustic experiments with our ND biCI rats, which have yielded a number of new and important insights. First, by varying the pulse rate of the binaural stimuli delivered, we were able to show that ITD sensitivity remains surprisingly good for pulse rates of up to 900 pps, but drops sharply at 1800 pps. Electric ITD sensitivity thus declines only at pulse rates higher than the upper limit for acoustic ITDs, and good ITD sensitivity with CIs is achievable at pulse rates used in clinical practice. Second, by independently varying envelope and pulse timing ITDs, we were able to show that ITD discrimination is dominated by the timing of the pulses, and envelope ITDs are essentially useless as a cue under CI stimulation. Third, by independently varying ITDs and ILDs, we were able to show that time-intensity trading ratios for electric hearing are as small as 20 μs/dB. Result 1 indicates that delivering good ITDs via CIs need not be incompatible with the high pulse rates needed for good speech encoding, but results 2 and 3 indicate that the essentially random pulse timing ITDs delivered by current, desynchronized clinical processors are a very significant problem. Pulse timing ITDs would normally be interpreted as powerful lateralization cues, which can confound even very large interaural level difference cues unless the animal becomes desensitized to ITDs.
Samantha Moseley, Christof Ferhman, Jonah Weissman and Chad D Meliza
Topic areas: neural coding
Noise Avian AuditoryFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Vocal communication requires the ability to detect a signal of interest within a background of competing sounds with similar acoustics (the “cocktail-party problem”). This ability has been attributed to higher-order auditory neurons that produce selective responses to vocalizations that are invariant to increasing levels of background noise. Noise-invariant neurons have been observed in several nonhuman species, but it is not yet known how they develop. We hypothesize that noise-invariance requires early experience in complex acoustic conditions. Zebra finches (Taeniopygia guttata) are an excellent model for studying communication in noise to test this hypothesis. Because zebra finches live in large colonies, not only must adults solve the cocktail-party problem, but young fledglings need to isolate their tutor’s song from the colony background in order to learn and copy that song into adulthood. We predict that this early exposure to colony noise instructs the development of noise-invariant neurons in the zebra finch pallium. To test this, we reared birds in either the presence (n=7 birds) or absence (n=7 birds) of colony noise, then performed single-unit recordings throughout the auditory pallium. Responses were collected to conspecific stimuli embedded in synthetic colony noise at varying signal-to-noise ratios. Noise-invariance was quantified at the single-unit level by directly comparing neural responses to these auditory scenes with responses to the original foreground stimuli. Noise-invariance was also quantified within simultaneously recorded populations of 30–50 neurons using a linear decoder. As predicted, neurons in colony-reared birds were more invariant to noise.
Siyu Zhu, Xiaohan Bao, Paisley Barnes and Stephen G. Lomber
Topic areas: auditory disorders
Plasticity Deaf Visual Electroencephalography CatFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
When deprived of a sensory modality, the brain often compensates with supranormal performance in other intact sensory systems. This phenomenon is known as cross-modal plasticity, where areas of the brain responsible for a certain sensory modality are reorganized and repurposed because of the sensory loss. Approximately 1.33 billion (18.5%) people in the world are affected by hearing impairment, identifying it as one of the most prevalent neurological disorders. Deaf humans and cats have superior visual motion detection abilities, and this advantage has been causally demonstrated to be mediated by reorganized auditory cortex. The present study sought to determine the electrophysiological response of hearing and deaf cats to motion-onset stimuli of different velocities. Deafness was induced in the first postnatal month by systemic administration of ototoxic drugs. In maturity, we examined VEPs in both hearing and deaf cats generated from electroencephalogram (EEG) recordings in lightly anesthetized subjects. VEPs are an averaged and amplified record of the gross electrical action potentials generated by the brain in response to visual stimulation, and examination of VEPs is a commonly used non-invasive ophthalmological technique to assess the functional state of the visual system. The stimulus consisted of 200-ms long coherently leftward-moving dots with randomly generated positions at 10 speeds between 2 to 64 deg/sec. VEP waveforms were produced from the average of 160 trials for each speed. In both groups, peak amplitudes increased with increasing stimulus speeds, and significantly larger peak amplitudes were observed in deaf subjects at higher speeds (8 deg/s and above, Mann-Whitney U test p less than 0.05). Cross-modal reorganization in auditory cortex underlying the significant improvement of motion detection found in deaf subjects could be contributed by the increase in neuronal discharge to visual motion stimuli, and this can lead to increased measurable VEP amplitudes. This study furthers current understanding of cortical plasticity during hearing loss and can establish the assessment of VEPs as an additional tool in the evaluation of cross-modal plasticity following hearing loss. This work was supported by grants from the Canadian Institutes of Health Research and the Natural Sciences and Engineering Research Council of Canada.
Jesse Herche, Cynthia King, Stephanie Lovich and Jennifer Groh
Topic areas: correlates of behavior/perception multisensory processes subcortical processing
Auditory Efferents Oto-acoustic emissions AudiovisualFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
How does what we see change what we hear? Sensory integration involves two principles in tension. The brain: 1) builds its perceptions from a synthesis of sensory afferents and 2) controls its own inputs through sensory efferents. Where along ascending pathways do efferents act and how do signals from one sensory modality alter another? In the auditory system, visual information biases sound responses across numerous CNS regions, from the brainstem to the primary auditory cortex. Where does audio-visual sensory interaction begin? Here, we tested whether visual stimuli exert measurable effects on the auditory periphery. Auditory efferent pathways control sensitivity and responsiveness to incoming sound (e.g. middle ear reflexes, otoacoustic emissions). Our group’s recent work identified an efferent process activated with eye movements. We postulated visual content may also influence the auditory efferent system and tested the hypothesis with a silent visual known to produce auditory illusions in at least some participants – the so-called ‘jumping pylons’ stimulus. Microphone recordings captured ear-canal-air-pressure changes in 14 human participants viewing jumping pylon and freeze frame stimuli in alternating blocks. We evaluated non-specific modulation of ear-canal noise across a wide frequency spectrum to identify global efferent signal or specific changes in ear canal noise at the periodicity of pylon jumping. We found broadband changes in 7 subjects (50%), suggesting a global influence of visual stimuli on auditory gain. To date, we have not observed a specific periodic signal elicited by the jumping pylons, even in the 4 subjects that experienced the illusion (29%). In conclusion, we hypothesize that, in some subjects, primary hearing structures modulate sound transduction based on inferences from the visual system. Future work should clarify the timing and variability of response phenotypes across subjects to assess the role of visually gated auditory efferents in shaping sound processing.
Brendan Williams, Tanya Danaphongse, Alfonso Reyes, Samantha Kroon, Varun Pasapula, Allan Jacob, Arjun Mehendale and Crystal Engineer
Topic areas: correlates of behavior/perception
Vagus Nerve Stimulation Valproic Acid Exposure Autism Spectrum Disorder Speech Discrimination TaskFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Individuals with Autism Spectrum Disorder (ASD) often struggle with everyday communication. Specifically, they exhibit impairments in processing receptive and expressive language. These difficulties are thought to arise from the improper development of neural structures along the auditory pathway. The result is a weakened and delayed response to auditory cues. When responses are partially impaired, a cascade of processing errors can occur— resulting in a failure to correctly process sound. Typical function may be partially regained through extensive speech therapy. However, many individuals still report deficits following treatment. To improve outcomes following treatment, an adjunctive therapy is needed. One such therapy is vagus nerve stimulation (VNS). Paired with an auditory cue, VNS has been shown to drive plasticity across the auditory pathway. In rodents, VNS-sound pairing increases neural responses to sound. Using prenatal exposure to valproic acid (VPA), we model the physiological and behavioral deficits associated with ASD — in rats. Prior works have shown that when VPA exposed rats receive VNS-sound paired therapy, the physiological deficit in sound processing is overcome and neural performance is rescued. It has yet to be determined whether VNS-sound pairing could improve a VPA exposed rats’ performance in a behavioral task. This study tests the hypothesis that VNS paired with successful trials on a speech discrimination task will improve the performance of VPA exposed rats. Three groups (saline-exposed control, VPA-exposed, VPA-exposed+VNS) of rats were trained on a go, no-go speech sound discrimination task. All three groups received identical surgical and training procedures. Our preliminary results suggest that VNS positively modulates performance on a consonant discrimination task. VPA+VNS rats have improved target sound recognition and fewer nosepokes to non-target sounds than their VPA counterparts. This study could be fundamental in developing clinical strategies to implement VNS paired auditory therapy to improve the auditory processing capabilities of individuals with ASD.
Alessandra Sacco, Stephen Gordon and Stephen Lomber
Topic areas: auditory disorders
Tractography Cat DTI DeafFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Following sensory deprivation, areas and networks in the brain may adapt and rewire to compensate for loss of input, resulting in a potential myriad of shifts concerning local functionality, white matter integrity, network organization and more. These computational adaptations may engender regional disuse-driven atrophies and/or compensatory cross-modal plasticity (enhanced abilities in the remaining senses). To better elucidate the mechanism supporting compensatory plasticity, this study investigated structural connectivity differences between hearing (n=8) and perinatally-deafened (n=8) cats, a well-studied animal model for auditory deprivation. Using DTI, connectional changes were explored throughout the entirety of the brain in two analysis streams: ROI-ROI and modality-modality. For the latter, ROIs were combined according to primary function, resulting in 7 masks: auditory, visual, somatosensory, motor, frontal, non-cortical, and other. FSL’s probtrackx was run from each gray matter ROI seed to the remaining 154 targets for ROI-ROI analysis, and from each seed modality to the remaining 6 targets for modality-modality analysis. For each case, relative streamline percentages were calculated between each seed to remaining targets. Wilcoxon rank-sum tests were performed to compare percentages between groups and corrected for using the Benjamini-Hochberg procedure. Results suggest structural plasticity in various regions throughout the deaf brain, not limited to sensory cortices. This included a significant decrease in connectivity between dysgranular insular area and fourth somatosensory cortex of the left hemisphere for deaf compared to hearing (p = 0.0463). Near-significant differences (p less than 0.1) included a decrease in connectional strength between visual area 18 and cingulate visual area, as well as increases between prepyriform cortex and area 21b, the dorsal division of agranular insular area and the dorsolateral division of prefrontal cortex, and between posterior suprasylvian visual area and primary somatosensory cortex in deaf compared to hearing. For modality-modality analysis, results suggest reduced communication between motor cortex and non-cortical structures as indicated by a ~49% decrease in connectivity between the two modalities in deaf compared to hearing (p = 0.0420). Overall, this is the first study to examine tractography-based connectivity alterations following auditory deprivation in cats, and it suggests that deafness incites differential adaptations in areal communication both locally and globally.
Stephen Gordon and Stephen Lomber
Topic areas: auditory disorders
Cat Structural MRI Cortical Thickness DeafFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Gray matter thickness derived from high-resolution magnetic resonance images is a useful non-invasive method of characterizing a property of the cerebral cortex. The purpose of this investigation was to examine if high-resolution MRI could be used to assess differences in cortical thickness covariance across ROIs between the brains of deaf cats and those of hearing controls. In this study, 29 adult hearing and 26 adult perinatally-deafened cats were scanned at 0.5mm isotropic resolution to look for structural covariance across ROIs from a cat brain atlas (Stolzberg et al., 2017). Deafness was induced during the first postnatal month by systemic administration of ototoxic drugs. Hearing status, or lack thereof, was confirmed by the recording of auditory brainstem responses, and all animals were scanned after reaching sexual maturity. Thickness maps were obtained using the Advanced Normalization Tools (ANTs) DiReCT cortical thickness pipeline, and this data was then processed in MATLAB. Age and sex were regressed out of the raw thickness values independently for each group, and all subjects’ values were normalized to the mean of that group. These corrected regional thickness values were then correlated against each other for each pair of ROIs within each group using Pearson’s R, and corrected using the Benjamini-Hochberg method. Significant covariances were compared within and across the two groups. Plots of covariance vs. Euclidean distance between the ROI pairs were similar across hearing and deaf subjects, with each group showing increased significant covariances at very short distances (less than 5mm apart) and a more consistent distribution from 5mm out to the extremes (~35mm). The mean covariance for bilaterally homologous ROIs were found to be significantly higher than for non-homologous ROI pairs in both groups. Numbers of significant covariances within and across sensory modalities differed between groups. A notable result was the near-complete absence of somatomotor covariances in the deaf group as compared to the hearing group. Results from this work point to a change in the structural connectivity across the cat brain following deafness, further supporting the plastic nature of the brain as seen in past tracer and behavioral studies.
Suizi Tian, Yu-ang Cheng and Huan Luo
Topic areas: memory and cognition
auditory working memory temporal regularity rhythmFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Temporal regularities are known to facilitate perception but whether and how it modulates working memory of auditory sequence remains unclear. In our behavioral experiment, human subjects were instructed to memorize a sequence of piano tones presented in either a rhythmic or arrhythmic way. After a maintaining period, the same tone sequence with one tone altered in pitch was presented, and subjects reported whether the shifting pitch was higher or lower than that of the memorized tone sequence. We employed a hierarchical drift-diffusion model to characterize the memory performances. Our results show that temporal regularity facilitates memory performances only when the number of tones in the sequence exceeds working memory capacity. Specifically, for short sequences with 4 tones, rhythmic and arrhythmic tone sequences showed similar performance. Importantly, for the longer sequences with 7 and 10 tones, rhythmic presentation improved the drifting rate of perceptual judgment compared to the arrhythmic condition. Taken together, temporal regularity improves auditory working memory capacity, presumably by distributing attention more efficiently on memorized items during encoding or enhancing memory consolidation through neural oscillations.
Aysegul Gungor Aydin, Michael Chimenti, Kevin Knudtson and Kasia Bieszczad
Topic areas: memory and cognition
Epigenetic Gene expression Learning & memory Temporal processing Auditory memoryFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The ability to appreciate speech sounds with complex temporal features characteristic of human voices relies on learning to discriminate brief acoustic cues on the order of milliseconds. Learning-induced neurophysiological plasticity in the adult auditory system that operates on this timescale is essential for changes to sound-evoked activity that can enable memory for significant temporal sound features and guide behavior. Lasting neurophysiological changes require de novo gene expression within the auditory system. Gene expression is powerfully controlled by epigenetic mechanisms that modify and remodel chromatin to orchestrate a network of activity-dependent genes that dictate central auditory function, neuroplasticity, and—ultimately—successful sound-cued behavior. Blocking histone de-acetylase 3 (HDAC3), one of the most widely studied epigenetic mechanisms in learning and memory processes, typically increases the accessibility of transcriptional regulatory proteins to promoters to enable activity-dependent gene expression during memory consolidation. In rodent models of memory for amplitude-modulated (AM) sound cues, HDAC3-inhibition promotes the formation of highly precise AM-cue memory by facilitating auditory cortical changes in temporal coding (Rotondo & Bieszczad 2021 J Neurophysiol). Results will be presented from genome-wide RNA-sequencing on auditory cortical and subcortical samples in trained rats treated with an HDAC3 inhibitor (vs. a group of vehicle-treated trained rats) learning an established AM rate discrimination task. Thus, HDAC3 manipulation provides an opportunity for a molecular-level investigation of activity-dependent genes that are involved in temporal information encoding within the auditory system. Identifying key differentially regulated genes (DEGs) between groups or individuals that formed highly precise temporal acoustic cue memory (vs. those that did not) opens the door to future discovery of the downstream circuit- and systems-level processes regulated by these gene products that are critical to the success of auditory learning and behavioral function.
Rashi Monga, Celine Drieu and Kishore Kuchibhotla
Topic areas: memory and cognition
learning and memory procedural memory auditory discrimination striatumFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Most conceptual and analytical models assume that procedural memory formation in animals is dominated by ‘online’ trial-and-error learning. And, yet, ‘offline’ processes are known to play a critical role in declarative forms of memory, including episodic-like memory in rodents. To what extent do offline processes contribute to procedural learning in rodents? To address this, we reasoned that if online processes dominate in a procedural learning task, then reducing the number of training trials per session by half would double the number of sessions required to reach expert performance. We trained water-restricted mice on an auditory go/no-go task in which they had to lick to one tone (S+) to obtain a water reward and withhold licking to another (S-) to avoid a timeout. We assayed learning in reinforced and non-reinforced (probe) trials to allow us to dissociate between ‘acquisition’ of task contingencies (measured in probe trials) and slower behavioral ‘expression’ (measured in reinforced trials). Mice (n=6) that received 140 reinforced trials per session took 4.1 ± 2.3 sessions to acquire the task contingencies (measured in probe trials) and 13.0 ± 4.9 sessions to express those contingencies (measured in reinforced trials). We then reduced the number of trials per session to 70 (n=8). Surprisingly, these mice acquired (5.0 ± 1.4 sessions) and expressed (11.6 ± 3.0) the task in the same number of sessions (p=0.6) despite experiencing only half the trials. These results suggest that offline processes contribute to procedural learning. To further test this possibility, we reduced the number of trials per session to 35. Mice (n=5) acquired the task contingencies in slightly more sessions (7.0 ± 0.8 sessions) but still with significantly fewer trials than predicted from the 140 or 70 trial-per-session tasks (pless than 0.01). Interestingly, these mice took far more sessions to express the task contingencies in reinforced trials with 3 of 5 mice not reaching criterion even after 50 sessions. These data support the model that procedural learning exhibits two, dissociable learning processes and provides new behavioral evidence that acquisition leverages offline processes to enhance learning while expression depends more heavily on trial-and-error practice.
Amber Kline, Michellee Garcia, Hiroaki Tsukano, Koun Onodera, Michael Kasten, Paul Manis and Hiroyuki Kato
Topic areas: hierarchical organization subcortical processing thalamocortical circuitry/function
auditory cortex thalamo-cortical projections hierarchical cortical organizationFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
How our brain integrates information across parallel sensory channels to achieve coherent perception remains a central question in neuroscience. In the auditory system, sound information reaches the cortex via two anatomically distinct pathways—the “primary” lemniscal pathway, which is thought to carry fast and accurate representations of sounds, and the non-lemniscal pathway, which is generally described as a slower integrator of multisensory information. However, the potential roles of the non-lemniscal pathway in fast sound processing remain unclear. In this study, we identified a short-latency (less than 10 ms) input onto layer 6 (L6) of the secondary auditory cortex (A2), which was as fast as the lemniscal inputs to L4 of the primary auditory cortex (A1). We performed retrograde tracing and found that A2 L6 receives inputs from neurons along the non-lemniscal pathway: cochlear nucleus → external shell of the inferior colliculus (ECIC) → medial division of the medial geniculate nucleus (MGm) and the brachium of the inferior colliculus (BIC) → A2 L6. Using electrophysiological recordings, we confirmed the short-latency responses in these brain structures: 4-5 ms in ECIC and 5-7 ms in MGm/BIC. These anatomical and functional properties support a non-lemniscal origin of short-latency inputs that bypasses A1 and directly reaches the deep cortical layers of A2. Ongoing electrophysiology experiments in vitro and in vivo aim to understand how short-latency L6 input interacts with L4 input to shape sound representation in the auditory cortex and influence perceptual behaviors.
Zsuzsanna Kocsis, Joel I. Berger, Ryan M. Calmus, Bob McMurray, Mccall E. Sarrett, Phillip E. Gander, Christopher K. Kovach, Jeremy D. Greenlee, Aaron D. Boes, Thomas Nickl-Jockschat, Hiroto Kawasaki, Inyong Choi, Matthew A. Howard and Christopher I. Petkov
Topic areas: speech and language correlates of behavior/perception
source localized EEG intracranial EEG long-term effects of surgical impact plasticityFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
A fundamental question in neuroscience is the human brain’s capacity for plasticity. Recently we reported rare intracranial ECoG recordings obtained tens of minutes before and after a neurosurgical resection requiring disconnection of the left anterior temporal lobe (ATL) to treat intractable epilepsy [1]. The recordings during a speech semantic expectation task showed immediate impact on speech sound processing and prediction. Here we report source-localized high-density EEG results with the same task in the two left hemisphere ATL disconnection patients previously reported and an additional right hemisphere patient. The EEG data were obtained 2-6 weeks before and 2 and 6-14 months after their surgical treatment procedure. We aimed to study the correspondence between the ECoG and EEG in the same patients and longer-term compensation throughout the brain. Behavioral data on the task showed a striking post-surgical impact on speech perception in the left hemisphere ATL patients, but a lack of such an effect in the right hemisphere patient whose speech perception ability stayed within the normative range of control participants. In all three patients, the source-localized EEG signals after ATL disconnection showed magnified frontal and auditory area responses to the speech sounds in the hemisphere affected by the surgical procedure, recapitulating effects observed immediately after the surgical procedure in the intracranial ECoG recordings. Moreover, the contralateral hemisphere showed magnified responses to the speech sounds post-disconnection only when the left hemisphere was affected. The overall results establish key correspondences between the ECoG and source-localized EEG signals and identify forms of plasticity and compensation. [1] Kocsis et al. (2022) https://doi.org/10.1101/2022.04.15.488388
Johannes Niediek, Maciej M Jankowski, Ana Polterovich, Alexander Kazakov and Israel Nelken
Topic areas: memory and cognition correlates of behavior/perception
Reinforcement learning Sound-guided behavior Information theory Freely moving rats ElectrophysiologyFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Animal experiments in neuroscience often use coarse measures of behavior, e.g. trial outcome (correct/incorrect). However, this does not capture complex animal behavior, with trials consisting of many actions. We developed a framework to study behavior and simultaneous neuronal activity of freely moving rats at high resolution. We model behavioral tasks as Markov Decision Processes (MDPs), where a rat trajectory is described as a sequence of environment states and rat actions, with state transitions depending on current state and action. We computed deterministic optimal policies (a policy is a rule that prescribes an action in each state). To describe non-deterministic rat behavior, we computed non-deterministic, information-limited policies that realize optimal reward rates at a prescribed amount of deviation from non-informative behavior, quantified as the Kullback-Leibler divergence from a default, non-informative policy (Tishby’s complexity, TC). We applied our framework to data from five female rats performing a complex auditory-guided task, implemented in a large environment with twelve nose-poke ports and loudspeakers. Rats had to position themselves at locations indicated by sounds. Despite the nontrivial task, rats reached high success rates within two 70-minute sessions. Observed rat trajectories resembled optimal policies, but were non-deterministic. We estimated the TC of rat movement and nose-poking and its change over time by comparing rat behavior with information-limited policies. Our model revealed a prolonged increase in TC over time. Significantly, this behavioral refinement was not discernible via reward rates, and to our knowledge has not been described. The model captured individual propensities for preferring some foraging strategies over others, and a reduction in the tendency of rats to perform nose-pokes that do not yield rewards. Recording with chronically implanted silicon probes from the left insular cortex, we found that in many neurons, firing rates (averaged over ten minutes) strongly correlated with TC. The proportion of highly correlated units was significantly larger in real recordings than in surrogate data. Our model is based on first principles of information theory rather than on ad-hoc measures of behavior. Measures derived from this model bring new insights into rat behavior, and seem to reflect brain activity.
Antonio Criscuolo, Molly J. Henry, Michael Schwartze, Hugo Merchant and Sonja Kotz
Topic areas: correlates of behavior/perception cross-species comparisons subcortical processing
Rhythm cognition delta-band oscillations temporal predictionsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Since Darwin, comparative research has shown that most animals share basic timing capacities, such as the ability to process temporal regularities and produce rhythmic behaviors. What seems to be more exclusive, however, are the capacities to generate temporal predictions and to display anticipatory behavior at salient time points. These abilities are associated with subcortical structures like basal ganglia (BG) and cerebellum (CE), which are more developed in humans as compared to nonhuman animals. With a combined comparative and translational approach, we examined the phylogenetic trajectories of human’s rhythm cognition and tested the causal involvement of cortico-subcortical structures. We made use of a unique dataset including 2 macaque monkeys, 20 healthy young (HY), 11 healthy old participants (HO) and 22 stroke patients, 11 with focal lesions in the BG and 11 in the CE. We recorded EEG while participants listened to isochronous equitone sequences with a presentation rate of 1.5Hz. We examined whether neural oscillatory activity in the delta-band (1-3Hz) internalized the timing of external events, by encoding temporal regularity and showing an anticipatory phase-alignment to expected tone onsets, ultimately indicating temporal predictions. Interestingly, macaque monkeys showed striking similarities with human participants: they showed a clear peak in the Fourier spectrum at 1.5Hz, thus confirming the ability to encode temporal regularities. Furthermore, healthy participants’ and macaque monkeys’ delta-band activity displayed a coherent and anticipatory phase-alignment to expected tone onsets, as indexed by mean vector length (MVL). HO and patients showed a similar peak in the Fourier spectrum at 1.5Hz, but significantly differed in their MVL: BG patients showed a stronger phase-alignment to tone onsets as compared to the other groups. When compared to HY, however, HO and CE patients showed lower MVL, while BG patients were comparable. Our phylogenetic and translational approach demonstrates that, similarly to humans, macaque monkeys encode temporal regularities and formulate temporal predictions. Furthermore, data suggests that ageing and CE lesions, but not BG, alter temporal predictions, reducing the ability to track environmental rhythms. These observations provide crucial evidence for a differential but complementary role of CE and BG in the phylogenesis of human’s rhythm cognition.
Vighneshvel Thiruppathi and Michael Brosch
Topic areas: neural coding novel technologies
Direct current stimulation Auditory cortex MacaqueFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In our day-to-day lives, we are constantly inundated with temporally fragmented auditory information which we retain and associate over time to attain a goal-directed behavior e.g. memorizing and noting a person’s phone number while it is being recited. The auditory cortex has long been known to be involved in sensory processing but its implication for cognitive processes are not fully explored. To decipher its role in the behavior, external intervention is el in order to establish a causal-link relationship. Electrical stimulation remains to be the simplest of all reversible activation techniques for causal research in systems neuroscience. Yet it is a powerful tool to induce a transient effect on neurons, which is crucial to delineate and study the effects on the course of an explicit cognitive function during a complex behavioral task. There has been a tremendous interest in transcranial direct current stimulation i.e. tDCS since its rediscovery a decade ago, which involves localized macrostimulation of a cortical region using low-magnitude currents of either polarity presumably causing shifts in neurons’ resting membrane potential. Although it has been shown that transcranial anodal i.e. positive stimulation improves and cathodal i.e. negative stimulation deteriorates the behavioral performance of human subjects on auditory tasks respectively, its neuronal mechanisms have not been thoroughly investigated. Thus, we set out to bridge this connection by assessing the neural effects of intracranial direct current stimulation i.e. iDCS in the auditory cortex of macaque subjects (n = 3). The current study aims to address the following: 1) the quantitative relationship between current intensity and its electrically-evoked neuronal activity; 2) the instantaneous and prolonged stimulation effects on the temporal dynamics of neural activity; 3) the spatial extent of stimulation effects on neural activity; 4) the effects on neural firing rate due to various stimulation types and 5) the effects of different current intensities on sensory encoding in auditory cortex. To this end, we determined the efficacy of iDCS on spontaneous activity and on sensory encoding in auditory neurons with the ultimate goal to manipulate cellular and network-level processes.
Annika Michalek, Lukas Hessel, Anja Oelschlegel, Patricia Wenk, Nicole Angenstein, Eike Budinger and Jürgen Goldschmidt
Topic areas: correlates of behavior/perception hierarchical organization multisensory processes novel technologies
fMRI CBF-SPECT Auditory stimulation FM tone Background noise EPI noise Anesthesia Mongolian gerbilFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
One of the most widely used approaches for BOLD-fMRI studies in rodents is echoplanar imaging (EPI) in medetomidine anesthetized animals. The effects of anaesthesia and EPI-noise on brain activation patterns, particularly on auditory-evoked patterns, are largely unknown. We here use cerebral blood flow (CBF) SPECT for imaging, in Mongolian gerbils, frequency-modulated (FM) tone induced brain activation patterns under awake unrestrained conditions outside the MR scanner and under medetomidine anesthesia inside the MR scanner in the presence or absence of EPI-noise during simultaneous BOLD-fMRI. For SPECT-imaging of CBF, gerbils were i.v. injected with 99mTcHMPAO through chronically implanted external jugular vein catheters. Animals were continuously injected over 12 min with and without FM tone stimulation during the following conditions: awake unrestrained, medetomidine anesthetized, medetomidine anesthetized plus EPI-noise and BOLD-fMRI. SPECT-scanning for 99mTc-brain distribution read-out was performed after injection. 99mTcHMPAO-injections under anesthesia were done with the animals inside a Bruker 9.4T horizontal scanner; BOLD-fMRI was performed simultaneously to tracer-injections with a GE-EPI sequence. CBF-SPECT images showed a clear right-lateralized increase in CBF in the primary auditory cortex (AC) under all FM tone-stimulation conditions, consistent with right-lateralized BOLD-responses to the same stimulus in the same region. EPI-noise under anesthesia without FM tone stimulation lead to a strong bilateral increase of CBF ( greater than 45%) in the inferior colliculus (IC) but only moderately increased CBF in AC. FM tone stimulation did not further increase CBF significantly in IC, and significant BOLD-responses in IC were not detected on group level. CBF response magnitudes to FM tone stimulation in AC were remarkably similar in awake and anesthetized gerbils peaking at ca. 25%. In the presence of EPI-noise the response was ca. 5% lower. Subcortical auditory pathways were more clearly delineated in anesthetized as compared to awake gerbils. To the best of our knowledge, our study is the first study that disentangles potentially confounding effects of EPI-noise and medetomidine anesthesia on brain-wide activation patterns in rodent BOLD-fMRI. Auditory cortical activation patterns during medetomidine anesthesia closely mimic those in the awake state. Interfering effects on the auditory system in BOLD-fMRI can arise from EPI-noise and most strongly affect the IC.
Fabian Schmidt, Lisa Reisinger, Patrick Neff, Ronny Hannemann and Nathan Weisz
Topic areas: auditory disorders speech and language correlates of behavior/perception
hearing impairment spectro-temporal response functions magnetoencephalographyFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The current gold-standard for the diagnosis of hearing loss is pure-tone audiometry. Yet, the artificial pure tones used to assess hearing thresholds in pure-tone audiometry do not resemble real-life listening situations. Therefore pure-tone audiometry only provides an incomplete picture of individual hearing impairment, as disorders such as supra-threshold hearing loss (i.e. hidden hearing loss) can not be captured. Additionally, pure-tone audiometry is vastly dependent on subjective feedback. This can be problematic as giving informed feedback is challenging for some patient groups (e.g. babies that are born deaf or elderly people with dementia). Here we propose the “Neurogram”, a possible way to overcome the shortcomings of pure-tone audiometry by using a combination of system identification approaches, magnetoencephalography and a naturalistic listening situation (a radio play). By subsequently fitting linear encoding and decoding models we regress features of an acoustic signal (e.g. spectrograms) from related measured brain activity. We find that the decodability of acoustic information decreases with individual hearing capacity measured using pure-tone audiometry. Furthermore, we found a stronger relationship between subjective reports of speech perception (assessed using the Speech, Spatial and Qualities of Hearing Scale) and the here proposed “Neurogram” compared to pure-tone audiometry. In the future we aim to further develop this approach and work towards a diagnostic procedure that allows clinicians to fit hearing aids optimally based on a characterization of individual hearing impairment without solely relying on subjective feedback.
Corinne E. Fischer, Nathan Churchill, Veronica Vuong, Melissa Leggieri, Michael Tau, Luis Fornazzari, Michael Thaut and Tom A. Schweizer
Topic areas: memory and cognition
Dementia Cognitive decline Music fMRIFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Listening to a playlist of autobiographically salient songs has been shown to have beneficial effects on cognitive performance, particularly in people with cognitive decline. However, the mechanisms by which music improves memory have not been elucidated. Using functional magnetic resonance imaging (fMRI), we investigated the effects of an in-home autobiographically salient music listening program (1 hr/day, 5-7 days/week, 3 weeks) on longitudinal changes in brain structure, function, and memory in musicians (n=6) and non-musicians (n=8) with early-stage cognitive decline. Brain activity and memory, assessed using the Montreal Cognitive Assessment (MoCA), were evaluated pre- and post-intervention. Changes in functional connectivity, white matter microstructure, and global memory associated with the intervention, were assessed with a paired t-test. The results of the task-based scans consistently showed a decline in activation from pre- to post-intervention in the globus pallidus and right inferior frontal gyrus, suggesting greater neural efficiency. When examining effects of musicianship, we observed that musicians showed less longitudinal change. Resting-state functional connectivity analysis showed decreased connectivity between temporal and frontal network nodes, with musicians showing more longitudinal change in functional connectivity, relative to non-musicians, who showed minimal changes. White matter microstructural analysis revealed longitudinal effects in structure, as evident by decreased radial diffusivity in the right superior corona radiata with musicians demonstrating more reduction relative to non-musicians. Finally, using a Wilcoxon Signed-Rank test, we found significant results on the memory subscore of the MoCA at post-intervention (p=0.034). The study results indicate that repeated listening to autobiographically salient music may induce beneficial neuroplastic changes in cognition and that musicianship may modulate these processes.
Grant W. Zempolich and David M. Schneider
Topic areas: correlates of behavior/perception hierarchical organization neural coding thalamocortical circuitry/function
Auditory cortex Sensorimotor Prediction errorFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
During auditory feedback-based behaviors - such as playing violin - we form predictions for the sounds our movements produce, and if expectation and experience differ, adjust our subsequent actions. The auditory cortex produces activity well suited to support auditory feedback-based motor learning. In primates and mice, auditory cortical responses to self-generated sounds that match expectation are weak, while responses to self-generated sounds that violate expectation are large, consistent with auditory cortex producing prediction error signals during sound-generating behaviors. It remains unknown whether or how these signals are used to guide learning. To explore this question, we developed a novel auditory-guided behavior in which mice press a lever using their forelimb toward a 2mm wide target zone that changes location every approximate 30 presses. Mice hear a 16kHz tone when the lever enters the zone and an 8kHz tone if the press exceeds the bounds of the zone. Presses that peak within the zone are rewarded when the lever returns to the starting position. Presses that are too short (producing no tones) or too long (producing both an entry and an exit tone) are unrewarded. Over 3 weeks, mice learn to produce precise lever presses that peak within the target zone. Performance errors decrease with training and mice plateau at a level of performance where the variance of peak positions is approximately equal to the size of the zone. Following target relocation, mice rapidly adjust their lever movements and find the new zone within several trials, suggesting that they use acoustic feedback to guide behavior. Consistent with this conclusion, when we omit all tones on select trials mice fail to find the target zone and performance error increases significantly. Preliminary multi-electrode array recordings from the auditory cortex of well-trained mice indicate that many neurons respond to both movement and sound. Some neurons respond differently to the 16kHz entry tone depending on where the target zone is located, suggesting that auditory cortex encodes a combination of lever position and sound. These experiments establish a novel auditory-guided learning paradigm that may provide insights into how prediction errors are used to guide behavior.
Guan-En Graham, Michael Chimenti, Kevin Knudtson, Devin Grenard, Liesl Co and Kasia Bieszczad
Topic areas: memory and cognition
auditory memory and plasticity auditory cortex epigenetics gene expression learning and memoryFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Forming auditory memories that last a lifetime requires experience-dependent neurophysiological plasticity. Gene expression enables lasting functional changes to auditory system processing. Epigenetic mechanisms are powerful molecular-level controllers underlying activity-dependent gene expression in the adult auditory system that can control long-lasting effects on neuronal function and strengthen learned behaviors. Epigenetic regulators such as the post-translational acetylation or deacetylation of histone proteins that package DNA by histone acetyltransferase (HAT) and deacetylase (HDAC) enzymes are permissive or repressive to gene expression, respectively. Histone deacetylase 3 (HDAC3) often works with transcriptional machinery to enable activity-dependent de novo DNA transcription. Here, we systemically inhibit HDAC3 (HDAC3i) which increases auditory memory and facilitates auditory cortical plasticity (reviewed in Shang & Bieszczad, 2022) to present the first evidence of vast transcriptomic changes in adult ACx induced by sound discrimination training. Genes implicated in learning-induced ACx plasticity, memory, and sound-cued behavior have been completely unknown. Bulk RNA-sequencing on ACx samples was performed to determine genome-wide effects of systemic HDAC3i in rats trained in a two-tone auditory discrimination task. We report that HDAC3i amplifies the magnitude of learning-dependent transcription by further up- or down-regulating unique subsets of induced genes (relative to vehicle and naïve groups). Interestingly, there are few unique differentially expressed genes (vs. vehicle), e.g., Adamts13, Cabin1, and Rexo4. As HDAC3i primarily amplified learning-induced ACx transcription changes, bioinformatic analysis in iPathway GuideTM determined molecular pathways more strongly activated with HDAC3i vs. training alone. This analysis identified proteins involved with cholinergic and glutamatergic synapses, and key regulators of the MAPK/ERK pathway in synaptic plasticity and memory. qRT-PCR verified genes of interest (GOIs) identified by RNA-seq. Single molecule fluorescent in situ hybridization (smFISH) visualized GOIs within ACx anatomy. Combining bulk RNA-seq with more sensitive and cell-type specific smFISH reveals both broad (genome-wide) and subtle (GOI) learning-induced ACx transcription events. Together, these results identify ACx gene networks important for experience-dependent effects on sound processing and characterize the regulatory role of HDAC3 on ACx genetic targets that may be key for neurophysiological plasticity events to support highly precise and lasting auditory memories.
Ryan Calmus, Benjamin Wilson, Yukiko Kikuchi, Zsuzsanna Kocsis, Hiroto Kawasaki, Timothy D. Griffiths, Matthew A. Howard and Christopher I. Petkov
Topic areas: memory and cognition speech and language neural coding
HUMAN ECOG INTRACRANIAL AGL STRUCTURE COMPUTATIONAL MODEL COGNITION CODINGFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Understanding how the brain represents and binds auditory information distributed over time is a challenging problem, requiring computationally and neurobiologically informed approaches to solve. Perception of spoken language is a salient example, whereby syntactic knowledge facilitates “movement” and transformation of sequential acoustic events into hierarchical mental structures. By some accounts a fundamental A/B “Merge” function is responsible for creating these mental structures, and may be implemented in other cognitive domains using similar neurocomputations. We previously proposed a computational model, VS-BIND, which applies vector-symbolic operations to population-level codes to bind A/B pairs of items segregated in time, forming a composite representation (Calmus et al., Phil. Trans. Roy. Soc. B, 2019). VS-BIND also specifies mechanistic roles for prefrontal areas 44/45 and motor/premotor cortex during interactions with temporal cortex. Here, we test this model using human intracranial recordings obtained during an Artificial Grammar Learning (AGL) task, in neurosurgery patients being monitored for epilepsy treatment. During the task, 12 patients listened to auditory speech sequences containing dependencies between adjacent and non-adjacent nonsense words, before being tested on their ability to distinguish novel “grammatical” and “ungrammatical” sequences. We analyzed intracranial data using traditional methods, demonstrating engagement of the fronto-temporal network. Additionally, we applied a battery of established and novel multivariate analyses to reveal the representational geometry of regional speech encodings and the causal representational flow between frontal and temporal regions. The results show that certain prefrontal areas integrate relational information, including the ordinal position of items in a sequence, in concordance with the mechanistic and site-specific predictions of the VS-BIND model. Furthermore, we observed net causal representational flow consistent with feed-forward and feedback predictive signaling, suggesting that expectation-driven predictions are fed back to primary auditory cortex from prefrontal areas including 44/45. We are testing tenets of VS-BIND using neural patterns directly derived from the neurophysiological signals recorded in the human patients, and comparing it to other models. The results indicate a critical role for fronto-temporal areas in transforming the sensory world, and, more specifically, provide initial evidence for fundamental A/B binding mechanisms associated with the transformation of sequential auditory items into mental structures.
Christof Fehrman, Samantha Moseley, Jonah Weissman and C. Daniel Meliza
Topic areas: neural coding neuroethology/communication
Spectro-temporal receptive field Computational Model Spiking Neural NetworkFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Animals must be able to recognize salient auditory signals in a noisy environment. This noise is often spectrally and temporally complex and is highly correlated with conspecific vocalizations. Neurons in the avian auditory cortex have been shown to produce noise-invariant firing at relatively low signal-to-noise ratios (SNRs). It is unknown how they are able to filter out background noise while preserving the structure of the signal of interest. One mechanism for this noise-invariance may lie in the structure of spectro-temporal receptive fields (STRFs). STRFs are often modeled as impulse response functions and act as filters on auditory signals. STRFs in the zebra finch (ZF) auditory cortex are known to overlap with the modulation power spectrum of their song vocalizations. We hypothesized this overlap provides a bank of signal filters that allows the system to filter out complex background noise. We tested our hypothesis by constructing a spiking neural network where each neuron was given a STRF parameterized from a known distribution in the ZF auditory cortex. The STRFs used were drawn from either a temporally bandpass, temporally wideband, or mixed distribution. The mixed condition completely covers the modulation power spectrum of ZF song and we hypothesized this condition would produce the strongest noise-invariant responses. We trained the network to reconstruct a noiseless input signal of ZF song. We then varied the input song across three background noise conditions (white, synthetic, and conspecific) and eleven SNRs (from 70 to -10). A noise-invariant network would produce song reconstructions that filtered out the background noise. Consistent with our hypothesis, using the mixed STRFs produced the strongest noise-invariant networks across all three background noise conditions.
Baptiste Bouvier, Patrick Susini, Nicolas Misdariis and Catherine Marquis-Favre
Topic areas: memory and cognition correlates of behavior/perception subcortical processing
Salience Attention Timbre Experimental psychologyFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Attention allows a listener to select only the relevant information in a sound scene and to ignore others. However, some irrelevant sounds sometimes manage to capture our attention against our will. In the present study, the influence of timbre attributes on auditory attention capture is investigated. To address this issue, an additional singleton paradigm has been implemented and provides an indirect measure of the influence of a timbre attribute on attention capture. The participant's task is a target discrimination task according to its duration. In the reference condition, the target is embedded in sequences of all-identical successive distractors. In the test condition, one of the distractors, called the singleton, has a different timbre. We examine how performance on the discrimination task is degraded when the singleton is present. In two experiments, the brightness (experiment 1) and roughness (experiment 2) of the singleton are varied (through variations of spectral centroid and depth of amplitude modulation respectively). We observe that the higher the spectral centroid or the amplitude modulation depth of the singleton, the higher the error rates and response times. This reveals that auditory attention capture is a feature-driven effect and is in particular modulated by the variations of timbre’s attributes. This work provides new insights on the nature of auditory attentional capture and opens new perspectives for studying it as a feature-driven effect. The results obtained will be used to improve salience models.
Margherita Giamundo, Regis Trapeau, Simon Nougaret, Xavier De Giovanni, Luc Renaud, Thomas Brochier and Pascal Belin
Topic areas: correlates of behavior/perception
Auditory cortex Electrophysiology Voice perceptionFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The ability to extract and process voice information is crucial for the social life of humans and other primates. Neuroimaging studies have shown the existence of temporal voice areas (TVAs) selective for conspecific vocalizations in both humans and non-human primates (Bodin, Trapeau et al., 2021), supporting the hypothesis of a functional homology in cerebral voice processing between humans and their closest relatives. But how is voice information treated at the neuronal level in these areas is still not clear. To tackle this issue, we implanted two rhesus macaques with several high-density multi-electrode arrays in fMRI-localized voice areas of the superior temporal gyrus. Spiking activity was recorded during an auditory stimulation task (pure tone detection task) in which a set of n=96 stimuli from four categories (human voices, macaque vocalizations, marmoset vocalizations, non-vocal sounds) was presented. A total of 1582 auditory-responsive single (n=472) and multi-units (n=1110) was recorded from 4 arrays in the two monkeys. Analyses indicate that a moderate proportion (29%) of cells was selective for conspecific (macaque) vocalizations, considerably smaller than the proportion of face-selective cells in the middle face patches (Tsao et al., 2006), confirming previous findings (Perrodin et al., 2011). However, at the population level, decoding analysis shows that spiking activity in the different TVAs allows above-chance classification of conspecific vocalizations from non-vocal sounds, with higher accuracy for more anterior TVAs. Spiking activity in the TVAs also allows classification of macaque call types as early as 65ms after the onset of the auditory stimulation, again with higher accuracy for the most anterior TVAs. Furthermore, a Representational Similarity Analysis of neuronal responses to the 96 stimuli shows that in the anterior TVAs, the Representational Dissimilarity Matrices capturing pairwise spiking activity differences between stimuli show significant association with an ideal categorical model separating conspecific vocalizations from other sounds as early as 75ms after stimulus onset. This difference did not occur for non-vocal sounds. These results advance our understanding of the neural substrates of voice information processing in macaques, and open a unique comparative window by allowing direct comparison with similar data obtained with humans.
Adam Attaheri, Aine Ni Choisdealbha, Giovanni M. Di Liberto, Sinead Rocha, Perrine Brusini, Natasha Mead, Helen Olawole-Scott, Panagiotis Boutris, Samuel Gibbon, Isabel Williams, Christina Grey, Sheila Flanagan, Dimitris Panayiotou, Alessia Phillips, Maria Alfaro E Oliveira, Carmel Brough and Usha Goswami
Topic areas: speech and language
Oscillations Cortical tracking EEG Infant LanguageFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In adults, neurophysiological signals in the delta and theta bands are known to “cortically track” the envelope of speech, and the phase of these low frequency oscillations also temporally organize the amplitude of high frequency oscillations in a process called phase amplitude coupling (PAC). The alignment of cortical neural signals to specific stimulus parameters of speech has been shown to play a key role in adult language processing, yet this role regarding early language measures is unexplored with infants. Here we report longitudinal EEG data from 112 infants (Cambridge BabyRhythm study), aged 4-, 7- and 11- months, as they listened to nursery rhymes. After establishing the presence of stimulus-related neural signals (PSD), multivariate temporal response function (mTRF) analyses measured the strength and maturation of cortical speech tracking, whilst a normalised modulation index (nMI) assessed PAC. We replicated this experiment with 21 adult participants, to see whether delta and theta cortical oscillatory networks differ when tracking speech in the infant and adult brain. Language data was recorded in both the adults and infants (longitudinal), to see if individual differences in language performance could be predicted from our EEG results. Peaks in stimulus-related spectral power (PSD) were different in the two populations. In Infants, PSD peaks were observed in the delta and theta ranges with a developmental maturation of the theta peaks. Whilst stimulus-related increases in PSD power were present in the adult data at these frequencies, PSD peaked at different points. Both infants and adults showed significant cortical tracking of the sung speech in both delta and theta bands but not in the alpha band. Furthermore, delta band tracking was significantly greater than theta band tracking in both populations. PAC was stronger for theta- versus delta- driven coupling in adults but was equal for delta- versus theta-driven coupling in infants. Finally, we describe which of our battery of language outcome measures were predicted by each of the above EEG measures. These data suggest that cortical speech tracking mechanisms are present early in infancy but undergo developmental changes into adulthood. Furthermore, cortical oscillation measures can predict certain elements of infant language performance.
Jared Collina, May Xia, Gozde Erdil, Janaki Sheth, Konrad Kording, Yale Cohen and Maria Geffen
Topic areas: correlates of behavior/perception
Auditory cortex Behavior CategorizationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
In everyday life, because both sensory signals and neuronal responses are noisy, important cognitive tasks, such as auditory categorization, are based on uncertain information. At the neuronal level, categorization requires a transformation of sensory representation into a representation of category membership. While categorical representations were found in the cortex, the cell types and auditory neuronal mechanisms supporting the emergence of these representations remains unknown. The broader goal of our project is to understand the neuronal circuits that support decision-making based on information in the region of uncertainty. Here, to test the role of the auditory cortex (AC) in creating and biasing categorical stimulus representations, we trained mice in a two-alternative-forced choice task in which mice categorize the frequency of a “target” sound into one of two overlapping categories (“low” or “high”). We reversibly suppressed cortical activity through bilateral muscimol injection in AC of trained mice in order to test the hypothesis that AC is involved in categorizing pure tones. Mice maintained their ability to perform the task. However, suppressing AC activity induced an attenuation in categorization accuracy to trained stimuli. In addition, inactivation of the AC resulted in a significant broadening of the psychometric slopes, which is an indicator of how certain mice are about the stimuli near the perceived category boundary when making their decisions. These findings suggest that the auditory cortex controls categorization acuity, but is not necessary for stimulus discrimination in general. Our results lay the groundwork for further exploration of the role of the auditory cortex in categorization behavior under uncertainty.
Rosstin Afsahi, Jessica Jacobs, Pawel Kusmierek, Patrik Wikman, Patrick Forcelli and Josef Rauschecker
Topic areas: speech and language correlates of behavior/perception neural coding
Auditory cortex Motor LanguageFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Dating back to Wernicke (1874), the dorsal-stream model of auditory and speech processing postulates that connections between auditory and motor regions of the brain are crucial for relaying motor commands to their sensory outcomes (Rauschecker, 2011). Previous studies with functional MRI in our lab have shown that motor regions are activated when macaques listen to auditory-motor sequences they had previously learned to produce (Archakov et al., 2020). However, no previous study has specifically investigated connectivity between auditory cortex and regions of the PMC that physiologically respond to auditory input. To investigate this further, we injected Cholera Toxin B subunit (CTB) tagged with Alexa Fluor (AF) 488 and 594 fluorophores into anterior and posterior regions of the PMC, respectively, in adult macaques (Macaca mulatta). The exact locations of the injection sites were chosen based on the activations found in our previous fMRI study. The animals were euthanized after two weeks, the brains were blocked and cut in the coronal plane, and immunohistochemical staining was performed. The results showed that the highest concentration of labeled cells, projecting to the posterior injection site, was found in the posterior parietal cortex and in the IPS, including VIP and LIP. Additional label from both injections was found in SMA and the ACC. The current dorsal-stream model postulates that posterior parietal areas may serve as a relay station between auditory centers and the PMC. This may include area VIP, which has previously been shown to play a role in auditory processing (Lewis and Van Essen, 2000).
Kateryna Pysanenko, Daniel Suta and Josef Syka
Topic areas: subcortical processing
VOCALIZATION INFERIOR COLLICULUS MOUSEFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Communication sounds play an important role in the social interaction of mice. These species-specific vocalizations are found predominantly in the ultrasonic range with frequencies typically between 50 and 80 kHz. We studied how ultrasonic vocalizations are encoded in the subcortical portion of the central auditory system in CBA mice. Specifically, we focused on the inferior colliculus (IC) of the mouse. We recorded ultrasonic vocalizations in freely moving mice (1.5-18 months of age) when one female and one male were temporarily housed together in the same cage. The repertoire of communication sounds of the inbred CBA strain was dominated by relatively simple vocalizations consisting of a series of short bouts with upward and downward frequency modulation. The responses of IC neurons to tones, combinations of two tones, frequency-modulated tones and ultrasonic vocalizations were recorded using a multichannel microelectrode in ketamine-xylazine anesthetized mice. Two-tone stimulation was applied to revealed inhibitory regions in the frequency response area. We found not only lower and/or upper inhibitory sidebands surrounding excitatory region at the characteristic frequency but specifically in high-frequency neurons inhibition formed more complex patterns. Neurons in the high-frequency region of the IC frequently displayed responses with several excitatory peaks when stimulated with upward or downward frequency modulated tones, whereas those in the low-frequency region responded predominantly with a continuous excitatory reaction. In principle, only neurons localized in the high-frequency region of the IC responded to species-specific vocalizations and their responses did not follow all segments of the vocalization bouts. This finding suggests that the inhibition might play an important role in the processing of species-specific vocalizations in the subcortical part of the auditory pathway in CBA mice.
Karli Nave, Erin Hannon and Joel Snyder
Topic areas: correlates of behavior/perception
eeg replication registered report rhythm music beat and meterFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Cognitive neuroscience research has attempted to disentangle stimulus-driven processing from conscious perceptual processing for decades. While some prior evidence for neural processing of perceived musical beat (periodic pulse) may be confounded by stimulus-driven neural activity, a seminal finding in auditory neuroscience has evidenced perception-related brain responses to imagined musical beat while holding stimulus features constant. Frequency tagging, which measures electrical brain activity at frequencies present in a stimulus, showed increased brain activity at beat-related frequencies when listeners imagined a metrical pattern while listening to an isochronous auditory stimulus (Nozaradan, Peretz, Missal, & Mouraux, 2011). However, it is unclear whether this represents repeatable evidence for conscious perception of the beat reflecting the population-level effect size, and whether this effect is related to relevant music experience, such as music and dance training. This Registered Report (journal: Advances in Methods and Practices in Psychological Science) details the results of 13 independent conceptual replications of Nozaradan et al. (2011), all using the same pre-registered vetted protocol (see https://osf.io/d8fmb for more information on our pre-registration). Listeners performed the same imagery tasks as in Nozaradan et al. (2011), with the addition of a behavioral task on each trial to measure conscious perception. Meta-analyses will examine the effect of imagery condition and estimate the meta-analytic effect size for each condition, as well as the presence (or lack thereof) significant moderating effects of music and dance training. Logistic regression will estimate the predictive value of behavioral performance on brain activity on individual trials. With pre-registered data analysis culminating Summer 2022, results will be presented at the conference. We will discuss possible explanations for discrepancies between these findings and the original study and implications of the extensions provided by this registered report.
Olivia Lombardi, Blake Sidleck, Jack Toth, Priya Agarwal, Danyall Saeed, Dylan Leonard, Abraham Eldo and Michele Insanally
Topic areas: neural coding
perceptual learning auditory cortex electrophysiologyFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The ability to flexibly adapt to changing environments is the hallmark of an adaptive nervous system and is seen across the entire phylogenetic tree. In contrast, perceptual and cognitive inflexibility are implicated in many neurological disorders including hearing loss, autism, and schizophrenia. While sensory and frontal cortical areas have long been implicated in flexible behaviors, we lack a fundamental understanding of how information is gated within these circuits to select and update behavioral strategies based on sensory input and context. We trained mice to perform a go/no-go auditory reversal learning task that required animals to adapt their behavioral response to the same set of auditory cues. Specifically, animals were trained to respond to a target tone (11.2 kHz) and to withhold from responding to a nontarget tone (5.6 kHz) for water reward. Once animals learned this phase of the task we then implemented a rule-switch and reversed which tone was rewarded, requiring animals to remap stimulus-reward contingencies. Chemogenetic silencing demonstrated that auditory cortex is required for reversal learning in mice. Using silicon probe recordings, we simultaneously monitored the activity of single-units in the auditory cortex (AC, n=2,170 neurons) and frontal cortex (M2, n=2,688 neurons) during reversal learning. We found that neural response profiles during learning were highly heterogeneous ranging from highly-reliable or ‘classical’ responses to seemingly-random ‘non-classical’ firing. Neural populations in both regions dynamically altered their response profiles during different phases of learning allowing for the emergence of flexible behaviors.
Andrea Santi, Sharlen Moore, Aaron Wang, Jennifer Lawlor, Kelly Fogelson and Kishore Kuchibhotla
Topic areas: memory and cognition neural coding
Alzheimer's disease Auditory memory Neural activityFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Memories must be accessible for them to be useful. Alzheimer’s disease (AD) is a progressive form of dementia in which cognitive capacities slowly deteriorate due to underlying neurodegeneration. Interestingly, anecdotal observations have demonstrated that Alzheimer’s patients can exhibit cognitive fluctuations during all stages of the disease. In particular, it is thought that contextual factors are critical for unlocking these hidden memories. To date, however, exploration of the neural basis of cognitive fluctuations has been hampered due to the lack of a behavioral approach in mouse models to dissociate memories from contextual performance. Our previous work demonstrated that interleaving ‘reinforced’ trials with trials without reinforcement (‘probe’ trials) in an auditory go/no-go discrimination task, allows us to distinguish between acquired sensorimotor memories and their contextual expression. Here, we used this approach, together with two-photon calcium imaging on behaving AD-relevant mice (APP/PS1+), to determine whether amyloid accumulation impacts underlying sensorimotor memories (measured using ‘probe’ trials) and/or contextual-performance (measured using ‘reinforced’ trials) in an age dependent manner. Importantly, peripheral auditory function, measured using the threshold for detecting an auditory brainstem response, was similar between WT and APP/PS1+ mice. We found that while contextual-performance is significantly impaired in young adult APP/PS1+ mice compared to age-matched controls, these animals show little to no impairments in the underlying sensorimotor memories. However, middle aged APP/PS1+ mice show deficits in both domains. The impairment found in the young adults was accompanied by a reduction in stimulus selectivity and behavioral encoding in the auditory cortex of APP/PS1+ mice that that can be partially restored in probe trials. Ongoing analyses aim to identify whether this impairment is cortex-wide or is concentrated near Aβ plaques. Finally, these effects were recapitulated by using a reinforcement learning model that accounts for changes in contextual signals. The main network model parameters affected between the control and the APP/PS1+ mice were those governing contextual scaling and behavioral inhibition. These results suggest that Aβ deposition impacts circuits involved in contextual computations before those involved in storing memories and that neural circuit interventions, such as modulating inhibition, may hold promise to reveal hidden memories.
Nilay Atesyakar, Andrea Shang and Kasia Bieszczad
Topic areas: memory and cognition correlates of behavior/perception
auditory cortex plasticity auditory learning and memory epigeneticsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Experience-dependent representational plasticity in the adult auditory system is integral to long-term memory formation for sounds. Representational plasticity, here, refers to any form of experience-dependent neurophysiological plasticity that leads to prolonged changes in neurophysiological activity (e.g. tonotopic map expansion) that represents a sound dimension. Epigenetic mechanisms, which regulate gene expression essential for long-term memory, have been identified as robust modulators of experience-dependent neurophysiological plasticity in the auditory cortex that underlie the formation of memory for signal sounds (Shang & Bieszczad, 2022). For example, inhibition of the epigenetic enzyme, histone deacetylase 3 (i-HDAC3) promotes signal-specific tuning bandwidth reduction and tonotopic map expansion for a signal acoustic frequency that predicts reward, while also increasing behavioral signal-specific responding to the trained frequency (relative to non-signal frequency cues) (Bieszczad et al., 2015; Shang et al., 2019; Rotondo & Bieszczad, 2020; Rotondo & Bieszczad, 2021). However, these results were observed under optimal (silent background) listening conditions that are often irrelevant to real-world experiences. Indeed, the extent to which the effects of i-HDAC3 to increase signal-specific neurophysiological and behavioral responses persist in novel backgrounds of noise is currently unknown. Thus, findings are presented that replicate a rodent (rat) model of sound-reward learning for a 5.0 kHz (60 dB) pure tone frequency cue with in vivo auditory cortical (A1) multiunit electrophysiological recordings (as in Rotondo & Bieszczad, 2020) to examine the impact of background noise on the frequency-specificity of behavioral and cortical responses mediated by training with or without i-HDAC3. A1 frequency-tuning is known to depend on signal-to-noise ratios in naïve rats (Teschner et al., 2016). To determine the effect of sound-reward learning and i-HDAC3 on this relationship, we assessed behavioral and A1 responses to signal and non-signal frequency cues presented under different signal-to-noise ratios (+0, +20 and +40dB SNR). The results reveal how noisy backgrounds can ultimately impact the behavioral expression of highly frequency-specific memory. Our attempt to build a comprehensive model of the epigenetic regulation of neuroplasticity in A1 and long-term memory for sounds will further improve the ability to achieve successful hearing-related therapeutics in real world environments where sounds are seldom encountered in isolation.
Manon Obliger, Pascal Belin and Regis Trapeau
Topic areas: cross-species comparisons neuroethology/communication
Voice Conspecific vocalizations Auditory Cortex Marmoset fMRI ComparativeFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Functional MRI studies in humans (Pernet et al, 2015) and macaques (Petkov et al, 2008; Bodin et al, 2021) suggest the existence of a “primate voice patch system”: a set of interconnected areas selective to conspecific vocalizations and subtending representations of increasing abstractness and invariance, potentially analogous to the primate face patch system. To date, a single marmoset auditory fMRI study compared conspecific vocalizations to nonvocal sounds, with results suggesting the existence of bilateral voice patches in the marmoset anterior temporal lobe (Sadagopan et al, 2015). Here we aimed to replicate and extend these results by scanning n=6 marmosets under anesthesia during auditory stimulation with 4 categories: marmoset vocalizations, natural non-vocal sounds, scrambled vocalizations, and silence. We used a 3T scanner (Siemens PRISMA) and a commercial 16-channel marmoset head coil (Takashima/Rogue Research). Anesthesia was performed using sevoflurane delivered by a mask. A few minutes prior to functional scanning sevoflurane level was lowered to 1.5%-1%. In later scanning sessions N2O was added to the gaseous mix in order to reduce further the sevoflurane level to 0.5-0.8%. Functional scanning was performed with a spatial resolution of 1 (TR=773ms) or 1.25mm (TR=598ms) using an optimized ‘clustered-sparse’ design. The comparison of EPI volumes acquired after sound stimulation vs. the silent baseline yielded varying results depending on the runs and sessions, with some subjects showing mostly subcortical activation and others showing nice bilateral activation of auditory cortex. Jacknife analyses indicate an effect of sevoflurane levels, with higher t-statistics for lower levels. The comparison of marmoset vocalizations vs. the nonvocal sounds did not yield voice-selective activation, possibly owing to the large difference in spectral distribution between the two categories. However, the comparison with scrambled vocalizations did result in bilateral voice-selective activation in one subject, in a location of the temporal pole very similar to that reported before. Ongoing work is including more subjects and the addition of MION injections as a contrast agent. These results are expected to shed greater light on the neural architecture underlying voice information processing in marmosets, and provide a strong test of the hypothesis of a primate voice patch system.
Rebecca Krall, Megan Arnold, Callista Chambers, Hailey King, Harry Morford, John Wiemann and Ross Williamson
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Auditory categorization Corticofugal Sensory-guided behaviorFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Senses connect our brain with the environment, allowing us to perceive the world around us. Auditory information enters the brain via a feed forward hierarchal pathway that canonically terminates in the auditory cortex (ACtx). An open question is how this information is processed then routed brain-wide to influence behavior. Excitatory projection neurons in the ACtx are comprised of three groups; intratelencephalic (IT), extratelencephalic (ET), and corticothalamic (CT). These populations project broadly, targeting nodes of the ascending pathway as well as downstream regions classically associated with decision making, action, and reward. This organization allows for the broadcasting of behaviorally-relevant information and the shaping of auditory representations across brain-wide neural networks. We hypothesize that ACtx projection neurons provide a critical link between auditory input and behavioral output, necessary for auditory-guided behaviors. To investigate this, we trained head-fixed mice to categorize noise based the rate of amplitude modulation (AM) and bilaterally silenced distinct neural populations during stimulus presentations using GtACR2 (on 20% of trials). To ensure that ACtx was necessary for performance of this task, we silenced all excitatory neurons and found that inhibition biased decisions towards one spout, ultimately leading to a significant reduction in categorization accuracy. Unexpectedly, silencing of any of the 3 projection neuron classes had little effect on mice's ability to categorize, indicating that no single projection is necessary for task performance and suggesting that multiple projections may work synergistically. This disconnect between cell-specific and global inhibition led us to examine other consequences of inhibition across learning. We found that longitudinally inhibiting either ET or IT neurons led to a significant reduction in learning rate, evidenced by increased number of trials and sessions to achieve task proficiency. Furthermore, ET and IT mice trained to expert level had lower accuracy and higher variability across sessions compared to wild-type controls. Using dynamic psychophysical modeling we were able to infer differential learning strategies based on the factors influencing choice for each projection neuron class. Our current efforts are focused on using the GLM-HMM framework to investigate switching between these learning strategies to further probe the link between neocortical output and behavioral outcomes.
Joshua Hoddinott, Molly Henry and Jessica Grahn
Topic areas: memory and cognition correlates of behavior/perception
Rhythm Beat Music EEG Neural Entrainment Familiarity Predictability TrainingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Humans often spontaneously synchronize movements to a perceived underlying pulse, or beat, in music. Beat perception may be indexed by the synchronization of neural oscillations to the beat, marked by increases in electrical amplitude at the same frequency as the beat in electroencephalography (EEG) signals (Nozardan, Peretz, & Mouraux, 2012). Neural synchronization to the beat appears stronger for strong-beat than non-beat rhythms (Tal, et al., 2017), and has been hypothesized to underlie the generation of an internal representation of the beat. However, because we are exposed disproportionately to strong-beat rhythms (e.g., in most music) in the daily environment, comparisons of neural responses to strong-beat and non-beat rhythms may be confounded by relative differences in familiarity. Thus, in this study we disentangled beat-related and familiarity-related effects by comparing EEG responses during the perception of strong-beat and non-beat rhythms that were either novel or familiar. First, we recorded EEG to a set of strong-beat, weak-beat, and non-beat rhythms. Then, subjects were familiarized with half of the rhythms over 4 behavioral sessions by listening to and tapping along with the stimuli. Finally, EEG to the full set of rhythms (half now familiar, half still unfamiliar) was recorded post-familiarization. Results show only small changes in EEG power post-familiarization, and little evidence of familiarity-driven EEG power for weak- and non-beat rhythms. This suggests that oscillatory entrainment to the beat is not driven by stimulus familiarity.
Jack Toth, Badr Albanna, Brian DePasquale, Saba Fadaei, Trisha Gupta, Kishore Kuchibhotla, Kanaka Rajan, Robert Froemke and Michele Insanally
Topic areas: neural coding
auditory cortex electrophysiology recurrent neural networkFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Neuronal responses during behavior can range from highly reliable ‘classical’ responses to irregular or seemingly random ‘non-classically responsive’ firing. These spiking profiles have been documented throughout brain regions in response to many task-related signals. We combined in vivo cell-attached, extracellular, and whole-cell recordings during behavior with analyses of a novel task-performing spiking recurrent neural network. We recorded from the auditory cortex of rats and mice during a go/no-go auditory recognition task (rats: d’ = 2.8±0.1, N = 15; mice: d’ = 2.5±0.1, N = 7) and observed both classically responsive cells that were highly modulated relative to pre-trial baseline and non-classically responsive cells that were relatively unmodulated. To relate synaptic structure to spiking patterns, we developed a spiking recurrent neural network model incorporating excitatory and inhibitory spike-timing-dependent plasticity trained to perform a similar go/no-go stimulus classification task. This model captures the distribution of responses observed in behaving rodents. Inactivation experiments revealed that classically and non-classically responsive units contributed to task performance via output and recurrent connections, respectively. Excitatory and inhibitory plasticity independently shaped spiking responses, increasing the number of non-classically responsive units while keeping all units engaged in performance. Local synaptic inputs predicted spiking response properties of units as well as responses of auditory cortical neurons from in vivo whole-cell recordings during behavior allowing us to predict the functional role of a neuron from the pattern of synaptic inputs. Thus, a diversity of neural response profiles emerges from synaptic plasticity rules with distinctly important functions for network performance.
Kateryna Pysanenko, Zbynek Bures and Josef Syka
Topic areas: neural coding
Lateralization Auditory cortex RatFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In the present study, we examined the hemispheric differences in central auditory processing of temporally structured stimuli in the adult rats. Evoked neuronal responses to temporally structured stimuli (sinusoidally frequency modulated [FM] tones, frequency sweeps, amplitude modulated [AM] tones and noise, click trains with constant and variable inter-click intervals) and natural vocalization were recorded from the left (LAC) and right (RAC) auditory cortices in adult (4–6 months old) anaesthetized F344 rats. Using vector strength, modulation-transfer functions, van Rossum distances, and direction-selectivity index (DSI), synchronization with stimulus structure and response reliability were compared in the LAC and the RAC. The RAC exhibits a higher synchronization ability for all periodic stimuli except AM tones. Reproducibility of responses to periodic stimuli was also better in the RAC. On the other hand, the LAC mostly shows higher relative response magnitudes to temporally structured stimuli. The responses to vocalizations were similar in both hemispheres, however, the RAC exhibited a higher response to the onset of the second bout. The proportion of direction-selective neurons was higher in the RAC. The results show that coding of temporally structured stimuli in the RAC is based more on the temporal code, while the LAC is more focused on the rate code. Direction selectivity is more developed in the RAC. These results confirm and extend our previous findings obtained both electrophysiologically and behaviorally. This work is supported by the project INTER-ACTION LTAIN 19201.
Alexander Pei, Matt D. Schalles, Jason Mulsow, Dorian S. Houser, James J. Finneran, Peter L. Tyack and Barbara G. Shinn-Cunningham
Topic areas: correlates of behavior/perception cross-species comparisons
Auditory processing dolphin auditory evoked potentials hearing source reconstruction EEGFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Animals use biological sonar (echolocation) to sense their environment through echoes from self-generated clicks. Dolphins excel at echolocation; however, the brain regions in the dolphin that parse echoes to analyze underwater acoustical scenes remain largely unknown. Over the last few decades, human electroencephalography (EEG) has matured to support localization of brain regions involved in neural computations by leveraging dense multi-electrode recording arrays. This study explored whether this paradigm could be applied in dolphins. We recorded 16-channel EEG in two adult male dolphins during a passive-listening task in air. Synthetic clicks were delivered via a jawphone on the animal’s left jaw, at presentation rates of 2, 4, and 8 times per second with a 50-ms jittered interstimulus interval. Anterior electrodes 4 cm posterior to the blowhole exhibited a clear P1-N1-P2 complex with latencies ranging from 15 ms to 40 ms for all stimulation presentation rates. Faster presentation rates led to a decrease in the relative magnitudes of the event-related potentials (ERPs). We analyzed these ERPS using the weighted minimum norm eLORETA algorithm to localize neural regions responsible for sound processing. From a manually segmented T1 MRI structural scan of a dolphin, we identified scalp, skull, and brain tissue and generated a 3-layer boundary element model of the head out of triangular meshes, which was used by the eLORETA algorithm to estimate the locations of neural activity. Preliminary results indicate source activity was lateralized towards left parietal regions, which aligns with previous invasive localization of auditory cortices (Supin et al. 1978). However, present estimates in subcortical gray matter regions are inferior to those previous reports, which localized activity to more superficial regions. This study provides a framework for EEG source localization in dolphins using anatomical and electrophysiological data, but requires further validation and testing.
Greta Tuckute, Jenelle Feather, Dana Boebinger and Josh McDermott
Topic areas: hierarchical organization neural coding
neural network fMRI hierarchy auditoryFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Deep neural networks are commonly used as models of the visual system, but are less explored in audition. We evaluated brain-model correspondence for publicly available audio neural network models along with in-house models trained on four different tasks. Most tested models out-predicted previous filter-bank models of auditory cortex, and exhibited systematic layer-region correspondence: middle layers best predicted primary auditory cortex while deep layers best predicted non-primary cortex. However, some of the publicly available models trained for engineering purposes produced substantially worse brain predictions. The results support the hypothesis that hierarchical models optimized for auditory tasks often learn representational transformations that coarsely resemble those in auditory cortex, but indicate that state-of-the-art engineering models can deviate substantially from biological systems.
Alessandro La Chioma and David Schneider
Topic areas: memory and cognition multisensory processes
Auditory cortex Auditory processing Predictive processing ContextFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Auditory perception relies on predicting the acoustic consequences of our actions. Correspondingly, neural circuits in the brain respond differently to expected versus unexpected self-generated sounds. In the real world, the same motor action can produce different sounds depending on the environment in which the behavior is produced - e.g. footsteps sound different when walking on concrete compared to fallen leaves. Yet it remains untested whether the brain can dynamically update predictions about self-generated sounds in a context-dependent manner and on an ethological timescale. To address this question, we developed a visual-acoustic virtual reality (VR), in which a head-fixed mouse on a treadmill repeatedly traverses two different environments, each consisting of a distinct visual corridor accompanied by distinct artificial footstep sounds. Following extensive behavioral acclimation, we made high-density neuronal recordings from auditory cortex as mice traversed VR and experienced either expected or deviant footsteps. We observe overall strong suppression of neural responses to self-generated sounds compared to the same sounds heard passively. When mice hear footstep sounds in the wrong visual context, neural responses are on average larger than when mice hear footstep sounds in the correct context, consistent with an expectation violation. Expectation violations emerge almost immediately after a mouse enters a new context, suggesting a rapid updating of predictions in parallel with behavior. Neurons with strong context-dependent modulation tend to reside in infragranular cortex. Together, our results suggest that the auditory cortex may combine auditory and motor signals with visual spatial cues for real-time, context-dependent processing of self-generated sounds.
Danyi Lu, Brett Bormann, Kendall Stewart, Jordan Roberts, Jeffrey Johnson, Katie Neverkovec and Gregg Recanzone
Topic areas: auditory disorders hierarchical organization neural coding
hearing loss normal aging rhesus macaque primary auditory cortex single-unit electrophysiologyFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Age-related hearing loss (ARHL) afflicts approximately 1/3 of the geriatric population and increases exponentially with age. We recorded single neuron activity in core auditory cortex in a passively listening male rhesus macaque monkey. Stimuli were 200ms duration tones with frequency ranging from 250Hz to 19kHz. The animal was 26-27 years old at the time of testing, equivalent to 78-81 human years, and had bilateral high frequency hearing loss as measured by an ABR. Preliminary analysis showed that the tonotopic representation in AI was not as systematically spatially organized as has been seen in younger monkeys. The rostral (low frequency) region showed a normal tonotopic gradient from low to high best frequency (BF) in the rostral to caudal direction, which we define as the non-reorganized zone. The caudal region showed a severely disrupted BF organization, which we define as the reorganized zone. Comparing neurons between the reorganized and non-reorganized regions showed that individual neurons simultaneously recorded from the electrode had more variance in their BF. The spontaneous, onset and offset, but not sustained, responses were greater in neurons in the reorganized zone compared to those in the non-reorganized cortex. Finally, the spectral bandwidth measured for neurons in the reorganized zone was greater compared to neurons in the non-reorganized zone. These data indicate that the responses of neurons in the reorganized zone are altered in both spectral and temporal domains across both the tonotopic gradient as well as between neighboring neurons. These differences may provide insights into the central mechanisms of ARHL.
I.M Dushyanthi Karunathilake, Christian Brodbeck, Shohini Bhattasali, Philip Resnik and Jonathan Z. Simon
Topic areas: speech and language neural coding
Speech encoding TRF Cortical Auditory Processing MEGFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Understanding speech requires analyzing acoustic waveforms via intermediate abstract representations including, phonemes, words, and ultimately meaning along with other cognitive operations. While recent neurophysiological studies have reported that the brain tracks acoustic and linguistically meaningful units the impact of different kinds of speech information on brain responses is not well understood. Additionally, how these feature responses are modulated by top-down mechanisms and speech comprehension is not well understood. To address these, we recorded magnetoencephalography (MEG) data from 30 healthy younger participants while they listened to four types of continuous speech-like passages: speech-envelope modulated noise, narrated English-like non-words, word-scrambled narrative, and true narrative. Using multivariate temporal response function (mTRFs) analysis, we show that the cortical response time-locks to emergent features from acoustics to linguistic processes at the sentential level as incremental steps in the processing of speech input occur. Our results show that when the stimulus is unintelligible, the cortical response time-locks only to acoustic features, whereas for intelligible speech, the cortical response time-locks to both acoustic and linguistics features. For the case of narrated non-words, phoneme-based lexical uncertainty generates less activation than for true words, suggesting a lack of predictive coding error. Temporal analysis shows that the non-word onsets do generate smaller early responses than word onsets, but they also generate stronger late responses than word onsets suggesting different neural mechanisms associated with accessing lexico-semantic memory traces. For the scrambled word passages, we find additional responses based on context-independent word surprisal, but for true narrative, the responses are additionally driven by context-based word surprisal. The unigram word surprisal responses show strong late peaks for the scrambled word passage, consistent with an N400-like response. The results also show that most language-dependent time-locked responses are left lateralized, whereas lower-level acoustic feature responses are right lateralized or strongly bi-lateral. Taken together, our results show that brain responses to certain linguistic units are influenced by the speech content, the level of processing and speech features that could be attributed to evaluate perception and comprehension.
Monty Escabi, Mina Sadeghi, Xiu Zhai, Delaina Pedrick, Heather Read and Ian Stevenson
Topic areas: correlates of behavior/perception neural coding subcortical processing
Auditory textures perception inferior colliculus auditory midbrain natural soundsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Categorical perception, the ability to group sounds or images into discrete perceptual categories, enables humans and other animals to rapidly and robustly recognize and respond to stimuli. Categorical representations have been observed for speech and other vocalized sounds along relatively simple time and frequency dimensions. However, the high-order acoustic features and neural computations underlying categorical representations for more general sound categories are not well understood. We used a texture synthesis to synthesize chimeric auditory textures. Chimeric textures were generated by morphing summary statistics of running water and crackling fire sounds. The sounds were first used in human perception tasks were participants were either required to identify or discriminate chimeric texture. For both tasks, the chimeric textures statistics were varied by adjusting either the included summary statistics during the synthesis and the morph ratio (MR) from 0 (water extreme) to 1 (fire extreme). Parallel studies were also carried out to study the neural representation of chimeric textures in the inferior colliculus (IC) of unanesthetized rabbits. Neural activity was obtained using multi-channel silicon probes inserted along the principal frequency axis during passive listening of the chimeric textures. We demonstrate that shifting summary statistics by changing the morph ratio produces a robust shift in human listener’s perception. Participants readily identified water and fire sounds in a categorical-like fashion with increasing morph ratio where the sound correlation structure was the critical and necessary statistical cue responsible for the categorical effect. The categorical shift is accompanied by lower discrimination accuracy and larger JND limens for within category as compared to across category discrimination. We then demonstrate how response statistics of neural ensembles can accurately decode the sound category. Bayesian neural decoders were used to assess how population response summary statistics can categorize or discriminate the chimeric textures. Shifting the morph ratio produces a similar categorical-like shift in neural decoding performance and similar larger JNDs and lower accuracy trends as for human participants. The findings suggest that high-order statistical sound cues can drive categorical-like perception for textures and that the neural response statistics in IC can contribute towards such phenomenon (supported by NIDCD, R01DC015138; R01DC020097).
Amy LeMessurier and Robert Froemke
Topic areas: correlates of behavior/perception neural coding neuroethology/communication subcortical processing
vocalizations auditory cortex corticocollicular maternal behaviorFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The ability of mothers to detect and respond to sensory cues from infants is essential for survival in mammals. A key maternal behavior in mice is retrieving isolated pups into the nest in response to infant ultrasonic vocalizations (USVs). Retrieval is performed robustly by experienced mothers (dams), but can be learned by virgins co-housed with a dam and litter. Maternal experience induces plasticity in left auditory cortex that broadens tuning curves for USVs and increases reliability of responses, and activity in left auditory cortex is required for retrieval (Carcea et al., Schiavo et al. 2020, Marlin et al. 2015). How does this enhanced coding support pup retrieval? Descending projections from auditory cortex to subcortical targets may be crucial for retrieval. We used 2-photon in vivo calcium imaging and in vivo channelrhodopsin-assisted patching to examine encoding of USVs throughout maternal experience in auditory projection neurons, labeled via retrograde viral tracing. We measured responses to USVs in neurons projecting to either the inferior colliculus or posterior striatum. In corticocollicular neurons, imaging experiments revealed a striking increase in baseline activity during epochs of repeated USV presentations compared to tone presentation epochs. In contrast, baseline activity in corticostriatal neurons was equivalent during tone and USV presentation epochs. Time-locked responses to USVs were also substantially larger in corticocollicular compared to corticostriatal neurons (evoked dF/F: 5.6 +/- 0.55, N=158 corticocollicular neurons from 4 mice; 2.9 +/- 0.2%, N=271 corticostriatal neurons from 3 mice; pless than 0.001). In vivo patch measurements from optogenetically-identified neurons reflect this trend (evoked firing rate: 2.04 Hz +/- 0.72, N=8 corticocollicular neurons from 3 mice; 0.24 Hz +/- 0.44 N=5 corticostriatal neurons from 5 mice, p=0.12). We tested the involvement of select populations of neurons in pup retrieval using chemogenetics. Suppressing activity in left auditory cortex layer 5 neurons impaired retrieval performance in retrieving females (vehicle: 90 +/- 10% of pups retrieved; CNO: 41 +/- 7%, N=3 mice, 10 trials per condition). However, bilateral suppression of activity in corticostriatal neurons did not impair performance (vehicle: 87 +/- 5% of pups retrieved; CNO: 96 +/- 2%, N=3 mice, 10 trials per condition).
Matt Schalles, Jason Mulsow, Dorian Houser, Jim Finneran, Peter Tyack and Barbara Shinn-Cunningjam
Topic areas: memory and cognition neuroethology/communication
dolphin attention auditory evoked potential (AEP) mismatch negativity (MMN)Fri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In a noisy acoustic environment, how does a dolphin attend to an auditory source of interest and ignore boat engine noise, snapping shrimp, and even another dolphins’ echo returns? Selective attention requires actively directing focus onto a target while ignoring competing sound sources. In many mammals, endogenous changes in task strategy, including the focus of selective attention, can modulate the magnitude of early cortical responses evoked by sound onsets (auditory evoked potentials or AEPs). Here, we investigated whether task demands alter AEPs in an adult male bottlenose dolphin. Each trial played a rapid sequence of 20 kHz and 28 kHz, 10-ms tones, with a jittered 50-150 ms inter-stimulus interval. The dolphin was trained to whistle to a “target” tone, which was 10 dB more intense (140 dB re 1 μPa SPL) than other tones, and to withhold responses on catch trials. Correct responses earned a fish reward. We employed a 2 x 2 design in which, across blocks (each ~1-2 weeks of training and testing), we varied both the target frequency (20 kHz or 28 kHz) and the frequency context (whether 20 kHz or 28 kHz occurred frequently). Specifically, in each block, 80% of the tones were a standard of one frequency (either 20 kHz or 28 kHz) and the other 20% were the other frequency (deviants). We predicted larger AEP responses to tone deviants whose frequency was infrequent (reflecting an exogenous mismatch negativity or MMN response). We further expected the MMN to be larger when the deviant frequency matched that of the target, reflecting endogenous effects. Preliminary results from a midline electrode ~20 cm posterior to the blowhole suggest an increased MMN when the target and deviant frequencies match for both 20- and 28-kHz conditions. The difference in the standard and deviant peak-to-peak amplitude between N1-P2 AEP components was larger from about 50-100 ms post-stimulus onset for tones of the target frequency. This MMN was smaller (28 kHz) or nonexistent (20 kHz) for tones not matching the target. This suggests that dolphins exhibit top-down effects on auditory processing, particularly when goal-directed task demands interact with stimulus statistics and expectations.
Wusheng Liang, Barbara Shinn-Cunningham and Christopher Brown
Topic areas: correlates of behavior/perception
Auditory spatial attention Head-related transfer function ElectroencephalographyFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Auditory spatial attention plays an important role in our daily life by enabling us to selectively attend to different sources of sounds in busy acoustic environments. To understand auditory spatial attention, three different methods for spatializing stimuli are commonly used in research studies: the head-related transfer function (HRTF), interaural level difference (ILD), and interaural time difference (ITD). However, few studies have asked if the spatialization method affects the ability to direct spatial attention and alters concomitant neural signatures of attentional control. To understand this question, we developed an auditory attention task where the participants were asked to remember a short sequence of syllables from a target direction while ignoring both a competing stream that was presented on all trials from the opposite hemifield and a rare interrupting sound (presented in 25% of the trials, randomly selected, also from the opposite hemifield). We compared the three different spatialization methods for spatializing the stimuli and recorded electroencephalography (EEG) during task performance. Across subjects (N = 15), ITD spatialization yielded significantly worse target syllable recall than ILD or HRTF spatialization. In addition, the rare, salient interrupting sound caused a greater performance decrease (relative to uninterrupted trials performance) for the ITD spatialization. Preliminary EEG analysis showed greater attentional modulation of the P1 component of the event-related potential (measured from fronto-central electrodes) evoked by attended syllables for HRTF and ILD spatialization. Further analysis of the EEG data will be undertaken to better understand how the spatialization method affects deployment of auditory spatial attention.
Dori Grijseels, Daniella Fairbank and Cory Miller
Topic areas: neuroethology/communication
marmoset dyadic communication modelling computational turn-takingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
An important aspect of dyadic communication is determining when each conversation partner vocalizes. Various animal species, including humans, employ implicit rules that determine the communication dynamics between partners. Common marmosets (Callithrix jacchus) use long-range phee calls to make contact with conspecifics in the absence of visual access. During these contact calls, marmosets rarely interrupt each other, and show coordination of calling time and duration, suggesting they also follow implicit rules for turn-taking. However, how marmosets determine when to call in these vocal exchanges is still unclear. In this study, we recorded the phee calls of pairs of monkeys that could hear, but not see, each other. Based on these timing data for these phee calls, we propose a novel stochastic model that better captures the marmoset calling dynamics. We show that a marmoset’s proclivity to respond to individual calls of their partner monkey is largely independent of that monkey’s behavior, including response delay and intercall interval. Instead, they have periods of increased overall calling, which we deemed the active state, in which they are more likely to initiate and continue a conversation. Overall, we used natural monkey calling behavior to develop a novel model of marmoset communication dynamics. We plan to use this model in future experiments where we replace one monkey with a computer generated virtual monkey whose respective vocal behavior is determined by machine learning algorithms, allowing for naturalistic interrogation of the underlying processes that govern this communicative behavior, and development of a powerful platform for neurobiological investigations.
Camila Zugarramurdi, Lucía Fernández, Marie Lallier, Manuel Carreiras and Juan C. Valle-Lisboa
Topic areas: memory and cognition correlates of behavior/perception
reading neural synchronization prereadersFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Learning to read involves accessing phonological representations in order to establish grapheme-to-phoneme mappings. In turn, the development of phonological representations depends on successful segmentation of speech into discrete units. In the last decades, strong evidence has pointed to the role of cortical oscillations as an underlying mechanism for speech segmentation. In the present work, we used EEG to study synchronization of cortical oscillations and amplitude modulated white noise at 2, 4 and 8 Hz in 40 Spanish-speaking pre-reading children, and its association with reading performance one year later. Results show synchronization between cortical oscillations and auditory stimuli at particularly relevant frequencies for speech processing (2 and 4 Hz), but not at less relevant frequencies (8 Hz). With respect to reading, we found that neural synchronization at 2 Hz, but not at 4 Hz, is associated with future reading performance. These findings provide novel evidence for the role that cortical oscillations play in auditory processing and learning to read in prereaders.
Michael Turvey, Srihita Rudraraju, Bradley Theilman and Timothy Gentner
Topic areas: correlates of behavior/perception hierarchical organization neural coding neuroethology/communication
predictive coding mismatch extracellular songbird electrophysiology animal behavior bioacousticsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Predictive coding is well-studied in the visual system and in certain paradigms in the auditory system. However, in the auditory system, research has focused mainly on highly tractable but ethologically unlikely stimuli such as pure sine tones. Here we use operant behavioral training to induce strong expectation for ethologically relevant stimuli in a match-to-sample paradigm in European starlings, Sturnus vulgaris, with sequentially patterned syllables from conspecific songs. After varying amounts of training, subjects were able to generalize rule governed match/non-match responses to very large sets of novel song syllables. Following behavioral training and testing, we examined the evoked spiking activity in auditory forebrain regions, analogous to mammalian secondary auditory cortices. Using 128-channel silicon extracellular electrode arrays, we recorded from higher auditory areas CM and NCM. In CM (n=398 well isolated units), firing rates drop significantly with repeated presentations of the same syllable, and a strong mismatch response (significant increase in firing rate) is observed when the syllable changes. The strength of this mis-match response increases as the number of sample syllable repetitions increases. The mis-match response generalizes across large sets of song syllables, but only if those syllables are familiar from prior training. We do not see a change in response for silence, indicating that the observed “mismatch” response in these higher auditory areas is specific to feature identity and not just mean acoustic power. The results demonstrate that individual auditory-sensitive neurons are also sensitive to context, giving us more information about how learned signals are encoded in the brain.
Xiaomao Ding and Maria N. Geffen
Topic areas: correlates of behavior/perception neural coding
Auditory cortex Interneurons Optogenetics Perception Temporal regularityFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The statistical structure of sensory stimuli is a key component of how animals process and interact with the external world. In the auditory domain, temporal regularity is a common part of the statistical structure of stimuli, due to physical constraints on naturalistic sound sources. Human subjects exhibit differential responses to regular versus random sound sequences and also exhibit increased behavioral sensitivity to regular sequences of sound. In the mouse brain, temporal regularity modulates the neuronal response in the auditory cortex (AC) to novel tones in an oddball paradigm. Here, we investigated how inhibitory interneurons modulate differential responses to temporally regular and random sequences in sound. We used 2-photon imaging to record calcium traces of AC neurons in head-fixed somatostatin(SST)-cre mice injected with a JAWS inhibitory opsin. The mice were presented with a set of moving ripples presented in a regular or random sequence. In both sequences, a novel ripple occasionally replaced one from the base set and a laser was used to pseudo-randomly inhibit SST interneurons during sound presentation. Neurons in AC exhibited novelty response, as demonstrated by an increased response to the novel ripple as compared to the same ripple presented as part of a random sequence. The novelty response was modulated by the regularity of the sound sequence: more neurons exhibited a novelty response in the regular sequence (36%) as compared to the random sequence (19%). This novelty response modulation by sequence regularity was abolished when SST activity was suppressed optogenetically: on novel ripple presentations with optogenetic inhibition of SST interneurons, the same number of neurons exhibited a significant response in both regular and random sequences. Our results establish that the novelty responses in AC are modulated by sequence regularity, and that this modulation is controlled by a specific type of inhibitory interneuron. This modulation likely supports complex aspects of speech and music perception.
Emily Przysinda, Jillian O'Malley, Aaron Nidiffer, Bridget Shovestul, Abhishek Saxena, Stephanie Reda, Emily Dudek, David Dodell-Feder and Edmund Lalor
Topic areas: speech and language
language schizophrenia naturalistic auditory envelopeFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Schizophrenia is a complex disorder that affects perception, cognition, and emotion. Patients with schizophrenia have been known to have subtle differences in auditory processing that might be related to auditory hallucinations. These differences are often studied using paradigms that involve isolated aspects of auditory processing with repetitive and abstract auditory stimuli. Recent developments in analyses of different auditory and language components using naturalistic stimuli, such as continuous speech, may allow us to better understand the interaction of acoustic and language processing in patients with schizophrenia. It also can decrease potential confounds of attention and motivation. Furthermore, the stimuli can be tailored to questions of interest and allow us to ask questions about language that may interact with other domains, such as social processing deficits in schizophrenia. Here, we use a naturalistic speech approach to examine basic auditory and linguistic processing during a video stimulus, "The Office" comedy TV show. We used linear methods to relate the EEG signal to the auditory envelope of the episode, which yields quantitative values that index how well the brain is tracking the envelope of language. Our data demonstrate that we can robustly capture how brain data is tracking the acoustic envelope of "The Office" and that this is driven by activity over temporal scalp, as we would expect. Preliminary comparisons between patients with schizophrenia (n=7) and controls (n=17) suggest that patients with schizophrenia may be tracking the acoustic envelope less compared to healthy controls. These results show efficacy in extracting useful neural tracking signatures of the envelope from video stimuli and that patients with schizophrenia may have deficits in processing basic acoustic features of naturalistic video. Future analyses will disentangle how much this reduced envelope tracking in schizophrenia derives from deficits in acoustic versus linguistic processing of the stimuli and how these measures may be interacting with behavioral and clinical indexes.
Sharlen Moore, Zyan Wang, Angel Lee and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception
goal-directed behavior habit-like performance water restriction audiomotor learningFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Animals use different decision processes to efficiently adapt to complex environments. When operating in a goal-directed mode, animals deliberate amongst alternatives in a slow and cognitively demanding process. As the environment becomes predictable, animals can shift to a faster habit-mode. To what extent are transitions between goal-directed to habitual behavior sudden or slow? The nature and timescale of these transitions remains poorly understood. Standard methods to determine the presence of habits include outcome devaluation and contingency degradation. They, however, are implemented at discrete time points, hindering the identification of the precise moment when a transition occurs. We hypothesized that by shifting an animals’ motivation from a need (survival) to a preference (palatability), we could use action rate variability as a behavioral indicator of goal-directed performance (variable) versus habit-like performance (stable) without impacting accuracy. We leveraged a recent protocol in which mice get ad libitum access to water with citric acid (CA), a less-palatable substance that fulfills hydration. We compared CA mice with mice under standard water-restriction (WR) protocols. Mice were trained in an auditory go/no-go task in which they learned to lick in response to a tone for a water reward and withhold licking to another tone to avoid a timeout. All mice acquired task contingencies at similar rates. Interestingly, throughout training, most CA mice initially showed high action rate variability, regularly shifting between epochs of high and low engagement. Surprisingly, CA mice exhibited an abrupt reduction in action rate variability, typically at the beginning of a new session, suggesting a sudden shift to habit-like performance, an effect not evident in WR mice. Detailed lick analysis demonstrated a sudden increase in consummatory licks in CA mice post-transition, a signature of ‘automaticity’ related to habits. Ongoing work aims to isolate pupil-based biomarkers of goal-directed and habitual behavior. This approach allows us to identify naturalistic transitions between goal-directed and habitual behavior and provides an opportunity to uncover new insights into the neural basis of habit formation. Our data suggests that the transition from goal-directed to habit-like performance during learning is sudden, concomitant with automaticity, and may result from a winner-take-all process.
Nazineen Kandahari, Patrick Hullett, Jon Kleen, Vikram Rao, Robert Knowlton and Edward Chang
Topic areas: speech and language hierarchical organization
Heschl’s gyrus Heschl’s gyrus resection epilepsy surgery primary auditory cortex superior temporal gyrus speech networkFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In classic speech network models, the primary auditory cortex, located along Heschl's gyrus, is the primary source of auditory input to the human speech processing area in the posterior superior temporal gyrus (pSTG). Because of this, Heschl's gyrus resection in the dominant hemisphere is often avoided and, when performed, typically only involves partial resection. However, recent work from our lab has added new insight into the classical speech network model. It shows other parallel inputs to the pSTG, thus increasing the possibility of Heschl's gyrus resection without speech impairment. In support of this, here, we present a clinical case of a woman with severe medically refractory epilepsy with an epileptic focus in the left (dominant) Heschl's gyrus. Physiologic recordings from this area were consistent with the primary auditory cortex localized to Heschl's gyrus. Furthermore, stimulation-based functional mapping did not disrupt speech perception or production consistent with dominant Heschl's gyrus not being necessary for speech perception. Given this, she underwent lesionectomy with total resection of Heschl's gyrus to treat her epilepsy. Immediately post-operatively, there were no speech perception or production deficits, and clinically, she remains seizure-free. While resection of the dominant hemisphere Heschl's gyrus warrants caution, this case illustrates the ability to resect Heschl's gyrus without speech impairment and further supports multiple parallel inputs to pSTG.
Jennifer Lawlor, Sarah Elnozahy, Farah Du, Fangchen Zhu, Aaron Wang, Tara Raam and Kishore Kuchibhotla
Topic areas: correlates of behavior/perception
sensorimotor learning longitudinal two-photon calcium imaging cholinergic axons auditory cortexFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
During sensorimotor learning, animals link a sensory cue with actions that are separated in time using circuits distributed throughout the brain. Learning thus requires neural mechanisms that can operate across a wide spatiotemporal scale and promote learning-related plasticity. Neuromodulatory systems—with their broad projections and multiple timescales of activity—fulfill these criteria and could serve as a potent mechanism to link the different sensory and motor components. Yet, it remains unknown the extent to which this proposed model of plasticity occurs in real-time during behavioral learning. Acquisition of sensorimotor learning in a go/no-go task is much faster and more stereotyped than previously considered (Kuchibhotla et al., 2019). We trained mice to respond to one tone for a water reward (S+) and withhold from responding to another (S-). We interleaved reinforced trials with those where reinforcement was absent (“probe”). Early in learning, animals discriminated between S+ and S- in probe but not reinforced trials. This unmasked a rapid acquisition phase of learning followed by a slower phase for reinforcement, termed ‘expression’. What role does neuromodulation play in task acquisition? Here, we test the hypothesis that cholinergic neuromodulation provides a ‘teaching signal’ that drives primary auditory cortex (A1), and links stimuli with reinforcement. We exploit our behavioral approach and combine this with longitudinal two-photon calcium imaging of cholinergic activity in A1 during discrimination learning. We report both robust stimulus-evoked cholinergic activity to both S+ and S- and stable licking-related activity throughout learning at the level of the axon segment. While this activity mildly habituates in a passive control, in behaving animals the S+ and S- stimulus-evoked activity is enhanced (S+: duration, S-: amplitude and duration) during early learning. Additionally, we test the hypothesis that cholinergic neuromodulation impacts the rate of task acquisition. We bilaterally expressed ChR2 in cholinergic neurons within the basal forebrain of ChAT-cre mice and activated these neurons on both S+ and S- trials throughout learning. Test animals acquired the task faster than control groups as measured in probe trials. These results suggest that phasic bursts of acetylcholine, projecting widely to cortical regions, directly impact the rate of discrimination learning.
Huaizhen Cai and Yale Cohen
Topic areas: multisensory processes neural coding
audiovisual task neural decoding non-human primatesFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Because our environment is inherently multisensory, it is reasonable to speculate that our brain has evolved to preferentially process such multisensory information. Indeed, multisensory activity has been found throughout the entire auditory hierarchy: from the middle ear to the prefrontal cortex. However, despite the large literature on multisensory processing, we do not have a full picture of the relationship between multisensory behavior and neural activity, especially in primate models. Here, we recorded auditory-cortex neural activity at different spatiotemporal scales in order to evaluate its contributions to multisensory behavior. Specifically, we recorded EEG, LFP, and single-unit activity while a monkey performed an audiovisual detection task, in which an ecologically relevant ‘coo’ call embedded in a chorus was delivered with or without a corresponding ‘coo’ video. The signal to noise ratio was varied from -10 - 10dB with a step size of 5dB. We report how well we can decode stimulus and task parameters from individual neural signals (e.g., LFP or single units) and whether combinations of neural signals (e.g., LFP and single units) facilitated parameter decoding.
Mateo Lopez-Espejo and Stephen David
Topic areas: correlates of behavior/perception neural coding
Integration Context Sparse Code Ferret Population Code Working MemoryFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Accurate auditory processing requires working memory and integration of information from ongoing stimuli over time. Thus, the auditory cortex (AC) must keep track of the context of sensory information over many hundreds of milliseconds, longer than the typical latency of evoked cortical activity. The magnitude of this memory trace, how it interacts with the representation of ongoing stimuli, and how neuronal and circuit mechanisms contribute to it, remain open questions. To study this problem, we measured the responses of neurons to brief natural sound probes, presented in different sensory contexts, where context was defined as the natural sound immediately preceding a probe. We used context-dependent differences in probe response to measure the magnitude of memory and integration effects. Recordings were performed using multi-electrode arrays in the primary and secondary regions of the AC of awake, passively listening ferrets. We found significant contextual effects which in some cases lasted more than 1 second, with neurons in secondary auditory regions showing stronger and longer-lasting effects. Contextual effects were greater when one of the contexts was silence, and smaller when it was the same sound as the probe. This implicates adaptation to the contexts spectro-temporal features as a potential contributor to the effects. Context effects were limited to a subset of stimulus combinations in a single neuron, but the specific combinations were diverse across neurons in the local population. Thus, the local population supports a sparse representation of the ongoing sensory context. To gain insight into mechanisms supporting context effects, we trained encoding models to predict the activity of individual neurons as a function of stimulus and past population activity. Models containing both past activity of neighboring neurons and the neuron itself were better able to account for long lasting contextual modulation. The requirement for both inputs hints at a combined role of local network inputs and intrinsic properties of the neuron in generating these contexts representations.
Andrea Bae, Keanu Shadron, Roland Ferger and José Luis Peña
Topic areas: correlates of behavior/perception cross-species comparisons subcortical processing
sound localization binaural hearing brain oscillationsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Barn owls are specialists in sound localization. Their well-described midbrain stimulus selection network, a circuit containing a map of auditory space dedicated to localizing salient sounds, provides a unique opportunity to study the flow of information between midbrain and forebrain, where a transformation of coding scheme is indicated by the vanishing of a map of auditory space between midbrain and forebrain regions. This project worked towards investigating how the circuit conducts bottom-up relay for salient stimuli in environments with competing sounds. Earlier in vivo recordings in the owl’s optic tectum (OT) have shown that gamma oscillations are spatially tuned to both visual and auditory information, and may play a role in stimulus selection. However, previous recordings in deep midbrain structures, like OT, have relied on single electrodes in a single region and an open question remains regarding how oscillations contribute to information flow during stimulus selection. Towards this end, we recorded spike responses and local field potentials in OT and one of its downstream forebrain regions simultaneously in anesthetized owls. So far, we observe heterogeneity in tuning properties to binaural cues in the forebrain, ranging from peaked tuning curves to broad tuning to contralateral space. However, while tuning may differ between brain regions, firing rates and gamma power positively correlate both during spontaneous activity in the absence of stimuli and during presentation of stimuli, suggesting connectivity. Future experiments will determine the role of gamma oscillations in promoting stimulus selection during presentation of two competing sounds, and determine whether gamma power is predictive of bottom-up relay towards salient stimuli during sound orientation behavior.
Yufei Si, Shinya Ito, Alan Litke and David Feldheim
Topic areas: neural coding subcortical processing
Mouse Electrophysiology sound localizationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Locating the source of a specific sound in a complex environment and determining its saliency is critical for survival. A topographic map of the auditory space is essential for such perception and has been found in the superior colliculus (SC) of a range of species. We have shown that mice use high-frequency monaural spectral cues and interaural level differences to compute spatial receptive fields (RFs) along the azimuth. Our previous studies used broadband noise as the stimulus to ensure coverage of the wide spectrum of sound frequencies. However, in a naturalistic environment, the auditory stimuli that an animal encounters will have specific spectral and temporal features that may not be present in broadband noise stimuli; even so, the animal can still localize sound sources. It remains unknown whether and how the SC neurons respond to and localize naturalistic sounds. We have used blind analysis of large-scale in vivo physiological recordings of SC neurons in response to a collection of naturalistic auditory stimuli, including looming and chirping sounds, as well as original and altered versions of ultrasonic pup calls. Our data show that CBA/CaJ mice of both sexes have subsets of SC neurons that respond to each of these stimuli and that a significant fraction of these neurons have spatially localized RFs. In particular, neurons with complementary response patterns may suggest neural computation in the local circuitry; neurons with response in the late time scale may suggest top-down control from other auditory circuits outside of the SC. Additionally, altering the spectrotemporal structure of the pup call stimulus leads to a change in the spatial-temporal RFs of the pup call responsive neurons in the SC. Together, these findings 1) show that SC auditory neurons are able to respond to and localize naturalistic sounds; 2) demonstrate that SC neurons respond to auditory stimuli with a variety of response patterns, some of which suggest neural computation in the local circuitry and top-down control from other auditory circuits outside of the SC; and 3) suggest that computation in the SC may be used to extract spectrotemporal features from naturalistic sounds to perform sound localization.
HiJee Kang and Patrick Kanold
Topic areas: memory and cognition hierarchical organization neural coding
implicit learning auditory cortex calcium imaging memoryFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
A key function of the brain is sensory perception in changing and uncertain environments. In audition, fast implicit learning of incoming auditory inputs is a fundamental ingredient of efficient perception and segregation in complex auditory scenes that contain both changing and unchanging elements. However, neural mechanisms for implicit learning have not been clearly identified. We identify changes in neural responses to randomly re-occurring sounds, considered as mnesic trace, in auditory cortex (ACtx) to study neural mechanisms for implicit learning. To do so, we present a series of spectro-temporally complex sounds, dynamic random chords; DRCs. DRCs are artificial sound sequences in which at each time bin (20 ms for 1 sec) multi-frequency tone pips with varying sound levels (50 – 90 dB SPL) in each frequency bin (4-40 kHz separated by 10 frequency bins following a logarithmic scale) are generated. Average sound level of each sound is set at 70dB SPL. We take CBA x Thy1- GCaMP6s F1 mice to trace response changes of excitatory cells. While each DRC is generated afresh (Random sound; 20 different tokens), one specific sequence (Target sound) will re-appear at random trials for 20 times. We play a series of these DRCs in a random order to passively listening awake young adult mice while conducting two-photon imaging in three different subregions: thalamocortical layer L4 of primary auditory fields (A1), superficial layer L2/3 of A1, and secondary auditory fields (A2). We quantify whether there is any distinctive neural dynamics for the Target sound from its repetitive exposure, compared to Random sounds. We also identify whether the ‘learning cells’ are more prevalent from certain cell types based on their tuning curve, per subregions. We observed a trend of greater adaptation of neural responses to the re-occurring Target sound compared to other Random sounds in high- order regions. This can be seen as a larger response of the Random sound over the re- occurring Target sound. Our results suggest that the higher-order region in the ACtx is more involved for implicit learning processes. Acknowledgement: This work is funded by U19NS107464.
Mark Saddler and Josh McDermott
Topic areas: correlates of behavior/perception neural coding subcortical processing
deep learning auditory nerve phase locking speech recognition voice recognition localizationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Neurons can encode information in the timing of their spikes in addition to their firing rates. The fidelity of spike timing is arguably greatest in the auditory nerve, whose action potentials are phase-locked to the fine-grained temporal structure of sound with sub-millisecond precision. However, the perceptual role of this temporal coding in hearing remains controversial. We investigated the issue with machine learning models optimized to perform real-world hearing tasks using simulated cochleae, asking whether phase locking in a model’s cochlear input was necessary to obtain human-like behavior. We trained deep artificial neural networks to recognize and localize words, voices, and environmental sounds using simulated auditory nerve representations of naturalistic auditory scenes. We manipulated the upper limit of phase locking via the lowpass cutoff in simulated inner hair cells. Networks using high-frequency phase locking replicated key aspects of human auditory behavior: task performance was robust to sound level and background noise. Degrading phase locking impaired performance, but much more so on some tasks than others. Voice recognition and sound localization (in both azimuth and elevation) were substantially impaired. Word recognition was largely left intact, though some phase-locked spike timing was necessary to account for the fluctuating masker benefit and the effects of tone vocoding in normal hearing humans. The results link neural coding to real-world perception and clarify conditions in which prostheses that fail to restore high-fidelity temporal coding (e.g., contemporary cochlear implants) could in principle restore near-normal hearing.
Keanu Shadron and José Luis Peña
Topic areas: cross-species comparisons neural coding neuroethology/communication subcortical processing
Sound localization Barn owl DevelopmentFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In order to accurately perceive the world and respond accordingly, the brain must deal with noise inherent in sensory cues. One method is for the brain to learn which cues are reliable across contexts and rely on these cues in the future. We investigated whether the reliability of binaural spatial cues drives the tuning of neurons that compute sound location in barn owls. Barn owls use the interaural time difference (ITD) to determine azimuthal location. Previous work showed that the signal-to-noise ratio of ITD varies across frequencies in a location dependent manner, based on the acoustical properties of the head. Thus, for a given location, certain frequencies convey ITD cues more reliably, and the neural tuning in the midbrain external nucleus of the inferior colliculus (ICx) reflects this pattern. We hypothesized that if the frequency tuning of the ICx is driven by cue reliability across locations, then altering the pattern of cue reliability should adjust frequency tuning. Two barn owls were raised without the facial ruff, which modifies the pattern of ITD reliability. Results indicate that ICx neurons tuned to frontal locations respond to lower frequencies than normal, consistent with the change in ITD reliability. Recordings of the lateral shell of the core inferior colliculus (ICCls), immediately upstream of ICx, show normal tunings to frequencies across the owl’s hearing range. This suggests that the owls’ midbrain still encodes ITD for higher frequencies, but are unused for sound localization. These data suggest that stimulus statistics are used to guide neural tuning.
James Baldassano and Katrina Macleod
Topic areas: neural coding neuroethology/communication
Avian model Neural coding Superior olivary nucleusFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Inhibition is crucial for precise spiking during sensory processing. In the avian auditory brainstem, inhibition stems mostly from the superior olivary nucleus (SON). SON neurons receive excitatory input from the intensity-coding cochlear nucleus angularis (NA) and the coincident detecting nucleus laminaris (NL). SON neurons project back to the ipsilateral cochlear nucleus magnocellularis (NM), NA, and NL. A separate SON population projects to the contralateral SON, but if these distinct populations have difference physiological properties is unknown. Previous studies revealed two physiological type: tonic firing and single-spiking. We describe here a third phenotype, a temporally patterned tonic phenotype. In vivo studies have demonstrated at least 4 firing types, with some neurons able to phase lock. However, it is unclear how in vivo responses correspond with the in vitro responses, as well whether these cell types correlate with the divergent postsynaptic targets. To investigate how SON neurons respond to more naturalistic, temporally fluctuating inputs, we used in vitro electrophysiology and applied ‘noisy’ current injections that mimic in vivo activity. Sensitivity to fluctuations was measured as a change in firing rate, while reliability was assesses using a shuffled autocorrelogram analysis. Our results showed that single-spiking neurons (n=15) were the most sensitive to temporally modulated input and had highest reliability, while non-temporally patterned tonic neurons (n=22) were the least sensitive to temporally modulated input and more closely resembled integrators, temporally patterned tonic neurons (n=31) had moderate sensitivity and reliability in its firing. Intracellular labeling of recorded neurons allowed the reconstruction of the axonal projections. The projection patterns were strongly related to the noise response types. Single-spiking neurons projected medially, toward the contralateral SON or ascending pathways. Temporally patterned tonic neurons projected ipsilaterally and dorsally in a fiber tract toward NM and NL. Finally, the non-temporally patterned SON neurons projected ipsilaterally via two different fiber tracts, either to NA or to NL and NM. These results suggest SON neurons have physiological specializations that allow a range of temporally responsivity, consistent with the diversity of in vivo response patterns. The data further suggests that circuit specializations allow the processing temporal information in functionally distinct pathways.
Jennifer Lawlor, Melville Wohlgemuth, Cynthia Moss and Kishore Kuchibhotla
Topic areas: subcortical processing
echolocating bat two-photon calcium imaging inferior colliculus tonotopy social callsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Navigating our everyday world requires parsing stimulus information from constantly evolving sensory flows. The physical properties of stimuli may vary continuously, while behavioral responses are comparatively discrete. For example humans can readily understand spoken words despite speech spectral properties varying across speakers. How do category-specific representations emerge in the brain? Here, we take advantage of an animal model long-studied for its expert auditory sensing: the echolocating bat. The bat auditory system makes use of echoes from ultrasonic vocalizations to determine the identity and location of objects – and for social interactions with conspecifics. As such, the bat constructs its representation of the external environment using sound, making it a powerful model to investigate the brain’s representation of natural acoustic categories. We developed two-photon calcium imaging in the awake big brown bat, Eptesicus fuscus, to assay the activity of a population of neurons with cellular resolution. We expressed GCaMP6f in excitatory neurons of the Inferior Colliculus, a central auditory hub, while using a thinned-skull approach to monitor large populations of cells in head-fixed subjects. We assessed functional auditory properties of thousands of neurons in awake, passively listening bats (n=3 bats) by presenting a large stimulus set, including pure tones of varying duration and playbacks of natural calls. We discovered a superficial fine-scaled tonopy in the superficial layers of the IC shell region with cells' preferred frequencies increasing in both the caudolateral and rostromedial extent. Using two sets of natural call categories–exemplar ultrasonic social vocalizations and temporally matched echolocation calls–we show that the sampled population is category selective. We found that category-specific cells do not follow the tonotopic gradient but rather form small clusters spread across the IC. Large-scale population decoding reveals sharper boundaries across, rather than within, sound category even when stimuli show considerable spectrotemporal variation. Our two-photon calcium imaging in the echolocating bat reveals the relationship between traditionally defined functional auditory features and natural categories of sounds in large neural ensembles with unprecedented spatial fidelity.
Jacob Edwards and Sarah Woolley
Topic areas: speech and language cross-species comparisons neural coding neuroethology/communication
songbird birdsong neural tuning acoustics spectrotemporal temporal communicationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Hearing and speech are foundational to human social communication. Early auditory experience with native language guides vocal learning through the landscape of development and shapes auditory cortical coding for life. Like humans and unlike other animals, songbirds learn to sing by copying the vocal sounds of adult tutors. As in humans, early vocal learning permanently shapes neural coding and perception of acoustic features. Prior work shows that juvenile songbirds that are reared and tutored by parents of a different species can successfully copy heterospecific song syllables. In contrast, cross-species tutoring experiments suggest that the temporal organization of syllables into sequences is determined by species genetics. Juveniles copy the syllables of their adoptive tutor’s songs, but produce those syllables with the temporal patterning typical of their own species, even when they have never heard conspecific song. We hypothesize that the secondary auditory cortex (caudal nidopallium; NC), where neurons are tuned by vocal learning, contains two populations of neurons: one that encodes syllable acoustics, and one that encodes temporal patterning. To test this hypothesis, we studied song behavior and auditory cortical coding of four species of songbirds that have known relatedness and specific differences in song acoustics and temporal organization. Using single-unit electrophysiology, we compared responses of NC neurons to stimuli including natural songs of multiple species, songs with altered temporal structure, and two classes of synthetic sounds that systematically varied in spectrotemporal acoustics and temporal patterning. Across species, NC neurons responded strongest to conspecific songs. In species that produced songs with complex acoustics, tuning to spectrotemporal modulations found in conspecific song explained firing rate differences to songs. In species that produce simple songs, tuning to acoustic frequency explained neuronal responses. Analysis of single NC neurons indicated that neurons were sensitive to either acoustics or timing, but rarely both. A subset of neurons was sensitive to the position of a syllable within song, suggesting sensitivity to syllable order. Results indicate that separate subpopulations of secondary auditory cortical neurons process spectrotemporal acoustics and sequence organization of vocal communication sounds, which could be controlled by distinct cellular mechanisms to shape the vocal learning landscape.
Jarrod Hicks and Josh McDermott
Topic areas: correlates of behavior/perception
sound texture auditory scene analysis streaming sound similarityFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Sound textures are created by the superposition of many similar acoustic events (e.g., rain falling, birds chirping, or people clapping) and are thought to be represented in the auditory system by statistics that summarize acoustic information over time. Real-world auditory scenes frequently contain multiple concurrent textures (as when birds chatter next to a babbling brook), raising the question of whether listeners can “hear out” (i.e., stream) individual textures. We sought to characterize “texture streaming” by asking whether listeners can estimate the number of sound texture sources in an auditory scene. In the first experiment, participants heard auditory scenes composed of one or two real-world textures and judged the number of distinct sound sources. Listeners performed above chance, tending to correctly judge the number of sources in each auditory scene. Inspection of judgments for individual scenes revealed consistent patterns of errors, indicating that particular combinations of textures tended to be mistakenly heard as a single stream. In a follow-up experiment, we asked participants to rate the similarity of texture pairs from the streaming experiment and found similarity ratings to be partially predictive of streaming judgments. However, a substantial portion of the explainable variance in streaming judgments could not be predicted from perceptual similarity of the source textures, suggesting additional (as yet not understood) principles of perceptual organization. Together, these experiments demonstrate the phenomenon of texture streaming—a neglected aspect of auditory scene analysis in which listeners are able to stream concurrent textures in auditory scenes.
Gregory Hamersky, Luke Shaheen and Stephen David
Topic areas: neural coding
Natural Sounds Auditory Streaming FerretsFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
In everyday hearing, listeners encounter complex auditory scenes containing spectrally overlapping sound sources. Accurate perception requires streaming, i.e., the grouping of sound features into meaningful sources based on statistical regularities in the time and frequency domains. Numerous psychoacoustic studies have described auditory streaming as a perceptual phenomenon, but less is known about its underlying neural basis. The current study recorded single unit activity in the auditory cortex (AC) of awake ferrets. Passively listening ferrets were presented with a series of natural sound excerpts from two broad, ethologically relevant categories: textures (backgrounds, BGs) and transients (foregrounds, FGs). FG and BG stimuli were presented individually and concurrently. Neural responses to overlapping pairs (BG+FG) were modeled as linear weighted combinations of responses to the isolated BG and FG. Model weights showed BG+FG responses were consistently suppressed relative to responses to the individual sounds. Perceptually, FG stimuli typically pop out from BG. Surprisingly, the linear model showed stronger FG-specific suppression compared to BGs in the neural responses. To investigate the mechanism supporting FG suppression, spectral and temporal statistical features of each sound were regressed against the degree to which a sound was suppressed and the degree to which it suppressed other sounds. Sounds with high temporal stationarity, which is more prominent in BG, were likely to suppress a concurrent sound and less likely to be suppressed. Conversely, a sound with high stationarity on the spectral axis, more common in FG stimuli, was less likely to suppress a paired sound and more likely to be suppressed itself. Ongoing experiments are recording neural activity during presentation of synthetic versions of the natural sounds, in which temporal and spectral statistics that drive stationarity are either matched or shuffled relative to the original sounds. These data will directly test the role of these spectro-temporal statistics on neural responses to concurrent stimuli. < TEST SEGMENT
Natsumi Homma and Christoph Schreiner
Topic areas: neural coding
primary auditory cortex rat signal-in-noise processing ventral auditory field vocalizationFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In the auditory cortex, groups of neurons show reliable synchronous activities (coordinated neuronal ensembles; cNEs). How individual neurons fire together to encode sensory information is, however, not well understood. Here we examined encoding of vocalizations in the presence of background noise in two core rat auditory cortical fields, primary auditory cortex and ventral auditory field (Sprague-Dawley rats, female, ~2-5 months). We obtained dense extracellular recordings using multichannel silicone probe and presented short segments of vocalizations embedded in various types of spectrotemporally modulated noise at six different signal-to-noise ratios. cNEs were identified based on dimensional reduction techniques (Lopes-dos-Santos et al, 2013; See et al., 2018), and the ability of decoding vocalizations or noise types was estimated by nearest-neighbor linear decoder (Foffani and Moxon, 2004) for cNEs and individual cortical neurons, respectively. The results suggest that (i) cNEs are more likely to decode vocalization and/or noise than single units, (ii) decoding accuracy of cNEs is improved compared to that of single units, and (iii) improvement of cNE decoding is most prominent for low level noise. These findings indicate that synchronous activities in the cortex could help refine spiking patterns and reduce the effects of noise in vocalization-in-noise processing. In addition, cNEs may contribute to emerging stimulus selectivity towards downstream areas, which support the idea that cNEs are a functional unit to enhance and convey essential sensory information.
Malinda McPherson, Dana Boebinger, Nancy Kanwisher and Josh McDermott
Topic areas: correlates of behavior/perception
pitch fMRI harmonicFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Sounds like speech and music are generally harmonic, with frequency components that are integer multiples of a common fundamental frequency (f0). Information in harmonic sounds is believed to be extracted in two ways: via individual frequency components, and via estimation of the fundamental frequency. The first mechanism is applicable irrespective of whether sounds are harmonic. We compared brain responses evoked by harmonic melodies to those from otherwise identical sounds whose frequency components had been jittered to be inharmonic (inconsistent with a single f0). Cortical responses were similar for harmonic and inharmonic melodies when tones were presented in quiet. However, the same regions of cortex had higher responses to harmonic melodies when tones were presented in noise, a manipulation known to cause listeners to rely more on estimates of f0. The results suggest that mechanisms selective to harmonic sounds are present in the auditory cortex but are co-located with non-selective mechanisms that are not sensitive to harmonic structure.
Roland Ferger, Keanu Shadron, Brian Fischer and José Luis Peña
Topic areas: correlates of behavior/perception neural coding subcortical processing
Sound localization Frequency Tuning Midbrain Barn Owl Modelling Cue Reliability Interaural Time DifferenceFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
All animals are faced with the challenging task of transforming natural sensory information into percepts that can drive adaptive behavior. Due to the inherently noisy nature of almost all sensory cues, it is crucial to select or emphasize the most meaningful and reliable cues. While meaningfulness often varies based on the task or immediate stimulus statistics, some aspects of sensory cue reliability for detection and coding are determined by prevalent physical and/or physiological conditions. Previous work from our lab demonstrated that neurons in the barn owl's external nucleus of the inferior colliculus (ICX), which are sensitive to sound direction cues such as the interaural time (ITD) and level (ILD) differences, exhibit frequency tuning that matches anticipated ITD statistics. In other words, ICX neurons, which integrate across separate frequency channels from the immediate upstream nucleus, respond most strongly to those frequencies that are most reliably conveying ITD cues for their respective sound direction. Noteworthy, this is a function of the sound direction and the physical properties of the head. More recent work has addressed the development of this frequency tuning by changing the physical properties underlying the cue reliability and found that the frequency tuning in ICX can indeed be changed in a way consistent to its relationship with ITD cue reliability. This suggests that the frequency tuning observed in ICX is at least partially dependent on experienced reliability as opposed to being solely genetically predetermined. In this study, we test the hypothesis that a Hebbian learning model can predict frequency tuning based on the reliability of the ITD cue and experience-dependent plasticity. This follows the idea that inputs from upstream neurons tuned to reliable frequencies are more likely to have a high covariance across longer time scales than those from neurons tuned to unreliable frequencies. Hence, the reliable frequency channels are more likely under a Hebbian learning rule to form stronger connections onto the ICX neurons than their unreliable counterparts. A rather simple learning rule like this could explain both the formation of tuning patterns during development and plasticity in adult animals, suggested by recent preliminary data and similar studies.
Jennifer Mohn, Melissa Baese-Berk and Santiago Jaramillo
Topic areas: speech and language hierarchical organization neural coding
non-primary feature encoding mouseFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The ability to discriminate communication sounds is a key element of social interaction for humans and other animals alike. Communication sounds are often complex and vary in both spectral and temporal content. The neural mechanisms that underlie learning to discriminate complex sounds remain poorly understood. In this study, our goal was to evaluate to what extent the mouse can serve as a model for investigating these mechanisms. We first confirmed that mice can be trained to discriminate between sounds that vary in either spectral (formant transition, FT) or temporal (voice onset timing, VOT) features using frequency-shifted human speech sounds: /ba/, /da/, and /pa/. Then, we sought to understand whether neurons in the auditory cortex of mice are selective to either FT, VOT, or both. Additionally, we tested if these features are represented similarly across auditory cortical regions or if areas specialize in encoding a single feature. We recorded simultaneously from primary, dorsal and ventral auditory cortical neurons of passively listening, awake mice while we presented synthesized, human speech sounds that varied in FT and VOT. We found populations of cells in all three areas that were selective to only one of the features, either FT or VOT. We also found some cells in all three regions that showed mixed selectivity for FT and VOT. Together, these results suggest that the mouse is an appropriate model for investigating how learning affects the way features of complex sounds are represented in the brain.
Jan Willem de Gee, Zakir Mridha, Matthew Thompson, Kit Jaspe and Matthew McGinley
Topic areas: memory and cognition auditory disorders multisensory processes subcortical processing
Cholinergic Neuromodulation AttentionFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Attention is limited in capacity and costly to utilize. Therefore, organisms are driven by ongoing behavioral incentives to choose which stimuli to attend to (selective attention), and how much to attend to them (attentional effort). We previously showed that pupil-linked arousal helps regulating attentional effort to exploit increases in task utility (de Gee et al., 2022). Fluctuations in pupil size at constant luminance track the global arousal state of the brain, and the activity of major neuromodulatory systems, including noradrenaline, acetylcholine, and perhaps also serotonin and dopamine. It is currently unknown which of these neuromodulatory systems are responsible for regulating attentional effort. In the attentional effort task, mice lick for sugar-water reward when detecting unpredictable temporal coherence in an ongoing tone-cloud. To detect all weak-coherence signals, mice would need to sustain an infeasibly high level of attentional effort across the sessions. To probe adaptive effort allocation, we switched the sugar-water reward size (droplet volume) between high and low values in blocks of 60 trials. Thus, mice should increase attentional effort during blocks of high reward. We simultaneously recorded pupil size, walking speed, and acetylcholine concentration in auditory cortex via two-photon imaging of GRABACh (GPCR-based sensor). Mice better detected the coherence in the high vs low reward context. Irrespective of block type, evoked ACh responses were bigger for hits versus false alarms (28.1% increase, p = 0.008). This indicates that phasic cholinergic signaling helps optimize task performance. Importantly, irrespective of outcome, evoked ACh responses were bigger in the high versus low reward blocks (23.8% increase, p = 0.001). These results were not due to differences between conditions in walking or lick statistics. In sum, we find that cholinergic signaling regulates attentional effort in mice. In ongoing work, we are determining the roles other neuromodulatory systems in mediating these adaptive shifts in cognitive behavior. References: de Gee et al. 2022. Mice regulate their attentional intensity and arousal to exploit increases in task utility. bioRxiv.
Jingwen Li, Vladimir Jovanovic, Mikio Aoi and Cory Miller
Topic areas: correlates of behavior/perception neural coding neuroethology/communication
frontal cortex natural communication marmosets generalized linear model population encodingFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Communication is an interactive behavior involving the active, coordinated exchange of social signals between two or more individuals. Studies exploring the neural basis of communication in the primate brain have traditionally focused on how these signals are processed or produced, leaving many facets of vocal behavior remain largely unexplored. One challenge has been limitations of traditional analyses for quantifying the often variable, dynamic nature of vocal behaviors, such as conversational exchanges. Here we recorded the activity of neurons in primate prefrontal and premotor cortex as freely-moving marmosets engaged in natural conversational exchanges and applied different analysis approaches to determine how these substrates encode the various facets of this vocal behavior. We first applied traditional analysis of spike trains using peri-stimulus time histograms (PSTHs) and observed neurons responding to hearing calls and neurons showing a compensatory decrease when producing calls. To further explicate the encoding strategy of the frontal population in natural communication, we performed a generalized linear model (GLM) based analysis. The model includes external call events, internal behavioral states, and spike history with a sliding window, and gives continuous readout of spike rate. The predicted spike rate successfully recapitulates observed PSTHs. From the trained GLM kernels and coefficients, we identified more neurons in the population having a significant change in spike rate for communication events. Furthermore, we found state-related neurons that are modulated by the internal behavior states. Next, we performed dimensionality reduction and clustering analysis on the GLM kernels and coefficients. We observed distinctive clusters in the population playing different functional roles: one cluster that drives (firing before) marmoset call production; one responding to (firing after) only producing antiphonal calls in a conversation; one responding to hearing calls. Last, we used the GLM framework to decode the animal’s communicating behavior from the neural activities. These results show that the GLM analysis applied here to a continuous natural vocal behavior revealed elements of primate frontal cortex activity that were not evident using more traditional analyses, and suggests that this method may be a powerful tool to better understand the neural basis of communication in the primate brain.
Prachi Patel, Kiki van der Heijden, Stephan Bickel, Jose Herrero and Nima Mesgarani
Topic areas: correlates of behavior/perception hierarchical organization
sEEG Sound localization in humans Cocktail party spatial attention auditory cortexFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
How the human auditory cortex represents spatially separated simultaneous talkers and how attention to either talker’s location or voice modulates the neural representations of attended and unattended speech are unclear. Here, we measured neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in the heschl gyrus, planum temporale and superior temporal gyrus. Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, a talker’s location changed the mean response level while talker’s spectrotemporal features altered the modulation of response around its baseline. These components were differentially modulated by attention to either the voice or the location of the talker which improved population decoding of attended speech features. Attentional modulation to talkers only appeared in the auditory areas with longer latencies, but attentional modulation to location was present throughout. Our results show that spatial multi-talker speech perception relies upon a separable pre-attentive neural representation, which could be further tuned by top-down attention to the talker’s voice or location.
Stephen David, Wanyi Liu, Mateo Lopez Espejo, Kai Lu, Pingbo Yin, Shihab Shamma and Jonathan Fritz
Topic areas: memory and cognition correlates of behavior/perception cross-species comparisons hierarchical organization
auditory ferret frontal cortex behavior 2-AFC Go-NoGo neural representation sound discriminationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
How are representations of sensory stimuli in frontal cortex (FC) shaped by task demands and reward structure? Ferrets can easily be trained to distinguish target pure tones from distractor broadband noise, using an aversive, conditioned avoidance (CA-GNG) paradigm (Fritz et al., 2003). Responses in FC were highly selective for target tones (NoGo) during active behavior and showed no response to the noise (Go) stimuli (Fritz et al., 2010). To clarify whether this highly selective, gated response to target tones was dependent upon design and reward structure of the behavioral paradigm, ferrets were trained to make the same acoustic discrimination, but now using two other behavioral paradigms. Three ferrets were trained on a positive, appetitive version (POS-GNG) of the Go-NoGo paradigm (David et al., 2012) and three additional ferrets were trained on a 2-alternative forced choice (2AFC) appetitive task. All ferrets learned the different tasks to behavioral criterion. We recorded neuronal activity in dorsal FC of head-fixed ferrets during passive listening to the acoustic stimuli, and also during task-engaged conditions for all three behavioral paradigms. Locations of recordings were neuroanatomically confirmed. We measured single unit responses to tones and noise in CA-GNG, POS-GNG and 2AFC tasks. We observed striking differences in the neural representation in FC in the three behavioral paradigm conditions. Unlike the highly selective response to target stimuli observed for the CA-GNG task, there were responses to either or both tonal targets and noisy distractors in the POS-GNG task. Furthermore, we observed no persistent post-passive responses in POS-GNG, unlike previous results in CA-GNG. The same response pattern was observed in recordings from FC during performance of the 2AFC paradigm. The similarity of neural responses in these two paradigms reflected similarities in behavioral responses. These findings reveal the highly task paradigm-dependent representation of sensory stimuli in FC during discrimination of identical acoustic stimuli and support earlier results in starling forebrain (Gentner and Margoliash, 2003). These studies help us better understand the way in which multiple factors, including task-engagement and attention, task design and motor responses, and reward valence, all contribute to shaping FC neural responses and representation of task-relevant sounds.
Rowan Gargiullo, Megan Zheng, Osama Hussein, Cedric Bowe, Carrissa Morgan, Kaitlyn A. Brooks and Chris C. Rodgers
Topic areas: memory and cognition correlates of behavior/perception
active sound localization mouse behavior freely moving sensorimotor integration auditory cortex hearing lossFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In natural behavior, we actively move our heads, eyes, hands, and bodies to collect sensory information. For instance, people are better able to localize sounds when they move their head while listening, an ability known as active sound localization. This active strategy is especially important for people with cochlear implants or single-sided hearing loss. However, our understanding of active sound localization is limited, because in most auditory studies the head is held still. To understand the sensorimotor strategies and neural circuitry for this ability, we have developed a mouse model of active sound localization. Freely moving mice are placed in an arena surrounded by eight speakers, one of which (the "goal") emits a continuous stream of noise bursts. The mouse is rewarded for going to the goal. Mice learned this task, as well as additional task variants that probe spatiotemporal integration by distributing the sound sources, or auditory attention by including distracters. Surgical induction of conductive hearing loss impaired the mice, as do lesions to auditory cortex, indicating a role of central auditory pathways. In ongoing work, we plan to identify the motor strategies freely moving mice use to localize sound, how this is directed by a network of interacting brain regions, and how this enables recovery from hearing loss.
Ya-Ping Chen, Patrick Neff, Sabine Leske, Daniel D.E. Wong, Nicole Peter-Siegrist, Jonas Obleser, Tobias Kleinjung, Andrew Dimitrijevic, Sarang Dalal and Nathan Weisz
Topic areas: auditory disorders speech and language correlates of behavior/perception
cochlear implant degraded speech auditory processing multivariate analysis EEGFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Studies have shown that individuals with a cochlear implant (CI) for treating single-sided deafness have experienced improved speech perception in noise. However, it is not clear how single-sided CI users’ speech perception improves and how neural speech representation of speech intelligibility changes over time. In the present longitudinal EEG study, we measured the neural activitiy in response to variously degraded spoken words monaurally to CI and non-CI ears in 10 single-sided CI users and 10 age- and sex-matched individuals with normal hearing. The spoken words were degraded in 9 levels: 3 levels of temporal degradation x 3 levels of spectral degradation. Subjective comprehension ratings of each word were also recorded. The data of single-sided CI participants were collected at four time points: pre-CI surgery, 3, 6, and 12 months after surgery. We conducted representational similarity analysis (RSA) to depict single-sided CI users’ improvement in characteristics of temporal dynamics and representational patterns in both CI and non-CI ears. After surgery, comprehension performance of the degraded words improved over time in both ears of single-sided CI users. Moreover, this improvement in both ears was reflected by increased similarity to neural representational patterns of healthy controls/ears although in various time windows. In single-sided CI users’ non-CI ears, the increased similarity showed around 800 ms after stimulus onset, also where the peak of decoding accuracy of spoken-word intelligibility appeared. In single-sided CI users’ CI ears, the increased similarity showed, however, around 500 ms. The present study implies that auditory cortical speech processing after CI implantation gradually normalizes towards generally normal functioning within months. The CI benefits not only the CI ear but also the non-CI ear. These novel findings highlight the feasibility of tracking neural recovery after auditory input restoration by advanced multivariate analysis methods like RSA.
Alexander J. Billig, Sukhbinder Kumar, William Sedley, Phillip E. Gander, Joel I. Berger, Meher Lad, Maria Chait, Hiroto Kawasaki, Christopher Kovach, Matthew A. Howard and Timothy D. Griffiths
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization neural coding
human auditory hippocampus memory intracranial fMRIFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
In addition to supporting declarative memory and navigation, the hippocampus can represent sensory and conceptual spaces (Behrens et al., Neuron 100:490-509). In rodents, hippocampal place cells also form discrete firing fields for particular sound frequencies (Aronov et al., Nature 543:719-722, 2017). To investigate whether not only auditory cortex but also hippocampus maps auditory features in humans we generated chords that are perceived on a continuum from "beepy" to "noisy", based on the number of simultaneous components. Unlike tone frequency this "density" feature is not consistently associated with any physical spatial dimension. In an fMRI experiment, subjects held in mind a 2-s sound of fixed density and adjusted the density of a 8-s sound to match it. In other conditions we removed the memory component (adjustments were made freely with no target density), the adjustment component (button presses were made for odd/even judgments on spoken digits), or both. The design distinguished activity supporting auditory memory from that relating to sound adjustment, also controlling for sensory and motor factors. Auditory working memory was associated with activity in anterior insula, paracingulate cortex, and inferior frontal gyrus. Density adjustment elicited bilateral hippocampal activity, which did not depend on the subject navigating toward a fixed target density. The density of target sounds could be decoded from multivoxel patterns in non-primary auditory cortex in planum temporale, and also in hippocampus. Intracranial recordings in neurosurgical patients performing a similar task revealed increases in auditory-cortical high-gamma (70-150 Hz) and in hippocampal delta/theta (1-8 Hz) power. A subset of single- and multi-units in both primary auditory cortex and hippocampus were tuned to particular densities, while others showed elevated activity during particular phases of the task (such as when storing the target sound in mind). We demonstrate involvement of a broad network - including auditory cortex and hippocampus - in mental navigation of a non-spatial sound dimension.
Sara Jamali, Stanislas Dehaene, Timo van Kerkoerle and Brice Bathellier
Topic areas: memory and cognition correlates of behavior/perception neural coding
sequence coding context auditory cortex mice adaptationFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The ability to extract temporal regularities at different time scales in sensory inputs and detect unexpected deviations from these regularities is a key cognitive ability. The classical auditory oddball paradigm shows that the brain responds to sequence violations at a local time scale, but such responses also occur under anesthesia and therefore seem pre-attentive. In contrast, recent studies in humans and monkeys suggest that when the violation concerns regularities occurring over longer time scales, responses to the violation appear only in conscious, attentive subjects. To investigate whether local and global sequence violation responses exist in the mouse, we recorded from layers 1 to 5 of the auditory cortex using two-photon calcium imaging while mice passively listened to repetitions of 1s-long sequences of five tones. The repeated short sequence contained either a single tone (AAAAA) or a local violation at its end (AAAAB). Purely global violations could be generated by presenting occasionally the AAAAA sequence in a block where AAAAB is repeated. We found that a population of neurons in the auditory cortex specifically responds to such purely global violations at the end of the AAAAA sequence. Although small, this population contained enough information to predict violations in single trials. A larger fraction of neurons boosted their responses to combinations of local and global violations (AAAAB presented in an AAAAA block). These global responses were resistant to a wide increase of inter-sequence interval (1.5s - 25s) ruling out that short-term adaptation causes these responses. However, global responses vanished when the difference between A and B sounds is less salient to the mouse. In anesthetized animals, the local violations were reduced compared to awake animals but the purely global violations disappeared. In the end, in VIP interneurons we show some clusters with sequence termination responses similar to those responding to the purely global violations. These results establish that the mouse brain is able to detect global violations in sound sequences in a subgroup of auditory cortex neurons, paving the way for the study of circuit mechanisms underlying long-term temporal regularity detection.
Su Jin Kim and Kishore Kuchibhotla
Topic areas: memory and cognition subcortical processing thalamocortical circuitry/function
sensorimotor learning plasticity overtraining optogenetics calcium imagingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Sensorimotor learning requires linking sensory input with an action and an outcome. How a new sensorimotor pairing is initially encoded by the brain and then consolidated for long-term use remains elusive. Recent evidence from our lab suggests that inactivation of the auditory cortex (AC) during learning impacts the acquisition of task contingencies, but then becomes dispensable after continued training. The mammalian auditory system is organized in a feedforward fashion from the inner ear to auditory brainstem, the midbrain (IC, inferior colliculus), then the thalamus (MGB, medial geniculate bodies), and finally the AC. Here, we ask whether the AC ‘tutors’ subcortical circuits during learning using longitudinal two-photon imaging of the MGB and IC alongside projection-specific optogenetics. To examine this, we trained mice to lick to a pure tone for a water reward (S+) and withhold licking to another tone (S-) to avoid a timeout. We optogenetically inactivated deep AC excitatory neurons (L4-L6, n=2 test stGtACR2, n=3 control SIO-stGtACR2), which project to subcortical auditory structures. Inactivation appears to impact behavioral performance early in learning. This effect was transient and waned across learning, such that after the animal achieves expert performance inactivating the deep-layer neurons no longer impacts behavioral performance. We are now using two photon imaging of thalamic axons to monitor the feedforward projections from the MGB to the AC across sensorimotor learning. Our ongoing work aims to understand the nature of the interactions between AC and subcortical auditory structures across the full extent of learning, from naïve to overtrained.
Neha Hemant Joshi, Wing Yiu Ng, Daniel Duque Doncos, Jonathan Fritz and Shihab Shamma
Topic areas: speech and language correlates of behavior/perception
auditory cortex acoustic scene analysis auditory behavior correlates of auditory perception cocktail party ferret frontal cortexFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Stream segregation of complex sounds is a valuable behavioral trait that humans and animals often perform effortlessly. Streaming has been previously studied extensively in animals using relatively simple stimuli such as tone sequences and noise segments. Here we report on a stream segregation task using speech that allowed us to explore a more direct cross-species comparison of auditory cortical responses in ferrets and humans. Two ferrets were trained to focus attention on a female speaker mixed in two simultaneous (overlapping male and female) speech streams, where both speakers were equally perceptually salient. The animals could reliably detect a target word uttered by the female speaker embedded in the speech mixture with a d-prime of 2.15 and 1.94 for the two animals. The main findings from single unit recordings in the ferret auditory cortex are that during task performance, the representation of the target female speaker becomes selectively enhanced relative to the male speaker. Thus, the neurons tuned to the spectral channels of the female voice become enhanced relative to those dominated by the male. This selective enhancement of the female spectral channels is accompanied by significant suppression of the male-dominated spectral channels. Neurons tuned to spectral channels that were shared between the two speakers remain relatively unchanged. These effects are observed in the primary auditory cortex (A1), but they are considerably amplified in the secondary auditory cortical areas (e.g., dorsal PEG) of the ferrets. The frontal cortex only represents the target female speaker in that the responses are phase-locked to female-spoken syllables regardless of the male speech. In this presentation, we shall elucidate the characteristic changes at the single neuron level in the auditory and frontal cortices of the ferret that contribute to these phenomena, and how these differ from the analogous physiological results in the human brain where attention modulation seems to primarily take place in the secondary auditory cortical area of the superior temporal gyrus (STG). [O’Sullivan et al, 2019] (Refs: O’Sullivan, James, et al. "Hierarchical encoding of attended auditory objects in multi-talker speech perception." Neuron 104.6 (2019): 1195-1209.)
Megan Arnold, Rebecca Krall, Callista Chambers, Harry Morford and Ross Williamson
Topic areas: hierarchical organization subcortical processing thalamocortical circuitry/function
Descending Corticofugal Auditory CortexFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The auditory pathway is comprised of a complex series of brain stations that are dedicated to the processing and interpretation of our sense of sound. Incoming sensory signals traverse an ascending hierarchy from the cochlea to the primary auditory cortex (ACtx), before being propagated brain-wide through networks of excitatory projection neurons to inform distinct behavioral outcomes. These neurons fall into three broad classes: intratelencephalic (IT), extratelencephalic (ET), and corticothalamic (CT). Of these classes, ET cells are unique as they form the only direct connection between the ACtx and myriad sub-cortical targets. Their unique organizational motifs place them in a privileged position to broadcast signals to multiple downstream targets simultaneously. However, the extent of axonal collateralization, downstream spatial organization, and upstream monosynaptic connectivity remains unknown. To address these questions, we characterized the input/output circuitry of ACtx ET cells and compared their anatomical organization to that of IT and CT populations. Using intersectional viral tracing strategies involving both Cre- and Flp-recombinase, we drove selective expression of adeno-associated viruses in distinct populations of excitatory projection neurons, allowing us to characterize downstream organization with high spatial resolution and identify local and long-range synaptic input through monosynaptic rabies tracing. Confirming prior reports, we found that ET neurons collateralize to non-lemniscal regions of the inferior colliculus and thalamus, as well as the lateral amygdala. Notably, ET cells that collateralize to the thalamus primarily originated from deep layer 6, rather than being evenly distributed across both layers 5 and 6. Monosynaptic rabies tracing demonstrated widespread synaptic inputs to ET, IT, and CT populations. All three excitatory projection types received the bulk of their input (in varying degrees) from sensory cortices, including somatosensory and visual cortex, as well as higher-order regions in parietal and frontal cortex. Our ongoing experiments are focused on extending these findings to distinct ET organizational motifs. This work will provide an anatomical foundation for understanding how brain-wide interactions between distinct areas cooperate to orchestrate sensory perception, guide behavior, and become disrupted in perceptual and neurological disorders.
Gunnar L Quass, Meike M Rogalla, Alexander N Ford and Pierre F Apostolides
Topic areas: correlates of behavior/perception neural coding subcortical processing
Subcortical Processing Population Coding Inferior ColliculusFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Active listening requires not only identifying sound features, but also predicting their behavioral relevance. Behaviorally relevant representations are well documented in the auditory cortex, but whether similar activity arises earlier in the central hierarchy remains debated. The dorsal and external nuclei of the inferior colliculus (shell IC) are midbrain circuits that receive a variety of acoustic, multi-sensory and neuromodulatory signals (Gruters & Groh 2012). We tested whether behavior and/or outcome signals are present in the shell IC during an auditory task using Ca2+-imaging, machine learning, and a reward-based discrimination task. We expressed the Ca2+-indicator GCaMP6f in shell IC neurons of 7 CBA/C57 Bl-6J mice. Animals were trained to report the presence or absence of amplitude modulation in a bandpass noise stimulus using a GO/NOGO paradigm. We used 2-photon microscopy to record neural activity as mice performed the task; modulation depth was varied to obtain psychometric functions. We analyzed the population activity at various epochs during the trial, and used a support vector machine (SVM) classifier and principal component analysis (PCA) to predict the trial outcome from neural population activity. We used the fluorescence data from shell IC neurons to train the SVM to predict the outcome of each trial (hit, miss, false alarm, correct rejection). Classification accuracy was highest after sound offset. However, significant classification was achieved even using activity from the first or second half of the stimulus. A similar accuracy was achieved when the SVM was trained on activity occurring prior to mice’s behavioral response. PCA data shows a persistent processing difference for the same sounds on different outcomes, lasting for several seconds. We further found significant task-modulation of sound responses. Collectively, our data argue that neural activity in the auditory midbrain reflects a mixed selectivity of predictive- and feedback information about behavioral outcome in response to behaviorally relevant sound features. Supported by NIH R01DC019090, The Whitehall Foundation, and the Hearing Health Foundation.
Keith J. Kaufman, Rebecca F. Krall, Megan P. Arnold, Tomas Suarez Omedas and Ross S. Williamson
Topic areas: correlates of behavior/perception neural coding
Auditory cortex Arousal Behavioral state Auditory processing Neural codingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Stimulus-independent nervous activity associated with arousal state, as indexed by pupil diameter, varies continuously and influences membrane potentials, cortical state, and sensory processing. Previous studies have shown that the response strength, reliability, and tuning properties of layer (L) 2/3 neurons in auditory cortex (ACtx) are modulated by arousal. The processing of acoustic information recruits a diverse set of excitatory neurons that span the entire cortical lamina. These excitatory neurons can be broadly categorized as intratelencephalic (IT), extratelencephalic (ET), or corticothalamic (CT), distinct in terms of their anatomy, morphology, and intrinsic and synaptic properties. These marked differences will likely lead to state-dependent changes in sensory coding. We combined two-photon calcium imaging and full-face videography in awake mice to investigate how arousal regulates the response properties of L2/3 IT (N=2; n=1119), L5 IT (N=6; n=769), ET (N=11; n=698), and CT (N=3; n=384) neurons in ACtx. We first analyzed the amplitude and reliability of neural responses to pure tones and found that both measures scaled monotonically with arousal in ET and CT neurons, but not in L2/3 and L5 IT neurons. We then computed the shared response variability across trials. While L5 IT cells were unmodulated by state, increases in arousal coincided with a reduction in shared variability for all other populations, indicating that neural activity becomes less correlated when animals are more alert. A measure of lifetime sparseness was used to determine changes in neural response distributions as a function of state. The sparsity of L2/3 IT and ET neurons decreased at higher arousal states, whereas the sparsity of L5 IT cells increased monotonically with arousal and the sparsity of CT cells was lowest at intermediate arousal states. We tested whether this arousal-dependent modulation degraded sound encoding by employing a statistical neural decoding analysis to predict stimulus identity at each state using the different neural populations. This analysis revealed an inverse-U function whereby intermediate states had the highest decoding accuracy, suggesting stimulus representations are less faithfully encoded at low and high arousal levels. Together, these findings provide a detailed description of arousal-dependent changes in the sensory coding of specific cortical projections.
Celine Drieu, Ziyi Zhu, Kylie Fuller, Aaron Wang, Sarah Elnozahy and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception
auditory cortex longitudinal two-photon calcium imaging goal-directed sensorimotor learning dimensionality reductionFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Goal-directed learning is canonically considered a slow process with high inter-subject variability. Exploration of the neural mechanisms, therefore, has focused on identifying dynamics concomitant with these gradual improvements. Recent work, however, used performance in reinforced and non-reinforced ‘probe’ trials to show that goal-directed learning can be dissociated into two behavioral phases: rapid ‘acquisition’ of task contingencies (measured in probe trials), and slower ‘expression’ that reveals the learned content (measured in reinforced trials). To what extent is the auditory cortex (AC) involved in either learning phase? To address this, we trained mice to lick to a tone for water reward (S+) and withhold from licking to another tone (S−) to avoid a timeout. Optogenetic inactivation of the AC significantly impaired both acquisition and expression. Surprisingly, this inactivation-induced deficit gradually waned during expression arguing for an ephemeral associative and teaching role for the AC, rather than one focused on task execution. To determine how the two learning phases are implemented by AC networks, we used longitudinal two-photon calcium imaging of the same large population of excitatory neurons (n=8,235 neurons in 8 mice across 15 days) in layer II/III. We isolated learning-related dynamics by comparing mice learning the task (n=5) to those passively listening to the same tones over the same period (n=3). We used unsupervised low-rank tensor decomposition (TCA) to uncover low-dimensional network dynamics at different timescales. While stimulus-related habituation dominated in passive networks, three striking behavior-driven features emerged in learning networks. First, a population of S+ driven neurons rapidly shifted to firing later in the trial, suggesting a role in reward encoding at the timescale of acquisition. Second, a distinct subset of S- responsive neurons gained a late-in-trial behavioral inhibition signal that increased gradually, at the timescale of expression. Third, reward-related signals appeared to be transient, ramping up and then gradually waning at expert levels. Thus, the AC plays a default, but temporary, role in goal-directed learning that is mediated by a network that shifts from being largely stimulus-driven to one that is optimized for behavioral needs.
Aneesh Bal, Patricia Janak and Kishore Kuchibhotla
Topic areas: correlates of behavior/perception neural coding
multi-learning learning learning compositionality latent learningFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Humans are able to learn a wide variety of tasks. Additionally, we can learn new skills over our lives without forgetting old ones, an ability termed “multi-task learning”. Despite these capabilities, it is unclear how populations of neurons in our brains allow for this to happen. To address this, we created an auditory multi-task learning paradigm in mice, wherein we can efficiently access, monitor, and manipulate auditory processing neurons involved over learning. We formulated two distinct auditory Go-NoGo tasks. First, in a localization task, mice were presented with a 6kHz pure tone on either the right (target stimulus) or left speaker (foil stimulus). In a second task, the frequency direction task, mice were played either an 8-16kHz upsweep (target stimulus) or a 16-8kHz downsweep (foil stimulus). Lastly, in a combined task, mice were presented with a either an upsweep or downsweep on either the right or left speaker. An upsweep on the right speaker was the target stimulus while all other combinations were considered to be foil stimuli. Based on these tasks, we formulated two paradigms. First, in a skill combination paradigm, mice first learned a localization task, then the frequency direction task, and lastly, the combined task. We found that mice (n=3) successfully learned two tasks in sequence, and some mice (n=2) exhibited expert level performance on the combination task within the first 50 trials of exposure, suggesting an integration of skills across the tasks. In a task decomposition paradigm, mice first completed the combined task, followed by the skill combination and frequency direction tasks. We found that mice exhibited expert level performance on the combined task (n=5), and strikingly, when tested on the individual tasks, exhibited expert levels of performance within the first 50 trials, suggesting latent learning that occurs during the simultaneous learning of multiple tasks. These preliminary results provide validation for two distinct behavioral paradigms to study transfer learning (skill combination) and simultaneous learning (task decomposition) that can be used for future behavioral studies and neural investigations of the role of the auditory cortex and subcortical auditory structures in learning multiple tasks.
Kirill Nourski, Mitchell Steinschneider, Ariane Rhone, Hiroto Kawasaki and Matthew Howard
Topic areas: speech and language hierarchical organization
Alpha suppression Gamma activity Intracranial electroencephalography SpeechFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Semantic novelty paradigms are useful tools to probe the cortical circuits involved in language processing. Neuropsychiatric disorders, including autism, schizophrenia, and disorders of consciousness, are characterized by aberrant detection of semantic novelty. This investigation took advantage of the superior spatio-temporal resolution of intracranial electroencephalography (iEEG) to examine semantic novelty processing a large cohort of subjects with comprehensive electrode coverage. Subjects were adult neurosurgical patients (N = 38; 18 women) undergoing chronic invasive monitoring for medically intractable epilepsy. Cortical activity was recorded using depth and subdural electrodes ( greater than 5700 contacts), with extensive coverage of lateral temporal, frontal, parietal and limbic cortex. Stimuli were monosyllabic words from three semantic categories, presented in auditory target detection tasks. Each task included common words “cat”, “dog”, “five”, “ten”, “red”, and “white” (20 exemplars each), and ten novel words, five of which were in the target category. Cortical activity was measured as event-related band power in broadband gamma (30-150 Hz) and alpha (8-14 Hz) bands (reflecting feedforward activation and release from feedback inhibition, respectively), and averaged evoked potentials (AEPs). Effects of semantic novelty were measured as differences between responses to common and novel words. Linear mixed effects models were used to assess regional and hemispheric differences in responses while accounting for across-subject heterogeneity. Semantic novelty was associated with greater task difficulty, indexed by lower target hit rates and longer reaction times. Responses to novel words (augmented gamma power and more pronounced alpha power suppression) were broadly present in the cortex, particularly in rostral temporal, limbic, and prefrontal areas. Prominent AEP effects were observed in the left anterior temporal lobe. There was a complex pattern of hemispheric asymmetries, with either left- or right-hemisphere bias evident for different brain areas and iEEG frequency bands. This study provides a framework for understanding activation patterns of brain areas involved in processing of auditory semantic novelty. Novelty effects occur in anterior and medial temporal areas, which serve as global network hubs (Banks et al., bioRxiv 2022.02.06.479292) involved in higher-order sensory and speech processing. The current study helps lay the foundation for further clarifying aberrant speech and language processing in clinical neuropsychiatric populations.
Kathleen Martin, Colin Bredenberg, Jordan Lei, Eero Simoncelli, Cristina Savin and Robert Froemke
Topic areas: correlates of behavior/perception
perceptual learning auditory cortex across-animal variabilityFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Perceptual learning is associated with altered cortical representations of task-relevant stimuli in trained animals relative to naive ones (Polley, et al., 2006; Edeline, et al., 1993). While there is substantial variability across animals in the degree of behavioral learning and the associated changes in neural representations, we lack an account of how experiences during learning may drive these differences. Thus, we developed an experimental and computational framework for describing how sensory representations change during auditory perceptual learning. Mice were progressively trained to classify frequencies as a single, center frequency (ranging from 11-16 kHz) or non-center by licking left and right, respectively, for a water reward. Discrimination between center and non-center frequencies improved over 10-45 days through multiple phases (N=72). In a subset of animals, we recorded from layer 2/3 excitatory neurons in auditory cortex throughout learning using two-photon imaging (N=14). Despite similar behavioral performance at the end of training, animals exhibited one of two distinct activity profiles in auditory cortex. Specifically, tuning profiles of excitatory neurons exhibited either a relative enhancement or a suppression of responses at frequencies reported as the center frequency. The response profiles emerged throughout learning, starting in the first phase of learning. These response profiles were only present during behavior engagement, not during passive listening to the same stimuli. To make sense of the across animal variability in tuning, we developed a computational model to explore whether animal-specific choice preferences observed during learning could explain the individual variability in neural tuning. We trained a model neural network using reward-dependent Hebbian learning (Williams, 1992) to perform the task and examined whether initial choice preferences (rates of licking right and left), and the resulting reward statistics, are related to the learned neural representations. We found that higher rates of reward in trials with non-center frequencies early in learning lead to larger magnitude responses to the center frequency, a relationship confirmed in the data. Overall, our results suggest that, through its effects on reward statistics and consequent synaptic plasticity, choice preference during early auditory perceptual learning may play a causal role in producing across-animal variability in learned representations.
Menoua Keshishian, Samuel Thomas, Brian Kingsbury, Serdar Akkol, Stephan Bickel, Ashesh D. Mehta and Nima Mesgarani
Topic areas: speech and language neural coding
speech language neural coding representation deep neural network recurrent neural network automatic speech recognitionFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Constructing computational models of spoken language processing in the human auditory system is hindered by the scarcity of neural data with the spatiotemporal resolution necessary for training models with sufficient complexity that can accurately explain the neural responses to novel stimuli. A common approach to handle such limitation is to train an artificial deep neural network on speech perception tasks using large amounts of data and learn insights about the brain processes by comparing the model and actual neural representations. There is increasing interest in deep language models trained to predict the next word in a sequence (e.g., GPT2). A crucial difference between these models and the brain is that the input to the auditory system is a highly variable sound which such models ignore by using the transcript. Because of this unrealistic assumption, the similarity between the transformation of sound to meaning in biological and computational neural networks remains underspecified. Here, we chose an RNN-Transducer as our computational model of speech perception. We first map the RNN-T layer activations to the neural activity at electrode locations. The intracranial (ECoG and sEEG) neural data for this analysis is recorded from 15 patients undergoing epilepsy surgery who listened to 30 minutes of speech. Our analysis was focused on the auditory cortex, including the Heschl’s gyrus, planum temporale, and the superior temporal gyrus. By comparing the predictability of the neural activation at each site from different layers of the network, we grouped neural recording sites by their best predictive layer of the network and observed better predictability of deeper layers of the model for downstream areas in the auditory pathway. To shed light on the mechanisms behind this improved prediction accuracy, we determined the degree of linguistic feature encoding in RNN-T layers from subphonetic to semantic levels. This analysis revealed a hierarchy of language encoding in the model such that earlier layers are best at predicting phoneme-level information and later ones at word-level information. Together, these two levels of analysis show a progressive encoding of linguistic information across different layers of the network as well as different regions of the auditory cortex.
Vani G. Rajendran, Luis Prado, Alessio H. Rojas, Manuel Anglada-Tort, Nori Jacoby and Hugo Merchant
Topic areas: memory and cognition correlates of behavior/perception cross-species comparisons
sensorimotor synchronization rhythm perception music beat cross-species tone clouds tappingFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The ability to dance or move to music requires extracting an abstract rhythmic pattern from often continuous sound, projecting this pattern into the future, and precisely timing motor actions in anticipation of sound events that have not yet occurred. Despite music being a central feature of the human experience, very little is known about the dynamics of the brain networks that enable this ability. Recent studies in the macaque have revealed a capacity in this non-human primate species to synchronize predictively to visual and auditory metronomes. Subsequent electrophysiological studies in the macaque are now shedding unprecedented light onto the dynamic encoding of rhythmic timing in neural populations in the supplementary (SMA) and pre-supplementary motor area (pre-SMA) during rhythmic tapping to metronomes. However, synchronization to music involves an additional level of abstraction to discover a salient rhythmic pattern out of spectrotemporally complex sound. Here, we take this next step by training monkeys who are already expert metronome tappers to tap to two kinds of spectrotemporally complex sounds: repeating tone clouds, and real music. Additionally, we asked a large sample of human listeners to tap along to the same stimuli. Based on comparison between human and non-human tappers, we report on the similarities and differences observed during sensorimotor synchronization to complex stimuli, and the possible implications of these observations on the neural dynamics involved.
Xiaomin He, Vinay Raghavan and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception
speech intelligibility SNR auditory attention speech encoding speech perceptionFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Studies of neural encoding of speech under adverse listening conditions have shown a selective representation of the target speaker relative to non-target speakers and background noises. However, how the physical degradation of target speech or its perceptual breakdown are reflected in the neural responses remain unclear. Here, we investigated this question in a multi-talker speech perception scenario with different background noises. We studied how the signal-to-noise ratio (SNR) and intelligibility of target speech change the encoding of speech envelope in EEG responses. We recruited 14 native English speakers and characterized their speech-in-noise perception using a standard speech intelligibility task. Next, we recorded EEG and pupillometry data while the subjects attended to one of two presented speakers with varying SNR and background noises. For each subject, neural speech tracking for both target and non-target speakers was assessed by computing the correlation between the reconstructed and true envelopes using linear decoders. Results show that the accuracy of neural speech tracking is significantly predictable from the intelligibility of the target speech but relatively insensitive to the SNR of the target speech and the type of background noise. We further identified a minimum threshold of intelligibility at around 35%, below which the neural discrimination of target and non-target speakers was not possible. Furthermore, we found that the mean pupil dilation peaks close to this intelligibility threshold. These findings deepen our understanding of how the physical and perceptual properties of speech are encoded in EEG responses, and pave a path towards detection and enhancement of target speech.
Kelsey L. Anbuhl, Marielisa Diez Castro, Nikki A. Lee and Dan H. Sanes
Topic areas: memory and cognition correlates of behavior/perception
Listening effort Cingulate cortex Auditory cortex gerbilFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Individuals with hearing loss (HL) often exert greater cognitive resources (i.e., listening effort) to understand speech, especially under challenging acoustic conditions. The resulting cognitive fatigue can impede language acquisition and can have long-term negative consequences for quality of life. However, the neural mechanisms that support listening effort are uncertain. Evidence from human studies suggest that the cingulate cortex is engaged under difficult listening conditions and can exert top-down modulation of the auditory cortex (AC). Here, we asked whether the gerbil cingulate cortex (Cg) sends anatomical projections to the AC, and whether it mediates effortful listening. Retrograde and anterograde virus tracers were injected into AC and Cg, respectively, to determine connectivity. To assess effortful listening, an amplitude modulation (AM) rate discrimination task was used and stimulus parameters (AM rate, sound duration) were varied to adjust the difficulty of listening conditions. Using an appetitive Go-Nogo paradigm, gerbils were trained to discriminate between “Go” stimuli consisting of AM rates (4.5-12 Hz, broadband noise carrier, 100% depth) and a “Nogo” AM stimulus (4 Hz). Trials were clustered into ‘easy’ or ‘hard’ blocks, where the sound duration was 1s or 0.25s, respectively. AM rate discrimination thresholds were determined from psychometric functions. Once asymptotic performance was reached, gerbils were implanted with bilateral Cg cannulae. To determine whether Cg is required for task performance, muscimol was infused bilaterally prior to testing to pharmacologically inactivate Cg and compare to saline-infused controls. Cg recordings are currently being obtained to determine whether Cg neurons represent task difficulty. Viral tracing experiments revealed a strong, descending projection from Cg to AC. Next, we asked whether locally inactivating Cg impairs perceptual performance. We found that Cg inactivation disrupted performance only for difficult listening conditions: thresholds for the 1s blocks (i.e., ‘easy’ blocks) remained the same across saline and muscimol conditions (~5 Hz AM), whereas thresholds for 0.25s blocks were elevated only for muscimol conditions (saline: ~5.5Hz AM; muscimol: ~7Hz AM). Taken together, the results reveal a descending cortical pathway from Cg to AC that mediates perceptual performance during difficult stimulus conditions. This pathway is a plausible circuit that may be undermined by HL.
Derek Nguyen, Sharon Crook and Yi Zhou
Topic areas: neural coding novel technologies
Neuropixels Auditory Cortex NeurophysiologyFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
Sensory cortices contain six-layer neurons with diverse morphological and electrical properties. High-density silicon probes (e.g., Neuropixels) are new-generation digital neural probes designed for detection of extracellular action potential (EAP) waveforms simultaneously across many cortical and/or subcortical structures in vivo. This new type of probe features a densely spaced electrode arrangement which improves spike-sorting accuracy and cell type identification. In addition, spatiotemporal EAP waveforms can reflect cell type-specific morphoelectrical features, such as dendritic backpropagation and capacitive currents. These features aid in understanding the functional roles of individual neurons and local cortical circuits. Here we study neuronal spiking representations of sound features in the marmoset auditory cortex using high-density electrode probes in the awake condition. The data analyses combine EAP waveform characterization, spike-time correlation, and response properties of neurons, resulting in multidimensional information on cortical dynamics. The results reveal diverse sound tuning preferences of cortical neurons across cortical layers and high spatial precision of the silicon probes in separating nearby neurons based on EAP waveforms and spiking dynamics.
Maansi Desai, Alyssa Field, Jacob Cheek, Malinda Mullet, Elise Rickert, Donise Tran, Rosario DeLeon, William Schraegle, Nancy Nussbaum, Dave Clarke, Elizabeth Tyler-Kabara and Liberty Hamilton
Topic areas: speech and language
intracranial recordings electrocorticography (ECoG) stereoelectroencephalography (sEEG) speech perception development human auditory cortex encoding modelsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Intracranial recordings have provided many insights into investigating the neural circuitry of speech perception in adults. Similar research in pediatric populations is rare due to the difficulty of recordings and, in many cases, the inability of younger patient participants to tolerate monotonous experimental sessions. We addressed this gap by determining whether movie trailer stimuli could be used to replace more typical, less engaging sentence stimuli to derive auditory receptive fields in children undergoing invasive surgical monitoring for epilepsy. We have previously shown this is possible in healthy participants using electroencephalography, but this has not been demonstrated in invasive recordings. We recorded stereoelectroencephalography (sEEG) from 6 patients (age 4-19) at Dell Children’s Medical Center in Austin, Texas. Electrode coverage included right hemisphere or bilateral coverage of auditory and language-related areas. All patients listened to and watched audiovisual movie trailer stimuli. A subset of 3 patients also listened to sentences taken from the TIMIT acoustic phonetic corpus. We fit linear encoding models to describe the relationship between acoustic and linguistic stimulus features and the high gamma power of the local field potential (70-150 Hz). Predicting neural activity from phonological features, the spectrogram, and the combination of both feature subspaces demonstrated robust model performance when training on movie trailers (ravg=0.15, rmax=0.69) and TIMIT (ravg=0.31, rmax=0.66) across bilateral superior temporal gyrus (in two patients), middle temporal gyrus, and insula. To determine whether receptive field models generalized across stimuli, we conducted a cross-prediction analysis to predict TIMIT responses based on movie trailer-derived receptive fields and vice versa. As observed previously, training and testing on the same stimulus type yielded better model performance. Using movie trailer stimuli to predict responses to TIMIT and vice versa worked almost as well as predicting from the same stimulus type (r=0.87, pless than 0.001 for within- vs. between stimulus type for TIMIT, r=0.93, pless than 0.001 for MT). This suggests that receptive field models derived from the more engaging audiovisual stimuli are similar to those derived from sentences. Taking advantage of this generalizability, we show receptive field results from childhood through adolescence, which has implications for understanding development of speech in the brain.
Yinghao Li, Prachi Patel, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: memory and cognition correlates of behavior/perception novel technologies
Music EEG ECOG Deep Learning Artificial Neural NetworkFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The human auditory system is responsive to temporal dependencies that are an integral part of time-series signals such as music. How the musical context at different timescales is encoded in various regions of the human auditory cortex and how musical expertise affects the encoding of context remain unclear. To shed light on these questions, we used a transformer neural network known for its contextualization over layers to model musical pieces at different timescales. The 12-layer sequence-to-sequence multi-tasking transformer model was trained on music MIDI data with both an encoder and a decoder. Transformer features were derived from different layers of the encoder and were used to predict neural responses recorded either with scalp or intracranial EEG. Scalp EEG was recorded from 10 musicians and 10 non-musicians, and iEEG data was recorded from 5 neurosurgical patients undergoing epilepsy surgery. All subjects listened to 30 minutes of music consisting of 8 Bach pieces. We applied a non-negative matrix factorization transformation on features from each layer of the encoder of the transformer model to reduce the dimensionality and use them to linearly predict the EEG and iEEG responses. We found that the prediction correlations were monotonically increasing over the layers of the transformer encoder for both EEG and iEEG data. The differences in correlation also increased more for musicians compared to non-musicians. We further performed k-means clustering of iEEG electrodes based on their correlation over layers and identified two major clusters with different correlation/layer progressions across electrodes. The first cluster of electrodes showed increased correlation which plateaued after layer 5. The electrodes in this cluster were mostly located in TTS, HG, and PT. The second cluster of electrodes showed increased correlation which did not plateau before layer 8, with electrodes in the lateral aspect of STG, MTG, and IFG. These findings show the efficacy of deep learning models in revealing the multiscale pattern of music encoding in the human auditory system, demonstrate differences in how musical context is processed in musicians and non-musicians, and suggest a hierarchy in contextual processing of music across various brain regions.
Ariel Edward Hight, Erin Glennon, Yew-Song Cheng, Julia K. Scarpa, Jonathan D. Neukam, Nicole H. Capach, Michele Insanally, Silvana Valtcheva, Robert C. Froemke and Mario A. Svirsky
Topic areas: speech and language correlates of behavior/perception cross-species comparisons neural coding
neuroplasticity cochlear implants humans rodents auditory cortex iEEG ECoG speech perception psychophysicsFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
Neuroplasticity and resulting scales of learning and memories range from minuscule synapses to neuron conglomerates, milliseconds to years, and from sensory detection to complex speech perception. We address challenges in linking neuroplasticity across scales by focusing on cochlear implants (CIs). CIs are clinical auditory neuroprostheses that restore hearing to humans via electrical stimulation of the auditory nerve. Auditory cues provided by CIs are sufficient for speech perception, but asymptotic speech perception abilities are variable and adaptation periods are needed. We study this experimentally tractable adaptation process in adults for both deafened rats and human CI subjects. (Rodent Studies) First, we performed 60-channel intercranial-electroencephalography (iEEG) recordings of tone and CI evoked potentials across the auditory cortex (ACtx) of normal hearing and deafened rats with CIs. Spatial correlations of evoked activity indicate cochleotopic encoding of tone frequency. Cochleotopy was less clear for CI stimuli. A supervised PCA/LDA decoder predicted, significantly above chance, both stimulus frequency (tone) and channels (CI) from single trials. Decoding performance across animals suggest encoding of CI compared to acoustic tones is initially degraded. Next, we trained rats to discriminate sound frequency via a self-initiated 2-alternative forced choice (2AFC) task. After ~3 weeks of acoustic training (N=18) or ~1 week of CI training (N=5) rats were able to discriminate tones and CI electrodes (d’ greater than 1), respectively. Lastly, we performed whole-cell recordings of ACtx neurons and measured excitatory and inhibitory postsynaptic currents (E/IPSCs) in trained vs untrained animals. Evoked E/IPSC responses were irregular and long latency prior to training. Correlations of E/IPSCs improved significantly after training. (Human Studies) We longitudinally tracked auditory psychophysical abilities and speech perception following initial CI activation in human subjects (N=3). Take-home testing was implemented every other day for the first ~30 days following activation via a computer tablet loaded with validated tests of temporal acuity (modulation-detection test, MDT; and gap-detection) and spectral acuity (quick-spectral modulation detection test, QSMD). Speech perception was tested periodically in-lab using standard tests of word and sentence recognition. Significant improvements were found in frequency acuity (QSMD) in 2 of 3 CI users and speech performance improved significantly in all 3 CI users.
John Orczyk and Tobias Teichert
Topic areas: memory and cognition neural coding
Memory Tone decoding Echoic memory Machine learningFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The neural substrate of echoic memory (EM) is unclear. One hypothesis holds that EM stores information about past sounds in the absence of delay-period activity in the form of a ‘negative trace’ of depleted vesicles. However, it remains unclear if and how the information can be read out of such an activity silent state. Here we test if an activity silent negative trace resides in A1, and whether it can be reactivated by the subsequent presentation of a non-informative sound. To test this hypothesis, we performed single-trial decoding of sound-evoked local field potentials (LFPs) that were recorded from a semi-chronically implanted 96-channel electrode array, with 45 electrodes covering the entire tonotopic map in primary auditory cortex in one macaque monkey. Each trial consisted of one of 21 pure tone pips of different frequency, spanning about 7.5 octaves, followed by an identical white noise (WN) burst 350-450 ms after the tone. We first decoded tone identity from LFPs evoked by the tones themselves using a support vector machine (SVM) classifier with a linear kernel and 5-fold validation applied to data in a 10-ms sliding window. Decoding accuracy reached a peak of over 40% correct responses in the time window 25-35 ms after tone onset and returned to chance levels around 400 ms later. In line with the reactivation hypothesis, the subsequent WN burst significantly boosted decoding accuracy compared to trials when it was omitted (15.9+/-1.1% vs 10.4+/-1.2%, p less than 0.01). In line with the notion of a ‘negative’ trace, classification performance was below-chance performance if classifiers trained on tone evoked-responses were applied to data from the subsequent WN-evoked response, and vice versa. In line with the short-lived nature of echoic memory, our findings suggest that decoding accuracy declines rather quickly with delay and is close to chance if the delay between tone and reactivation stimulus exceeds 1 second. In summary, our findings show that information in A1 about prior stimuli persists in an activity silent ‘negative’ trace that can be reactivated for a limited amount of time by a subsequent non-informative stimulus.
Erica Shook, Ching Fang, Justin Buck and Guillermo Horga
Topic areas: speech and language hierarchical organization
Deep Neural Networks Predictive Coding Sensory Feedback Noise RobustnessFri, 11/11 4:00PM - 6:00PM | Posters 2
Abstract
The human auditory system is robust to many types of corrupting noise. However, the neural mechanism that drives this robustness is unclear and has rarely been explored in large-scale auditory network models. There is converging evidence from the visual neuroscience and computer vision literature that feedback connections play an important role in making biological and artificial networks robust to noise. Furthermore, there is evidence of substantial feedback connections in the human auditory cortex. Here, we augment a feedforward deep neural network trained to identify speech in corrupting noise using a recently introduced predictive coding scheme. We find that the introduction of feedback connections improves speech identification across several types of corrupting noise. An analysis of network activity showed that feedback connections denoise representations which drives this performance improvement. We also find that the extent of this performance improvement depends on which layers are augmented with feedback connections. Overall, this work demonstrates that increasing the biological realism of a deep neural network of the auditory system improves robustness to corrupting noise, a key feature of human audition. Furthermore, our network model of auditory predictive coding provides a testbed for hypotheses regarding the dynamics of auditory representations.
Keshov Sharma, Mark Diltz, Theodore Lincoln and Lizabeth Romanski
Topic areas: correlates of behavior/perception multisensory processes neural coding neuroethology/communication
Prefrontal Cortex Primate Multisensory Faces Vocalizations Expression Identity Audiovisual Neural CodingFri, 11/11 10:15AM - 12:15PM | Posters 1
Abstract
The ventrolateral prefrontal cortex (VLPFC) is robustly active during the perception and integration of vocalizations and faces, suggesting it has a critical role in social communication. While categorical features of social stimuli, like identity and expression, have been shown to drive single unit and population activity in the temporal lobe (Yang and Freiwald, 2021; Gothard et al. 2007) the impact of these features on VLPFC neuronal activity is unclear. Additionally, since single VLPFC neurons are largely multisensory, neuronal responses to faces or vocalizations in isolation, do not reflect the population activity occurring during perception of dynamic communication stimuli. Thus, we recorded neural responses in macaque VLPFC to naturalistic audiovisual stimuli and determined if neurons were driven by identity or expression. Recordings were made using multielectrode arrays while macaques viewed dynamic audiovisual movie stimuli which included agonistic (barks, screams), affiliative (coos), and neutral valence (grunts) stimuli. Examination of the neural response compared to baseline, indicated that more than half (285/406) of the recorded population were responsive to the stimuli. Analysis of the single neuron responses indicated that, on average, the neurons did not exhibit strong selectivity for particular stimulus exemplars (0.41 ± 0.13), or for the identity (0.22 ± 0.13), or the vocalization type (expression) (0.22 ± 0.13) of face-vocalization pairs. Although a two-way ANOVA found main effects of identity, expression, or their interaction in the firing rates of 111 single neurons (n=285, p less than 0.05), the mean decoding accuracies across the 285 single units for identity (0.38 ± 0.06) and expression (0.38 ± 0.05) were not appreciably higher than chance (0.33). However, when analyzed as a pseudopopulation, decoding accuracy increased as a function of population size for both identity (0.80 at n=280) and expression (0.64 at n=280). Principle components analysis of mean population activity across time revealed that population responses to the same identity followed similar trajectories in the response space, facilitating segregation from other identities. Our results suggest that identity is a critical feature of social stimuli that dictates the structure of population activity in the VLPFC during social communication.