- Home
- Previous APAN Programs
- Abstract Browser 2021
Abstract Browser 2021
Lizabeth Romanski
Topic areas: neuroethology/communication neural coding multisensory processes
Thursday, 10/22 10:00AM - 11:00AM | Keynote
Abstract
Informal discussion to follow at 11:15 am EDT (GMT-4) on Zoom (link below).
Srivatsun Sadagopan
Topic areas: correlates of behavior/perception memory and cognition
Friday, 11/5 10:00AM - 10:30AM | Young Investigator Spotlight
Abstract
The auditory system effortlessly recognizes complex sounds such as speech or animal vocalizations despite immense variability in their production and additional variability imposed by the listening environment. Recent imaging and electrophysiological data from human subjects indicate that hierarchical computations in auditory cortex underlie such robust sound recognition. However, the rationale for how higher auditory processing stages must represent information, the computations performed in primary (A1) and subsequent auditory cortical areas to achieve robust recognition, as well as the underlying circuit mechanisms, remain largely unknown. Our central premise is that to reveal these mechanisms, we must engage the auditory cortex with complex and behaviorally relevant sounds. Using animal vocalizations (calls) as a model of such sounds, I will describe our lab’s approach to understanding cortical circuits underlying robust call recognition. First, I will briefly outline a hierarchical computational model that learns informative features of intermediate complexity from spectrotemporally dense input representations to enable efficient categorization. Supporting this model, I will describe electrophysiological data recorded from awake guinea pigs demonstrating that the transformation from a dense spectral-content-based representation to a sparse feature-based representation occurs in the superficial layers of A1. Next, I will describe ongoing work in which we are extending this model to achieve noise-invariant categorization by incorporating biologically feasible gain-control mechanisms. I will show that such a model approaches the behavioral performance of animals engaged in call categorization tasks. Finally, I will discuss future directions for utilizing this framework to further understand complex sound processing in normal and hearing-impaired subjects. Informal discussion to follow at 11:15 am EDT (GMT-4) on Zoom (link below).
Kasia M. Bieszczad
Topic areas: hierarchical organization neural coding thalamocortical circuitry/function
Friday, 11/5 12:15PM - 12:45PM | Young Investigator Spotlight
Abstract
Advances in modern molecular neuroscience have shown that regulation above the genome—or "epi-genetic" mechanisms, like chromatin modification and remodeling—are important for learning and behavior. In adult brains, they set a permissive state for activity-dependent gene expression at the foundation of lasting synaptic change and memory, which my lab has investigated in the auditory system. In this talk, I will highlight the chromatin modifier, histone deacetylase 3 (HDAC3), which has been called a "molecular brake pad" on memory formation by its action to normally obstruct activity-dependent transcription required for long-term memory. Blocking HDAC3 in adult animals learning about the behavioral relevance of sound “"eleases the brakes" on physiological plasticity in the auditory cortex, allowing it to remodel in an unusually robust and multidimensional way by expanding the cortical representation of training sound features while also "tuning-in" those receptive fields to respond to training sound cues more selectively. Furthermore, neural changes are mirrored in behavior: the same animals are also more likely to selectively respond to the precise sound features of trained cues (vs. novel sound cues). This brain-to-behavior relationship of increased sound cue-selectivity mediated by HDAC3 has been corroborated in several studies, extending also to subcortical auditory processing and to tasks that require discrimination of simple and complex acoustic signals or to inhibitory associations, like in tone-cued extinction. HDAC3 appears to experience-dependently facilitate sound cue-selective neural and behavioral responsivity, opening the door to investigate epigenetic processes that control the dynamics of the auditory system from genes to behavior. Informal discussion to follow at at 3:00 pm EDT (GMT-4) in Gathertown, Discussion Area 1 (link below).
Jonas Obleser, Charlie Schroeder, Molly Henry, Christoph Kayser, Noelle O'Connell
Topic areas: memory and cognition neural coding hierarchical organization
Thursday, 11/4 3:00PM - 4:00PM | Special Presentation
Abstract
Peter Lakatos passed away earlier this year. This symposium will present a tour d'horizon of his pioneering work and its impact on Auditory Neuroscience. At the end of the hour, we will be raising a glass in Peter’s memory. Do prepare your favourite beverage and feel free to turn-on your video for a communal “Cheers!”
Pradeep Dheerendra, Nicolas Barascud, Sukhbinder Kumar, Tobias Overath and Timothy D Griffiths
Topic areas: correlates of behavior/perception
Auditory object change detection statistical learningThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Auditory object analysis requires the fundamental perceptual process of detecting boundaries between auditory objects. However, the dynamics underlying the identification of discontinuities at object boundary is not known. We investigated the cortical dynamics underlying this process by employing a synthetic stimulus composed of frequency modulated ramps known as "acoustic textures", where boundaries were created by changing the underlying spectro-temporal coherence. We collected magnetoencephalographic (MEG) data from 14 subjects in 275-channel CTF scanner. We observed a very slow (less than 1 Hz) drift in the neuro-magnetic signal that started 430 ms post boundary between textures that lasted for 1330 ms before it decayed to baseline no-boundary condition. The response evoked by this drift signal was source localized to Heschl's Gyrus bilaterally which was shown in the previous BOLD study (Overath et al., 2010) to be involved in the detection of auditory object boundaries. Time-frequency analysis demonstrated suppression in alpha and beta bands that occurred after the drift signal.
Chenggang Chen and Xiaoqin Wang
Topic areas: neural coding
Sound Localization Stimulus Context Nonhuman PrimateThu, 11/4 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
Responses of neurons in auditory cortex are influenced by stimulus context. Comparing to spectral and temporal contextual effects, much less is known on spatial contextual effects. In this study, we explored how spatial contextual modulations evolve over time by stimulating neurons in awake marmoset auditory cortex with sequences of sounds either randomly from various spatial locations (equal probability mode) or repeatedly from a single location (high probability mode). To our surprise, instead of inducing adaptation as expected from well documented stimulus-specific adaptation (SSA) literature, repetitive stimulation in the high probability mode from spatial locations away from the center of a neuron’s spatial receptive field evoked lasting facilitation observed by both extracellular and intracellular recordings from single neurons in auditory cortex. Nearly half of the sampled neuronal population exhibited this spatial facilitation, irrespective of stimuli type and visibility of the test speaker. Facilitation with longer duration occurred when the repetitive stimulation was delivered from speakers with firing rates ranked lower than the best speaker’s firing rate under the equal probability mode. The extent of the facilitation decreased with decreasing presentation probability of the test speaker. Interestingly, the induced facilitation did not change spatial tuning selectivity, tuning preference and spontaneous firing rate of the tested neurons. Taken together, our findings revealed a location-specific facilitation (LSF) instead of SSA to repetitively presented sound stimuli which has not been observed in auditory cortex. This form of spatial contextual modulation may play an important role in supporting such functions as auditory streaming and segregation.
Mohsen Alavash, Malte Wöstmann and Jonas Obleser
Topic areas: memory and cognition correlates of behavior/perception novel technologies
connectivity neural oscillations auditory attention source EEGThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Recent advances in network neuroscience suggest that human brain resting-state functional connectivity is driven by bursts of high-amplitude co-fluctuations of hemodynamic responses measured using fMRI (Faskowitz et al., 2020). We here build on these new insights and ask (1) whether connectivity of intrinsic neural oscillations manifests high-amplitude co-fluctuations as such, and (2) whether and how these connectivity events unfold during the deployment of auditory attention? We reanalyzed two recently published EEG data sets of resting-state (N = 154; Alavash, Tune, Obleser, 2021) and an auditory attention task (N = 33; Wöstmann, Alavash, Obleser, 2019), respectively. During the task, an auditory spatial cue was presented to the participants asking them to attend to one of two concurrent tone-streams and judge the pitch difference (increase/decrease) in the target stream. We source-localized EEG and unwrapped power-envelope correlations across time and cortex within alpha and beta frequency ranges. This revealed connectivity events at sub-second resolution during which high-amplitude power-envelope co-fluctuations occurred. In line with recent fMRI work, alpha and beta connectivity derived from these events showed high similarity with static connectivity during both rest and the auditory task. Importantly, during the task frontoparietal beta connectivity events occurred during processing of the spatial cue, while posterior alpha connectivity events occurred during processing of the tone-streams. Our results suggest that high-amplitude power-envelope co-fluctuations drive connectivity of alpha and beta oscillations. Critically, these connectivity events appear to underlie ongoing deployment of auditory attention in a functionally distinct manner, that is, anticipation of stimuli versus selective stimulus processing.
Carolina Fernandez Pujol, Elizabeth Blundon and Andrew Dykstra
Topic areas: memory and cognition correlates of behavior/perception neural coding thalamocortical circuitry/function
auditory cortex perception magnetoencephalography neural circuits laminarFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Electroencephalography (EEG) and magnetoencephalography (MEG) are excellent mediums for capturing human neural activity on a millisecond time scale, yet little is known about their underlying laminar and biophysical basis. Here, we used a reduced but realistic cortical circuit model - Human Neocortical Neurosolver (HNN) - to shed light on the laminar specificity of brain responses associated with auditory conscious perception under multitone masking. HNN provides a canonical model of a neocortical column circuit, including both excitatory pyramidal and inhibitory basket neurons in layers II/III and layer V. We found that the difference in event-related responses between perceived and unperceived target tones could be accounted for by additional input to supragranular layers arriving from either the non-lemniscal thalamus or cortico-cortical feedback connections. Layer-specific spiking activity of the circuit revealed that the additional negative-going peak that was present for detected but not undetected target tones was accompanied by increased firing of layer-V pyramidal neurons. These results are consistent with current cellular models of conscious processing and help bridge the gap between the macro and micro levels of analysis of perception-related brain activity.
Isma Zulfiqar, Elia Formisano, Sriranga Kashyap, Peter de Weerd and Michelle Moerel
Topic areas: correlates of behavior/perception multisensory processes
multisensory periphery temporal modulation laminar high-resolution fMRIFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Recent evidence supports the existence of multisensory processing in early auditory regions of the cortex. To study the source of these multisensory modulations, we investigated visual influences on the auditory cortex in a cortical depth-dependent manner using high resolution functional MRI at 7 Tesla. Specifically, given the reciprocal connectivity between early visual cortex representing the periphery and auditory cortex, we set out to explore audiovisual integration of peripherally presented stimuli. For 10 subjects, we collected anatomical data at 0.6 mm and functional data at 0.8 mm isotropic resolution. In a blocked design, the participants were presented with unisensory and audiovisual stimuli. Attention was directed either towards the auditory stimulus, or away from both stimuli. Our preliminary results showed multisensory enhancement in a cortical network comprising early sensory sites (auditory and visual), insula, and ventrolateral prefrontal cortex. Multisensory enhancement was present in the primary auditory cortex and increased along the auditory cortical hierarchy. These findings confirm that the primary auditory cortex is not uniquely unisensory. Additionally, we observed a task-dependent attentional modulation of multisensory enhancement in deep layers of the auditory belt. This suggests a top-down origin of attention that stems from long-range cortico-cortical feedback. Future analyses will include cortical depth-dependent connectivity analysis which may help discriminate between the frontal regions and visual cortex as sources of the observed context-dependent multisensory enhancement in deep layers of the auditory belt, and a multivariate analysis to increase sensitivity of our analysis by examining distributed multisensory effects.
Beate Wendt, Jörg Stadler and Nicole Angenstein
Topic areas: auditory disorders speech and language
cochlear implant duration processing frequency processing just noticeable differences serial order judgement speech processingThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
The perception of speech requires the processing of different basic acoustic parameters such as frequency, duration and intensity. If this fundamental processing is inefficient, this might lead to problems in speech perception. The present study investigates low-level auditory processing in adult cochlear implant (CI) users in the inexperienced and experienced state. Frequency, duration, intensity processing and serial order judgement were tested by stimulating the ear with a CI and the just noticeable differences were determined. Alternative forced choice measurements were performed shortly after the first fitting of the CI and again around two years or more later. Furthermore, speech processing was tested with German standard speech tests (Oldenburg sentence test (OLSA), Freiburg monosyllabic and multisyllabic word tests). In addition, the perception of consonants and vowels were tested. As expected, speech processing clearly improved over time. However, there was no significant improvement in low-level auditory processing. Correlations of the performance between the different low-level tests and between the different speech tests were observed. In addition, a few correlations between the speech tests and the low-level test performance were detected, e.g. between the OLSA and frequency processing only in the inexperienced state. Correlations between the word tests and the recognition of consonants and vowels were present in both the inexperienced and experienced state, but particularly pronounced in the experienced state. The results might have implications for the rehabilitation of CI users.
Rachid Riad, Julien Karadayi, Anne-Catherine Bachoud-Levi and Emmanuel Dupoux
Topic areas: correlates of behavior/perception novel technologies
spectro-temporal modulations auditory neuroscience interpretability of deep neural networks audio signal processingThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Deep learning models have become potential candidates for auditory neuroscience research, thanks to their recent successes in a variety of auditory tasks, yet these models often lack interpretability to fully understand the exact computations that have been performed. Here, we proposed a parametrized neural network layer, which computes specific spectro-temporal modulations based on Gabor filters [learnable spectro-temporal filters (STRFs)] and is fully interpretable. We evaluated this layer on speech activity detection, speaker verification, urban sound classification, and zebra finch call type classification. We found that models based on learnable STRFs are on par for all tasks with state-of-the-art and obtain the best performance for speech activity detection. As this layer remains a Gabor filter, it is fully interpretable. Thus, we used quantitative measures to describe distribution of the learned spectro-temporal modulations. Filters adapted to each task and focused mostly on low temporal and spectral modulations. The analyses show that the filters learned on human speech have similar spectro-temporal parameters as the ones measured directly in the human auditory cortex. Finally, we observed that the tasks organized in a meaningful way: the human vocalization tasks closer to each other and bird vocalizations far away from human vocalizations and urban sounds tasks.
Michelle Moerel, Agustin Lage-Castellanos, Omer Faruk Gulban and Federico De Martino
Topic areas: memory and cognition correlates of behavior/perception neural coding
frequency-based attention ultra-high field fMRI population receptive field mapping auditory cortexFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Electrophysiological studies suggest that auditory attention induces rapid changes in neuronal feature preference and selectivity. Functional magnetic resonance imaging (fMRI) studies of human auditory cortex have revealed an increased BOLD response in neuronal populations tuned to attended sound features. As fMRI studies could typically not characterize the influence of attention on neuronal population receptive field properties, it is still unclear how the results obtained with fMRI in humans relate to the electrophysiological findings in animal models. We used ultra-high field fMRI to examine auditory processing while participants performed a detection task on ripple sounds. By manipulating the chance of target occurrence, participants alternatively attended to low (300 Hz) or high frequency (4 kHz) ripple sounds. Instead, responses to natural sounds were used to compute neuronal population receptive fields (pRFs). We observed a faster reaction time to noise bursts in attended compared to unattended ripples. In contrast with previous fMRI studies, the auditory cortical response to attended ripple sounds was lower. Maps of frequency preference (BF) and selectivity (tuning width; TW) were similar across attentional conditions, with the exception of a narrower TW with attention in parabelt locations with a BF close to the attended frequency. The narrower tuning width in voxels whose preferred frequency matches the attended one could underlie the observed lower response to attended, compared to non-attended, ripple sounds. The difference between our results and previous fMRI studies may be explained by the difference in experimental design, and suggests that fundamentally different mechanisms may underlie different attentional settings.
Ying Fan and Huan Luo
Topic areas: memory and cognition
auditory sequence memory content structureFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Two forms of information – frequency (content) and ordinal position (structure) – have to be stored when retaining a sequence of auditory tones in working memory (WM). However, the neural representations and coding characteristics of content and structure, particularly during WM maintenance, remain elusive. Here, in two electroencephalography (EEG) studies in human participants performing a delayed-match-to-sample task with a retrocue, by transiently perturbing the ‘activity-silent’ WM retention state and decoding the reactivated WM information, we demonstrate that content and structure are stored in a dissociative manner with distinct characteristics throughout WM process. First, each tone in the sequence is associated with two codes in parallel, characterizing its frequency and ordinal position, respectively. Second, during retention, a structural retrocue successfully reactivates structure but not content, whereas a following neutral white noise triggers content but not structure. Meanwhile, a content retrocue is able to reactivate both content and structure information, while a subsequent neutral visual impulse successfully makes maintained structure information detectable. Third, structure representation remains stable whereas content code undergoes a dynamic transformation through memory progress. Finally, the neutral-impulse-triggered content and structure reactivations during retention correlate with WM behaviors on frequency and ordinal position, respectively. Overall, our results support distinct content and structure representations in auditory WM and provide efficient approaches to access the silently stored WM information (both content and structure) in the human brain.
Danna Pinto, Anat Prior and Elana Zion-Golumbic
Topic areas: memory and cognition speech and language
Statistical Learning Language EEG Frequency TaggingThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Statistical Learning (SL) is hypothesized to play an important role in language development. However, the behavioral measures typically used to assess SL, particularly at the level of individual participants, are largely indirect and often have low sensitivity. Recently, a neural metric based on frequency-tagging has been proposed as an alternative and more direct measure for studying SL. Here we tested the sensitivity of frequency-tagging measures for studying SL in individual participants in an artificial language paradigm, using non-invasive EEG recordings of neural activity in humans. Importantly, we use carefully constructed controls, in order to address potential acoustic confounds of the frequency-tagging approach. We compared the sensitivity of EEG-based metrics to both explicit and implicit behavioral tests of SL, and the correspondence between these presumed converging operations. Group-level results confirm that frequency-tagging can provide a robust indication of SL for an artificial language, above and beyond potential acoustic confounds. However, this metric had very low sensitivity at the level of individual participants, with significant effects found only in 30% of participants. Conversely, the implicit behavior measures indicated that SL has occurred in 70% of participants, which is more consistent with the proposed ubiquitous nature of SL. Moreover, there was low correspondence between the different measures used to assess SL. Taken together, while some researchers may find the frequency-tagging approach suitable for their needs, our results highlight the methodological challenges of assessing SL at the individual level, and the potential confounds that should be taken into account when interpreting frequency-tagged EEG data.
Lingyun Zhao, Alexander Silva and Edward Chang
Topic areas: speech and language
Speech Stopping Frontal cortexFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
An important capacity for normal speech is the ability to stop ongoing production quickly when required. This ability is crucial for maintaining smooth conversations and deficits in it are often indicative of speech disorders. Previous studies have investigated the neural control for canceling speech and other motor outputs before their onset. However, it is largely unknown how the brain controls speech termination when one has already started speaking. Here we studied this question by directly recording neural activity from the human cortex while participants start and stop their speech following visual cues. We found increased high-gamma activity near the end of the production in the premotor cortex during cued stopping, which was not observed in the self-paced, natural finish of a sentence. Across single trials, a subset of premotor regions was activated according to the time of the stop cue or stop action. Activity in single electrodes and across populations distinguishes whether stopping occurred before an entire word was finished. In addition, we ask how the neural process for stopping is related to concurrent articulatory control. We found that stop activity existed in largely separate regions from those in the sensorimotor cortex encoding articulator movements. Finally, we found that areas with stimulation-induced speech arrest overlapped with areas showing stop activity, suggesting that speech arrest may be caused by inhibition of speech production. Together, these data provide evidence that neural activity in the premotor cortex may act as an inhibitory control signal that underlies the stopping of ongoing speech production.
Lixia Gao, Xinjian Li and Xiaohui Wang
Topic areas: subcortical processing
Inferior colliculus Cortical inactivation Temporal representation Rate representation Time-varying signalThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Temporal processing is crucial for auditory perception and cognition, especially for communication sounds. Previous studies have shown that the auditory cortex and thalamus use temporal and rate representations to encode slowly- and rapidly-changing time-varying sounds. However, how the inferior colliculus (IC) encodes time-varying sounds at the millisecond scale remains unclear. In the present study, we investigated the temporal processing by IC neurons in awake marmosets to Gaussian click trains with varying inter-click intervals (2–100 ms). Strikingly, we found that 28% of IC neurons exhibited rate representation with non-synchronized responses, which is in sharp contrast to the current view that the IC only uses a temporal representation to encode time-varying signals. Moreover, IC neurons with rate representation exhibited response properties distinct from those with temporal representation. Next, we further demonstrated that reversible inactivation of the primary auditory cortex modulated 17% of the stimulus-synchronized responses and 21% of the non-synchronized responses of IC neurons, revealing that cortico-colliculus projections play a role, but not a crucial one, in temporal processing in the IC. Our findings fill a gap in the auditory temporal processing in the IC of awake animals and provide new insights into temporal processing from the midbrain to the cortex.
Luis Rivera-Perez, Julia Kwapiszewski and Michael Roberts
Topic areas: subcortical processing
Inferior colliculus neuromodulation acetylcholine nicotinic acetylcholine receptors VIP neuronsFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
The inferior colliculus (IC), the midbrain hub of the central auditory system, receives extensive cholinergic input from the pontomesencephalic tegmentum (PMT). Activation of nicotinic acetylcholine receptors (nAChRs) in the IC can enhance auditory performance by altering the excitability of neurons. However, how nAChR activation affects the excitability of specific neuron classes in the IC remains unknown. Our lab identified a distinct class of glutamatergic principal neurons in the IC that expresses vasoactive intestinal peptide (VIP). Using VIP-Cre x Ai14 mice and immunofluorescence, we found that cholinergic terminals are commonly located in close proximity to the somas and dendrites of VIP neurons. By using whole-cell electrophysiology, we found that acetylcholine (ACh) drives a strong, long-lasting excitatory effect in VIP neurons. Application of nAChR antagonists revealed that ACh excites VIP neurons via the activation of α3β4* nAChRs, a subtype that is relatively rare in the brain. Furthermore, we determined that cholinergic excitation of VIP neurons occurs by activating post-synaptic nAChRs located in VIP neurons themselves and does not require activation of presynaptic inputs. Finally, we found that trains of ACh puffs elicited temporal summation in VIP neurons, suggesting that cholinergic inputs can affect activity in the IC for prolonged periods in an in vivo setting. These results uncover the first cellular-level mechanisms of cholinergic modulation in the IC and a novel role for α3β4* nAChRs in the auditory system, and suggest that cholinergic inputs from the PMT can strongly affect auditory processing in the IC by increasing the excitability of VIP neurons.
Jian Carlo Nocon, Howard J. Gritton, Xue Han and Kamal Sen
Topic areas: neural coding
Parvalbumin Cortical code Temporal code Rate code Spike timing Sparse coding Cocktail party problem Amplitude modulation Spatial tuningThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Cortical coding of sensory stimuli plays a critical role in our ability to analyze complex scenes. Cortical coding can depend on both rate and spike timing-based coding. However, cell type-specific contributions to cortical coding are not well-understood. Parvalbumin (PV) neurons play a fundamental role in sculpting cortical responses; yet their specific contributions to rate vs. spike timing-based codes has not been directly investigated. Here, we address this question in auditory cortex using a cocktail party-like paradigm; integrating electrophysiology, optogenetic manipulations, and a family of spike-distance metrics, to dissect the contributions of PV neurons towards rate vs. timing-based coding. We find that PV neurons improve discrimination performance by enhancing lifetime sparseness, rapid temporal modulations, and spike timing reproducibility. These findings provide novel insights into the specific contributions of PV neurons in auditory cortical discrimination in the cocktail party problem via enhancing rate modulation and spike timing-based coding in cortex.
Jennifer Lawlor, Melville Wohlgemuth, Cynthia Moss and Kishore Kuchibhotla
Topic areas: correlates of behavior/perception neural coding subcortical processing
Two-photon calcium imaging Echolocation Inferior ColliculusThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Navigating our everyday world requires parsing relevant information from constantly evolving sensory flows. How the brain processes and sorts sensory inputs is a central ongoing question in systems neuroscience. Here, we take advantage of a model long-studied for its expert auditory sensing of the world: the echolocating bat. The echolocating bat produces ultrasonic vocalizations and listens to returning echoes to determine the identity and location of objects in the environment. While traditional electrophysiology techniques have provided key insights into network-level activity, they are limited in their ability to reveal micro-functional architecture and cell-type specific activity. We developed two-photon calcium imaging in the awake big brown bat, Eptesicus fuscus, to assay the activity of a population of neurons with cellular and sub-cellular resolution. We expressed GCaMP6f in the excitatory population of the Inferior Colliculus, while using a head-fixation and thinned-skull surgical approach to longitudinally monitor the same local populations. We assessed functional auditory properties of thousands of neurons in awake, passively listening bats (n=3) by presenting pure tones, white noise, frequency sweeps, echolocation and social calls, and other stimuli. We parametrically controlled features including the duration and delay of relevant stimuli. In preliminary analyses, we observe a novel, superficial fine-scaled tonopy in the superficial layers of the IC. In addition, presentation of ‘prey capture’ echolocation sequences varying in their spectral content (‘natural’ vs ‘artificial’) elicited a population rotational dynamic reflecting the temporal structure of the call sequence, while manifolds separate the spectral information.
Vibha Viswanathan, Barbara Shinn-Cunningham and Michael Heinz
Topic areas: speech and language correlates of behavior/perception neural coding subcortical processing
scene analysis temporal coherence consonant confusions comodulation masking release cross-channel processing wideband inhibition computational modeling speech intelligibility cochlear nucleusFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Temporal coherence of sound fluctuations across different frequency channels is thought to aid auditory grouping and scene segregation, as in comodulation masking release. Although most prior studies focused on the cortical bases of temporal-coherence processing, neurophysiological evidence suggests that temporal-coherence-based scene analysis may start as early as the cochlear nucleus (the first auditory region supporting cross-channel processing over a wide frequency range). Accordingly, we hypothesized that aspects of temporal-coherence processing that could be realized in early auditory areas may shape speech understanding in noise. We explored whether physiologically plausible computational models could account for results from a behavioral experiment that measured consonant categorization in different masking conditions. Specifically, we tested whether within-channel masking of target-speech modulations predicted consonant confusions across the different conditions, and whether predictions were improved by adding across-channel temporal-coherence processing mirroring the computations known to exist in the cochlear nucleus. Consonant confusions provide a rich characterization of error patterns in speech categorization, and are thus crucial to rigorously test models of speech perception; however, to the best of our knowledge, they have not been utilized in prior studies of scene analysis. We find that within-channel modulation masking can reasonably account for category confusions, but that it fails when temporal fine structure cues are unavailable. However, the addition of across-channel temporal-coherence processing significantly improves confusion predictions across all tested conditions. Our results suggest that temporal-coherence processing strongly shapes speech understanding in noise, and that physiological computations that exist early along the auditory pathway may contribute to this process.
Lucas Vattino, Maryse Thomas, Carolyn Sweeney, Rahul Brito, Cathryn Macgregor and Anne Takesian
Topic areas: correlates of behavior/perception neural coding thalamocortical circuitry/function
Interneurons Network CorrelationFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Inhibitory interneurons in neocortical layer 1 (L1) convey behaviorally relevant information by integrating sensory-driven inputs with neuromodulatory signals. Their activity is known to regulate moment-to-moment encoding of the sensory environment in a context-dependent manner and can drive cortical plasticity mechanisms, both during postnatal development and adulthood. We and others have shown that these interneurons are heterogenous and can be subdivided into two major classes defined by the expression of either neuron-derived neurotrophic factor (NDNF) or vasoactive intestinal peptide (VIP). It has been previously demonstrated that L1 interneurons make recurrent synaptic contacts and are also connected electrically through gap junctions, suggesting that they might form coordinated inhibitory networks. However, the connectivity patterns of specific L1 interneuron subtypes and the in vivo functional implications remain unclear. We performed fluorescence-guided whole-cell electrophysiology in slices of the mouse primary auditory cortex (A1) while optogenetically activating VIP or NDNF interneurons. We found that GABAA-mediated synaptic connections between NDNF interneurons were significantly stronger than those between VIP interneurons or other L1 interneurons, suggesting a robust NDNF recurrent inhibitory network. Next, we performed two-photon calcium imaging in A1 of awake, behaving mice to monitor the spontaneous and sound-driven activity of networks of VIP and NDNF interneurons. These results will reveal how behavioral context impacts the in vivo coordinated activity of these L1 inhibitory networks. Together, our findings suggest that distinct connectivity patterns among NDNF and VIP interneurons may underlie specialized functions in sensory encoding and cortical plasticity.
Rafay A. Khan, Brad Sutton, Yihsin Tai, Sara Schmidt, Somayeh Shahsavarani and Fatima T. Husain
Topic areas: auditory disorders
Tinnitus Hearing loss Neuroimaging Tractography ConnectivityFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Tinnitus has been associated with both anatomical and functional plasticity. In white matter, tinnitus-associated changes have been seen in a range of regions, as have negative results. Some of this variation in findings may be attributable to small sample sizes, and different methodologies employed by different groups. A further layer of complication is added by the unknown relationship of tinnitus with hearing loss. To evaluate whole-brain network level changes of structure, we investigated anatomical connectivity via fiber tractography. High-resolution diffusion imaging data was collected from 97 participants, who were divided into four groups: normal hearing controls (CONNH, n=19), hearing loss controls (CONHL, n=17), normal hearing tinnitus sufferers (TINNH, n=17) and tinnitus sufferers with hearing loss (TINHL, n=44). Group-level differences in connectivity were inspected in three nodes – the precuneus (representing the default mode network; DMN) and bilateral auditory cortices (nodes for auditory network). Three measures of connectivity were calculated – mean strength, local efficiency, and clustering coefficient. ANOVA revealed significant group differences for all three measures in the precuneus, but none that reached statistical significance in either auditory cortex. Post-hoc analysis revealed that group differences were primarily driven by the CONNH > TINHL and TINNH > TINHL contrasts, suggesting that the DMN connectivity has altered integration and segregation associated with tinnitus, which can be differentiated from connectivity changes associated with hearing loss. This study demonstrated the feasibility of studying tinnitus-related neural plasticity using fiber tractography, and results showed an anatomical analog for findings previously reported in functional connectivity literature.
Kate Christison-Lagay, Noah Freedman, Christopher Micek, Aya Khalaf, Sharif Kronemer, Mariana Gusso, Lauren Kim, Sarit Forman, Julia Ding, Mark Aksen, Ahmad Abdel-Aty, Hunki Kwon, Noah Markowitz, Erin Yeagle, Elizabeth Espinal, Jose Herrero, Stephan Bickel, James Young, Ashesh Mehta, Kun Wu, Jason Gerrard, Eyiyemisi Damisah, Dennis Spencer and Hal Blumenfeld
Topic areas: memory and cognition correlates of behavior/perception
auditory perception intracranial EEG humanThu, 11/4 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
Much recent work towards understanding the spatiotemporal dynamics of the neural mechanisms of conscious perception has focused on visual paradigms. To determine whether there are shared mechanisms for perceptual consciousness across sensory modalities, we developed an auditory task, in which target sounds (calibrated to 50% detection) were embedded in noise. Participants (patients undergoing intracranial electroencephalography for intractable epilepsy; n=31) reported if they perceived the sound, and the sound’s identity. Participants’ perception rate was 58.0% (2.0% SEM) when a target was present; false positive rate was 8.5% (1.4%). For perceived trials, they correctly identified the target in 89.2% (1.4%) of trials; identification accuracy for non-perceived trials was 40.2% (2.0%) (chance: 33%). Recordings from < 2,800 grey matter electrodes were analyzed for power in the high gamma range (40-115 Hz). We performed cluster-based permutation analyses to identify significant activity across perceived and not perceived conditions. For not-perceived trials, significant activity was restricted to auditory regions. Perceived trials also showed activity in auditory regions, but this was accompanied by activity in the right caudal middle frontal gyrus and non-auditory thalamus. Consistent with visual findings, in perceived trials we found (1) early auditory activity is followed by a wave that sweeps through auditory association regions into parietal and frontal cortices and (2) decreases below baseline in orbital frontal and rostral inferior frontal cortex. In summary, we found a broad network of cortical and subcortical regions involved in auditory perception that are similar to the networks observed with vision, suggesting shared general mechanisms for conscious perception.
Heidi Bliddal, Christian Bech Christensen, Cecilie Møller, Peter Vuust and Preben Kidmose
Topic areas: correlates of behavior/perception novel technologies
Ear-EEG EEG beat perceptionFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Ear-EEG is a promising novel technology that records electroencephalography (EEG) from electrodes inside the ear, allowing discrete and mobile recording of EEG. Nozaradan et al. (2011) used scalp EEG to study neural responses to an isochronous sequence of sounds under three conditions: a control condition and two imagery conditions where participants were instructed to imagine accents on every second (march) or third (waltz) beat. A significant peak was found at the frequency of the imagined beat only in the matching imagery conditions. Since no physical accents were present in the stimulus, the peaks at beat-related frequencies indicate higher order processing of the sound sequence. The aim of the present combined scalp- and ear-EEG study (n = 20) was to determine whether neural correlates of beat perception can be measured using ear-EEG. To investigate this, we used an adapted version of the Nozaradan paradigm. Three different electrode reference configurations were tested, a literature-based reference, an in-ear reference, and an in-between ears reference. The results showed that when the literature-based reference or the in-between ears reference was used, a significantly greater peak was found at the march related frequency in the march imagery condition and at the waltz related frequency in the waltz imagery condition, when comparing to the other imagery condition (p < .02). In conclusion, it is possible to measure the neuronal correlates of beat perception using ear-EEG despite the markedly different electrode placement. Therefore, the present study is bringing us one step closer to using neuronal feedback to improve hearing aid algorithms.
Carla Griffiths, Joseph Sollini, Jules Lebert and Jennifer Bizley
Topic areas: memory and cognition speech and language correlates of behavior/perception neural coding
Auditory cortex Perceptual invariance Auditory object Encoding Electrophysiology Neural decoding FerretThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Perceptual invariance, the act of recognising auditory objects across identity-preserving variation and in the presence of other auditory stimuli, is critical to listening. To test perceptual invariance, we trained four ferrets in a Go/No-Go water reward task where ferrets identified a target word ("instruments") from a stream drawn from 54 other British English words (distractors). We then manipulated the mean fundamental frequency (F0) within and across trials. The ferrets identified the target word (chance=33% hit rate) when the F0 was roved within a trial with hit rates (Female/Male speaker) of 60%/40% for F1702, 68%/38% for F2002, 43%/38% for F1803, and 48%/52% for F1815. For whole trial modified F0, the hit rate was 62%/42% (F1702), 48%/44% (F2002), 44%/35% (F1803), and 57%/40% (F1815). We recorded neural activity from auditory cortex in one ferret and considered sites with a sound-onset response for Euclidean distance decoding. We computed a decoding score for pairwise discrimination of the target word from seven high-occurrence distractors, the target word reversed, and pink noise equal in duration and spectrally matched to the target word. We found neural responses that discriminated target from distractor responses across variation in F0. Moreover, in most cases, these responses did not carry F0 information either in target or distractor responses. Our preliminary results suggest that auditory objects are represented in the AC and that these responses are resistant to F0 change. Future work will incorporate hippocampal recordings to determine whether temporal coherence between the hippocampus and AC is required for auditory object recognition.
Jules Lebert, Carla Griffiths, Joseph Sollini and Jennifer Bizley
Topic areas: correlates of behavior/perception neural coding thalamocortical circuitry/function
Auditory Scene Analysis Stream segregation Auditory Cortex Ferret Electrophysiology BehaviorFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Listening in the real world involves making sense of mixtures of multiple overlapping sounds. The brain decomposes such scenes into individual objects, and a sequence of related auditory objects forms a stream. We are investigating the role of the auditory cortex in the formation and maintenance of auditory streams. The temporal coherence theory (Shamma et al., 2011) has provided one explanation for stream formation, postulating that the brain creates a multidimensional representation of sounds along different feature axes, and groups them based on their temporal coherence, to form streams. Supporting this idea, neural correlates of the differences in perception elicited by synchronous and alternating tone have been found in the primary auditory cortex of behaving ferrets (Lu et al., 2017). However, the temporal coherence theory has yet to be tested with more naturalistic sounds composed of multiple streams. To this end, we trained ferrets to detect a target word in a stream of repeating distractor words, spoken by the same talker, played in a background of spatially separated noise (White, pink and speech shaped noise). Preliminary data were collected in the auditory cortex of behaving ferrets. Neural population level analyses are being implemented to identify correlation structures. We hypothesize that when the animal can successfully segregate noise and speech streams, the neural population will be a uniform structure early in the trial, that will evolve into two distinct correlation structures. Stimulus reconstruction will be performed on the different clusters of neurons to investigate whether they encode for different auditory streams.
Joan Belo, Maureen Clerc and Daniele Schon
Topic areas: memory and cognition correlates of behavior/perception novel technologies
Auditory Attention Detection EEG Auditory Attention Cognitive FunctionsThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
EEG-based Auditory Attention Detection (AAD) methods have become popular in the field of auditory neuroscience because they could be of particular interest to better understand how the brain processes naturalistic auditory stimuli including speech and music. Such methods also open promising avenues in clinical applications since they could be used to develop neuro-steered auditory aids. However, one major limitation of AAD lies in the fact that their performances (i.e. reconstruction and decoding accuracy) varies greatly across individuals. We make the hypothesis that part of this inter-individual variability could be due to general cognitive abilities that are necessary to process complex auditory scenes. These abilities may include inhibition, working memory (WM) and sustained attention. In this study, we assess whether inhibition, WM and sustained attention abilities, correlate with the reconstruction and decoding accuracies of a backward linear AAD method. More precisely, we postulate that the higher these cognitives functions are, the higher the reconstruction and decoding accuracies of the method will be. To test this hypothesis 30 participants were enrolled in an experimental paradigm in which they had to 1) actively listen to several dichotic stimuli while their neural activity was recorded 2) complete online behavioral tests, from home, to measure aforementioned cognitive abilities. Using backward linear AAD method we reconstruct the stimulus envelopes, from the EEG data, and obtain subject-specific reconstruction and decoding accuracies. Then we correlate these performances with the behavioral data collected during online cognitive tests.
Maryse Thomas, Carolyn Sweeney, Esther Yu, Kasey Smith, Wisam Reid and Anne Takesian
Topic areas: correlates of behavior/perception thalamocortical circuitry/function
auditory cortex interneuron NDNF frequency discrimination layer 1Fri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
A growing understanding of auditory cortical interneuron circuits has highlighted the role of specific inhibitory interneurons in shaping frequency tuning and discrimination acuity. In auditory cortex, neuron-derived neurotrophic factor (NDNF) interneurons reside primarily within neocortical layer 1 and are known to inhibit the distal dendrites of excitatory neurons while simultaneously disinhibiting their somata via projections to parvalbumin-positive interneurons. Here, we aim to investigate the function of NDNF interneurons in frequency tuning and behavioral frequency discrimination in mice. Using in vivo two-photon calcium imaging, we first characterized the tuning properties of NDNF interneurons in response to pure tones of varying frequencies and intensities. We found that a subset of NDNF interneurons exhibit robust frequency and intensity tuning comparable to excitatory neurons in layer 2/3. We next developed a behavioral frequency discrimination paradigm in which mice distinguished trains of repeating pure tones from trains of two alternating tones allowing us to establish reliable frequency discrimination thresholds by varying the frequency of the alternating tone. We subsequently recorded the responses of both NDNF interneurons and layer 2/3 excitatory neurons to the training stimuli under both passive and behaving contexts. We identified individual neurons in both classes capable of discriminating between repeating or alternating tones. Finally, ongoing experiments using chemogenetic silencing of NDNF interneurons will demonstrate the necessity of their activity for task performance. This work will help establish the function of NDNF interneurons in frequency discrimination and further elucidate the contribution of specific inhibitory cortical circuits to sound processing.
James Baldassano and Katrina MacLeod
Topic areas: cross-species comparisons neural coding subcortical processing
Cochlear Nucleus Avian auditory brainstem Electrophysiology Intrinsic physiologyFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
The avian cochlear nucleus angularis (NA) plays a diverse role in encoding intensity information, including the acoustic temporal envelope. Recent work shows that the intrinsic properties of NA neurons may contribute to functional diversity. The 5 different electrophysiological neuron types present in NA (tonic 1, tonic 2, tonic 3, single spike, damped) have distinct levels of temporal sensitivity and when stimulated in-vitro in a manner that simulates natural biological input, ranging from pure integrators to temporally sensitive neurons known as differentiators. We investigated the role of low-threshold activated Kv1 channel in driving this operating diversity by blocking them with specific antagonist, dendrotoxin (DTX). We found that Kv1 channels shaped the electrophysiological phenotypes of NA neurons types, particularly the single-spike, tonic 1, and tonic 2 neurons. When spike time reliability and fluctuation sensitivity was measured in DTX-sensitive NA neurons, we found that the temporal response sensitivity to rapidly fluctuations in their inputs was reduced with the drug. Finally, we show that DTX reduced spike threshold adaptation in these neurons. These results suggest Kv1 channel properties act as high-pass filters and could be a driving force behind the temporal sensitivity in NA neurons.
Sophie Bagur, Jacques Bourg, Alexandre Kempf, Thibault Tarpin, Etienne Gosselin, Sebastian A Ceballo, Khalil Bergaoui, Yin Guo, Allan Muller, Jérôme Bourien, Jean Luc Puel and Brice Bathellier
Topic areas: hierarchical organization neural coding subcortical processing thalamocortical circuitry/function
Auditory cortex Auditory thalamus Inferior colliculus Neural coding Population coding Electrophysiology Two photon imagingThu, 11/4 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
Auditory perception relies on spectrotemporal pattern separation. Yet the step-by-step transformations implemented by the auditory system remain elusive, partly due to the lack of systematic comparison of representations at each stage. To address this, we compared population representations from extensive two-photon calcium imaging and electrophysiology recordings in the auditory cortex, thalamus, inferior colliculus in mice and a detailed cochlea model. Using a noise-corrected correlation metric, we measured the similarity of activity patterns generated by diverse canonical spectrotemporal motifs (pure tones, broadband noise and chords, amplitude and frequency modulations This measure evidenced a decorrelation of sound representations that was maximal in the cortex where the information carried by temporal and rate codes converged. This decorrelation was accompanied by response sparsening. The thalamus stood out from this global trend by its very dense code that recorrelated responses, possibly because it represents an anatomical bottleneck. The gradual decorrelation of time-independent representations found in the auditory system could be reproduced by a deep network trained to detect in parallel basic perceptual attributes of sounds (frequency and intensity ranges, temporal modulation types). In contrast networks trained on tasks ignoring some perceptual attributes failed to decorrelate the related acoustic information as observed in the biological system. Finally, we tested the impact of introducing a thalamus-like bottleneck and found that it could account for the recorrelation of representations. Together these results establish that the mouse auditory system makes information about diverse perceptual attributes accessible in a rate-based population code and correctly constrained neural networks reproduce these properties.
Koun Onodera and Hiroyuki Kato
Topic areas: thalamocortical circuitry/function
auditory cortex deep layer optogeneticsFri, 11/5 10:45AM - 11:00AM | Short talk
Abstract
Revealing the principles governing interactions between the six layers of the cortex is fundamental for understanding how these intricately woven circuits work together to process sensory information. The information flow in the sensory cortex has been described as a predominantly feedforward sequence with deep layers as the output structure. Although recurrent excitatory projections from layer 5 (L5) to superficial L2/3 have been identified by anatomical and physiological studies, their functional impact on sensory processing remains unclear. Here, we use layer-selective optogenetic manipulations in the primary auditory cortex to demonstrate that feedback inputs from L5 suppress the activity of superficial layers, contrary to the prediction from their excitatory connectivity. This suppressive effect is predominantly mediated by translaminar circuitry through intratelencephalic (IT) neurons, with an additional contribution of subcortical projections by pyramidal tract (PT) neurons. Furthermore, L5 activation sharpened tone-evoked responses of superficial layers in both frequency and time domains, indicating its impact on cortical spectro-temporal integration. Together, our findings challenge the classical view of feedforward cortical circuitry and reveal a major contribution of inhibitory recurrence in shaping sensory representations. Informal discussion to follow at 11:15 am EDT (GMT-4) on Zoom (link below).
Eyal Kimchi, Yurika Watanabe, Devorah Kranz, Tatenda Chakoma, Miao Jing, Yulong Li and Daniel Polley
Topic areas: memory and cognition correlates of behavior/perception multisensory processes
acetylcholine neuromodulator cortex photometryThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Acetylcholine is an important neuromodulator of cortical function that guides neural plasticity in mammalian auditory cortex (ACtx). However, the neural effects of acetylcholine in ACtx have been primarily studied using exogenous pharmacologic modulation or electrical stimulation of cholinergic inputs. We still do not know when acetylcholine release occurs endogenously in ACtx. Here, we used fiber photometry to monitor a genetically encoded fluorescent indicator to measure acetylcholine release in ACtx in awake mice (N = 12). We compared drivers and correlates of acetylcholine release in ACtx with those in visual cortex (VCtx, N = 8) and medial prefrontal cortex (mPFC, N = 4). While sensory-evoked acetylcholine release in ACtx was expected for loud, novel, or behaviorally relevant sounds, we found that even moderate intensity ( < 50 dB SPL) tones or spectrotemporal ripples elicited strong, non-habituating acetylcholine release. Surprisingly, we observed similar results in VCtx, where auditory stimuli drove stronger acetylcholine release than visual stimuli. In contrast, acetylcholine levels in mPFC were less sensitive to passively presented audiovisual stimuli, yet were highly responsive to behavioral reinforcers including liquid reward, air puff, or electric shock. Acetylcholine levels in all regions were also correlated with pupil dilation and orofacial movements. These findings demonstrate that distributed acetylcholine levels are dynamic and regionally specialized, with shared associations in sensory cortices of different modalities that are distinct from mPFC. Knowing the drivers and correlates of endogenous acetylcholine levels shapes our understanding of physiologic neural plasticity and unveils opportunities for noninvasive monitoring or manipulation of endogenous neuromodulator release.
Ying Yu, Seung-Goo Kim and Tobias Overath
Topic areas: correlates of behavior/perception
MTurk Music perception Temporal processing Tonal structure 12-tone serialismThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Introduction: Temporal integration windows help us parse continuous information. They have been studied for speech perception but remain relatively unknown for music, especially with respect to their role in appreciating musical structure via tonality. Here, we tested whether two types of highly structured musical styles (Western tonal music and 12-tone atonal music) are analyzed with different temporal integration windows. Method: Stimuli were created using a quilting algorithm (Overath et al., 2015) that controlled the temporal structure (segment lengths were log-spaced multiples of 60 ms, up to 3840 ms). The tonal and atonal music selections were taken from J.S. Bach’s Violin Sonatas and Partitas & E. Krenek’s Sonata for Solo Violin No. 2, respectively. Naturalness ratings for each 12-s stimulus were collected via Amazon MTurk. Results: Analysis of the data collected using two-level GLMs revealed that naturalness ratings significantly increased with increasing segment length. On average, tonal music (Bach) was rated higher than atonal music (Krenek). The interaction between music type and segment length revealed that the effect of segment length was greater in tonal than in atonal music. Importantly, the naturalness ratings for atonal music plateaued at 1920 ms, but continued to increase for tonal music. Conclusions: The results show a greater sensitivity to the temporal structure of tonal music than non-tonal music in the general population with minimal musical training. Further studies are ongoing to determine the relationship between temporal windows of integration for speech and music.
Kelly Jahn, Jenna Browning-Kamins, Kenneth Hancock, Jacob Alappatt and Daniel Polley
Topic areas: auditory disorders correlates of behavior/perception
central gain tinnitus hyperacusis hypersensitivity loudness emotionFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Our experience of sound is deeply interwoven with emotion. While we have well-developed approaches to study the neural encoding of sound as it relates to acoustic sources and sound perception, rigorous quantitative approaches to assess the affective qualities of sound have received far less attention. Whether in the context of normal aging, sensorineural hearing loss, or combat PTSD, subjects often report aversive emotional reactions to sounds that are experienced as neutral by normal listeners. Here, we developed a comprehensive battery of objective physiological biomarkers that can quantitatively dissociate complaints of enhanced loudness perception and sound-evoked distress in human subjects, and which have the potential to evolve into a new class of diagnostic tools for evaluating sound intolerance. To quantify sound-related distress, we assess complementary behavioral and objective indices of arousal including sound-evoked changes in pupil diameter, skin conductance, and facial micro-movements. We quantify neural sound-level growth using cortical electroencephalography (EEG) alongside parallel psychophysical measures of loudness growth. To date, we have tested 23 subjects with normal hearing, 11 with tinnitus, and 3 with hyperacusis. Across subjects, sounds that elicit negative emotional reactions also lead to elevated physiological arousal (e.g., larger changes in pupil diameter and skin conductance responses) relative to neutral sounds. We also find that neural sound-level growth is steepest for individuals with tinnitus and in subjects who report subjective hypersensitivity. The development and validation of a comprehensive battery to quantify the multifaceted aspects of sound tolerance will prove valuable in diagnosing and managing disorders with core hypersensitivity phenotypes.
Gavin Mischler, Menoua Keshishian, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language
Adaptation Computational Modeling Dynamic STRF Auditory RepresentationsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
The human auditory pathway displays a robust capacity to adapt to sudden changes in background noise. While this neural ability has been identified and characterized, its mechanism is not well understood due to the difficulty of interpreting such a nonlinear behavior. Traditional linear models are highly interpretable but fail to capture the nonlinear dynamics of adaptation. To overcome this, we employ convolutional neural networks (CNN) with ReLU activations, which can be easily interpreted as dynamic STRFs (DSTRF), since a linear equivalent function to the CNN can be found for each stimulus instance. We seek to interpret the nonlinear operations of the DSTRFs which have been trained to mimic the brain’s responses in order to understand how the brain may be dealing with changing noise conditions. We recorded intracranial EEG (iEEG) from neurosurgical patients who attended to speech with background noise that kept changing between several categories. We first demonstrate that a feedforward CNN trained to predict neural responses can produce the same rapid adaptation phenomenon as the auditory cortex, a feat which linear models cannot achieve. We further analyze the computations performed by these models and show how some neural regions react to a sudden change in background noise by altering their filters to deal with the new sound statistics. By looking at how these filters change over time particularly when adapting to a new background noise, we provide evidence that can explain how the brain achieves noise-robust speech perception in real-world environments.
Timothy Olsen and Andrea Hasenstaub
Topic areas: neural coding
Sound offset response Short-term plasticity Interneurons PV SSTThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
The neural response to sound is dependent on stimulus history, with sound-evoked responses subject to various forms of short-term plasticity (STP). In addition to spiking in response to sound, many neurons in the central auditory system fire action potentials in response to the cessation of a sound (so-called “offset responses”). Sound-offset responses are essential for sound processing, including the encoding of sound duration and frequency-modulated sweeps, which in turn are essential for the understanding of continuous sounds such as speech. Whether offset responses are also subject to STP is currently unknown. We presented awake mice with a series of repeating noise bursts, recorded through the layers of primary auditory cortex (A1) with a silicon probe, and measured dynamics of spiking response magnitudes from putative pyramidal (broad spiking: BS), PV and SST cells. BS and PV cells showed transient sound-evoked responses and frequent sound-offset responses. In contrast, SST cells generally exhibited sustained sound-evoked responses and few offset responses. Sound-evoked responses from BS cells typically depressed, whereas their sound-offset responses were more likely to remain stable or facilitate. In comparison, PV cells showed depressive responses for both sound-evoked and sound-offset responses, whereas SST cells showed increased sound-evoked facilitation and similar sound-offset STP. Across the population of all recorded cells, sound-evoked STP did not predict sound-offset STP, and vice-versa. This study reveals that sound-offset responses in A1 are subject to short-term plasticity, with different cell types in A1 showing varying amounts of sound-evoked and sound-offset adaptation and facilitation.
Benjamin Auerbach, Xiaopeng Liu, Kelly Radziwon and Richard Salvi
Topic areas: auditory disorders correlates of behavior/perception
autism fragile x syndrome hyperacusisThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Auditory processing impairments are a defining feature of autism spectrum disorders (ASD), most notably manifesting as extreme sensitivity and/or decreased tolerance to sound (i.e. hyperacusis). Fragile X syndrome (FX) is the leading inherited cause of ASD and, like the greater autistic population, a majority of FX individuals present with hyperacusis. Despite this prevalence and centrality to the autistic phenotype, relatively little is known about the nature and mechanisms of auditory perceptual impairments in FX and ASD. Using a combination of novel operant and innate perceptual-decision making paradigms, we found that a Fmr1 KO rat model of FX exhibits behavioral evidence for sound hypersensitivity in the form of abnormally fast auditory reaction times, increased sound avoidance behavior, and altered perceptual integration of sound duration and bandwidth. Simultaneous multichannel in vivo recordings from multiple points along the auditory pathway demonstrated that these perceptual changes were associated with sound-evoked hyperactivity and hyperconnectivity in the central auditory system. These results suggest that increased auditory sensitivity in FX is due to central auditory hyperexcitability and disrupted temporal and spatial integration of sound input. This novel symptoms-to-circuit approach has the potential to uncover fundamental deficits at the core of FX and ASD pathophysiology while also having direct clinical implications for one of the most disruptive features of these disorders.
Stephen Town, Katherine Wood and Jennifer Bizley
Topic areas: correlates of behavior/perception hierarchical organization
Auditory cortex Cortical inactivation Behavior Vowel discrimination Sound localization Spatial release from masking Distributed coding Functional specializationFri, 11/5 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
A central question in auditory neuroscience is how far brain regions are functionally specialized for processing specific sound features such as sound location and identity. In auditory cortex, correlations between neural activity and sounds support both the specialization of distinct cortical subfields, and encoding of multiple sound features within individual cortical areas. However, few studies have tested the causal contribution of auditory cortex to hearing in multiple contexts. Here we tested the role of auditory cortex in both spatial and non-spatial hearing. We reversibly inactivated the border between middle and posterior ectosylvian gyrus using cooling as ferrets (n=2) discriminated vowel sounds in clean and noisy conditions. The same subjects were then retrained to localize noise-bursts from six locations and retested with cooling. In both ferrets, cooling impaired both sound localization and vowel discrimination in noise, but not discrimination in clean conditions. We also tested the effects of cooling on vowel discrimination in noise when vowel and noise were colocated or spatially separated. Here, cooling exaggerated deficits discriminating vowels with colocalized noise, resulting in increased performance benefits from spatial separation of sounds and thus stronger spatial release from masking during cortical inactivation. Together our results show that an auditory cortical area may contribute to both spatial and non-spatial hearing, consistent with single unit recordings in the same brain region. The deficits we observed did not reflect general impairments in hearing, but rather account for performance in more realistic behaviors that require use of information about both sound location and identity.
Sam Watson, Torsten Dau and Jens Hjortkjær
Topic areas: speech and language correlates of behavior/perception neural coding subcortical processing
Amplitude modulation EEG Modulation Filterbank Neural correlatesThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Critical information for the perception of speech and other natural sounds is encoded within the envelope. Consequently, the auditory system accurately tracks envelope modulations at multiple rates and a ‘modulation filterbank’ model has seen some success in accounting for modulation detection and discrimination behaviour, as well as some speech intelligibility data. The present study investigates whether there is a neural basis for such a modulation filterbank, detectable in far-field envelope following responses (EFRs). This study utilises novel stimuli to simultaneously interrogate temporal and rate-based coding EFR signatures to amplitude modulations in humans and the stability of this coding during modulation masking. A fixed target frequency amplitude modulation (AM) is imposed upon a carrier, along with a noise band AM masker at intervals ranging ± 2 octaves around the target AM rate. The presence of the target is switched on/off periodically at a slow (~2 Hz) rate to create an EEG readout of rate-based encoding pathways, while temporal coding is captured at the target frequency. Modulation domain masking under the theorised modulation filter bank should increase with decreasing proximity of the AM masker to the target. Masking is measured as a reduction the EEG steady-state response at the target or switching rate in response to masker position. Comparing behavioural and these EFR modulation masking curves, preliminary data suggests neural temporal coding of envelopes is unaffected by the presence of a competing random noise modulation masker.
Fangchen Zhu, Sarah Elnozahy, Jennifer Lawlor and Kishore Kuchibhotla
Topic areas: subcortical processing
cholinergic system axonal imaging two-photon imagingFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
The cholinergic basal forebrain (CBF) projects extensively to the auditory cortex (ACx). To date, however, little is known about the intrinsic sensory-evoked dynamics of the CBF. Here, we used simultaneous two-color, two-photon imaging of CBF axon projections and cortical neurons in the ACx to examine stimulus-evoked responses in head-fixed mice passively listening to a suite of auditory stimuli. We observed striking, non-habituating, phasic responses of CBF axons to neutral auditory stimuli that are correlated with tonic cholinergic activity – a known neural correlate of brain state. However, we observed no evidence of tonotopy in CBF axons; there is a coarse population tuning to low-to-mid frequencies that is homogeneous across the ACx. Interestingly, individual axon segments exhibited heterogenous tuning within imaging sites, allowing the CBF to respond to all frequencies presented. Despite this microscopic heterogeneity, the tuning of axon segments and nearby cortical neurons was un-coupled. Finally, using chemogenetic inactivation of the ACx and auditory thalamus while imaging CBF axons in the ACx, we demonstrated that inactivation of the auditory thalamus, but not ACx, disrupted the frequency tuning of CBF axons and significantly dampened their responsiveness. Our work proposes a novel, non-canonical function for the CBF in which the basal forebrain receives auditory input from the auditory thalamus, modulates these signals based on brain state, and then projects the multiplexed signal to the ACx. These signals are temporally-synchronous with cortical responses but differ in their underlying tuning, providing a potential mechanism to influence cortical sensory representations during learning and task engagement.
Rose Ying and Melissa Caras
Topic areas: correlates of behavior/perception subcortical processing
inferior colliculus perceptual learning task-related plasticity electrophysiologyThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Sensory stimuli that are alike in nature, such as tones with similar frequencies or slightly different shades of the same color, can be difficult to differentiate at first. However, training can lead to improvement in one’s ability to discriminate between the stimuli, a process called perceptual learning. Auditory perceptual learning is important for language learning, and it can also improve the use of assisted listening devices (Fu & Galvin, 2007). Auditory cortex is an important cortical hub for sensory information that also receives functional inputs from frontal regions associated with higher-order processing. Previous research has shown that perceptual learning strengthens the top-down modulation of auditory cortex. However, it is unclear whether these learning-related changes first emerge elsewhere in the ascending auditory processing pathway and are inherited by the auditory cortex, or arise in the cortex de novo. The inferior colliculus (IC) has been shown to display spectrotemporal task-related plasticity (Ryan & Miller, 1977; Slee & David, 2015), making it an attractive candidate region for the target of top-down projections that modulate neural improvement in perceptual learning. To explore this possibility, single-unit recordings were obtained from the IC of awake, freely-moving Mongolian gerbils during perceptual learning on an amplitude modulation detection task. Our results will determine whether the gerbil IC displays task-related plasticity, and whether learning-related plasticity occurs in the IC during perceptual training. These findings will contribute to a deeper understanding of the circuits behind perceptual learning, which can aid translational research in hearing loss and cochlear implant use.
Meredith Schmehl and Jennifer Groh
Topic areas: correlates of behavior/perception multisensory processes neural coding subcortical processing
multisensory integration sound localization inferior colliculus macaqueFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Visual cues can influence brain regions that are sensitive to auditory space (Schmehl & Groh, Annual Review of Vision Science 2021). However, how such visual signals in auditory structures contribute in the perceptual realm is poorly understood. One possibility is that visual inputs help the brain distinguish among different sounds, allowing better localization of behaviorally relevant sounds in noisy environments (i.e., the cocktail party phenomenon). Our lab previously reported that when two sounds are present, auditory neurons may switch between encoding each individual sound across time (Caruso et al., Nature Communications 2018). We sought to study how pairing a light with one of two sounds might change these time-varying responses (e.g., Atilgan et al., Neuron 2018). We trained a rhesus macaque to localize one or two sounds in the presence or absence of accompanying lights. While the monkey performed this task, we recorded extracellularly from single neurons in the inferior colliculus (IC), a critical auditory region that receives visual input and has visual and eye movement-related responses. We found that pairing a light and sound can change an IC neuron's response to that sound, even if the neuron is unresponsive to light alone. Further, when two sounds are present, pairing a light with one of the sounds can change a neuron's time-varying response to the two sounds. Together, these results suggest that the IC alters its sound representation in the presence of visual cues, providing insight into how the brain combines visual and auditory information into a single perceptual object.
Jong Hoon Lee and Xiaoqin Wang
Topic areas: neural coding
vocalizations auditory belt marmoset auditory cortex high-density electrodesFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
In early stages of auditory processing, neurons faithfully encode acoustic features of sounds. As we move up the auditory pathway, the auditory system is thought to represent behaviorally relevant stimuli (e.g., species-specific vocalizations) in a manner that is invariant to differences or changes in their basic acoustic parameters when such differences fall within the range of natural stimuli. This invariance is believed to be the underlying mechanism for perceptual phenomena such as categorical perception or the perceptual magnet effect that have been documented in human speech perception (Kuhl 1991). Although there have been studies investigating the emergence of such invariance, there is much to discover in terms of where and how it is represented. In this study we investigated how core and belt areas of marmoset auditory cortex encode changes in center frequency of both pure tones and synthesized marmoset phee calls and compared the neural responses with corresponding behavioral measures. Marmoset is a highly vocal non-human primate species with a hearing range similar to that of humans. We observed that in the core area, responses to a given pair of stimuli (pure tones or phees) reflect their differences in center frequency. In the belt area, however, this was the case for the responses to pure tones, but not to phees. In particular, offset responses to frequency-shifted phees were invariant to changes in center frequency when the frequency shift was within the range of the center frequency of phee population samples recorded in our colony (Agamaite et al., 2015).
Ryan Calmus, Benjamin Wilson, Yukiko Kikuchi, Zsuzsanna Kocsis, Hiroto Kawasaki, Timothy D. Griffiths, Matthew A. Howard and Christopher I. Petkov
Topic areas: memory and cognition speech and language neural coding
Sequence learning Computational modelling Relational code Binding ECoG Electrophysiology Cognition HumanFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Understanding how the brain represents and binds information distributed over time benefits from neurocomputationally informed approaches. Language exemplifies the temporal binding problem, where syntactic knowledge facilitates mental restructuring of words/phrases, yet the problem is broadly relevant. For instance, various animal species can learn auditory sequencing dependencies in Artificial Grammar Learning (AGL) tasks, and in primates fronto-temporal regions including frontal operculum and areas 44/45 are implicated. We recently proposed a neurocomputational model, VS-BIND, which triangulates reported findings across frontal auditory areas to define site-specific roles and interactions (Calmus et al., 2019). We are testing this model with human and primate intracranial recordings and here report tests undertaken on AGL data in neurosurgery patients being monitored for epilepsy treatment. Under the AGL task, 12 patients listened to speech sequences containing adjacent and non-adjacent dependencies and were then tested on their ability to distinguish novel "grammatical" and "ungrammatical" sequences. Analysis of the intracranial data using traditional methods demonstrated fronto-temporal engagement, and we subsequently undertook novel multivariate analyses to reveal the representational geometry of regional relational encodings and inter-regional causal flow. Results revealed that prefrontal auditory areas interact to integrate relational information, including ordinal positions of items in a speech sequence, concordant with predictions of VS-BIND. We observed causal flow consistent with expectation-driven prefrontal feedback predictions to primary auditory cortex and feedforward auditory information flow. These results indicate critical fronto-temporal roles in transforming the auditory sensory world into mental structures, and show how the neural system integrates key speech sequence features into relational codes.
Rebecca Krall, Megan Arnold, Callista Chambers, Mara Frum and Ross Williamson
Topic areas: correlates of behavior/perception neural coding subcortical processing
Auditory categorization Corticofugal Sensory-guided behaviorFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Sensory-guided behavior is ubiquitous in everyday life. During sensory-guided behavior, distinct populations of neurons encode relevant sensory information and transform it into an appropriate behavioral response. An open problem is identifying which neural circuits contribute to such behaviors. In the auditory system, information is propagated through a feed-forward hierarchy that runs from the cochlea to the primary auditory cortex (ACtx). Corticofugal neurons in the ACtx send projections to multiple nodes of this ascending pathway and target distinct downstream regions associated with decision making, action, and reward. Through these projections, corticofugal neurons are positioned to broadcast behaviorally-relevant information and shape auditory representations across brain-wide networks. We hypothesized that distinct classes of ACtx projection neurons differentially mediate auditory-guided behaviors. To test this hypothesis, we developed a head-fixed behavioral task, where mice are trained to categorize amplitude-modulated noise bursts by licking one of two spouts yoked to the modulation frequency. To determine the role of ACtx in this behavior, we optogenetically silenced excitatory neurons during stimulus presentations using soma-targeted Guillardia theta anion-conducting channelrhodopsins (GtACRs). We found inactivating ACtx excitatory neurons in 20% of trials did not disrupt task learning, allowing us to assess the contribution of ACtx neural populations longitudinally. We found that inhibition of excitatory neurons in the ACtx altered task performance in a manner dependent on the extent of learning and task difficulty. Our ongoing research seeks to characterize the specific contributions of distinct ACtx projection neuron class using selective expression of GtACRs to inhibit activity across task acquisition and expression.
Justin Yao, Klavdia Zemlianova and Dan Sanes
Topic areas: correlates of behavior/perception
parietal cortex temporal integration alternative forced choice neural manifoldsFri, 11/5 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
The transformation of sensory evidence into decision variables is fundamental to forming perceptual choices. We asked how the neural representation of acoustic information is transformed in the auditory-recipient parietal cortex, a region that is causally associated with sound-driven perceptual decisions (Yao et al., 2020). Neural activity was recorded wirelessly from parietal cortex while gerbils performed an alternative forced choice auditory temporal integration task, or during passive listening to the identical acoustic stimuli. Gerbils were required to discriminate between two amplitude modulated (AM) noise rates, 4 versus 10 Hz, as a function of signal duration (100-2000 ms). Task performance improved with increasing duration, and reached an optimum at ≥600 ms. We found that population activity from simultaneously recorded parietal neurons represented acoustic information (4 vs 10 Hz AM). A principal component analysis fit to trial averaged neural responses revealed low-dimensional encoding of acoustic information as neural trajectories (i.e., neural manifolds) differentiated across stimulus conditions. During task performance, decoded population activity reflected psychometric performance, which was consistent with low-dimensional encoding of acoustic information (seen in passive listening) and behavioral choices (left versus right). At stimulus onset, neural trajectories started at a similar position, but began to diverge toward the relevant decision subspace after ~300 ms of acoustic stimulation. Neural trajectories of incorrect trials tended to course along the opposing decision subspace, reflecting lapse rate or failed evidence accumulation. Taken together, our findings demonstrate that parietal cortex leverages the encoded auditory information to guide sound-driven perceptual decisions over a behaviorally-relevant time course.
Keith Kaufman, Rebecca Krall, Megan Arnold and Ross Williamson
Topic areas: correlates of behavior/perception neural coding
auditory cortex arousal pupil brain state corticofugal neurons frequency tuningThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Fluctuations in behavioral state, such as attention and arousal, persist during wakefulness and influence both behavior and sensory information processing. Strong correlations between pupil diameter, a biomarker of arousal state, and evoked neural activity in multiple sensory cortices are well-documented. Lin et al (2019) found that arousal modulates both the frequency tuning and response magnitude of L2/3 pyramidal neurons in the primary auditory cortex (ACtx). ACtx pyramidal neurons massively innervate many downstream targets, both within and outside of the central auditory pathway. Notably, classes of corticofugal neurons (i.e., intratelencephalic (IT), extratelencephalic (ET), and corticothalamic (CT)) differ regarding their anatomy, morphology, and intrinsic and synaptic properties. The distinct anatomy and connectivity profiles of these cells lead us to hypothesize that arousal states may differentially modulate their sensory tuning properties. To investigate this, we recorded neural activity from ACtx with two-photon calcium imaging in awake mice while simultaneously capturing facial movements and pupil dilations. We drove GCaMP8s expression in IT/ET/CT populations through Cre-dependent viral transfection. Similar to Lin et al. (2019), we found arousal-dependent changes in the tuning properties of L2/3 IT cells, evidenced by an increase in bandwidth and response magnitude coinciding with higher arousal states. Our preliminary data suggests that the effect of arousal differs in other projection classes, with response magnitudes peaking at intermediate levels of arousal, following the classic inverted-U dependence (Yerkes-Dodson curve). The characterization of these state-dependent effects on distinct excitatory populations provides valuable insight into how sensory information is shared brain-wide to guide perception and action.
Megan Arnold, Rebecca Krall and Ross Williamson
Topic areas: hierarchical organization subcortical processing
corticofugal extratelencephalic auditory projectionFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Excitatory projection neurons in primary auditory cortex (ACtx) propagate sensory information brain-wide to inform emotion, attention, decision-making, and action. These neurons fall into three broad classes: intratelencephalic (IT), extratelencephalic (ET), and corticothalamic (CT). Of these classes, ET cells form the only direct connection between ACtx and myriad sub-cortical targets. Their distinct morphology, with prominent apical dendrites and diverse axonal targets, puts them in a privileged position to broadcast sensory signals to multiple downstream targets simultaneously. However, the extent of their axonal collateralization, the spatial organization of their projections, and whether these distinct organizational motifs receive differential synaptic input remains unknown. To address these questions, we characterized the input/output circuitry of ACtx ET cells and compared their anatomical organization to that of IT and CT populations. We drove selective viral expression of a fluorophore in distinct ET sub-populations, allowing us to quantify downstream projection densities and identify local and long-range synaptic input through monosynaptic rabies tracing. Our preliminary results indicate that many ET neurons collateralize to the non-lemniscal regions of the inferior colliculus and thalamus, confirming previous reports. Monosynaptic rabies tracing demonstrated widespread synaptic inputs to ET, IT, and CT cells from many cortical and subcortical areas, including the thalamus, contralateral ACtx, and ipsilateral visual, somatosensory, parietal, and retrosplenial cortices. Our ongoing experiments are focused on extending these findings to distinct ET organizational motifs. This work will provide a foundation for understanding how brain-wide interactions between distinct areas cooperate to orchestrate sensory perception and guide behavior.
Zsuzsanna Kocsis, Rick L. Jenison, Thomas E. Cope, Peter N. Taylor, Bob McMurray, Ariane E. Rhone, Mccall E. Sarrett, Yukiko Kikuchi, Phillip E. Gander, Christopher K. Kovach, Fabien Balezeau, Inyong Choi, Jeremy D. Greenlee, Hiroto Kawasaki, Timothy D. Griffiths, Matthew A. Howard and Christopher I. Petkov
Topic areas: speech and language correlates of behavior/perception neural coding
surgical disconnection of the ATL diaschisis speech prediction frontal-auditory neural signalsThu, 11/4 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
The strongest level of causal evidence for the neural role of a brain hub is to measure the network-level effects of its disconnection. Here we present rare data from two patients who underwent surgical disconnection of the anterior temporal lobe (ATL) as part of a clinical procedure to treat intractable epilepsy. During the surgery, we obtained pre- and post-resection intraoperative electrocorticographic (ECoG) recordings while the patients were awake and performing a speech-sound perceptual prediction task. We also obtained pre- and post-operative magnetic resonance imaging (MRI) including T1 and T2 structural and diffusion-weighted scans. Diffusion MRI tractography from ATL seed regions confirmed disconnection of the temporal pole from other cortical areas. Post-disconnection neurophysiological responses to the speech sounds showed a striking dissociation from the pre-disconnection signal in the form of, 1) magnified responses in auditory cortex (Heschl’s gyrus) across oscillatory frequency bands (3-150 Hz); and, 2) disrupted oscillatory responses in prefrontal cortex (Inferior Frontal Gyrus, IFG). Moreover, after the disconnection auditory cortical mismatch responses and theta-gamma coupling to the speech sounds were disrupted, and neural responses to different speech sounds became less segregable (i.e., more similar). State-space conditional Granger Causality analyses revealed substantial changes in neural information flow between auditory cortex and IFG post disconnection. Overall, we demonstrate diaschisis, whereby the loss of ATL neural signals results in an immediate change in activity and connectivity in intact frontal and auditory cortical areas, potentially reflecting incomplete compensation for processing and predicting speech sounds.
Amy LeMessurier, Kathleen Martin, Cheyenne Oliver and Robert Froemke
Topic areas: memory and cognition correlates of behavior/perception neural coding
auditory cortex perceptual learning inhibition interneuronsFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Auditory perceptual learning is associated with changes in the tonotopic organization of frequency tuning in auditory cortex, as well as modulation of tuning depending on behavioral context. This modulation depends on activity in inhibitory circuits. Stimulation of neuromodulatory centers projecting to cortex can speed perceptual learning and induce tonotopic map plasticity. Interneurons in layer 1 (L1) receive input from long-range neuromodulatory and intracortical inputs, and target the dendrites of pyramidal cells and layer 2/3 interneurons, positioning them to gate integration of neuromodulatory and sensory input. We hypothesize that plasticity in L1 is a crucial component of auditory perceptual learning. We trained 5 mice on an appetitive, 2-alternative forced-choice tone recognition task while measuring activity in NDNF interneurons on each day of training using chronic 2-photon calcium imaging. Each mouse initially achieved < 80% correct discriminating a center tone and a foil before the introduction of additional foils surrounding the target frequency. After 28 +/- 7 days of training each mouse correctly identified tones on 76 +/- 4% of trials. NDNF neurons displayed a variety of tuning profiles for tones used in the task, including suppression of responses to the target tone in some neurons and enhanced responses in others. Additionally, tuning in NDNF neurons was task modulated – tuning curves differed within most cells between the task context and passive tone presentation. These results suggest that tuning in NDNF interneurons may be plastic over the course of training, and that activity in these neurons may shape context-dependent activity in downstream neurons.
Hannah M. Oberle, Alexander F. Ford, Jordyn Czarny and Pierre F. Apostolides
Topic areas: hierarchical organization neural coding subcortical processing
Inferior Colliculus Top-down control Synaptic mechanisms Cortico-collicularFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Active listening, such as conversing in a noisy environment, relies on “top-down”, cognitive resources. Descending projections from the auditory cortex to sub-cortical regions are thought to be a major source of such “top-down” signals, but the cellular mechanisms supporting descending transmission are unknown. To address this, we determined the biophysical properties of descending synapses from the auditory cortex to the inferior colliculus (IC) using in vitro and in vivo whole-cell electrophysiology and optogenetics. We found brief auditory cortex stimulation reliably triggered EPSPs with a range of peak amplitudes primarily in the superficial layers of the IC. Using a minimal fiber stimulation approach, we quantified the strength of single auditory cortico-collicular contacts and determined that multiple descending axons converge onto single IC neurons. Latency measurements in vivo suggest that descending cortical signals likely arrive in the IC rapidly following sound onset, such that the auditory cortex may influence the first spike latencies of the shell. In vitro, dual pathway stimulation that mimics the in vivo timing of ascending and descending pathways causes a NMDA receptor dependent, supra-linear integration of excitatory inputs. Our work suggests that cortico-collicular input can reliably and substantially modulate shell IC activity, leading to the amplification of IC efferent signals when ascending and descending inputs arrive in quick succession.
Kathleen Martin, Eleni Papadoyannis, Jennifer Schiavo, Nesibe Temiz, Saba Fadaei, Matthew McGinley, David McCormick and Robert Froemke
Topic areas: correlates of behavior/perception
auditory perceptual learning vagus nerve stimulation cholinergic modulationThu, 11/4 12:30PM - 12:45PM | Short talk
Abstract
Input from the periphery to the central nervous system (CNS) can influence sensory perception. Previous work has demonstrated the vagus nerve plays a key role in transmitting peripheral information to the CNS. This is partially done by activating neuromodulatory areas, including basal forebrain (BF). Here, we examined if vagus nerve stimulation (VNS), and subsequent activation of cholinergic BF neurons, could be used to improve sensory discrimination. To study this, we sought to improve auditory perceptual decisions in well-trained mice. Mice were trained to classify tones as a single, center frequency (11-16 kHz) or non-center. In well-trained animals, perceptual decisions were variable across animals, but stable within an animal. We used a custom cuff electrode to stimulate the vagus nerve in blocks of trials in animals to improve stable behavior. After six days of VNS, we observed significant improvements in performance (N=11 animals), in comparison to sham-implanted animals. We dissected the contribution of cholinergic signaling to VNS-mediated perceptual improvement. We found increased activation of cholinergic BF axons in auditory cortex during VNS. Since VNS activated auditory-cortical projecting cholinergic neurons, we optogenetically activated these neurons to mimic VNS. Activation of cholinergic neurons alone was sufficient to improve behavior (N=5 animals). To test if VNS-mediated improvements were dependent on activity of cholinergic neurons, we optogenetically inhibited cholinergic BF neurons during VNS. This abolished the VNS-mediated improvement previously seen (N=6 animals). Taken together, these results indicate that VNS improves perceptual discrimination, at least in part, by activating auditory-cortical projecting cholinergic neurons in the BF. Informal discussion to follow at 1:15 pm EDT (GMT-4) on Zoom (link below).
Rebecca Norris, Stephen Town, Katherine Wood and Jennifer Bizley
Topic areas: multisensory processes
Multisensory Audiovisual Ferret Auditory cortex Suprasylvian cortexFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Multisensory integration has been demonstrated across cortical and subcortical areas. In the ferret, some neurons in auditory cortex (AC) respond to visual stimuli, and AC responses to sound can be modulated by visual stimuli. AC receives input from several potential sources of visual information: parietal cortex, the suprageniculate nucleus of the thalamus, and visual cortex. Here, we investigated the role of a sub-region of visual cortex (the suprasylvian cortex, SSY and adjacent area 21) in multisensory integration. SSY and area 21 send dense projections to AC, particularly to anterior regions. To assess the functional relevance of these connections, we recorded the responses of AC neurons under ketamine-medetomidine anaesthesia to auditory, visual and combined audiovisual stimuli before, during and after transient inactivation of SSY via cooling. Of the 312 stimulus responsive units identified, 224 (72%) were responsive to broadband noise, and 143 (45%) responded to a white light flash. Cooling SSY impacted 35% (51/143) of visually responsive units, most of which exhibited decreased responses. However, we also recorded 13/51 units in which visual activity emerged or increased during cooling. Intriguingly, some units showed a shift in the timing of their responses under cooled conditions, and we additionally observed that some units showed persistent changes in firing pattern after cooling, despite SSY returning to pre-cooling temperature and activity. These findings support a functional role for both excitatory and inhibitory effects of visual cortex on audiovisual integration in AC, while also implicating the involvement of additional pathways.
Dana Boebinger, Sam Norman-Haignere, Josh McDermott and Nancy Kanwisher
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization neural coding
auditory cortex music perception fMRIFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Converging evidence suggests that neural populations within human non-primary auditory cortex respond selectively to music. These neural populations respond strongly to a wide range of music stimuli, and weakly to other natural sounds and to synthetic control stimuli matched to music in many acoustic properties, suggesting that they are driven by high-level musical features. What are these features? Here we used fMRI to test the extent to which musical structure in pitch and time contribute to music-selective neural responses. We used voxel decomposition to derive music-selective response components in each of 15 participants individually, and then measured the response of these components to synthetic music clips in which we selectively disrupted musical structure by scrambling either the note pitches and/or onset times. Both types of scrambling produced lower responses compared to when melodic or rhythmic structure was intact. This effect was specific to the music-selective component, and not present even in spatially overlapping response components with other selectivities. We further found no evidence for any cortical regions sensitive to pitch but not time structure, or vice versa. Our results suggest that the processing of melody and rhythm are intertwined within auditory cortex.
Beshoy Agayby, Yukiko Kikuchi, Ross Stephen Muers, Jennifer Soraya Nacef, Michael Christoph Schmid and Christopher I. Petkov
Topic areas: multisensory processes novel technologies
Optogenetics Laminar recordings Auditory Cortex VLPFC STS Theta Frontal-auditory CircuitsThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Multisensory neural interactions often involve the upper bank of the Superior Temporal Sulcus (STS) and Ventro-Lateral Prefrontal Cortex (VLPFC, e.g., areas 44/45). These regions contain audio-visual bimodal neurons and clusters, and receive monosynaptic input from auditory belt areas to VLPFC (Romanski et al., 1999; Rocchi et al., 2021). However, how this projection contributes to VLPFC function including audiovisual integration is not clear. To address this, we tested whether optogenetic perturbation of auditory belt areas would influence VLPFC responses to voices, faces or both. We injected the optogenetic viral construct AAV9-CaMKII-ChR2-eYFP at multiple sites and depths (total volume = 36µl) in the anterior part of the auditory belt. Following expression, we confirmed strong spiking responses in auditory belt neurons in response to optogenetic stimulation modulated at 5Hz and 40Hz. Near injected locations, there was an increase in multiunit activity (MUA) across the cortical layers in response to blue light (473nm). Then, we investigated the effect of optogenetic stimulation of auditory belt neurons with or without simultaneous sensory stimulation with voices, faces or both on VLPFC area 44 neuronal and LFP responses across the cortical layers. Frontal Current Source Densities (CSDs) were markedly different to the sensory stimulation conditions and appeared to be modulated particularly in deeper layers during auditory cortex optogenetic stimulation. Our preliminary results demonstrate the feasibility of using macaque optogenetics to manipulate neuronal circuit activity during audio-visual integration, which aim to shed light on the mechanistic role the rapid monosynaptic auditory input into fronto-temporal regions involved in sensory integration.
Taku Banno, Jaejin Lee, Yonatan I. Fishman and Yale E. Cohen
Topic areas: correlates of behavior/perception hierarchical organization neural coding
auditory scene analysis auditory stream segregation cortical pathway cortical layers macaque monkeyThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
A fundamental goal of the auditory system is to transform acoustic stimuli into discrete perceptual representations. One important process in this transformation is the generation of “auditory streams”, which are formed by segregating auditory stimuli with different spectrotemporal regularities. Although behavioral studies have identified key psychoacoustic determinants of auditory streaming, its underlying neural mechanisms are still unclear. Here, we recorded multiunit activity (MUA) by linear array multicontact electrodes penetrating orthogonally into the core and belt auditory cortices in two macaque monkeys. The monkeys performed a behavioral task designed to provide an objective report of auditory streaming, wherein the successful detection of a “target” stimulus could only occur if a sequence of tones was perceptually segregated. We found that the core and belt encoded both stimulus and behavioral variables: At the beginning of the trials, the MUA encoded stimulus frequency, but as the sequence unfolded, the MUA was modulated more by a monkey’s behavioral outcomes. Comparisons between cortical areas and layers further revealed a characteristic distribution of the neural encoding. In the core, frequency separation between the tones modulated MUA more in deeper layers than in superficial layers; however, we could not identify any consistent differences across cortical layers in MUA modulation by behavioral outcomes. In contrast, in the belt, frequency differences modulated MUA primarily in superficial layers, whereas behavioral outcomes modulated MUA primarily in deeper layers. These findings indicate that, during auditory streaming, task variables are dynamically and differentially encoded in different layers of the ventral auditory pathway.
Katarina Poole, Maria Chait and Jennifer Bizley
Topic areas: memory and cognition correlates of behavior/perception neural coding
auditory patterns regularity detection auditory cortex ferret electrophysiology behaviourFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Acoustic stimuli that transition from randomly presented tones to regular sequences of tones have been employed to investigate the involvement of regularity within auditory scene analysis and have highlighted a network of brain regions such as auditory cortex and hippocampus that are involved in this process (Barascud et al., 2016). However, how neurons in auditory cortex detect and respond to regularity is unknown. Here we seek to understand how neurons within auditory cortex detect acoustic patterns and encode the transition from random to regular tone sequences We trained ferrets (n=6) on a go/no-go task to detect the transition from a random sequence of tones to a repeating pattern of 3 tones with all animals performing significantly above chance (p < 0.001, d’ range across animals = 1.87 to 2.33). Performance decreased but was maintained above chance for the majority of animals across increasingly complex auditory patterns (p < 0.05, mean d’ across animals for pattern repeat lengths 3, 5 and 7 respectively: 2.12, 1.50, and 0.87) and with matched stimuli where both the random and regular sequence were generated from the same unique frequencies (p < 0.05, mean d’ across animals: 0.86). We are currently performing chronic neural recordings across primary and secondary fields of auditory cortex in trained ferrets (n=2) using multi-electrode arrays. Here we hope to elucidate neural mechanisms that are involved in behavioural detection of regularity by comparing behavioural performance and the elicited neural correlates.
Vinay Raghavan, James O'Sullivan, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception
speech perception cocktail party glimpsing model ecogFri, 11/5 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
Speech perception in multitalker acoustic environments requires listeners to extract and group features of a target talker from those of non-target talkers. Because information in speech is both sparse and redundant, listeners are thought to rely upon spectrotemporal glimpses in which one talker contains more energy than the others. Conversely, the phoneme restoration effect suggests masked phonemes may be restored to support continuous speech perception. Despite computational and behavioral evidence showing glimpses are sufficient to support speech recognition, the neuroscientific evidence for glimpsing is lacking. Therefore, we investigated how attention to a talker changes the encoding of glimpsed and masked phonemes of target and non-target talkers in human auditory cortex. We obtained iEEG recordings in HG and STG while subjects attended to one talker in a two-talker mixture. We used linear encoding models to predict the high-gamma envelope of neural responses using acoustic, phonetic, and linguistic features of target and non-target talkers. We found linguistic encoding beyond phonetic features for only the target talker. In processing phonetic features, responses were more accurately predicted using separate glimpsed and masked phonetic representations. In particular, we found glimpsed phonetic encoding of target and non-target talkers in HG. We also found that STG represents both masked and glimpsed phonetic features for only the target talker, with masked phonetics encoded 100ms later. These findings provide neural evidence for the glimpsing model and suggest that auditory cortex continuously restores masked target speech, leading to a complete and invariant representation of target talker phonetics in STG.
Ana Sanchez Jimenez, Katherine Willard, Victoria Bajo Lorenzana, Andrew King and Fernando Rodriguez Nodal
Topic areas: correlates of behavior/perception
Sound localization Operant behavior Conductive hearing loss PlasticityThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Sound localization relies on the capacity of the brain to extract spatial information embedded in the auditory signal. Unilateral conductive hearing loss (UCHL) produces a binaural imbalance, which compromises the computations required for sound localization. However, previous studies have shown that training enables individuals with UCHL to partially recover their spatial accuracy. It remains unclear whether this adaptive process generalizes beyond the training stimuli. Seven ferrets trained to localize broadband noise bursts in a silent background showed a severe reduction in accuracy when one ear was plugged, which then recovered with training (response accuracy improved from 20% to 70-90% and error size decreased from 47.9±14.3 to 11.1±10.7 degrees). Similarly, the accuracy of head-orienting responses, which were initially biased toward the side of the non-occluded ear, recovered in two of the animals, suggesting the likely involvement of subcortical processing. Localization accuracy was further tested before and after UCHL adaptation using noise bursts of different bandwidths against a background of either constant or amplitude-modulated noise. While localization accuracy varied with stimulus type depending on the cues available and the masking effects of the background noise, the relative performance of the animals for different sounds was unaffected by UCHL. However, a positive correlation was observed between the degree of adaptation achieved with the training stimulus and generalization of the animals’ performance to other stimulus types and to noisy backgrounds. Overall, these results suggest that recovery in spatial hearing can generalize to other sounds and that adaptation and generalization may occur simultaneously.
Han Mu, Seung-Goo Kim and Tobias Overath
Topic areas: correlates of behavior/perception neural coding
fMRI imaging temporal response function acoustic energy sound-onset/-offset responseFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
While the spatial patterns of information encoding in fMRI images are well established (e.g., Huth et al., 2016), the temporal response function (TRF; e.g., Ding & Simon, 2012) of fMRI time-series data remains less well understood. With respect to the latter, Harms et al. (2002, 2005) showed that auditory cortex exhibits a combination of “sustained” and “phasic” responses (positive peaks after sound onset and offset) depending on repetition rates and temporal densities. Motivated by this, we compared the predictive performance and the temporal response functions of various transformations of acoustic energy. In particular, we used (1) the cochleogram envelope, (2) the first-order derivative of the envelope and its nonlinear transforms: (3) positive and (4) negative half-wave rectification, and (5) absolute values. We analyzed a subset of an open-access fMRI data set from Sachs et al. (2019; openneuro-ds003085) in which 39 participants listened to a piece of 3-min instrumental pop music. A FIR model with delays up to 12 sec was fit using ridge regression and its prediction was evaluated via two-fold cross-validation. While regressors (4) and (5) showed significant prediction accuracies in the bilateral temporal cortices and the right inferior temporal gyrus (P < 0.025), only (5) showed physiologically plausible TRFs (positive peaks at around 6-8 sec). This result suggests that a considerable amount of variance of the fMRI responses to continuous sounds like music can be explained by offset responses. It also highlights the need for a careful choice of regressors to enhance the interpretability of linear predictive models.
Galit Agmon, Martin G. Bleichner, Reut Tsarfaty and Elana Zion Golumbic
Topic areas: speech and language
Neural tracking Speech EEG Speech disfluencies Speech segmentationThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Neural speech-tracking experiments usually use idealized speech (e.g., audiobooks). However, this type of speech is dramatically different from the spontaneous speech produced in real life. Real-life speech contains frequent pauses and fillers (“um”, “er”), its rate varies over time, and it is also very associative, leading to highly complex sentences. The current study aims to extend speech-tracking research to explore how the brain encodes these unique properties of real-life speech. To this end, we recorded neural activity using EEG from 20 participants as they listened to an unscripted narrative in Hebrew. We estimated the temporal response function (TRF) to the acoustic envelope of the speech, after characterizing it along five different dimensions: lexicality, clause-boundaries, clause-duration, speech fluency, and syntactic complexity. We found robust TRFs in fronto-central electrodes with two prominent components: TRF-P2 and TRF-N350 (reflecting the polarity and latency of each component). The lexicality of utterances (proper words vs. fillers) affected both TRF-P2 and TRF-N350 components, which were mostly absent for non-lexical utterances. Words that constitute clause boundaries (opening vs. closing words) also showed modulation of these components. Clause duration affected the latency of the TRF-N350 response. Syntactic complexity affected the amplitude of the TRF-N350, which was larger for high-complexity clauses. Speech rate, however, did not seem to have a prominent effect on the TRF. In conclusion, the current work demonstrates the importance of acknowledging the complexity of real-life speech. We hope that this proof-of-concept study will provide the foundation for future research on the neural processing of real-life speech.
Jan Willem de Gee, Zakir Mridha, Marisa Hudson, Yanchen Shi, Hannah Ramsaywak, Spencer Smith, Nishad Karediya, Matthew Thompson, Kit Jaspe, Wenhao Zhang and Matthew Mcginley
Topic areas: correlates of behavior/perception
Decision-making Reward Arousal Pupil AcetylcholineThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Depending on the current estimate of a task’s utility, one’s goal might be to harvest rewards (exploitation) or disengage to seek alternatives (exploration). Prominent theories argue that increased neuromodulation promotes a shift from exploitation to exploration, or instead mediates attention. Experimental data constraining and arbitrating between such theories is limited. We developed an “attentional effort task” for head-fixed mice. Mice licked for sugar-water reward upon detection of temporally unpredictable coherence in a sustained tone-cloud. We manipulate task utility by changing the reward size in blocks of trials. Thus, mice should expend more attentional effort (exploit more) in high reward blocks and disengage more in low reward blocks. We simultaneously recorded pupil size, walking speed, and acetylcholine concentration in auditory cortex via two-photon imaging of GRABACh (GPCR-based sensor). Here, we report behavioral and physiological signatures of adaptive shifts in behavior in >85 mice. Mice better detected the weak sensory signal in the high vs low reward context, indicating an increase in the efficacy of exploitation. Pre-trial baseline pupil size and walking speed were higher in the low reward blocks, indicating that mice are overly aroused and engage in exploratory behaviors when task utility is low. Finally, acetylcholine concentration phasically increased after correct and incorrect licks, in a reward-context dependent manner. In sum, we find that (pupil-linked) arousal and cholinergic signaling underlie adjustments of the exploration-exploration trade-off in mice. In ongoing work, we are determining the roles of frontal-sensory interactions and other neuromodulatory systems in mediating these adaptive shifts in attentional state.
Keyvan Mahjoory, Andreas Bahmer and Molly J. Henry
Topic areas: neural coding
Decoding spatial attention Decoding Auditory perception Machine learningThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
In a multi-speaker scenario, human listeners are able to attend to one particular speaker of interest and ignore the others. Previous studies have shown that spatial location of the attended speaker can be decoded with 80-90% accuracy based on the electroencephalography (EEG) recordings. However, for real world applications, hearing aids for example, finding the minimal EEG set-up that achieves similar decoding performance would be of great interest. In this study, we used publicly available 64-channel EEG data recorded from 18 participants attending to one of two spatially separated (left and right) speech audio streams and ignoring the other (Fuglsang et al. 2018). We trained a convolutional recurrent neural network on broadband (1-30 Hz) 10s EEG time series, and achieved a mean accuracy of 86.7% to decode the locus of attention. Next, we tested the model on subsets of EEG channels and frequency bands. Our results showed that a selection of 8 channels covering the entire head and positioned over both hemispheres can achieve around 80% accuracy. According to literature, we expected that alpha lateralization would be important for predicting the locus of attention. Indeed, decoding performance of our model decreased significantly after filtering out alpha activity or picking channels solely from a single hemisphere. Overall, this study exploited a data-driven approach, training a model on EEG time series rather than pre-selected features and showed that alpha lateralization is the main predictor of spatial attention. In addition, our minimal EEG-setup recommendation could be beneficial for hearing-aid applications coupled to “wearable” EEG.
Ali Zare, Bahar Khalighinejad, Jose L. Herrero, Ashesh D. Mehta and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception subcortical processing
Top-down and Bottom-up Speech Processing Semantic Encoding Noisy Speech Deep Language Models GPT2 Word Embeddings neural representation of words Context in Speech Processing ECoG Neural Activity PredictionThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Speech processing in real-world situations involves processing of the content of speech which can be subjected to changing acoustic conditions and backgrounds. The human auditory cortex (AC) processes speech at various levels that range from the acoustic to the semantic level. It has been shown that deep language models (DLMs) such as GPT2 contain semantic information that can be used to capture the neural representations of speech better than the conventionally used static semantic features. However, the extent to which the human auditory cortex relies on this high-level semantic information for the processing of speech in a noisy environment remains unclear. Here, we analyzed the neural activity (ECoG) in the AC as subjects listened to the speech with abruptly changing background noises. We find that the neural responses in noisy speech can be estimated with higher correlations when semantic features, derived from GPT2 word embeddings are used with the spectrogram information. This improvement is significant in lateral areas of the AC such as STG which is known for higher-level speech processing. In the clean speech, however, the addition of these semantic features to the acoustic information did not improve the predictions as significantly as in noisy speech. This suggests that the use of the complementary information in these features to the acoustic information is enhanced when the acoustic information is degraded. These findings contribute to our understanding of the top-down versus bottom-up dynamics in the brain during speech processing in the-real word conditions.
Prachi Patel, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: correlates of behavior/perception
iEEG spatial hearing cocktail party auditory attentionFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
The ability to track and attend to a moving talker is important for communication in naturalistic multi-talker speech perception. A switch in target’s location has been shown to disrupt target’s ERP component and alpha-band lateralization by EEG studies and to transiently activate parietal cortex by fMRI studies. However, limitations in spatiotemporal resolution and stimulus complexity leave questions unanswered. What are the cortical signatures of spatial attention switch throughout the auditory cortex as well as in the dorsal and ventral attentional networks? How do the neural dynamics enable successful tracking of a moving target’s speech? We recorded intracranially from humans while they performed a naturalistic spatial attention task of following a target in a multi-talker scene where the target randomly switched their location. We report a neural signature of increased high-gamma activity localized in time for spatial auditory attention switch throughout the auditory cortex, as well as the dorsal and ventral attention networks. This contrastive response is distinct for both the attention networks- the dorsal network shows transient response that goes down to baseline after 1s whereas the ventral network shows sustained response to the stimulus. Furthermore, a switch in target’s location momentarily disrupts the encoding of target’s speech features until the attention switch is completed, highlighting the temporal dynamics of encoding the target speech. These findings provide insights into temporal dynamics of spatial auditory attention control in the auditory cortex and the dorsal & ventral attention networks for tracking listener’s focus on a moving talker in multi-talker acoustic scenes.
Vishal Choudhari, Prachi Patel, Stephan Bickel, Ashesh D Mehta and Nima Mesgarani
Topic areas: auditory disorders memory and cognition speech and language correlates of behavior/perception
AUDITORY ATTENTION HEARING AID SOUND LOCALIZATIONThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Motivation: In most auditory attention decoding (AAD) algorithms, a representation (envelope or spectrogram) of the attended speech is reconstructed from neural signals and compared with the speech representations of the talkers in an acousitc scene. The decoded attended talker is chosen as one whose speech representation yields the highest correlation with the reconstructed speech representation. As talkers in a cocktail party environment are often spatially separated, a supporting approach for an AAD algorithm could be decoding the location where attention is directed, i.e., the location of the attended talker. Recent deep learning-based multi-channel speech separation algorithms preserve location information (interaural cues) in the separated binaural speech streams. The attended location decoded from neural signals can then be compared with the locations of talkers (estimated from their binaural speech streams) to improve the attended talker decoding accuracy. Approach: We recorded from epilepsy patients using depth electrodes (sEEG) as they listened to spatial multi-talker speech stimuli. Two parallel speech streams (one male, one female) arrived at the subject from two locations at different azimuthal angles: -45 degrees (front-left) and +45 degrees (front-right). Subjects were asked to attend to a pre-specified talker. Results: Attended location can be decoded significantly above chance levels on a window-by-window basis, with a window duration as small as 500 milliseconds. Decoding accuracies improve with an increase in window durations. Significance: The ability to reliably decode attended location suggests a potential for improving the performance of existing AAD algorithms by incorporating spatial information.
Sharlen Moore, Aaron Wang, Jennifer Lawlor, Kelly Fogelson, Andrea Santi and Kishore Kuchibhotla
Topic areas: memory and cognition
LEARNING MEMORY ALZHEIMER CONTEXTThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Alzheimer’s disease (AD) results in a slow deterioration of cognitive capacities due to neurodegeneration. Interestingly, AD patients can exhibit cognitive fluctuations and, in the presence of certain contextual factors, they are able to unlock their memories. Exploration of the neural basis of cognitive fluctuations has been hampered without a behavioral approach to dissociate memories from contextual-performance. Our previous work demonstrated that interleaving ‘reinforced’ with non-reinforced trials in an auditory go/no-go discrimination task, allows us to do this distinction. We used this approach, with two-photon calcium imaging on AD-relevant mice (APP/PS1+), to determine whether amyloid accumulation impacts underlying sensorimotor memories and/or contextual-performance in an age dependent manner. Importantly, peripheral auditory function, measured with ABRs, was similar between WT and APP/PS1+ mice. We found that while contextual-performance is significantly impaired in young-adult APP/PS1+ mice, these animals show only minor impairments in the underlying sensorimotor memories. However, middle-aged APP/PS1+ mice show deficits in both domains. The impairment found in the young-adults was accompanied by a reduction in stimulus selectivity in the auditory cortex of APP/PS1+ mice, especially in reinforced trials. Ongoing analyses aim to identify whether this impairment is cortex-wide or is concentrated near Aβ plaques. Finally, these effects were recapitulated by using a reinforcement learning model that accounts for changes in contextual signals. The main network model parameters affected between the control and the APP/PS1+ mice were those governing contextual scaling and behavioral inhibition. These results suggest that Aβ deposition impacts circuits involved in contextual computations before those involved in acquiring knowledge.
Bas Olthof, Fiona Lebeau, Gavin Clowry, Adrian Rees and Sarah Gartside
Topic areas: memory and cognition subcortical processing
auditory midbrain tract tracing retrograde labelling immunohistochemistry vesicular glutamate transporter 2Thu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
The hippocampus plays important roles in learning, memory, and emotion. Several studies have reported that some hippocampal neurons respond to sounds that evoke behavioural responses. Evidence indicates that sound information reaches the hippocampus via the auditory and entorhinal cortices, but hippocampal outputs to auditory structures have not been described. Given the involvement of the hippocampus in the interplay between auditory memory, spatial context, and behaviour, we hypothesised that the hippocampus innervates the inferior colliculus (IC), a centre associated with sound-evoked behaviour including escape responses. Following injection of a retrogradely transported viral vector encoding green fluorescent protein (GFP) into the IC, we found GFP-filled cells throughout the ipsilateral hippocampus, indicating direct projections from hippocampus to IC. Immunolabelling revealed that half of the GFP-filled cells contained calretinin, a quarter contained calbindin and a quarter parvalbumin, indicating that they are a mixed population of GABAergic neurones defined by these calcium binding proteins. In addition to GFP-filled cells, we observed many GFP-puncta surrounding nuclei in stratum pyramidale. Almost all GFP-puncta labelled for the vesicular glutamate transporter VGLUT2, indicating that they are terminals of glutamatergic neurons with cell bodies in a subcortical region. A small proportion of GFP puncta labelled for the vesicular transporter VGAT, indicating that they are likely terminals of GABAergic or glycinergic neurons. Our findings indicate an extensive, and hitherto unreported, direct GABAergic projection from the hippocampus to the inferior colliculus. Moreover, our data suggest that glutamatergic neurones originating in at least one subcortical region innervate both the IC and the hippocampus.
Lonike Faes, Luca Vizioli, Zidan Yu, Isma Zulfiqar, Jiyun Shin, Kamil Uludag, Lucia Melloni, Essa Yacoub and Federico De Martino
Topic areas: correlates of behavior/perception hierarchical organization
fMRI Predictive Coding Hierarchical processing Auditory perceptionFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Predictive processes take place throughout the (sub-) cortical hierarchy, with specific roles attributed to different cortical layers in the information flow between top-down predictions and bottom-up sensory evidence. However, in the auditory cortex, the particular role of cortical layers is still debated [1]. We use submillimeter fMRI (7T) to investigate the role of cortical layers in response to tones that either respect or deviate from contextual cues. Four tones are presented in either a predictable or unpredictable context. In the predictable context, in a small proportion of trials, the last tone could represent an oddball. Moreover, we evaluate the impact of fMRI denoising using Noise Reduction with Distribution Corrected (NORDIC) PCA [2] on submillimeter fMRI responses in temporal regions. Preliminary analyses revealed that, in Heschl’s gyrus, a high-low-high frequency gradient [3] is present in the tonotopic maps of both predictable and unpredictable conditions, but slightly altered in response to oddballs. The same gradient was visible in the data processed with NORDIC. As expected, NORDIC resulted in clearer responses to sound presentation due to the thermal noise suppression [2]. This suggests that NORDIC offers a comparative advantage to standard reconstruction, while also suggesting that oddball responses are differently represented. In follow-up analyses, we will quantify the spatial correlation between tonotopic maps. Additionally, we will investigate the depth-dependent responses to predictable, unpredictable and oddball stimuli and their relationship to the tonotopic representation. 1. Heilbron & Chait (2018) Neuroscience; 2. Vizioli et al. (2020) BioRxiv; 3. Moerel et al. (2014) Front. Neurosci.
Sarah Tune, Mohsen Alavash and Jonas Obleser
Topic areas: memory and cognition speech and language correlates of behavior/perception
neural filtering EEG selective auditory attention alpha power speech trackingThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Successful listening needs to separate relevant from irrelevant information. Recently (Tune et al., 2021), using a challenging spatial-attention listening task, we probed how two prominent auditory filtering strategies – lateralization of alpha power versus selective neural speech tracking – relate to listening success. In a longitudinal cohort of aging adults, at T1 (N = 155, 38-80 yrs), stronger speech tracking at the state-level and at the trait-level predicted better behavioral performance while alpha lateralization did not. These findings provide compelling evidence for speech tracking as neural marker of individual listening success. Yet, the reliability of both trait-level auditory filter strength and the link of auditory filter state to single-trial listening behavior is currently unknown. Here, we address these questions in a subsample who underwent measurements again after ~1.5 years (T2; N = 100, 42-82 yrs). Unexpectedly, the individual average strength of each neural filter was largely uncorrelated from T1 to T2, that is, neither one poses a reliable neural trait measure (tracking: β = .12, SE = .10, p =.23; alpha: β = .13, SE = .10, p =.20). However, lending credibility to the functional relevance of speech tracking as neural measure of attentional state, we replicated the effect of speech tracking state on accuracy in both its size and direction (OR = 1.09, SE = .03, p =.013). In sum, our results underscore the functional importance of state-level neural variability for behavior while calling into question the robustness of trait-level characterization of neural-attentional filters.
Federico Adolfi, Jeffrey Bowers and David Poeppel
Topic areas: memory and cognition
artificial and biological audition speech recognition auditory neural networks multi-scale robustnessThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Natural and artificial audition can in principle evolve different solutions to a given problem. The constraints of the task, however, can nudge the cognitive science and engineering of audition to qualitatively converge, suggesting that a closer mutual examination would improve artificial hearing systems and process models of the mind and brain. Speech recognition — an area ripe for such exploration — is inherently robust in humans to a number transformations at various spectrotemporal granularities. To what extent are these robustness profiles accounted for by high-performing neural network systems? We bring together experiments in speech recognition under a single synthesis framework to evaluate state-of-the-art neural networks as stimulus-computable, optimized observers. In a series of experiments, we (1) clarify how influential speech manipulations in the literature relate to each other and to natural speech, (2) show the granularities at which machines exhibit out-of-distribution robustness, reproducing classical perceptual phenomena in humans, (3) identify the specific conditions where model predictions of human performance differ, and (4) demonstrate a crucial failure of all artificial systems to perceptually recover where humans do, suggesting a key specification for theory and model building. These findings encourage a tighter synergy between the cognitive science and engineering of audition.
Ole Bialas and Marc Schönwiesner
Topic areas: correlates of behavior/perception neural coding
sound source localization spatial hearing head related transfer function sound source elevation auditory evoked response EEG decodingFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
The auditory system is organized tonotopically. As a consequence, the position of a sound source has to be inferred based on a number of implicit sound localization cues. The most important ones are interaural time and intensity differences, which correspond to a sounds azimuth, and direction dependent spectral filtering trough the head and ears, from which the sounds elevation is inferred. While the psychophysics and brain stem physiology of sound localization are well understood, we know little about it’s cortical encoding, especially in relation to elevation. Using EEG, we recorded brain signals in response to sound bursts from loudspeakers at different elevations. We used signal space projections to filter the EEG signal for changes across elevations. A trial-by-trial pairwise decoding of elevations from the recorded signals revealed a linear relationship between the distance of the sound sources and the decoding accuracy. Whats more, the accuracy of subjects in localizing sound sources was proportional to the accuracy of decoding their brain responses. Thus, distinct neural representation of the position of a sound source may be necessary for accurate localization. In a control experiment, we recorded sounds from different elevations with in-ear microphones and played them back while switching recordings between subjects. Since they could not localize sounds recorded trough foreign ears, this allowed us to disentangle the contributions of spectral content from correlates of sound source localization.
Malte Wöstmann, Burkhard Maess and Jonas Obleser
Topic areas: memory and cognition correlates of behavior/perception
auditory attention filter magnetoencephalography speechFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
The deployment of neural alpha (8–12 Hz) lateralization in service of spatial attention is well-established: Alpha power increases in the cortical hemisphere ipsilateral to the attended hemifield, and decreases in the contralateral hemisphere, respectively. Much less is known about humans’ ability to deploy such alpha lateralization in time, and to thus exploit alpha power as a spatio-temporal filter. Here we show that spatially lateralized alpha power does signify – beyond the direction of spatial attention – the distribution of attention in time and thereby qualifies as a spatio-temporal attentional filter. Participants (N = 20) selectively listened to spoken numbers presented on one side (left vs right), while competing numbers were presented on the other side. Key to our hypothesis, temporal foreknowledge was manipulated via a visual cue, which was either instructive and indicated the to-be-probed number position (70% valid) or neutral. Temporal foreknowledge did guide participants’ attention, as they recognized numbers from the to-be-attended side more accurately following valid cues. In the magnetoencephalogram (MEG), spatial attention to the left versus right side induced lateralization of alpha power in all temporal cueing conditions. Modulation of alpha lateralization at the 0.8 Hz presentation rate of spoken numbers was stronger following instructive compared to neutral temporal cues. Critically, we found stronger modulation of lateralized alpha power specifically at the onsets of temporally cued numbers. These results suggest that the precisely timed hemispheric lateralization of alpha power qualifies as a spatio-temporal attentional filter mechanism susceptible to top-down behavioural goals.
Troby Ka-Yan Lui, Jonas Obleser and Malte Wöstmann
Topic areas: memory and cognition
Distractibility Rhythmic sampling Auditory attentionThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Attentional sampling of task-relevant sensory stimuli operates in a rhythmic manner. However, evidence on the temporal dynamics of distractibility is scarce. We have previously shown that the vulnerability of working memory to distracting speech items fluctuates rhythmically. Here, we investigate the underlying neural responses to auditory distraction and the correspondence of pre-distractor neural phase with behavior. In the present behavioral and electroencephalography (EEG) study, we probed the temporal dynamics of distraction using a pitch discrimination task. Human participants (N = 30) compared the pitch of two target pure tones and were instructed to ignore a task-irrelevant distractor tone sequence with a 25 Hz temporal structure. We systematically varied the onset time of the distractor in-between the two target tones. Distractibility was measured behaviorally via the ability to discriminate the target tones (perceptual sensitivity), and neurally via the amplitude of the distractor-evoked response at 25 Hz. Linear mixed-effect models with sine- and cosine-transformed distractor onset phase as predictor showed that distractor onset phase in the delta and theta frequency range (~3.5–5 Hz) modulated both perceptual sensitivity and 25 Hz amplitude. Furthermore, distractor onset phase at similar frequencies co-modulated the behavioral and neural measures of distraction. We also related the pre-distractor phase of 5 Hz source-projected neural oscillations to perceptual sensitivity, which was most prominent in the left inferior frontal cortex. These results suggest that ~5 Hz phasic modulation of neural activity acts as a rhythmically fluctuating attention filter, which alternates between states of higher versus lower distractibility at ~5 Hz.
Ronald DiTullio, Chetan Parthiban, Eugenio Piasini, Vijay Balasubramanian and Yale Cohen
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization neuroethology/communication
Computational Neuroscience Perception Theoretical Natural SoundsFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
A fundamental goal of sensory systems is to extract sensory information from the environment and convert it into perceptual representations. Because the brain cannot simply transduce and represent all of the information present in the environment, sensory systems must select features of the stimuli to encode. One way a sensory system could perform this feature selection is by encoding particular statistical regularities in the environment. One statistical regularity of natural auditory stimuli is that they tend to have low temporal modulation; i.e. the powers of the frequencies that comprises natural stimuli tend to change slowly over time. It is unknown whether such slow temporal regularities are sufficient to enable learning and perception of auditory object classes. To test this idea, we adapted an unsupervised temporal learning algorithm, Slow Feature Analysis (SFA), to extract the auditory features that change most slowly over time. We then used this algorithm to evaluate the hypothesis that extracting these slowly varying features will capture both intra- and inter-class stimulus variance of rhesus macaque vocalizations. We found that (1) pairs of vocalizations in the SFA-generated feature space were linearly separable; (2) this feature space is robust to clutter (noise) in the training data set; and (3) this feature space captures enough variability for the classification of novel exemplars. Together, our results suggest that if the brain can extract the slow temporal features from auditory stimuli, it may be sufficient for and underlie important components of perception.
Katharina Bochtler, Andrew King and Kerry Walker
Topic areas: memory and cognition correlates of behavior/perception cross-species comparisons
attention psychophysics ferretFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Our ability to listen to a single sound source in a crowded room is thought to rely, in part, on directing attention to filtered features of sounds, such as frequency. Much has been learned about the neural signatures of this process by training ferrets to respond to a particular frequency of sound (e.g. Fritz et al., 2010). Here, we build on this literature by examining if ferrets also apply attentional frequency filters on tasks that do not require them to report on the frequency of sound directly. This frequency filtering may occur more automatically, based on the statistics of sounds in their environment. Mondor & Bregman (1994) demonstrated that when human listeners were asked to make tone duration discriminations, their reaction times were slower when the tone presented had an unexpected (i.e. low probability) frequency. We examined if ferrets also show evidence of this “frequency selectivity effect” on a 2-alternative forced choice duration discrimination task. We quantified how the width of the attentive window depends on the frequency statistics of sounds presented, and are investigating if the frequency selectivity effect generalizes to the pitch of complex sounds.
Andria Pelentritou and Marzia De Lucia
Topic areas: memory and cognition multisensory processes
Electroencephalography Auditory Regularity Processing Heartbeat Evoked Potentials SleepFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Recent work has demonstrated an important contribution of bodily signals in auditory stimulus processing. Specifically, the comparison of neural responses to sounds occurring in synchrony or asynchrony with the ongoing heartbeat suggests that regularities established across interoceptive and exteroceptive signals can induce temporal prediction of incoming sounds. Here, we investigated whether heartbeat based auditory regularity processing depends on conscious awareness of stimulus regularity. We recorded continuous electroencephalography and electrocardiography during wakefulness and sleep in healthy volunteers (N=26) presented with isochronous auditory sequences or in synchrony and asynchrony to the ongoing heartbeat. First, auditory regularity processing was investigated by analysing whether interbeat intervals were affected by sound omissions and by the auditory regularity type. Repeated measures ANOVAs with factor ‘auditory condition’ (three levels) and ‘order’ (four levels, before, during, and two after the omission) revealed an interaction during wakefulness (F=14.1, p < 0.0005), NREM N2 sleep (F=11.7, p < 0.0005) and REM sleep (F=8.48, p < 0.0005). Post-hoc analyses revealed that only omissions in the synchronous condition produced a long-lasting heartbeat deceleration (p < 0.0005). Second, the EEG omission response was investigated using cluster based permutation statistics. During wakefulness, we revealed a significant difference (p < 0.05) between 220-276 ms after heartbeat onset when comparing the omission response in the synchronous compared to the asynchronous condition and between 225-288ms after omission onset in the isochronous compared to the asynchronous condition. Neural monitoring of cardiac signals induces an expectation of auditory signals in cardio-audio regularities and appears to strongly modulate the heartbeat and neural response in conscious wakefulness and reduced consciousness in sleep.
Mark Saddler and Josh McDermott
Topic areas: correlates of behavior/perception neural coding subcortical processing
deep learning auditory nerve phase-locking speech recognition voice recognition localizationThu, 11/4 2:15PM - 2:30PM | Short talk
Abstract
The auditory nerve encodes sound with precise spike-timing that is phase-locked to the temporal fine structure of sound. The role of phase-locking in hearing remains controversial because physiological mechanisms for extracting this information (especially monaurally) have proven elusive. Here, we investigate the perceptual role of auditory nerve phase-locking with deep artificial neural networks. We used artificial neural networks in the spirit of ideal observer analysis, optimizing them for natural tasks and examining whether phase-locking in a network’s cochlear input was necessary to obtain human-like behavior. We trained networks to recognize words and voices and to localize environmental sounds using simulated auditory nerve representations of naturalistic auditory scenes. We manipulated the upper limit of auditory nerve phase-locking in our networks’ ears via the lowpass cutoff in the simulated inner hair cells. Networks whose input featured high-frequency phase-locking replicated key aspects of human auditory behavior: task performance was robust to sound level and remained good even in noisy conditions. Degrading phase-locking impaired performance, but much moreso on some tasks than others. Reducing the upper frequency limit of phase-locking to 50 Hz (eliminating virtually all temporal fine structure in the peripheral representation) had little effect on network word recognition, but substantially impaired voice recognition. Network localization performance (in both azimuth and elevation) was even more reliant on spike-timing, benefiting from phase-locking upwards of 1000 Hz. The results suggest that auditory nerve phase-locking is critical for accurate sound localization and voice recognition -- but less so speech recognition -- in natural environments. Informal discussion to follow at at 4:00 pm EDT (GMT-4) in Gathertown, Discussion Area 1 (link below).
Jean-Hugues Lestang, Clélia de Mulatier, Songhan Zhang, Lalitta Suriya-Arunroj, Vijay Balasubramanian and Yale E. Cohen
Topic areas: neural coding
neural coding auditory cortex rhesus monkeyThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Perception and behavior are mediated by cortical circuits that exhibit coincident firing. Traditionally, these groups of coordinated neurons are detected through the use of dimensionality reduction techniques. More recently, progress on the use of maximum entropy models for binary data has led to the development of a new technique based on minimally complex models (MCM). Here, we show that this method is also capable of identifying groups of synchronous neurons. We then review the strengths and weaknesses of both approaches in detecting coincident neural groups through the use of both simulated neural datasets, in which we artificially introduced correlations between neurons, and real neural data that was obtained from the rhesus monkey auditory cortex. With this in mind, we designed a comprehensive roadmap to assist researchers in determining which technique best suits their needs.
Patricia Valerio, Mari Nakamura, Stitipragyan Bhumika, Magdalena Solyga and Tania Rinaldi Barkat
Topic areas: speech and language correlates of behavior/perception neural coding
Auditory cortex Pure tone Frequency modulated sweep Postnatal development Parvalbumin positive neurons Sensory processingThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Neuronal circuits are shaped by experience during time windows of increased plasticity in postnatal development. In the mouse auditory system, the critical period (CP) for pure tones (PT) is well defined from postnatal days 12 to 15. The CP for frequency modulated sweeps (FMS) occurs weeks later, from postnatal days 31 to 38. Whether such CPs are timed by a temporally precise developmental program or sequentially organized was not yet known. Our work aimed to unravel the underlying neuronal mechanisms and dependency of CPs on each other. In vivo electrophysiological recordings, immunohistochemistry, molecular and sensory manipulations were performed in the mouse primary auditory cortex. We observed a decrease in parvalbumin (PV) expression in cortical layer 4 during the FMS CP, which was paralleled with a transient increase in responses to FMS. Despite this downregulation, the PV cell number was not altered. Continuous white noise (WN) exposure prevented PV expression decrease and delayed the FMS CP onset, suggesting a reduction in inhibition as a mechanism for this plasticity. Passive sound exposure further revealed that the cortical representation of both sound features is not influencing each other. Enhancing GABA function before the CP for PT accelerated it without changing the CP for FMS. Delaying the CP for PT with WN exposure did also not affect the CP for FMS. Together, these results reveal an unexpected picture of the independence of sound features and their related developmental plasticity and add fundamental knowledge about how our central sensory system is built, organized, and functioning.
Delaina Pedrick, Xiu Zhai, Ian Stevenson and Monty Escabi
Topic areas: speech and language correlates of behavior/perception neural coding subcortical processing
Sound Masking Energetic Masking Modulation Masking Inferior Colliculus Population Encoding Modulations Phase Randomized Spectrum Equalized Vocalizations Speech Cocktail PartyFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
During sound masking, the low-order sound structure can inhibit the strength of neural responses. However, the contribution of higher order statistics in the modulation content of natural backgrounds to masking at the neural level is less understood. Here we manipulated the spectrum and modulation content of five natural backgrounds and white noise to determine their contribution to energetic and modulation masking. To dissociate masking attributed to the spectrum versus modulation content, we generated phase randomized (PR) backgrounds which have the power spectrum of the original sounds but with whitened modulation content. Furthermore, to compare masking effects of the modulation content of different background sounds, we used spectrum equalized (SE) backgrounds that have a power spectrum identical to pink noise, but retain the original modulation content. We demonstrate that vocalizations in the presence of PR backgrounds reduce the neural population signal strength and synchrony of the encoded foregrounds, compared to the unaltered backgrounds. Despite this, the neural output signal-to-noise ratio of the encoded foreground is higher for the PR sounds. This suggest PR backgrounds produce sparser neural activity but allow for a higher fidelity neural representation of the foreground vocalization. In contrast, SE backgrounds have little effect on neural responses when compared to unaltered backgrounds. Masking differences observed across the original background sounds remained for the SE condition. Collectively, these results suggest modulation statistics in natural background sounds can have varied interfering effects and that the modulation content of the interfering background sounds can strongly influence the encoding of masked vocalizations.
Mccall E. Sarrett, Ariane E. Rhone, Christine Shea, Kristi Hendrickson, John B. Muegge, Christopher K. Kovach, Brian J. Dlouhy and Bob McMurray
Topic areas: speech and language
multilingualism bilingualism spoken word recognition lexical competition intracranial EEG machine learning decodingThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Spoken word recognition proceeds by immediately activating items in the mental lexicon which match the incoming speech signal as it unfolds. These items then compete for recognition over time. In multilinguals, the array of possible lexical competitors comprises items across all their languages. Eye-tracking studies indicate that strength of early competitor activation across languages may relate to listeners’ proficiency in each language: words from the more proficient language outcompete those from the other(s). However, the neural mechanisms subserving this effect are not well known. We present a case study of a pediatric neurosurgical patient who was bilingual in Spanish and English, but whose dominant language was Spanish. The participant passively listened to English and Spanish words. Words were cohort competitors across three conditions: within-Spanish (e.g. pato [duck] - papas [potatoes]), within-English [e.g. turkey - turtle], and cross-language (e.g. chalkboard - chanclas [flipflops]). We used machine learning to decode the dynamics of within-Spanish, within-English, and cross-language lexical competition from the pattern of activity on the superior temporal plane. Spanish words were more robustly decoded, regardless of their role as target word (the actual word heard on a given trial) or as lexical competitor (the phonologically related word competing for activation). English words were less reliably decodable. This is consistent with a Spanish-dominant system and corroborates evidence from eye-tracking. The differences in lexical competition dynamics between the two languages suggests that activation of cohorts may be in part due to perceptual tuning of early auditory areas for the sounds of a language.
Satyabrata Parida, Shi Tong Liu and Srivatsun Sadagopan
Topic areas: hierarchical organization neural coding thalamocortical circuitry/function
hierarchical processing noise invariance vocalization processing gain controlThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
To facilitate robust vocal communication in the real-world, the auditory system has adapted sound-processing mechanisms to generalize over production variability in vocalizations (trial-to-trial and subject-to-subject variability) and environmental variability (e.g., noisy backgrounds). These mechanisms are mediated by a hierarchical processing strategy, which, broadly speaking, consists of a dense spectrotemporal representational stage followed by a sparse feature detection stage. We previously implemented this hierarchical architecture in a computational model, and showed that optimal performance in call-categorization tasks can be achieved by detecting a few maximally informative features, thereby generalizing over production variability. Here, we extend the model to generalize over environmental variability. Specifically, to achieve noise invariance in call categorization, we explore the effects of two biologically feasible gain control mechanisms, (1) adaptation to sound statistics in the spectrotemporal representational stage and (2) sensitivity adjustment at the feature detection stage. We found that one or both gain control mechanisms are required for model performance to approach the behavioral performance of guinea pigs engaged in the same call categorization task. In ongoing experiments, we are recording neurophysiological data from the primary auditory cortex in response to vocalizations presented in clean and noisy conditions to validate these model predictions. Overall, these results highlight the contributions of gain control mechanisms at both the representational and feature detection stages to achieve noise invariance in auditory categorization tasks.
Marianny Pernia, Manaswini Kar and Srivatsun Sadagopan
Topic areas: auditory disorders correlates of behavior/perception
Temporary threshold shifts Vocalizations Hearing impairment Sound-in-Noise perception PupillometryThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Exposure to moderate-to-intense sounds can produce temporary threshold shifts (TTS) in the audiogram. Such TTS is hypothesized to contribute to lasting speech perception deficits in noisy listening conditions but not in clean conditions. In normal hearing animals, cortical neurons show selectivity for specific sound features. Preliminary data from our lab suggest that neurons retain this selectivity across different listening conditions, and that independent cortical mechanisms may underlie selectivity for acoustic features and invariance to noisy conditions. Because TTS may be associated with speech perception deficits in noise and since high-threshold auditory nerve fibers required for encoding sounds at loud levels are damaged in TTS, we hypothesized that at loud sound levels, only the invariance circuitry will be affected by TTS. We induced TTS in guinea pigs, a highly vocal rodent, using 4–8 kHz or 2–8 kHz noise at 106 dB SPL for 2 hours, and verified TTS using ABRs. We estimated thresholds for vocalization (call) categorization at low and high sound levels in quiet and noise before and after TTS using pupillometry. Consistent with the loss of high threshold auditory nerve fibers, we found that call-in-noise perception was affected at only loud sound levels. Call categorization was impaired only for calls with frequency content within or above the noise range used for TTS induction. In ongoing experiments, we are characterizing how TTS impacts feature selectivity and invariance in different laminae of primary auditory cortex using multichannel electrophysiological recordings combined with decoding models.
Matthew McGill, Ariel Hight, Dongqin Cai, Yurika Watanabe, Kameron Clayton, Aravindakshan Parthasarathy and Daniel Polley
Topic areas: auditory disorders correlates of behavior/perception neural coding
Two-photon Single Cell Tracking Hyperactivity Central GainFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
In an ever-changing sensory environment, cortical circuits continuously adapt their activity to new situations. One such change is a deprivation of input from the sensory periphery, which is associated with chronic hyperactivity and perceptual hypersensitivity to sensory stimuli. While studies in the visual, somatosensory, and auditory systems have converged on the idea of increased excitability and decreased inhibition that culminates in a state of persistent hyperactivity, a thorough characterization of the dynamic, single-cell changes across topographic space and time is lacking. Such studies would inform both the manifestations of compensatory plasticity and the link between cortical hyperactivity and perceptual hypersensitivity. The auditory system is ideally suited for these questions given the precise control of peripheral damage and stimulus delivery. Here, we induced sensorineural hearing loss in mice by exposure to intense noise (n=16 noise, n=13 sham) and developed chronic two-photon imaging approaches and new Go/NoGo (GNG) acoustic and optogenetic operant behavioral tasks that demonstrate cortical involvement in perceptual hypersensitivity. Two-photon calcium imaging of pyramidal neurons in auditory cortex (n= 11,201 cells) allowed us to track population and single-cell responses over weeks across the tonotopic map. We found daily location-specific and stimulus-specific changes in neural gain that match the time course of perceptual hypersensitivity. Using a model trained on neural activity to decode stimulus information, the change in model performance mirrored the change of behavioral performance in the GNG task. Insight into the neural signatures of hyperactivity will prove valuable broadly for auditory and other sensory disorders, and related neurological conditions.
Xueying Fu, Karlotta Staiger and Lars Riecke
Topic areas: multisensory processes
Electroencephalography Multisensory Auditory Tactile SSRThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Neural and perceptual responses to rhythmic auditory stimuli can be enhanced by simultaneous rhythmic stimuli in a different sensory modality; e.g., lip reading can improve neural representation and intelligibility of auditory speech in noise. It is less clear whether auditory stimuli can also be suppressed by simultaneous rhythmic non-auditory stimuli. The present electroencephalography study investigates whether rhythmic tactile stimulation can suppress fluctuating auditory noise and thereby reduce the auditory masking potential of the noise. Auditory stimuli consist of continuous white noise with 4-Hz amplitude-modulation and a continuous pure tone with 37-Hz amplitude-modulation. Human listeners perform a forced-choice task requiring them to detect a temporary loudness decrease in the tone (target) while receiving 4-Hz transcutaneous median-nerve stimulation (MNS) either in-phase or anti-phase relative to the noise. Target-detection performance and cortical steady-state responses (SSRs) to noise and tone are measured at four noise levels. Preliminary results show a significantly stronger 4-Hz SSR in the anti-phase vs. in-phase condition over frontocentral cortical regions. We are currently analyzing whether this tactile phase effect on the noise-masker representation impacts the neural processing and perceptual detectability of the auditory target. Preliminary observations indicate that the tactile phase may influence neural responses to the target when the latter is presented near the masking threshold. In sum, the relative timing of rhythmic MNS may influence the cortical representation of fluctuating auditory noise and possibly the neural masking potential of the noise.
Manaswini Kar, Marianny Pernia, Nathan Schneider, Isha Kumbam, Madelyn McAndrew and Srivatsun Sadagopan
Topic areas: correlates of behavior/perception neuroethology/communication
Vocalizations Call categorization Appetitive conditioning Spectral modulation Temporal modulation Go/No-go taskThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Vocal communication sounds (human speech or animal calls) are produced with a high degree of variability in diverse listening conditions. The auditory system can seamlessly discriminate and categorize sounds despite such variability. However, what vocalization features are perceptually important for categorization and how these features are represented in the brain remain poorly understood. As a first step towards exploring the impact of different spectral and temporal cues for call categorization, we developed an appetitive Go/No-Go call categorization task. We trained guinea pigs (GPs), highly vocal and social rodents that use complex calls in specific behavioral situations, to discriminate between two call categories with similar low-frequency content but different temporal modulations to the envelopes. To determine the call features necessary for categorization, we presented the animals with calls with systematically manipulated spectral and temporal features. GPs maintained robust categorization across a wide range of temporal modulations including changes to tempo, reversal, and changes to inter-syllable intervals. However, categorization performance was affected when the frequency content of the calls was shifted away from the natural range. These results suggest that spectral cues are dominant for the categorization of some GP calls. To determine if this result generalizes to other call types, in ongoing work, we are performing these experiments using a different pair of calls that are characterized by strong frequency modulations and have non-overlapping frequency content.
Xindong Song, Yueqi Guo, Chenggang Chen and Xiaoqin Wang
Topic areas: correlates of behavior/perception hierarchical organization neural coding novel technologies
Pitch Auditory cortex MarmosetFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
How the brain processes pitch on complex sounds has been one of auditory neuroscience’s central questions due to the importance of pitch in music and speech. The cortical representation of pitch has been demonstrated by one pitch-sensitive region near the anterolateral border between A1 and R in the common marmoset. However, it’s not clear if there exist other pitch-processing regions in the marmoset brain. Here, we performed optical imaging over the entire auditory cortex on the brain surface in awake marmosets. By contrasting responses to harmonic complex sounds with spectrally matched noises, we identified two discrete pitch-sensitive regions. One region is located anterolaterally to the A1 and R border and is consistent with the previously described “pitch-center”. The second region is newly found at the location more anterior to the “pitch-center” and functionally overlaps with the RT field, referred to as “anterior-pitch-region”. When tested by synthetic tones comprised of low-numbered harmonics, these two pitch-sensitive regions only appear when the fundamental frequency (F0) is close to or higher than 400Hz, consistent with the estimated harmonic resolvability of the marmoset. The response contrasts in these two pitch-sensitive regions were also robust when tested by more natural sounds such as a female’s singing (F0 ~300-700Hz). Furthermore, the ratio between the singing contrast and the synthetic-tone contrast is higher in the “anterior-pitch-region” than in the “pitch-center”. Together, our results suggest that the cortical pitch processing in marmosets is organized into discrete regions with a functional hierarchy along the anterior direction for natural harmonic sounds.
Meenakshi Asokan, Yurika Watanabe and Daniel Polley
Topic areas: memory and cognition correlates of behavior/perception neural coding
Auditory cortex Amygdala Functional coupling Long-range communication Fear conditioning Electrophysiology OptogeneticsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Neutral sounds can elicit distress after learned associations with aversive stimuli. Although synaptic plasticity processes within the basolateral amygdala complex (BLA) and auditory cortex (ACtx) ultimately encode the long-term fear memory and support conditioned behavioral changes, less is known about how interconnected networks of ACtx and BLA neurons communicate before, during, and after the initial association of sound and mild electric shock. Here, we use a combination of anatomical tracing, electrophysiology, optogenetics, and quantitative facial videography to elucidate changes in the functional coupling between BLA and ACtx as an emotionally neutral sound is transformed into a distressing sound. During the conditioning session and at post-conditioning recall, pupil diameter and spontaneous facial movements both provided evidence for rapid associative learning (n=7 mice). Using an intersectional virus strategy, we selectively expressed channelrhodopsin in ACtx neurons that project to BLA, allowing us to 1) optogenetically activate corticoamygdalar neurons en masse to monitor changes in feedforward activation in BLA; 2) optogenetically isolate single units in a higher-order field of the ACtx, the temporal association area (TeA), that project to BLA. By simultaneously recording from single units ensembles in BLA and TeA over three consecutive days surrounding aversive conditioning, our ongoing analyses of the cross-regional spike-triggered LFP suggests enhanced functional coupling from TeA -> BLA, but not BLA -> TeA, on the post-conditioning day following presentation of the sound paired with aversive reinforcement. These findings shed light on the long-range interactions that go beyond the classical auditory neural axis as sounds acquire emotional significance.
David O Sorensen, Kenneth E Hancock and Daniel B Polley
Topic areas: auditory disorders memory and cognition correlates of behavior/perception
Frequency Following Response Envelope Following Response Electroencephalography Temporal processing Hearing-in-noiseFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Stimulus-related temporal processing is a key feature of the auditory nervous system which can be observed in EEG. Abnormal neural encoding of temporal modulation has been associated with aging, sensorineural hearing loss, hidden hearing disorders, and mild traumatic brain injury. Most studies focus on a single timescale, but many salient stimuli, e.g. speech, have temporal dynamics nested across multiple timescales. Here, we describe an approach to study multiplexed encoding of temporal features organized along four nested timescales, ranging from stimulus temporal fine structure (~500 Hz) to temporal context (~0.5 Hz), and everything in between. Stimuli were concatenated amplitude modulated (AM) tones presented in quiet or in in the presence of spectrally filtered dynamic moving ripple noise. AM rates were arranged in a random (RAND) or patterned (REG) stimulus context, allowing us to quantify the frequency following response (FFR), envelope following response (EFR), envelope change following response (ECFR) from individual tokens, as well as a sustained pattern potential when comparing REG versus RAND trials. Analysis of scalp EEG from passively listening young adult subjects demonstrated that FFRs, EFRs, and ECFRs all decreased in amplitude when the stimuli were accompanied by informational masking noise compared to quiet. Stimulus context (RAND or REG) produced no effects: The sustained pattern potential was not observed, and the faster synchronization components were not systematically changed by context in young neurotypical listeners. The stimulus and analysis paradigm described here have potential for capturing objective temporal processing deficits across a wide range of underlying timescales and neurological conditions.
Chris Angeloni, Wiktor Młynarski, Eugenio Piasini, Aaron Williams, Katherine Wood, Linda Garami, Ann Hermundstad and Maria Geffen
Topic areas: correlates of behavior/perception neural coding
efficient coding adaptation behaviorFri, 11/5 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
The efficient coding hypothesis postulates that neurons shape their response properties to match their dynamic range to the statistics of incoming signals. However, whether and how the dynamics of efficient neuronal adaptation inform behavior has not been directly shown. Here, we trained mice to detect a target presented in background noise shortly after a change in the background contrast. The observed changes in cortical gain and detection behavior followed the predictions of a normative model of efficient cortical sound processing; specifically, target detection and sensitivity to target volume improved in low contrast backgrounds relative to high contrast backgrounds. Additionally, the time course of target detectability adapted asymmetrically depending on contrast, decreasing rapidly after a transition to high contrast, and increasing more slowly after a transition to low contrast. Auditory cortex was required for detection of targets in background noise and cortical neuronal responses exhibited the patterns of target detectability observed during behavior and in the normative model. Furthermore, variability in cortical gain predicted behavioral performance beyond the effect of stimulus-driven gain control. Combined, our results demonstrate that efficient neural codes in auditory cortex directly influence perceptual behavior.
Karolina Ignatiadis, Diane Baier, Brigitta Tóth and Robert Baumgartner
Topic areas: correlates of behavior/perception neural coding
looming bias EEG attentionThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
In contrast to our eyes, our ears never rest. Our auditory system remains alert, monitoring our environment and conveying us information about events in our surroundings. The auditory looming bias is an early perceptual effect, reflecting higher alertness of listeners to approaching auditory objects, compared to receding ones. Behavioral studies link the emergence of the looming bias to evolutionary traits. Neural investigations, however, argue also for a top-down projection from the prefrontal cortex to the auditory cortex, prioritizing approaching over receding sonic motion. Here, we test the influence of attention on eliciting the looming bias. 28 listeners were first exposed to sounds simulated as approaching, static, or receding while watching a silent movie in the passive condition and then while having to discriminate across those three categories in the active condition. EEG scalp potentials revealed pre-attentive correlates of the looming bias occurring as early as 90 ms post motion onset. The bias is intensified through attention while its timing and central topography is maintained. This argues for early bottom-up driven origins of the auditory looming bias in order to prevent us from harmful events.
Yuko Tamaoki, Michael S. Borland, Rimenz Rodrigues De Souza, Collin Chandler, Arjun Mehendale and Crystal T. Engineer
Topic areas: auditory disorders speech and language novel technologies subcortical processing
Inferior colliculus Autism Spectrum Disorder Vagus Nerve StimulationFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Individuals with autism often exhibit delayed and weak neural responses to sounds. Prenatal exposure to valproic acid (VPA) alters the development of both subcortical and cortical auditory areas in both humans and animal models. These neural changes have been observed in VPA-exposed rats, which display significantly delayed and weak responses to sound in the auditory cortex. Recent studies in VPA exposed animals have also observed brainstem alterations, specifically the early processing areas for auditory pathways such as the superior olivary complex and the inferior colliculus. Therefore, developing a method to improve these neural deficits throughout the auditory pathway is needed. We have developed a new approach to drive robust, specific plasticity that substantially enhances recovery after neurological damage. This strategy uses brief bursts of vagus nerve stimulation (VNS) paired with a sound presentation. In this current study, we test the hypothesis that VNS paired with speech sound presentation, 300 times per day for 20 days, will reverse maladaptive plasticity and restore neural responses to sounds in VPA-exposed rats. Following the last day of VNS-sound pairing, neural recordings in response to tones, speech sounds, and noise burst trains were collected from the inferior colliculus in each of the experimental groups. Our results suggest that VPA animals displayed weaker responses to speech sounds and VNS-sound pairing is an effective method to enhance auditory processing in rat models of ASD with degraded neural processing of sounds.
Jessica Jacobs, Patrik Wikman, Pawel Kuśmierek and Josef Rauschecker
Topic areas: correlates of behavior/perception cross-species comparisons hierarchical organization
auditory-motor integration electrophysiology motor cortex Dorsal stream internal modelsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Using functional MRI, we have shown that motor regions along the dorsal stream and in the putamen are activated, when macaques are listening to auditory sequences they had previously learned to produce by pressing levers with their hands (‘monkey piano’; Archakov et al, 2020). The hand/arm regions of premotor cortex (PMC) were activated when monkeys listened to self-produced sound sequences as opposed to unlearned sound sequences. We are now performing electrophysiological recordings from auditory and motor regions in awake behaving monkeys. During recordings, the animals either listen to learned auditory sequences, or produce an 8-note tone sequence on the monkey piano. The results show that auditory responses of single neurons in AC were suppressed during sequence-production compared to passive listening of the same learned sounds. When the piano produced incorrect notes, AC neurons responded to the sound more robustly, indicative of an error signal being generated. Similar error signals were also observed in PMC. Our findings suggest that an internal model of expected auditory consequences of motor behavior is generated in PMC, and sent to AC, where it is compared to the actual auditory input. Prediction errors generate a signal in the AC, which is sent to motor regions. The results provide evidence for the existence of internal models in the auditory dorsal stream of old-world monkeys. Internal models are a prerequisite for speech. The fact that monkeys don’t speak suggests that internal models have evolved to support their hand movements but not their vocalizations.
David Meijer, Roberto Barumerli, Burcu Bayram, Michelle Spierings, Ulrich Pomper and Robert Baumgartner
Topic areas: memory and cognition correlates of behavior/perception
Spatial sound localization Bayesian inference Suboptimality in perceptionThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Uncertainty in perception is caused by an ever-changing environment and an abundance of sensory noise. To obtain statistically optimal perceptual estimates Bayesian inference prescribes that the likelihood function from noisy sensory representations is integrated with permanently updated prior beliefs. While there is evidence to suggest that the human brain is able to approximate such Bayesian inference in vision, this is less clear in audition. Moreover, recent reports propose that the brain frequently resorts to computationally simpler, heuristic methods, such as stochastically switching between likelihood and prior instead of integrating them. Here, we have designed an experiment to test whether auditory space perception is consistent with Bayesian inference. Participants hear a sound sequence of unpredictable length and indicate their perceived location of the last sound. Each sound has a 1/6 chance to be randomly sampled from within -60° to +60° azimuth and otherwise defaults to the previous sound’s location. This random change-point process introduces a variety of spatial prior reliabilities so that relative weighting between prior and likelihood can be studied. We compare participant’s responses against model predictions for a Bayesian observer and a switching observer. Initial analyses of pilot data (N=5) confirm that both models fit the data qualitatively well, though small differences exist. Future analyses of a full dataset (N=32) should enable us to conclude whether or not human sound localization is consistent with Bayesian inference or is better described by heuristic solutions. Additionally, EEG analyses will be used to assess the neural implementation of the proposed algorithm.
Chase Mackey, Leslie Liberman, M. Charles Liberman, Troy Hackett and Ramnarayan Ramachandran
Topic areas: auditory disorders correlates of behavior/perception hierarchical organization subcortical processing
Nonhuman primate Evidence accumulation Probability summation Reaction-time Brainstem Hidden hearing lossFri, 11/5 2:15PM - 2:30PM | Short talk
Abstract
The process by which sensory evidence is accumulated over time to guide actions is characterized by different models of temporal integration across sensory modalities. Studies of the auditory system have reached substantially different conclusions regarding the hierarchical evolution of neuronal temporal integration, and its relationship to perception. Additionally, few studies have investigated the process in realistic, noisy environments. To address this, we assessed psychophysical, and neurometric measures of temporal integration in macaques detecting tones of different durations, in quiet and in noise. Detection theoretic analyses of single-unit responses in the inferior colliculus (IC) and cochlear nucleus (CN) of normal-hearing, behaving macaques revealed that IC and CN neuronal distributions of temporal integration rates in quiet were not significantly different, but were faster than behavior. In noise, a subset of IC neurons integrated more slowly, approximating behavior, while the CN rates did not change. This suggests a subcortical transformation in temporal integration in noisy backgrounds. Hypothesizing from our models that changes in temporal integration may arise from cochlear synaptopathy (CS), we assessed the psychophysical measures in a macaque model of CS. CS reduced the slope of the psychometric function for short duration tones, which, interpreted in the context of a detection theoretic model, is consistent with integration from fewer auditory nerve fibers. Together, these data provide an account of auditory temporal integration at neuronal, computational, and perceptual levels. And the perceptual deficits due to CS point the way forward for future studies investigating computational and neuronal changes that underly subclinical hearing deficits. Informal discussion to follow at at 3:00 pm EDT (GMT-4) in Gathertown, Discussion Area 2 (link below).
Agnes Landemard, Célian Bimbard, Sam Norman-Haigneré and Yves Boubenec
Topic areas: correlates of behavior/perception cross-species comparisons neural coding
auditory cortex natural sounds vocalizations functional Ultrasound imaging ferret cross-speciesThu, 11/4 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
Little is known about how neural representations of natural sounds differ across species. For example, speech and music play a unique role in human hearing, yet it is unclear how auditory representations of speech and music differ between humans and other animals. Using functional Ultrasound imaging, we measured responses in ferrets to a set of natural and spectrotemporally-matched synthetic sounds previously tested in humans. Ferrets showed similar lower-level frequency and modulation tuning to that observed in humans. But while humans showed substantially larger responses to natural vs. synthetic speech and music in non-primary regions, ferret responses to natural and synthetic sounds were closely matched throughout primary and non-primary auditory cortex, even when tested with ferret vocalizations. This finding reveals that auditory representations in humans and ferrets diverge sharply at late stages of cortical processing, potentially driven by higher-order processing demands in speech and music.
Emma Holmes and Timothy Griffiths
Topic areas: memory and cognition
Cognition Attention Cocktail party problem Psychophysics AgeingFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
We often face the challenge of understanding speech when competing speech is present. Listeners with normal hearing can deploy preparatory spatial attention to improve intelligibility in spatialised settings, but children who have hearing loss from a young age deploy preparatory spatial attention to a lesser extent than do children with normal hearing. It is currently unclear whether age-related hearing loss has similar detrimental effects on spatial attention, or whether older adults’ prior experience with sounds preserves (or, perhaps increases reliance on) preparatory spatial attention despite hearing loss. Here, we investigated how age and audiometric thresholds relate to preparatory spatial attention. We recruited two groups of participants, age 18-35 years and 60-80 years. We measured their audiometric thresholds and tested their ability to understand a target phrase in the presence of two competing phrases, which were spoken by different talkers and presented from different locations. The target talker was cued visually by an arrow (left or right), which was presented 100 or 2000 ms before the talkers started speaking—thus providing a short or longer interval for participants to prepare spatial attention. We measured intelligibility in the both conditions at SNRs between -18 and +18 dB. Preliminary results suggest that intelligibility for both groups was better in the 2000 than 100 ms condition, implying a benefit from preparatory spatial attention. The final analyses will compare the magnitude of benefit between the groups and examine how it relates to audiometric thresholds, thresholds for discriminating spatial location, and WAIS scores.
Guan-En Graham, Michael Chimenti, Kevin Knudtson, Devin Grenard, Liesl Co, Manuela Velez, Timothy Tchou and Kasia Bieszczad
Topic areas: memory and cognition
learning and memory epigenetics plasticityFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Auditory memory requires experience-dependent transcription within the auditory system for comprehension of sound and sound-guided action. Epigenetic mechanisms are powerful controllers underlying activity-dependent changes to gene expression that can have lasting effects on experience-dependent systems-level auditory neuroplasticity and ultimately on robust changes in sound-cued behavior. Blocking the epigenetic enzyme, histone deacetylase 3 (HDAC3), which is thought to facilitate de novo gene expression has been shown to regulate the precision of auditory memories by facilitating auditory cortical (ACx) plasticity (Bieszczad et al., 2015; Shang et al., 2020; Rotondo & Bieszczad, 2020; 2021). However, it is largely unknown which genes HDAC3 controls in ACx to influence auditory memory. RNA-sequencing on ACx samples taken from rats learning a two-tone sound frequency discrimination task show that the main effect of blocking HDAC3 is to amplify the expression magnitude of learning-induced genes (vs. vehicle or naïve)—few genes are actually differentially expressed between trained groups (e.g., Adamts13, cabin1). Single molecule fluorescent in situ hybridization (smFISH) identified anatomical loci and cellular context for where specific genes of interest (identified from other memory systems, i.e., c-fos, egr1, and nr4a2) were increased in expression in ACx. Combining bulk RNA-seq with more sensitive and cell-type specific smFISH techniques has allowed determination of quantitative changes in HDAC3-mediated learning-induced gene expression in the auditory system. Together, these results define a role for epigenetic action within ACx that may be key for neuroplasticity events to support the consolidation of highly precise and lasting auditory memories and opportune for novel molecular therapeutic targets.
Jiayue Liu, Josh Stohl, Enrique Lopez-Poveda and Tobias Overath
Topic areas: auditory disorders neural coding
auditory nerve model cochlear synaptopathy hearing loss] auditory percpetionFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
‘Hidden hearing loss’ has inspired a wealth of research. The stochastic undersampling model (Lopez-Poveda & Barrios, 2013) suggests that auditory deafferentation introduces internal noise in the subsequent, central auditory processes. In this model, auditory nerve fibers (ANF) are modelled as samplers, which sample the input sound at individual stochastic rates, and ANF loss is mimicked by reducing the number of samplers. However, the model parameters do not capture the full complexity of physiological response characteristics, thus leaving unclear the type of information conveyed by the ANFs. We added half-wave rectification, refractoriness, and subtypes of ANFs to the original model to explicitly model ANF (type) loss within a more realistic physiological setting. In addition, we used an artificial-neural-network-based stimulus reconstruction to decode the modelled ANF responses back to sound (Akbari et al., 2019, Morise et al., 2016), which we then tested on participants. We conducted a pure tone in noise (PTiN) detection task and a modified version of HINT (Nilsson et al., 1994) via MTurk. The behavioral stimuli were degraded using our model, with 3 levels of ANF survival (100, 10, 5%). Preliminary results indicate that the PTiN threshold increases significantly with ANF loss, at a rate that aligns well with predictions from Oxenham (2016). For the HINT, the results only showed a significant threshold shift between the 10% and 5% ANF loss conditions. In conclusion, our model combines detailed physiological response properties with the stochastic undersampling model and thereby allows a quantification of the perceptual consequences of ANF loss.
Felix Bröhl, Anne Keitel and Christoph Kayser
Topic areas: speech and language multisensory processes
speech MEG cross-modal restoration cross-modality pitch mutual informationThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
While multisensory speech is a core stimulus for our everyday life, it remains unclear how the brain reflects and combines auditory and visual speech signals. Recent studies indicated that even during unimodal speech (i.e. audio-only or video-only) cortical areas may engage in highly specific multisensory representations. One example is the restoration of their respective modality-specific representation when this is lacking in the input, such as the auditory cortex reflecting the speech envelope during visual-only speech. Another example is the restoration of the not-presented signal that would normally be represented in the respective other cortex, such as the visual cortex representing the acoustic speech envelope during visual speech. However, evidence of the previous studies investigating such cross-modal restoration of unisensory speech signatures is still sparse and inconclusive. We further this research by analyzing previously recorded MEG data in which participants listened to or watched speech in auditory- and visual-only conditions. By quantifying the neural representations of multiple acoustic and multiple visual speech features using a mutual information approach, we were able to show that 1) auditory speech features were tracked during visual-only speech in the auditory and visual cortex and 2) this tracking was mostly provided by a representation of pitch in visual but not auditory cortex. These findings suggest that while the auditory cortex during visual speech represents speech information only generically, e.g. by increased neural excitability in anticipation of sensory input, the visual system engages in more language specific computations that could aid in speech perception.
Keshov Sharma, Mark Diltz and Lizabeth Romanski
Topic areas: multisensory processes neural coding neuroethology/communication
multisensory social communication monkeyThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Facial gestures, mouth movement, and corresponding vocal stimuli are routinely integrated during social communication in animals and humans by a network of brain regions. The ventrolateral prefrontal cortex (VLPFC) is a part of this network and is highly responsive to social stimuli. Macaque VLPFC neurons selectively respond to and integrate species-specific vocalizations and faces. Investigations of other socially responsive brain regions indicate preferred processing of attributes including expression and identity; such a preference has not yet been established in VLPFC. In this study, we asked whether socioemotional expressions or caller identity could be decoded from VLPFC single neuron and population responses to social stimuli. Moreover, we asked whether decoding accuracy was enhanced when face and vocalization stimuli were presented simultaneously compared to visual information alone. We recorded single and multi-unit neurons in VLPFC, while macaques viewed movies of conspecifics vocalizing with accompanying facial gestures. Decoding of single neuron firing rates identified neurons with greater than chance decoding accuracy for expression and identity within the VLPFC. Combined as a pseudo-population, these cells exhibited increasing decoding accuracy for both variables as a function of population size, with decoding accuracy of identity increasing faster than expression. Our data show that sufficient information for the classification of different expressions and identities is present at the population level in the VLPFC, with identity as the potential driver of the population response. Our results encourage further investigation into the role of the VLPFC in social communication as well as the perception and interpretation of emotional expressions.
Mingyue Hu, Roberta Bianco, Antonio R Hidalgo and Maria Chait
Topic areas: memory and cognition correlates of behavior/perception
memory auditory scene analysis regularity MEG EEG attentionFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Listeners are tuned to the emergence of auditory patterns. This sensitivity is hypothesized to rely on memory mechanisms that track the unfolding acoustic structure. However the nature of these mnemonic mechanisms remains elusive. In particular, it is not known: (1) Whether they are encapsulated or relate to explicit memory abilities of individuals (2) Whether they are limited by duration or information content (3) Whether participants’ behavioral pattern detection ability correlates with automatic processing in auditory cortex. Building on Barascud et al (2016, PNAS), we use structured tone-pip sequences that contain transitions between random and regularly repeating tone-pip patterns. In the behavioral assays listeners were instructed to detect such transitions (50% of trials). During EEG/MEG experiments, naïve subjects listened passively. Memory mechanisms were taxed by introducing increasingly long gaps (up to 500ms) between successive tones and by manipulating information content (sequence complexity) vs. duration. Pattern detection ability was related to listeners’ cognitive profile by obtaining performance on a set of standard memory/attention tasks. In a series of large N online experiments we found that (1) Listeners can maintain stable pattern performance up to gaps of 200ms, with a sharp drop in sensitivity thereafter. (2) Performance is not linked to sustained attention abilities but to explicit auditory short term memory (3) Performance is largely limited by memory store duration, with minor effects of information content. Related EEG and MEG data are now being acquired. This work was supported by a BBSRC grant to MC.
Alexandria Lesicko, Christopher Angeloni, Jennifer Blackwell, Mariella De Biasi and Maria Geffen
Topic areas: hierarchical organization subcortical processing
inferior colliculus auditory cortex cortico-collicular predictive coding stimulus specific adaptationFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Sensory cues are differentially encoded depending on the contextual stream in which they are embedded. This adaptation can be understood through a predictive coding framework, in which responses to predictable stimuli become attenuated (repetition suppression), while unexpected cues elicit a prediction error response. In the auditory system, prediction error first appears at the level of the auditory midbrain, or inferior colliculus (IC), and is most prominent in the auditory cortex (AC). To determine the role of descending connections from AC to IC in predictive coding, we selectively suppressed the cortico-collicular (CC) pathway while recording responses to predictable and unpredictable stimuli in IC. To selectively target auditory CC cells, we made bilateral injections of a retro AAV-Cre construct in IC, while injecting an AAV-Flex-ArchT construct in AC. We performed extracellular recordings in IC in awake mice while playing tone sequences designed to parse prediction error and repetition suppression effects, suppressing CC cells on a fraction of trials. We found that suppression of the CC pathway led to a decrease in prediction error but did not affect repetition suppression. We also discovered populations of IC neurons that exhibit repetition enhancement, an increase in firing with stimulus repetition, and negative prediction error, a stronger response to a tone in a predictable rather than unpredictable context, both of which were suppressed during CC inactivation. Neurons in IC responded more similarly to each context in the absence of cortical input, suggesting that AC provides cues about the statistical context of sound to subcortical brain regions.
Xiu Zhai, Mina Sadeghi, Fengrong He, Heather Read, Ian Stevenson and Monty Escabi
Topic areas: neural coding
Sound texture Sound statistics Neural coding Neural correlationsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Time-averaged summary statistics in sound textures contribute to sound perception and are known to modulate neural activity in auditory midbrain. Yet, the role of different neural codes towards the neural representation of textures remains largely unknown. Here we recorded neural activity in the inferior colliculus (IC) of unanesthetized rabbits using linear multi-channel silicon probes during passive listening to natural sound textures and synthetic variants with perturbed statistics. Spike-sorted single-unit data and an analog representation of multi-unit activity (aMUA) were evaluated from the neural recordings. Single-unit response properties varied with changing sound summary statistics. Shuffled autocorrelograms exhibited a sharper and stronger peak upon adding higher-order statistics, indicating enhanced temporal coding, higher temporal precision and reliability. By comparison, firing rates were relatively constant and unaffected by the summary statistics. Furthermore, correlated firing between single-units showed diverse structures across sounds and increased upon adding statistics to the synthetic stimuli. Similar findings were demonstrated in the neural aMUA. Next, single-trial neural decoders were used to determine whether the neural correlations and spectrum in IC could allow sound textures to be identified and discriminated. The neural correlation-based classifier performance improved with added sound statistics, resembling trends observed in human psychoacoustics (Zhai et al 2020). By comparison, neural spectrum-based codes contributed much more towards texture discrimination. The findings suggest that neural response statistics in auditory midbrain are modulated in unique ways by statistical regularities in sounds and that rate-based and correlation-based neural codes may have dissociable roles for the recognition and discrimination of natural texture sounds.
Hiroaki Tsukano, Divya Narayanan, Amber Kline, Koun Onodera and Hiroyuki Kato
Topic areas: hierarchical organization
Auditory Cortex Mapping Optical Imaging Stereotaxic coordinates Variability MiceThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
A major goal of neuroscience is to understand the specific functions of individual brain regions. Stereotaxic coordinates based on brain atlases are widely used for experimentally targeting individual regions. However, in the mouse neocortex, due to the lack of cytoarchitectural landmarks that delineate areal borders, it remains unclear how variability across individual animals affects the precision of stereotaxic targeting. Moreover, it is becoming evident that the coarse delineation of auditory cortices in the mouse atlas as “A1, AuD, and AuV” is insufficient to describe the true complexity of their functional organization. Therefore, to build a solid understanding of the area-specific functions, it is necessary to reevaluate the accuracy of area boundaries in brain atlases. Here, to evaluate the spatial variability of functionally identified auditory cortical areas across individual animals, we analyzed intrinsic imaging signal data from over two hundred mice. We quantified variabilities in both (1) inter-areal relative locations and (2) absolute stereotaxic areal locations to estimate the likelihood of failure in the stereotaxic targeting of auditory areas. Finally, we will compare the stereotaxic coordinates of “AuV” in the atlas with our targeted marking of the functionally identified secondary auditory cortex (A2) to directly quantify the deviations. Our data suggest that stereotaxic targeting is prone to inaccuracy due to the marked spatial variability across individuals, indicating the necessity of functional mapping. Together, we hope our work sets a new standard and helps the field move forward in dissecting cortical circuits to identify the unique roles of individual auditory cortical areas.
Melanie Tobin, Janaki Sheth, Katherine Wood and Maria Geffen
Topic areas: neural coding
auditory cortex interneurons neural codingThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Cortical neuronal networks are comprised of multiple types of excitatory and inhibitory neurons, which perform specific computations. In the auditory cortex (AC), different inhibitory neurons differentially affect sound processing in excitatory neurons, yet how they affect network dynamics remains poorly understood. To establish the role of different interneurons within the cortical network, we stimulated specific interneuron subpopulations, while imaging the network responses to sounds in AC of awake, head-fixed mice. We monitored the activity of populations of hundreds of neurons using two-photon calcium imaging and simultaneously increased the activity of an interneuron subpopulation using optogenetic stimulation. We focused on vasoactive intestinal peptide-positive (VIP) and somatostatin-positive (SST) inhibitory neurons, which have been found to shape cortical responses in a context-dependent fashion (Natan et al, 2015; Pi et al, 2013). Responses of cells activated by sound were enhanced when VIPs were activated, but reduced and slowed down upon activation of SSTs. By fitting rate-level functions of single neurons by either a sigmoid (monotonic) or Gaussian (non-monotonic) function, we observed that SST or VIP activation affected different parameters in the rate-level functions of monotonic and non-monotonic cells. At the population level, we computed measures of detectability and discrimination across neural populations. SST activation decreased population detectability of sounds and increased discriminability between sounds. By contrast, VIP activation increased population sound detectability and decreased discriminability between sounds. We conclude that the activation of VIP or SST interneurons leads to opposite effects on the response of the neural networks to sounds in AC.
Christine Junhui Liu, Cathryn MacGregor, Benjamin Glickman, Lucas Vattino, Ana Castro and Anne Takesian
Topic areas: thalamocortical circuitry/function
inhibitory neurons long-range projections auditory cortex medial geniculate body neuron-derived neurotrophic factor (NDNF) vasoactive intestinal peptide (VIP)Fri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Identifying neural targets that control central auditory plasticity will have far-reaching impact, offering potential ways to restructure neural circuitry. Work from our lab and others has identified groups of GABAergic inhibitory neurons in the auditory cortex (ACtx) expressing neuron-derived neurotrophic factor (NDNF) and vasoactive intestinal peptide (VIP) as key regulators of auditory plasticity. How these neurons act on specific postsynaptic targets to control plasticity is not fully understood. Previous studies have focused on short-range, local outputs of these GABAergic neurons. However, our results show that a subset of these neurons send long-range projections to distant cortical and subcortical regions, which may gate sensory integration and feedback to earlier auditory processing centers. By combining advanced anatomical tracing and immunohistochemistry techniques, we reveal that subtypes of VIP and NDNF neurons send long-range projections that extend horizontally across cortical regions, vertically into deep layers of the cortical column, and subcortically to the medial geniculate body (MGB). Ongoing electrophysiological studies are evaluating the intrinsic and synaptic properties of these long-range GABAergic neurons. Together, these findings extend our understanding of NDNF and VIP inhibitory neurons beyond local cortical circuitry. The characterization of long-range GABAergic neurons in ACtx could provide insight into novel auditory plasticity mechanisms and potential therapeutic strategies to promote adult plasticity and learning.
Andrea Shang and Kasia Bieszczad
Topic areas: memory and cognition correlates of behavior/perception
auditory cortex plasticity memory electrophysiologyFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Various forms of auditory cortical (ACx) plasticity (e.g., expansions in tonotopic area, shifts or narrowing of receptive field bandwidth) have been linked to key characteristics of auditory memory, such as its associative strength, or its precision for specific sound cues. We used an epigenetic manipulation known to promote the formation of highly precise auditory memory for behaviorally important acoustic frequency cues (Bieszczad et al., 2015; Shang et al., 2019; Rotondo & Bieszczad, 2020a; 2020b; 2021) to investigate whether the same neural coding strategy supports frequency-specific memory in a more perceptually challenging frequency discrimination ( < 1 octave). Prior studies have found that highly frequency-specific memory occurs with frequency-specific decreases in ACx tuning bandwidth (thereby increasing neural selectivity for the trained frequency cues). We trained adult male rats (n = 12) in a frequency discrimination task that required animals to associate two frequency cues (~0.75 octaves) with two different associative outcomes (reward or error). Consistent with prior work, epigenetic manipulation produced highly frequency-specific cued behavior for the rewarded frequency at memory test, yet ACx tuning bandwidth reductions were not detected, while other forms of plasticity were evident. Thus, highly precise behavioral memory for sound does not always occur with sound-specific reductions in ACx bandwidth; the auditory system must have redundant physiological substrates to support this behavioral effect. Stated broadly, different forms of learning-induced ACx plasticity may be redundantly and non-linearly linked to the same behavioral phenotypes. Results highlight the importance of task design in promoting desired cortical or behavioral function for individualized remediation.
Thaiz Sánchez-Costa, Alejandra Carboni and Francisco Cervantes Constantino
Topic areas: speech and language neural coding
selective attention prior knowledge cocktail partyFri, 11/5 11:15AM - 12:15PM | Virtual poster + podium teaser
Abstract
Unlike vision, hearing is well equipped for top-down signals to reach early sensory processes. A question of theoretical interest is how early can the enhancement effects of voluntary attention be measured. Since repetition speeds up auditory cortex responses to speech, we addressed the influence of prior knowledge during attention in a ‘cocktail-party’ setting systematically. Repetition does reduce the magnitude of cortical responses, thus we hypothesized that priming a listener with clean target versus clean masker speech just before hearing them again in a ‘cocktail-party’ has opposing effects: limited neural enhancement during selection if the target was repeated (countering attention), versus greater enhancement if the masker was primed (assisting suppression). Using single-trial electroencephalography, indices of temporal attention allocated to continuous speech onset responses were measured by means of the temporal response function method. Participants (N=35) attended to a diverse pool of single speakers amongst brief (~8 s) spoken phrase mixture snippets presented once, to identify aspects such as topic, keywords, and target identity. While attentional enhancement modulations were found in both target and masker-primed cases (and in no-priming controls), masker priming boosted differential encoding more effectively than target priming did. This was despite masker-primed trials being more difficult behaviorally. Moreover, an early (~50 ms) selective modulation was found for masker-primed trials, where the representation of the target exceeded that of the known masker. This indicates that, in ‘cocktail-party’ settings, learned expectations about a background can be incorporated pre-attentively for foreground segregation.
Rupesh Chillale, Shihab Shamma, Srdjan Ostojic and Yves Boubenec
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization neural coding
Primary Auditory Cortex Categorization Sensory vs Category Disentangling Extracellular Recordings Population Decoding Linear RegressionThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Grouping a set of stimuli into relevant categories is an important cognitive ability and is a key process to assign proper goal-directed behavioral responses. In perceptual tasks, the role of primary auditory cortex (A1) is to encode the sensory features of the stimulus. However, recent studies reveal task-relevant features being encoded by the A1 population thus leading to redefine the role of A1. This study investigates to what extent A1 can encode sensory vs categorical variables, and how to disentangle the encoding of these features. We trained ferrets on a Go/NoGo delayed categorization task to discriminate click trains into relevant categories. We recorded neural activity from A1 while ferrets were passively listening to the stimulus and actively engaged in the task. Exploiting the task structure and using linear regression analysis, we disentangled sensory vs categorical features in A1 population activity. We revealed that population-level representations of the Go category became more contrasted against baseline activity than No-Go category upon task engagement. We also isolated the emergence of categorical encoding during stimulus presentation, and how it was disrupted during incorrect trials. We finally show that categorical representations morphed at stimulus offset towards an anticipatory sustained activity with ramping dynamics culminating at response times.
Mateo Lopez Espejo and Stephen David
Topic areas: correlates of behavior/perception neural coding
interneuron tuning pupillometry inhibitory gain-control spontaneous-rateFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Research in the mouse auditory cortex has shown differences in sound tuning properties between excitatory and inhibitory neurons. Inhibitory neurons are thought to sharpen spectral and temporal tuning and to contribute to gain control in excitatory pyramidal neurons. However, the generality of differential inhibitory and excitatory tuning across species remains largely unknown, particularly in non-rodent species. To explore this issue we used the inhibitory interneuron-specific promoter mDlx to drive cell-type specific expression of channelrhodopsin (ChR2) in the auditory cortex of ferrets. We confirmed expression of ChR2 and colocalization with GABAergic neurons histologically. We then recorded the sound evoked activity of neuronal ensembles using a silicon multielectrode array with an integrated optic fiber. Additionally we performed pupillometry as a measure of arousal and internal state. During recordings, we identified putative inhibitory interneurons by short-latency responses to a brief blue-laser stimulus. We characterized differences between ChR2+ and ChR2- neurons in frequency tuning, response delay and duration, spontaneous firing rate and pupil related response modulation. Preliminary results suggest similar spectral tuning between neuronal types. However, we observe increased spontaneous firing rate in putative inhibitory interneurons and greater pupil-related modulation of gain and base firing rate for sound-evoked responses.
Gregory Hamersky, Luke Shaheen and Stephen David
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Streaming Marmoset Encoding modelsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Listeners often encounter auditory scenes containing complex, spectrally overlapping sound sources. Successful streaming, the ability to perceive a single sound source, requires identifying grouping cues, statistical regularities of natural sounds in the time and frequency domains. Though numerous psychoacoustic studies have described auditory streaming as a perceptual phenomenon, there has been little description of its underlying neural basis. The current study recorded single unit activity in the auditory cortex (AC) of awake marmosets. Passively listening marmosets were presented pairs of overlapping natural sound excerpts from ethologically relevant categories of sound textures (backgrounds, BGs) and transients (foregrounds, FGs). Neural responses to overlapping pairs (BG+FG) were modeled as linear weighted combinations of BG and FG in isolation. Model weights showed that responses to combinations were frequently suppressed relative to individual sound responses. Effects varied across neurons and stimuli, though neural identity, rather than stimulus identity, more strongly predicted the extent of suppression. Surprisingly, the linear model also showed greater suppression of FG than BG responses. In addition, responses were often strongly nonlinear and could not be predicted by the linear model. These results suggest that both neural tuning and spectrotemporal cues contribute to nonlinear responses during overlapping sound presentation. These interactions can be explored using spectro-temporal encoding models and relating selectivity to features of individual natural sounds.
Matheus Macedo-Lima, Lashaka S. Jones, Rose Ying and Melissa Caras
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization
learning perception rodent electrophysiology audition hearingThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
It is commonly accepted that, when it comes to learning to dance or ride a bike, “practice makes perfect.” However, our senses also benefit from training, a process termed perceptual learning. During auditory perceptual learning, practice is thought to strengthen the connection between a top-down brain network and auditory cortex, thereby enhancing cortical responses to sound, and improving perceptual detection capabilities; however, the top-down brain regions involved in this process are unknown. One promising candidate is the orbitofrontal cortex (OFC). The OFC sends direct projections to the auditory cortex and pairing sounds with OFC stimulation improves neural discrimination. To test OFC involvement in auditory perception and perceptual learning, we first determined whether OFC inactivation affected performance of Mongolian gerbils on an amplitude modulation (AM) detection task. We infused muscimol bilaterally into OFC before the task and found that AM detection was significantly impaired. Extracellular recordings from chronically implanted electrode arrays revealed that OFC inactivation abolished the top-down modulation of auditory cortical neurons during task performance. Finally, we recorded extracellular activity from the OFC of freely moving gerbils as they underwent auditory perceptual learning. We found that the activity of OFC neurons increased over the course of perceptual learning. Specifically, OFC firing rates significantly correlated with perceptual thresholds, but did not correlate with training day, which suggests that OFC activity is associated with learning and not task familiarity or repetition. We hypothesize that the OFC provides a top-down signal to facilitate practice-dependent improvements in auditory cortical and perceptual sensitivity.
Nathan Vogler, Tyler Ling, Alister Virkler, Violet Tu, Jay Gottfried and Maria Geffen
Topic areas: correlates of behavior/perception multisensory processes
Auditory Auditory cortex Multisensory OlfactoryFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
In complex environments, the brain must integrate information from multiple sensory modalities, including the auditory and olfactory systems, for perception and behavior. However, despite the importance of multisensory integration, there is a critical gap in our understanding of how the brain integrates auditory and olfactory stimuli. Here, we investigated the mechanisms underlying auditory-olfactory integration using electrophysiology, anatomy, and behavior. First, we tested whether and how odor stimuli modulate auditory cortical neurons’ responses to sound in awake mice. We developed an experimental system for delivering combinations of auditory and olfactory stimuli, and we collected in vivo electrophysiological recordings from the auditory cortex (ACx) in response to sounds, in the presence or absence of an odor. Our results suggest that odor stimuli may modulate sound responses of ACx neurons in awake mice. Next, we used viral tracing strategies to investigate the anatomical circuits underlying auditory-olfactory integration. Our results demonstrate direct inputs to the ACx from the piriform cortex (PCx), suggesting a possible substrate for olfactory integration in ACx. Finally, behavioral experiments are investigating how odor stimuli influence sound detection. Combined, our results point to ACx as an important area for auditory-olfactory integration.
Alexander J Billig, Sukhbinder Kumar, William Sedley, Phillip E. Gander, Joel I. Berger, Maria Chait, Meher Lad, Hiroto Kawasaki, Christopher K. Kovach, Matthew A. Howard and Timothy D. Griffiths
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization neural coding
Auditory cognition Memory HippocampusFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
In addition to supporting declarative memory and navigation, the hippocampus can represent sensory and conceptual spaces (Behrens et al., Neuron 100:490-509, 2018). Rodent hippocampal place cells can also form discrete firing fields for sound frequencies (Aronov et al., Nature 543:719-722, 2017). To investigate whether human hippocampus maps auditory features we generated chords that are perceived on a continuum from "beepy" to "noisy", based on the number of simultaneous components. Unlike tone frequency, this "density" feature is not consistently associated with any physical spatial dimension. In an fMRI experiment, subjects held in mind a 2-s sound of fixed density and adjusted the density of a 8-s sound to match it. In other conditions we removed the memory component (adjustments were made freely with no target density), the adjustment component (button presses were made for odd/even judgments on spoken digits), or both. The design distinguished activity supporting auditory memory from that relating to sound adjustment, also controlling for sensory and motor factors. Auditory working memory was associated with activity in insula, inferior frontal gyrus and paracingulate cortex. Density adjustment elicited bilateral hippocampal activity, which did not depend on the subject navigating toward a fixed target density. Parallel intracranial recordings in epilepsy patients revealed sustained hippocampal 1-8 Hz power during a similar task. Ongoing unit recordings from these patients, together with multivariate analysis of the fMRI data, will establish whether human hippocampus "maps" density space as rodent hippocampus does for tone frequency, helping to clarify the role of this structure in auditory cognition.
James Bigelow, Ryan Morrill, Timothy Olsen, Stephanie Bazarini and Andrea Hasenstaub
Topic areas: auditory disorders memory and cognition correlates of behavior/perception multisensory processes
audiovisual integration multisensory integration cortical layers inhibitory interneurons mutual informationFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Recent studies have established significant anatomical and functional connections between visual areas and primary auditory cortex (A1), which may be important for perceptual processes such as communication and spatial perception. However, much remains unknown about the microcircuit structure of these interactions, including how visual context may affect different cell types across cortical layers, each with diverse responses to sound. The present study examined activity in putative excitatory and inhibitory neurons across cortical layers of A1 in awake male and female mice during auditory, visual, and audiovisual stimulation. We observed a subpopulation of A1 neurons responsive to visual stimuli alone, which were overwhelmingly found in the deep cortical layers and included both excitatory and inhibitory cells. Other neurons for which responses to sound were modulated by visual context were similarly excitatory or inhibitory but were less concentrated within the deepest cortical layers. Important distinctions in visual context sensitivity were observed among different spike rate and timing responses to sound. Spike rate responses were themselves heterogeneous, with stronger responses evoked by sound alone at stimulus onset, but greater sensitivity to visual context by sustained firing activity following transient onset responses. Minimal overlap was observed between units with visual-modulated firing rate responses and spectrotemporal receptive fields (STRFs) which are sensitive to both spike rate and timing changes. Together, our results suggest visual information carried by infragranular inputs influences sound encoding across cortical layers in A1, and that these influences independently impact qualitatively distinct responses to sound.
Mahtab Attarhaie Tehrani, Sharad Shanbhag, Rahi Patel and Jeffrey Wenstrup
Topic areas: neuroethology/communication subcortical processing
communication calls auditory midbrain frequency tuningThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
We explored the representation within inferior colliculus (IC) subdivisions of both ultrasonic vocalizations (USVs) and several categories of non-USV, mostly broadband calls. We examined: 1) how responsiveness to different vocal categories relates to frequency tuning, 2) vocal selectivity beyond that related to frequency tuning, and 3) the distribution of vocal responses and frequency tuning across IC subdivisions. In head-fixed, urethane-anesthetized adult CBA/CaJ mice, we extracellularly recorded single- and multi-unit responses to pre-recorded syllables containing a broad range of spectral components. Generally, IC sites across subdivisions responded to vocal signals with energy corresponding to their frequency response areas. Sites tuned below 20 kHz responded mostly to non-USV calls: the LFH call (fundamental 4 kHz), MFV call (fundamental ~9-15 kHz), and Noisy call (broad band >15 kHz). Sites tuned to 30-40 kHz also responded to stepped USVs that contain frequencies below 50 kHz, but not to tonal USVs limited to frequencies above 55 kHz. Responses to tonal USVs were mostly recorded at high frequency sites (>55 kHz), but these sites also responded to the non-USVs, due either to signal energy near the peak of high frequency tuning curves or associated with tuning curve tails. Further, the more complex spectrotemporal structures of stepped USVs and noisy calls were represented in temporal response patterns of some IC neurons. These results indicate that information about a wide range of mouse vocalizations is present in both lemniscal and non-lemniscal projections through auditory thalamus, including those through the medial auditory thalamus to structures such as the amygdala.
Cynthia King, Rachel Landrum, Stephanie Schlebusch, David Kaylie, Christopher Shera and Jennifer Groh
Topic areas: multisensory processes subcortical processing
EMREOs middle ear muscles outer hair cellsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Eye movements are critical to linking spatial hearing and vision – every eye movement shifts the relative relationship between the visual and auditory sense organs. Neurophysiological studies have previously shown evidence for eye movement-related modulation of auditory signals within the brain, and we recently discovered a unique type of otoacoustic emission that accompanies eye movements. These eye-movement related eardrum oscillations (EMREOs) occur in the absence of external sound and carry precise information about saccade magnitude, direction, and timing (Gruters et al 2018, Murphy et al 2020). However, how these eye movement-related effects in the auditory periphery contribute mechanistically to hearing is not yet well understood. Two auditory motor systems may be involved in generating EMREOs: the middle ear muscles and/or the cochlear outer hair cells. To gain insight into which systems are involved and how they contribute, we are presently investigating the EMREOs in human subjects with dysfunction involving these systems, compared to a normal hearing population. We find that EMREOs are abnormal in such clinical patients, most commonly with the EMREO being abnormally small in individuals who have impaired outer hair cell or stapedius function. These findings tie the EMREO’s properties to peripheral motor systems known to play a mechanistic role in auditory perception. Future work is needed to assess if patients with these types of hearing loss have specific impairments in the perceptual process of integrating visual and auditory spatial information.
Xiao-Ping Liu and Xiaoqin Wang
Topic areas: neural coding
Auditory cortex Temporal coding Bursting neuronsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Neurons recorded in auditory cortex of awake marmoset show diverse responses to sound, but the relationship between neuronal type and this diversity is not understood. We developed a method to classify units into three dominant classes (regular spiking, fast spiking, and bursting) based on extracellular waveform, spike timing, and spontaneous rate. A second unsupervised clustering method identified the same classes with high agreement (~94%). We find that the temporal dynamics of responses are strongly influenced by unit type. A subset of bursting units with high bursting frequency (“fast bursting”) showed short precise latencies and strong adaptation. Both fast spiking and fast bursting units were able to follow rapid stimulus fluctuations, but the strong adaptation of the latter group accentuated acoustic onsets and produced the best phase-locking. Fast bursting units responded to vocalizations with transient spiking at particular moments during the call. These unit type differences may contribute to parallel temporal and rate codes in cortical processing.
Sarah Verhulst, Sarineh Keshishzadeh, Hannah Keppler and Ingeborg Dhooge
Topic areas: auditory disorders correlates of behavior/perception subcortical processing
Sensorineural hearing loss and tinnitus Cochlear Synaptopathy Auditory physiology Speech intelligibilityThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Tinnitus and age-related hearing difficulties can occur when audiometric hearing sensitivity is normal, therefore cochlear synaptopathy (CS) was named as a possible trigger mechanism for either etiology. However, the exact mechanisms underpinning ascending neural pathway adaption after CS and their functional consequences remain unclear. To better understand cause and effect, we investigated how markers of sensorineural hearing loss (SNHL) and speech intelligibility differ in two age-matched younger (22.7 pm 1.1 y/o, PTA = 4.2 pm 2.9 dB, N=31) and older (48.8 pm 5.6 y/o, PTA = 13.8 pm 4.7 dB, N=23) subject-groups with or without sustained tinnitus. SNHL markers included RAM-EFR and ABR amplitudes and high frequency thresholds (HFTs) up to 20 kHz. Speech reception thresholds (SRTs) were derived from a standardized 5-word-sentence test presented in quiet or in 70-dB stationary noise. RAM-EFR and ABR wave-I markers of CS were significantly smaller in the older than younger group and did not differ between tinnitus and non-tinnitus subjects within the age groups. This supports the view that CS is associated with ageing rather than with tinnitus. Elevated SRTs were observed in older listeners, and in those with weaker RAM-EFRs, elevated PTAs or HFTs. Tinnitus did not affect these outcomes, demonstrating a stronger importance of SNHL than tinnitus in predicting speech perception difficulties. Possible alterations in brainstem processing after CS were evaluated by studying the effects of age, tinnitus, the RAM-EFR or HFTs on the ABR wave-I/V ratio. We found no systematic dependencies, and thus no straightforward connection to this brainstem-gain marker.
Noémie te Rietmolen, Benjamin Morillon and Daniele Schön
Topic areas: speech and language neural coding
speech and music perception sEEG functional connectivityThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
This project investigates the brain dynamics underlying the perception of speech and music. Both speech and music are uniquely human, complex auditory sounds, however, whether they rely on the same neural underpinnings remains hotly debated. That is, where several studies have reported results in favor of shared neural resources, others provide evidence for specialized, neurally separable structures. One reason for the divergence in the literature comes from the methodological problems that are oftentimes faced when comparing the sounds. For instance, because speech and music (both highly complex) depend on multiple cognitive operations, which makes them difficult to directly compare, researchers will attempt to match the sounds as closely as possible, and, in doing so, significantly reduce their complexity. This effort can render the sounds unnatural and calls into question to what extent the reported brain responses reflect genuine speech and music processing. Another methodological limitation concerns the granularity of the analyses in the investigations. For instance, where studies using fMRI may report overlapping neural activity in certain brain regions, indicating domain-general processing, the oscillatory dynamics in these areas may well be domain-specific. Here, we sought to address these limitations by recording sEEG data, allowing for high temporal and spatial resolution, in an ecological paradigm, i.e. 18 epileptic patients passively attended long stretches (~10 minutes) of natural and continuous speech and music. Functional connectivity analyses point towards largely shared neural substrates, while additionally revealing more localized patterns of selectivity and showcasing the value of fine-grained analyses when comparing speech and music.
Matthew McGinley, Su Chen and Jan Willem de Gee
Topic areas: correlates of behavior/perception neural coding novel technologies thalamocortical circuitry/function
attention listening effort pupil drift diffusion modeling signal detection theory generalized linear model survival analysis auditory cortex neuromodulationFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Auditory decision-making and its neural correlates in laboratory tasks typically employ rigid trial-based structures that facilitate use of signal detection theory (SDT; Green and Swets, 1966) and drift diffusion modeling (DDM; Ratcliff and McKoon, 2008). However, during natural listening, important stimuli arrive at unpredictable and perhaps long intervals. The neural circuit implementation of factors like attentional effort may differ between rigid trial-based and natural tasks. Thus, we have developed a novel quasi-continuous sensory detection paradigm and associated toolkit of modeling approaches based on SDT, DDM, and GLMs of choice and its hazard function. Firstly, we developed real-time SDT (rt-SDT), for which sensitivity and decision criterion are analytically shown to be equivalent for a short sliding time-window and at the per-stimulus level, after incorporating the decision temporal prior with survival analysis. Secondly, we employ a DDM with single bound, leak, and lapses, mirroring recent human work (Ossmy et al., 2013). Finally, we apply GLM models of choice as a binary variable (DeCarlo, 1998) or of the response hazard function using survival analysis. Using simulations, we show that rt-SDT provides simple interpretations that work on small data sets, while the DDM requires more extensive data but provides a richer and behaviorally mechanistic account of decision-making. The GLMs are intermediate between rt-SDT and DDM in terms of constrainability, can interpretively opaque, but powerfully incorporate physiological and behavioral covariates. Our ongoing work is establishing a solid foundation for modeling quasi-continuous, sustained listening. Ossmy, Moran, … Donner(2013). Current Biology, 23(11),981-986. DeCarlo(1998). Psychological methods, 3(2),186.
Garret Kurteff and Liberty Hamilton
Topic areas: speech and language
intracranial recordings speech production speech perception encoding computational neuroscienceThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
We used intracranial recordings to understand speech-induced suppression and enhancement and its effect on phonological feature tuning. Patient participants overtly produced sentences, then listened to playback of themselves producing those sentences. Playback was either predictable (immediate playback) or unpredictable (random playback of previous trial). Sentence, word, and phoneme-level timing were transcribed. Multivariate temporal receptive fields were fit to high gamma (70-150 Hz) neural signals to examine phonological feature tuning and effects of production vs. playback and predictability of playback. We found suppression of auditory responses to self-produced speech in posterior superior temporal gyrus (STG) and sulcus (STS). Phonological tuning was similar during production and playback, but with lower amplitude production responses. In onset regions of pSTG, transient suppression of onsets was observed during production, but the later auditory response was mostly preserved. Onset-selective temporal lobe regions were not strongly modulated by predictability. Inferior frontal cortex responses demonstrated onset-selective encoding of playback, but showed mixed representation of playback predictability. In the insula, we observed sensory responses to speech that were enhanced during production relative to playback and occurred on a faster timescale. Phonological tuning in the insula was seen for a limited number of speech feature categories (primarily vowels). The predictability of playback did not have an effect on responses recorded from the insula. These results suggest that sensory and motor representations of speech are involved in both perception and production, and changes in high-gamma response between perception and production are not the result of differential encoding of phonological features.
Roberta Bianco and Maria Chait
Topic areas: memory and cognition
long-term memory regular patterns attention task demands auditory scene analysisThu, 11/4 12:15PM - 12:30PM | Short talk
Abstract
Accumulating research is revealing a remarkable human ability to implicitly remember sporadically reoccurring sounds (Bianco et al, 2020 eLife; Agus et al, 2010 Neuron). However, a key issue – the extent to which such memories are formed automatically or rely on the availability of general computational capacity – remains largely unstudied. In this series of experiments, we investigate how attentional capture by irrelevant distractors, or a high load decoy task affects implicit memory formation for arbitrary reoccurring sounds. Following the paradigm in Bianco et al (2020), participants (N=70; recruited online) were presented with rapid tone sequences (50 ms tone duration, 0.2-2 kHz frequency range) containing transitions between random to regularly repeating patterns (REG). Unbeknownst to them, a few different patterns reoccurred every ~3 minutes (REGr). In experiment 1, participants quickly responded to the emergence of such patterns and ignored unexpected distractor sounds (3-6 kHz frequency range) presented concurrently (50% trials). In a test block, distractors were removed. We found that memory for REGr patterns associated with distractors was predicted by listeners ability to ignore the distractors, suggesting that memory formation is not automatic. Follow-up control experiments, directly manipulating the availability of processing resources and behavioural relevance, are currently being conducted. Overall, these experiments further our understanding of attention as a perceptual bias helping the brain to filter what sensory inputs are worth remembering. This research was supported by a BBSRC grant to MC. Informal discussion to follow at 1:15 pm EDT (GMT-4) on Zoom (link below).
Melissa Polonenko and Ross Maddox
Topic areas: speech and language correlates of behavior/perception subcortical processing
auditory brainstem response auditory processing of speech electroencephalographyThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Paradigms assessing the response to naturalistic, continuous speech stimuli have seen increasing use for understanding the neural underpinnings of communication. We recently developed the “peaky speech” method for deriving canonical brainstem responses from speech to investigate early processing. Our first paper found that a male talker evoked larger and earlier responses than a female talker. Here we aimed to systematically evaluate how the fundamental frequency (f0) affects the morphology of peaky speech responses and then compare these changes to similar rate changes in click-evoked responses. Using the same narrators as before, we measured the response to each talker at three different frequency shifts: unshifted, the f0 of the other talker, and the f0 in between the two talkers, for a total of six conditions. Response amplitudes decreased and latencies increased with increasing mean f0 for both talkers and with increasing rate for clicks. When set to the same mean f0, responses evoked by the two talkers were highly similar to each other. Thus, the talker-dependence of responses seems to be largely driven by f0. Click responses followed a similar trend with average click rates matching the three speech f0s, but the click and speech response morphologies differed. Model-simulated ABRs mirrored the f0 and rate patterns of the measured ABRs. In summary, the peaky speech method is sensitive to changes in talker parameters, speech-evoked responses differ from those of clicks, and talkers with lower f0 evoke larger and faster responses. These are important considerations when planning experiments with peaky speech.
Lalitta Suriya-Arunroj, Jaejin Lee, Jean-Hugues Lestang, Monty Escabi and Yale Cohen
Topic areas: correlates of behavior/perception hierarchical organization neural coding
nonhuman primate auditory cortex neural coding Spectro-Temporal Receptive Field Dynamic Moving RippleFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Although neural activities in auditory core and belt areas have been widely studied, there are still many unanswered questions regarding how these areas cooperate to process auditory information. Furthermore, acoustic stimuli most often used are rather simple, lacking spectral and temporal structure found in natural sounds. In addition, subject are often not actively listening the stimuli making it unclear how the neuronal activities in different areas might have changed otherwise. We used Dynamic Moving Ripple stimulus to cover a wide range of spectro-temporal features and to investigated how different areas of auditory cortex of wake nonhuman primates process the stimulus by recording the neuronal activities in the core and belt areas with a linear multi-electrode array. We derived Spectro-Temporal Receptive Field (STRF) parameters and calculated the Reliability Index for each neuron and compared those parameters in core and belt areas. Our results didn’t show a significant difference between the core and belt areas when STRF parameters area treated individually, except the proportion of the number of neurons with a significant reliability index which was higher for belt area. However, there were significant differences between the core and belt areas in the correlations between several STRF parameter pairs such as bandwidth and best frequency or the temporal cutoff frequency and integration time. This result indicates that although individual spectro-temporal parameters of neurons are similar, the joint spectro-temporal sensitivity differs across the core and belt areas in wake nonhuman primates.
Samuel Smith, Mark Wallace, Ananthakrishna Chintanpalli, Michael Akeroyd, Michael Heinz and Christian Sumner
Topic areas: speech and language correlates of behavior/perception neural coding
speech-in-noise inferior colliculus psychoacoustics machine learningThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Speech-in-noise listening could be achieved by an active process of inference: what combination of sounds best reproduce the incoming auditory representation? This has previously been termed analysis-by-synthesis (Halle & Stevens, 1962). We tested a computational implementation of analysis-by-synthesis for recognizing phonemes and syllables in a variety of competing speech and noise backgrounds. Notably, machine recognition of multiunit recordings made from the auditory midbrain of guinea pigs, driven by a process of analysis-by-synthesis, produced human-like patterns of speech recognition. Predicted performance depended on low-level cues such as fundamental frequency, despite there being no explicit handling of these cues. The plausibility of analysis-by-synthesis would require that the brain predicts neural responses to combined sounds based on stored representations of individual sounds. A LSTM neural network was able to accurately approximate this nonlinear function across neural recordings, confirming that such a mechanism is tractable. This work suggests that if speech recognition in complex acoustic scenes were solved by a process of analysis-by-synthesis, the benefits of low-level acoustic cues might emerge automatically, without the need for any explicit segregation mechanisms that employ those cues.
Tobias Teichert
Topic areas: memory and cognition neural coding
Non-human primate primary auditory cortex EEG adaptation predictive codingFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
The reduction of neural responses to repeated stimuli (response suppression) is often attributed to neural adaptation or predictive suppression. While the two mechanisms differ dramatically in their theoretical underpinning and computational complexity, they have been difficult to separate experimentally. Predictive suppression is supported by findings that the degree of predictability can modulate the amount of response suppression. Adaptation is supported by strong and systematic suppression of neural response to unpredictable stimuli. The present work measures the temporal specificity of response suppression to distinguish the two experimentally. Adaptation is a passive process that predicts a gradual and monotonic recovery of neural responses for increasingly longer inter-stimulus delays. In contrast, predictive suppression is an active process that has been suggested to be temporally specific, i.e., strongest for stimuli that occur at the expected time. To quantify the temporal specificity of response suppression, we studied auditory evoked EEG and multi-unit responses of two macaque monkeys to pure tone pips presented either in highly regular contexts with mostly predictable timing and identity, or in random contexts with mostly unpredictable timing and identity. Briefly, we found that neural responses were most strongly suppressed if a deviant occurred in close temporal proximity to last standard, not the expected time of the next standard. There was no compelling evidence of non-monotonic recovery functions in the EEG or multi-unit responses from primary auditory cortex. Instead, neural responses were well accounted for by a simple feed-forward neurocomputational model with depressing synapses.
Ravinderjit Singh and Hari Bharadwaj
Topic areas: neural coding
Cortical temporal coding Systems Identification EEGFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Many studies have investigated how subcortical temporal processing, measured via brainstem evoked potentials (e.g., ABRs and FFRs) may be influenced by aging, hearing loss, musicianship, and other auditory processing disorders. However, human studies of cortical temporal processing are often restricted to the 40 Hz steady-state response. One possible reason for the limited investigation is the lack of a fast and easy method to characterize temporal processing noninvasively in humans over a range of modulation frequencies. Without a broadband characterization of cortical temporal processing, it is difficult to disentangle the different components that may contribute to the overall EEG response, and discover their respective functional correlates. Here, we use a system-identification approach where white noise, modulated using a modified maximum length sequence (m-seq), is presented to quickly obtain a stereotypical and repeatable auditory cortical “impulse” response (ACR) capturing broadband cortical modulation coding (up to 75 Hz) with EEG. Using principal component analysis (PCA) across different EEG sensors, we found that the overall response is composed of five components that can be distinguished by virtue of latency, and/or scalp topography. Furthermore, the components spanned different frequency ranges within the overall temporal modulation transfer function (tMTF), and differed in their sensitivities to manipulations of attention and/or task demands. Interestingly, we also find that the ACR shows nonlinear behavior, in that the relative magnitudes of the constituent components are different when measured using broadband modulations versus a series of sinusoidal modulations.
Narayan Sankaran, Matthew Leonard, Frederic Theunissen and Edward Chang
Topic areas: speech and language correlates of behavior/perception hierarchical organization neural coding
electrocorticography superior temporal gyrus music perceptionFri, 11/5 2:30PM - 2:45PM | Short talk
Abstract
To appreciate melody, listeners must map acoustic fluctuations in pitch onto meaningful and emotion-inducing representations of melodic structure. How such mapping occurs remains unknown, and the extent to which melodic processing leverages general-purpose versus music-specialized mechanisms remains a topic of debate. We recorded intracranial activity from local temporal lobe populations in six neurosurgical patients as they listened to natural instrumental melodies. We examined cortical tuning to two melodic features: the interval of pitch-changes, which provide an acoustic cue to melody, and the expectation of these changes, which encapsulate the learnt structural regularities that music exploits to engender emotion. Within the superior temporal gyrus (STG), we found spatially distinct populations tuned to interval-based and expectation-based aspects of melody respectively, with the former encoded anterior to the latter. To test the music-specificity of tuning to each feature, the same patients listened to natural English sentences, and we compared how populations tuned to melodic features encoded the equivalent acoustic and statistical information in speech. Interval encoding was domain-independent, such that the tuning of a population to certain melodic intervals generalized to the equivalent fluctuations of pitch present in speech. In contrast, encoding of melodic expectation occurred in specialized populations that were music-selective and did not encode equivalent prosodic or phonetic regularities in speech. Together, these findings underscore the heterogeneity of STG as a region containing general auditory populations intermixed with those functionally specialized for representing high-level information in crucial channels of human communication such as music. Informal discussion to follow at at 3:00 pm EDT (GMT-4) in Gathertown, Discussion Area 3 (link below).
Sharlen Moore, Zyan Wang and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception
goal-directed habitual learning audiomotor taskFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
During learning, it is challenging to identify the decision process used (i.e. goal-directed or habitual) by animals since the decision itself is a hidden variable that is not behaviorally observable. One challenge is that performance outcomes (action-rate and accuracy) are often similar when using either decision process. Moreover, in water-restriction paradigms, an animals’ drive is based on a survival-like biological need where animals exhibit consistently high action-rates (high motivation), then gradually improve accuracy (learning). We hypothesized that by driving an animals’ motivation from a necessity (survival) to a preference (palatability), we could use action-rate variability as a behavioral indicator of goal-directed versus habit-like performance without impacting accuracy. We leveraged a recent protocol in which mice get ad libitum access to water with citric acid (CA), fulfilling hydration needs with a non-palatable substance. Controls were mice under water restriction protocols. Mice were trained in an auditory go/no-go task in which they lick after a tone for a water reward and withhold licking after another tone to avoid a timeout. Our data shows that mice acquired task contingencies at similar rates in all groups. Interestingly, most CA mice initially shifted between epochs of high and low action-rates. Later, CA mice exhibited an abrupt reduction in action-rate variability, suggesting a shift to habit-like performance. This shift was not evident in water restricted mice. We are using biomarkers of arousal to understand changes in action-rate variability. These data suggest that the transition from goal-directed to habit-like performance during learning is abrupt and may result from a winner-take-all decision process.
Celine Drieu, Ziyi Zhu, Aaron Wang, Kylie Fuller, Sarah Elnozahy, Srdjan Ostojic and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception neural coding
Learning Two-photon calcium imaging Auditory cortex Dimensionality reductionThu, 11/4 2:30PM - 2:45PM | Short talk
Abstract
Goal-directed learning can be dissociated into two behavioral phases: rapid ‘acquisition’ of stimulus-action contingencies, and slower ‘expression’ that reveals the learned content. To what extent is the auditory cortex (AC) involved in either learning phase? We address this question using behavior, probabilistic optogenetics, and two-photon calcium imaging over learning. We train mice to lick to a tone (S+) and withhold from licking to another (S-); contingency learning is assayed daily using probe trials when reinforcement is unavailable. Optogenetic inactivation of the AC significantly impairs both acquisition and expression in a dissociable manner. Surprisingly, this inactivation-induced deficit gradually wanes during expression arguing that the AC teaches itself out of task execution. To determine how the two learning phases are implemented by AC networks, we use longitudinal, two-photon calcium imaging of the same excitatory neurons in L2/3 over learning (1,050±124 neurons/mouse in 6 mice). We isolate learning-related dynamics by comparing mice learning the task to those passively listening to the same tones. We then use unsupervised low-rank tensor decomposition (TCA) to uncover low-dimensional network structure in our high-dimensional data. Interestingly, learning enhances S- stimulus representation more than the S+, suggesting a feedforward role in behavioral inhibition. TCA strikingly reveals that S+-responsive neurons shift to firing during reward consumption, suggesting a direct role in reward learning. Overall, our work argues for a default but temporary role for the AC in discrimination learning that is mediated by a network that shifts from being largely stimulus-driven to one that is optimized for behavioral needs. Informal discussion to follow at at 4:00 pm EDT (GMT-4) in Gathertown, Discussion Area 2 (link below).
Eli Bulger, Barbara Shinn-Cunningham and Abigail Noyce
Topic areas: memory and cognition correlates of behavior/perception
fMRI prefrontal cortex attention memoryFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Prefrontal cortex (PFC) has been considered a general cognitive resource, recruited across tasks regardless of their specific computational demands. However, recent work has confirmed that human PFC can be parcellated into a number of smaller regions, some of which are specialized for auditory cognition (Michalka 2015, Glasser 2016, Noyce 2017, 2021). These regions are reliably recruited for auditory attention and working memory (WM), but their specific contributions are not understood. We collected fMRI (N=10) while subjects performed a PFC auditory localizer task (2-back memory for animal vocalizations contrasted against 2-back memory for faces), and while the same subjects performed a language processing task (Fedorenko 2013). Then, we developed an experiment contrasting auditory attention versus working memory. Each block (25 s) featured two spatialized competing streams of twelve 4-note melodies. Subjects were cued to attend to one stream, and either perform a challenging perceptual task or a WM task. To recruit auditory attention, subjects listened for occasional amplitude-modulated notes (“warbles”) and responded when one was detected; to recruit WM, subjects listened for repeated 4-note melodies (“repeats”) within the cued stream. All stimuli were matched between attention and working memory conditions. Using the auditory PFC localizer task, we labeled subject-specific regions of interest in transverse gyrus intersecting precentral sulcus (tgPCS), caudal inferior frontal sulcus/gyrus (cIFS/G), frontal operculum (FO), and central medial superior frontal gyrus (cmSFG). Preliminary results suggest that tgPCS, cIFS/G, and FO are recruited more strongly during WM than during attention.
Kate Gurnsey, Srivatsun Sadagopan and Tobias Teichert
Topic areas: speech and language
Non-human primate EEG Primary auditory cortex Frequency Following ResponseFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
The frequency-following response (FFR) is a scalp-recorded electrophysiological potential that closely follows the periodicity of complex sounds such as speech. The exact neural origin(s) of FFRs is(are) still a matter of debate. Initially thought to reflect mostly activity from the cochlear nucleus and inferior colliculus current thinking envisions multiple sources in brainstem, midbrain and cortex. In line with this assumption, we could recently show that the responses to individual F0 cycles of the stimulus (F0-responses) feature several spectro-temporally and topographically distinct components that likely reflect the sequential activation of brainstem ( < 5ms; 200-1000 Hz), midbrain (5-15 ms; 100-250 Hz) and cortex (15-35 ms; ~90 Hz). To confirm the cortical origin of the 90Hz component and to study more closely the properties of cortical FFRs, we recorded local field potentials in primary auditory cortex (A1) of three macaque monkeys using either large arrays of individually movable semi-chronically implanted electrodes that covered the entire tonotopic map of A1 or laminar probes that covered the entire cortical depth of individual cortical columns. Our preliminary results clearly confirm the cortical origin of the 90Hz component. Furthermore, we were able to quantify to what degree the reduced contribution of cortex to scalp-recorded FFRs at higher frequencies (beyond ~100 Hz) are caused by i) an inability of cortical neurons to follow higher frequencies, ii) destructive temporal superposition of F0 responses from adjacent F0 cycles, or iii) destructive spatial superposition of F0 responses from different tonotopic parts of A1.
Subong Kim and Hari Bharadwaj
Topic areas: neural coding subcortical processing
noise reduction hearing aid auditory brainstem response speech-in-noise subcortical neural coding auditory processingFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Listening to the speech sounds of interest in noisy environments can be tremendously challenging for people with sensorineural hearing loss, even with prescriptive amplification. Thus, current digital hearing aids commonly implement noise-reduction (NR) algorithms; however, NR processing inevitably distorts some speech cues while attenuating noise. Although it is known that there is much variability in hearing-aid users’ reactions to these conflicting effects, the physiological bases of that variance in perceptual outcomes with the addition of noise and the use of NR processing are understudied. Recently, we documented that cortical electroencephalographic measures of individual noise tolerance can predict individual speech-in-noise performance and NR outcomes. In this presentation, we will introduce novel objective measures of individual tolerance to noise and sensitivity to speech-cue distortions using speech-evoked auditory brainstem response. Preliminary results show that adding noise and introducing NR processing have considerably varying effects on brainstem responses across individuals. Given that the brainstem responses are relatively less affected by cognitive efficacy, these results suggest that precise peripheral characterization from each individual would help disentangle potential neural contributors to the variance in NR outcomes across individuals. Finally, this presentation also discusses how the subcortical measures of speech encoding relate to individual perceptual outcomes with NR processing.
Yufei Si, Shinya Ito, Alan Litke and David Feldheim
Topic areas: auditory disorders correlates of behavior/perception neural coding subcortical processing
superior colliculus electrophysiology spatial hearing hearing lossThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
The ability to locate the source of a sound in a complex environment is critical for survival. A topographic map of auditory space has been found in the superior colliculus (SC) of a range of species, including CBA/CaJ mice. Unfortunately, C57BL/6 mice, a strain widely used for transgenic manipulation, display an age-related hearing loss that limits its use in auditory research due to an inbred mutation in the CDH23 gene. To overcome this problem, researchers often study young (around 3-month) C57BL/6 mice, an age that has been shown to have normal hearing thresholds. Here we show that even 2-month old C57BL/6 mice of both sexes lack SC neurons with frontal auditory receptive fields (RFs) and therefore do not have a topographic representation of auditory space in the SC. Analysis of the spectrotemporal receptive fields (STRFs) of C57BL/6 mouse SC neurons reveals a deficit in the ability to detect high-frequency ( > 40 kHz) sounds. High-frequency sounds have been postulated to be the spectral cues required to compute frontal RFs in the mouse SC. We also show that either crossing C57BL/6 mice with CBA/CaJ mice or introducing one copy of the wildtype CDH23 gene to the C57BL/6 mice rescues both the high-frequency hearing deficit and the auditory map of space. Taken together, these results show that high-frequency hearing is required for mouse SC neurons to compute frontal auditory RFs.
Ling You, Kamun Tan, Meijie Li and Kexin Yuan
Topic areas: correlates of behavior/perception multisensory processes thalamocortical circuitry/function
multimodal sensory thalamus association cortex arousal circuit miceThu, 11/4 11:15AM - 12:15PM | Virtual poster
Abstract
Arousal induced by sensory stimuli is a phenomenon that has been widely observed both in the animal kingdom and in humans. However, how sensory thalamocortical systems would contribute to this process remains unclear. By combining transgenic mice with optogenetics, EEG and EMG recording, we found that optogenetic stimulation of VGluT2+ neurons in the posterior paralaminar nuclei (PPL) of the thalamus, which is a multimodal sensory thalamic region, but not neurons in the ventral division of the medial geniculate body (MGBv) or dorsal division of the lateral geniculate nucleus (LGNd), which is single modal, induced rapid sleep-to-wake transition. The magnitude of tone or blue light-evoked PPLVGluT2+ responses but not those in single modal sensory thalamus or intralaminar thalamus was highly correlated with whether the mice was awakened. Stimulation of PPLVGluT2+ in awake mice induced anxiety- and fear-like behaviors. Interrogation at circuit level revealed that the projection from PPLVGluT2+ to L2/3/5 pyramidal neurons in the temporal association cortex (TeA) and ectorhinal cortex (ECT) made causal contribution to both tone and blue light-indued awakening, while that to the caudal basal ganglia and ventral medial hypothalamus (VMH) made significant contribution to tone- and blue light-indued awakening, respectively. We further demonstrated that selective activation of axonal terminals in the PPL originated from the superior colliculus, the visual midbrain, and the inferior colliculus, the auditory midbrain, was able to awaken mice. Our findings highlighted a critical role of multimodal thalamocortical system in mediating sensory-induced arousal, which promotes defensive behaviors in mice.
Ariel Edward Hight, Nicole H. Capach, Jonathan D. Neukam, Sean K. R. Lineaweaver, Mahan Azadpour, Elad Sagi, Robert C. Froemke and Mario A. Svirsky
Topic areas: auditory disorders speech and language neural coding novel technologies
Cochlear Implant Neuroplasticity Humans Perception Acoustic Models Genetic AlgorithmsFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
Cochlear Implant (CI) stimulation is significantly distorted compared to acoustic stimulation in healthy ears, yet CI users attain good levels of speech comprehension. Several months of adaptation are typically required. We are investigating the perceptual changes that underlie this adaptation process by having CI users who have normal hearing in the nonimplanted ear select acoustic models that approximate what they hear through the implant. Specifically, acoustic models are adjustable across center frequency, bandwidth, acoustic carrier, and modulation rate. Optimized models and their parameters are longitudinally tracked following CI activation. Such listener-driven exploration of the acoustic model parametric space becomes rapidly inefficient when using method-of-adjustment over increasing parameter spaces. Here, we developed a genetic algorithm for optimizing user-selected acoustic models and examined its practical feasibility as well as noninferiority with respect to the traditional method-of adjustment procedure. As a first step, the GA has been successfully evaluated in CI users with contralateral normal hearing for modeling of single electrodes (n=7) and full arrays (n=7). Double-blind questionnaires were run to evaluate intelligibility, pleasantness, harshness, loudness, and overall similarity between the CI and acoustic models. We found that user-selected acoustic models of single channels were repeatable and sound more similar to CI stimulation compared to standard models.
Xiaomin He, Ali Zare and Nima Mesgarani
Topic areas: speech and language
speech perception encoding EEG BERT GPT2Thu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
The projection from the audio features to the neural signal can be used to reconstruct either side of the connection. The range of applicable audio features is wide, from the basic acoustic and higher-level semantic features. Previous studies show that features measuring the semantic dissimilarity on a speech flow, which is derived from a classical word-embedding method word2vec, can be reliably encoded in EEG during speech perception. In this research, we cast a sight on two novel word-embedding methods, BERT and GPT2, which are claimed to be more convincing when measuring the semantic continuity of speech. We recruited 22 native speakers for a one-hour auditory test with 64-channel EEG. Afterward, regularized linear regression models were trained by leave-one-out cross-validation to predict the neural signal from the stimulus. The prediction accuracy, reflecting the encoding levels of specific features, was looked into in detail. The results show that (1) BERT/GPT2 features can be encoded in the EEG signal (the prediction accuracy is above the chance level); (2) BERT/GPT2 features have a better word-embedding than classical word2vec (the semantic features derived from them have higher predictive power than that from word2vec); (3) BERT/GPT2 features, which are associated with speech semantics, can provide additional information to the basic acoustic feature (Envelope). And that indicates the intersection between top-down attention and bottom-up attention. Additionally, we compare the results between L1 (native) and L2 (English learner) groups. The encoding levels of BERT/GPT2 features reflect language proficiency and attention.
Menoua Keshishian, Serdar Akkol, Jose L. Herrero, Stephan Bickel, Ashesh D. Mehta and Nima Mesgarani
Topic areas: speech and language neural coding
Auditory cortex Language SpeechFri, 11/5 1:15PM - 2:15PM | Virtual poster
Abstract
How the human auditory cortex represents and transforms the sound pressure waveforms produced by a speaker to ultimately enable speech recognition remains unclear. It has been hypothesized that various levels of speech processing occur hierarchically in the human brain. Here, we used intracranial neural recordings of HG, PT, and STG from implanted electrodes in the auditory cortex of fifteen neurosurgical patients as they listened to natural speech. We used a multivariate regression framework to predict neural responses from various acoustic and linguistic features of speech and measured the variance explained by each feature. We found an explicit and distinct neural encoding of multiple levels of linguistic processing, including the acoustic, phonetic, phonotactic, lexical-phonemic and lexical-semantic features. Grouping neural sites according to their encoded linguistic features revealed several patterns of joint feature encoding, where the higher-level features were represented simultaneously with the lower-level features, forming a type of hierarchy. Anatomical and functional localization of linguistic encoding showed a gradual emergence of low- to high-level features from primary (i.e., medial HG) to non-primary (i.e., lateral STG) auditory cortical areas. Furthermore, we observed a temporal order in the appearance of these features, where encoding of higher-level features was correlated with neural response latency and the response time-lag for higher-level features was higher than lower-level features across the population. This joint, anatomically distributed, and temporally ordered appearance of various levels of linguistic features shines a light on what hierarchical processes enable the human auditory cortex to extract meaning from the speech sounds.
Kiki van der Heijden, Prachi Patel, Stephan Bickel, Jose Herrero, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception hierarchical organization neural coding
auditory object formation human auditory cortex intracranial measurementsFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
In everyday life, listeners readily extract relevant auditory objects from complex, multi-source auditory scenes. Yet, it is unclear how neuronal representations of acoustic and spatial features interact during auditory object formation. Here, we address two central questions about the neuronal transformation of an acoustic-based auditory scene representation into perceived auditory objects. First, we assess where in the cortical auditory hierarchy neuronal representations of spatial and acoustic sound features interact during auditory object formation and in what way. Second, we evaluate the effects of attention and moment-by-moment fluctuations in energetic masking on the transient emergence and decay of – multi-dimensional – feature and object representations. We address these questions by analyzing invasive intracranial neuronal responses in neurosurgical patients measured with stereotactic electroencephalography. Participants listened to mixtures of two concurrent, spatially separated speakers (locations -45° and +45°). To reveal the transient emergence of neuronal representations of acoustic features (speaker spectral profile), spatial features (location) and auditory objects (speech), we extracted the envelope of the high-gamma band (70-150Hz). Our findings show that spatial and acoustic feature representations are transformed from an acoustic-based into an attention-dependent, acoustic-invariant representation at different stages of the cortical auditory hierarchy. Classifiers revealed the transient emergence and decay of spatial and acoustic feature representations in distributed neuronal activity patterns, modulated by the moment-by-moment strength of energetic masking in distinct ways. These findings emphasize the multi-dimensional character of auditory object encoding in the cortical auditory hierarchy and shed new light on the interaction between the ventral and dorsal auditory stream.
Christina der Nederlanden, Jessica Grahn, Marc Joanisse, Tineke Snijders and Jan-Mathijs Schoffelen
Topic areas: memory and cognition speech and language correlates of behavior/perception
neural tracking familiarity music and language comparisonsThu, 11/4 1:15PM - 2:15PM | Virtual poster
Abstract
Several studies have shown that humans neurally phase-lock to the slow rhythms of speech and that the strength of this phase locking is related to comprehension. However, other top-down factors such as attention or language background also also have a significant impact on phase-locking that can be unrelated to comprehension. Our study examined how melody familiarity altered phase-locking to words set to familiar melodies. Sung utterances were paired with matched spoken versions to examine the difference between neural tracking of utterances when they are sung and spoken. Participants learned 12 novel melodies for four days leading up to MEG testing. During the MEG session, listeners heard 4 blocks of 12 trained and 12 additional novel melodies with 2 different text settings (48 trials) and with each text setting presented as speech (48 trials; 96 trials in total). Participants rated how familiar the melody of each utterance was during testing. Listeners showed greater neural tracking, cerebro-acoustic phase coherence (degree of phase consistency between MEG and Amplitude Envelope), to sung over spoken utterances overall. There was greater neural tracking of words sung to melodies that were rated as familiar compared to unfamiliar, but this effect extended to familiar spoken utterances, as well. There was no difference between subsequent memory for words sung to familiar over unfamiliar melodies. Familiarity and putting the melodic and rhythmic features of song are significant modulators of neural tracking, but their relationship to better processing outcomes, like comprehension or improved memory, is unclear.
Xiao Yang, Yixuan Li, Qiyuan Feng, Xinyi Gao, Megumi Hatori, Misako Komatsu and Joji Tsunada
Topic areas: neuroethology/communication
vocal communication ECoG marmoset monkeyFri, 11/5 11:15AM - 12:15PM | Virtual poster
Abstract
Vocal interactions require a series of cognitive processes that include sensory processing of vocalizations, decision making about whether and how to respond to them, and generating vocal motor output. Neuroimaging and neurophysiology studies in humans and non-human primates have identified cortical areas involved individually in voice perception and vocal production: e.g., voice or speech selective regions in the anterior temporal lobe and distinct roles of Broca’s area and speech motor cortex in vocal production. However, it remains unknown how the brain forms a decision for a vocal response during vocal interactions. Here, we recorded cortical activity using epidural electrocorticography (ECoG) while marmoset monkeys vocally interacted with other monkeys (partners). We found broad cortical activation in the theta band activity while listening to partners’ vocalizations. Interestingly, sensory activation observed in the ventrolateral prefrontal, posterior parietal, and auditory cortices was different depending upon whether monkeys responded to partners’ vocalizations, allowing us to decode animals’ behavioral responses on a call-by-call basis with ~80% accuracy. Furthermore, parietal activation, but not activation in other cortical areas, predicted when monkeys vocally responded, suggesting the specific role of the parietal cortex in timing control of vocal production. These results suggest that monkeys make a decision for the vocal response while listening to others’ vocalizations, and distributed brain areas may differentially encode not only whether or not but also how animals respond. This study was supported by JST, Moonshot R&D, Grant Number JPMJMS2012 (M.K.).
Lalitta Suriya-Arunroj, Yale Cohen and Joshua Gold
Topic areas: correlates of behavior/perception
Top-down processing Bottom-up processing Auditory decision-making Non-human primates Rhesus macaquesFri, 11/5 10:30AM - 10:45AM | Short talk
Abstract
Auditory perceptual decision-making is modulated by interactions between bottom-up (sensory-driven) and top-down (expectation-driven) processes. Despite the importance of these interactions, little is known about their underlying neural mechanisms. We investigated these mechanisms by recording neural activity in the auditory and prefrontal cortices of rhesus monkeys while they performed a challenging auditory decision-making task. The monkeys decided whether the last (“test tone”) in a sequence of tone bursts, embedded in broadband noise, was low or high frequency. Task difficulty was titrated by varying the sound level of the test tone relative to the noisy background. Bottom-up expectations were manipulated by presenting three identical low- or high-frequency tone bursts (“pre-tones”), establishing sequence regularity. Top-down processing was manipulated by presenting a visual cue that indicated the prior probability that the subsequent test-tone would be high or low frequency (“pre-cue”). The monkeys’ behavioral choices and response times were biased by both the pre-tones and the pre-cues, with stronger and more consistent effects by the pre-cues. Neural activity was modulated preferentially by the pre-tones in auditory cortex and by the pre-cues in prefrontal cortex. These findings imply functional segregation between bottom-up and top-down processing in the primate brain during auditory perceptual decision-making.