- Home
- Previous APAN Programs
- APAN Session Browser 2023
APAN Session Browser 2023
Lori Holt
Topic areas: neural coding correlates of behavior/perception
Fri, 11/10 9:00AM - 10:00AM | Keynote Lecture
Abstract
Speech is an undeniably significant portion of our auditory diet. Yet, early theoretical approaches largely separated studies of speech perception from advances in auditory neuroscience. As a result, the reciprocal benefits of crosstalk are just beginning to be realized. Taking examples from auditory learning and plasticity, I will describe how the challenges of everyday speech communication can inspire new ways to think about how listeners use acoustic input regularities to build flexible new representations, to direct attention, and to adapt to changing soundscapes.
Nai Ding
Topic areas: neural coding correlates of behavior/perception
Fri, 11/10 2:35PM - 3:00PM | Young Investigator Spotlight
Abstract
Human listeners are remarkably good at recognizing speech in noisy environments and grouping basic speech sounds, e.g., phonemes, into larger units such as words and sentences to extract meaning. How the auditory system builds an environment-invariant speech representation and sequentially group basic speech units into larger units are central questions in auditory neuroscience and cognitive neuroscience. In the first part, I will present a series of experiments that investigate how the auditory system encodes speech features in the presence of an intense long-delay echo. The echo greatly attenuates temporal modulations that are critical for speech recognition, and both speech intelligibility indices and automatic speech recognition systems predict that the echo greatly reduce intelligibility. Young human listeners, however, recognize echoic speech with ceiling accuracy and MEG experiments show that their auditory system could restore the temporal modulations eliminated by the echo. Computational modeling suggests that these modulations are restored by segregation speech and echo into different auditory streams. In the second part, I will present recent experiments testing whether low-frequency neural activity can serve as a marker for sequential auditory grouping. It is demonstrated that when the same word sequence is grouped differently, cortical activity recorded by MEG reliably track the perceived groups. These studies demonstrate that the large-scale cortical dynamics that can be recorded using MEG and EEG provide a powerful tool to characterize the neural computations underlying auditory perception.
Kameron Clayton, Matthew McGill, Bshara Awwad, Kamryn Stecyk, Divya Narayanan, Caroline Kremer, Yurika Watanabe, Kenneth Hancock and Daniel Polley
Topic areas: auditory disorders correlates of behavior/perception neural coding thalamocortical circuitry/function
Fri, 11/10 12:00PM - 1:00PM | Podium Presentations 1
Abstract
The auditory periphery converts a million-million-fold change in acoustic signal energy (120 dB) into an electrochemical code for sound intensity. The central auditory pathway, in turn, converts this sound intensity code into the perception of loudness. The essential circuitry for intensity-to-loudness transformation has not been identified, though there is reason to believe that dysfunction in these circuits could produce hypersensitivity to sound. Here, we tested the hypothesis that local circuits formed between parvalbumin-expressing (PV) GABA neurons and excitatory (RS) neurons in the mouse primary auditory cortex (A1) mediate loudness perception and are a critical failure point in hyperacusis. A1 single unit recordings revealed heterogeneous intensity tuning that was aggregated at the level of the cortical column into a linear readout of sound level. Optogenetic inactivation or activation of A1 PV neurons imposed opposite shifts in sound intensity coding towards neural hyperacusis or hypoacusis, respectively (n=6/177 units). To directly relate PV activity to loudness perception, head-fixed mice were trained in a two-alternative forced choice categorization task. Bilateral optogenetic inactivation or activation of A1 PV neurons immediately and reversibly shifted the perceptual boundary between soft and loud sound reporting, respectively, suggesting that A1 PV neurons function as a volume knob for loudness perception (N=12). To test whether reduced PV-mediated inhibition is an underlying cause of hyperacusis, we induced high frequency hearing loss with noise. After cochlear injury, A1 RS units were hyper-responsive to spared mid-frequency tones and were less suppressed by optogenetic PV activation (N=6/484 units). Noise-exposed mice also exhibited hyperacusis in loudness categorization compared to sham-exposed controls (N=10). Importantly, optogenetic activation of PV neurons in hyperacusic mice (N=5) transiently shifted loudness categorization back to pre-exposure levels. Finally, we asked if high-frequency stimulation of PV neurons could reinvigorate PV-mediated inhibition and stably reverse hyperacusis. We found that activating PV neurons for 15 minutes at 40 Hz – but not 1 Hz - reduced sound-evoked spiking in A1 RS units up to 60 minutes later (N=13/242 units) and shifted behavioral loudness categorization towards softer sounds for up to 1 week (N=4). These findings identify new therapeutic targets for auditory hypersensitivity disorders.
Manaswini Kar, Kayla Williams and Srivatsun Sadagopan
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Fri, 11/10 12:00PM - 1:00PM | Podium Presentations 1
Abstract
During active listening, we pay attention and expend effort to improve sound recognition. But how active listening enhances stimulus representations in the auditory pathway remains unclear. In the primary auditory cortex (A1), the reported effects of active listening on single neuron tuning and response magnitude vary widely. Here, we investigated this question using complex and ethologically relevant sounds (vocalizations, or calls) in the context of a call categorization task. First, to confirm the necessity of A1 for call processing, we trained guinea pigs (GPs) on a calls vs. noise discrimination task, and inactivated A1 pharmacologically. Unilateral A1 inactivation did not affect task performance, but bilateral A1 inactivation resulted in performance dropping to near-chance levels, demonstrating that A1 is critical for call processing. Next, we obtained large-scale A1 recordings using implanted Neuropixels probes while GPs performed a call categorization task. We observed lamina-specific effects, with the superficial (L2/3) and deep layers of A1 showing the highest modulation during active listening. Specifically, A1 L2/3 neurons retained high selectivity for call features across the active and passive conditions, but displayed greatly increased output nonlinearities during active listening. To explain these results, we extended our categorization model by allowing the output gains of model neurons to be nonlinearly modulated using power-law scaling. Increasing the nonlinearity of model L4 output gains did not improve model performance, but increasing the nonlinearity of model L2/3 output gains led to systematic performance increases. Using dimensionality reduction techniques, we attributed these performance increases to more compact and better separated representations of call categories. Taken together, our results demonstrate the necessity of A1 for call perception, and indicate that active listening modulates the activity of A1 L2/3 neurons in a manner that preserves their high feature selectivity but leads to enhanced call category decoding.
Xindong Song, Yueqi Guo, Chenggang Chen and Xiaoqin Wang
Topic areas: correlates of behavior/perception cross-species comparisons hierarchical organization novel technologies
Fri, 11/10 12:00PM - 1:00PM | Podium Presentations 1
Abstract
How the brain processes pitch on complex sounds has been one of auditory neuroscience’s central questions due to the importance of pitch in music and speech perception. The cortical representation of pitch has been demonstrated by one pitch-sensitive region near the anterolateral border between A1 and R in the common marmoset (Bendor and Wang, 2005). However, it’s not clear if there exist other pitch-processing regions in the marmoset brain. Here, we performed optical imaging over the entire auditory cortex on the brain surface in awake marmosets. By contrasting responses to harmonic complex sounds with spectrally matched noises, we identified two discrete pitch-sensitive regions. One region is located anterolaterally to the A1 and R border and is consistent with the previously described “pitch-center” by single-unit recording. The second region is newly found at the location more anterior to the “pitch-center” and functionally overlaps with the RT field, referred to as “anterior pitch-region”. When tested by synthetic tones comprised of low-numbered harmonics, these two pitch-sensitive regions only appear when the fundamental frequency (F0) is close to or higher than 400Hz, a phenomenon consistent with the estimated harmonic resolvability of the marmoset (Osamanski et al, 2013, Song et al, 2016). The response contrasts in these two pitch-sensitive regions were also robust when tested by more natural sounds such as a female’s singing of a-cappella songs (F0 ~300-700Hz). Furthermore, the ratio between the singing contrast and the synthetic tone contrast is higher in the “anterior pitch-region” than in the anterolateral “pitch-center” in all tested subjects. Together, our results suggest that the cortical pitch processing in marmoset is organized into discrete regions with a functional hierarchy along the anterior direction for natural harmonic sounds.
Ilina Bhaya-Grossman, Matthew Leonard, Yizhen Zhang, Laura Gwilliams, Keith Johnson and Edward Chang
Topic areas: speech and language correlates of behavior/perception
Fri, 11/10 12:00PM - 1:00PM | Podium Presentations 1
Abstract
Languages differ in the set of contrasting sounds (phonetic features), and the ways in which these sounds are sequenced (phonotactics) to produce units of meaning like words. By the time a speaker is proficient in a language, they have had extensive experience and exposure to both of these sources of information, which fundamentally alters how that signal is understood. It remains unknown, however, how such language experience affects the neural encoding and processing of this information at these phonological levels. Here, we used direct high-density electrocorticography (ECoG) to identify neural populations that respond to speech in native vs. unfamiliar languages, and to address the extent to which language-specific processing is associated with distinct levels of phonological representation. We performed ECoG recording while participants passively listened to natural speech in their native language and a language that was unfamiliar to them (either Spanish or English). We found that both languages elicited significant average responses to speech in nearly all cortical sites throughout the superior and middle temporal lobe, suggesting that the same neural substrates are active regardless of language familiarity. However, when we examined the encoding of specific phonological information, we found striking language-specific response patterns. Cortical sites that were sensitive to sequence-level phonological information like phonotactics showed significantly enhanced representations of these features in the native language. Similarly, neural populations showed more robust decoding of word boundaries when presented with the subject’s native language compared to an unfamiliar language. In contrast, the encoding of acoustic-phonetic features showed a lesser degree of difference as a function of native language experience. Critically, cortical sites that encoded enhanced sequence-level phonological information in the native language also encoded acoustic-phonetic features in both the native and unfamiliar language. Together, these results demonstrate that native language experience affects neural speech representations at the level of phoneme sequences and words, and further, that this effect occurs at cortical sites in the human temporal lobe where the acoustic-phonetic features of all speech sounds are represented.
Celine Drieu, Ziyi Zhu, Ziyun Wang, Kylie Fuller, Aaron Wang, Sarah Elnozahy and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 3:00PM - 3:45PM | Podium Presentations 2
Abstract
Goal-directed learning is traditionally considered a slow and gradual process. An alternative view suggests that animals, including humans, experience insightful moments with rapid, step-like changes during learning. Recent research reconciles these views, arguing that latent task knowledge can emerge rapidly even though behavioral performance improves gradually. The neural mechanisms that drive these parallel learning processes, however, remain unknown. The sensory cortex, given its brain-wide inputs from feedforward sensory, ascending neuromodulatory, and top-down frontal and motor regions, is a promising candidate for identifying these neural dynamics. Here, we trained mice on an auditory go/no-go task, employing optogenetic suppression and two-photon calcium imaging to investigate the role of the auditory cortex (AC) during learning. We found that optogenetic suppression during the stimulus or reward period respectively delayed learning. Complete trial suppression led to even greater delays. These effects waned during learning until vanishing when the animals reached expert performance, indicating a transient, associative, and instructional role for the AC. We then longitudinally tracked the same large excitatory network in layer II/III (n=4,643 neurons) using two-photon calcium imaging in mice performing the task (n=5) or listening passively to the same pure tones (n=3) over 15 days. Using unsupervised low-rank tensor decomposition, we identified distinct neural ensembles that encoded task contingencies, including reward prediction and behavioral inhibition. Remarkably, these contingency-related signals emerged within just 20-40 trials exhibiting ‘insight-like’ properties, strengthened during task acquisition, and eventually receded during extended training. The contingency-specific ensembles were organized into spatial domains separate from underlying stimulus representations, indicating a higher-order functional segregation within the AC. Therefore, latent task knowledge manifested early and rapidly in cortical networks in the form of contingency-related signals. These data challenge the classical view of goal-directed learning as slow and gradual and suggest the sensory cortex serves as an associative engine, directly linking stimuli with actions that produce desirable outcomes.
Matheus Macedo-Lima, Lashaka Hamlette and Melissa Caras
Topic areas: correlates of behavior/perception
Fri, 11/10 3:00PM - 3:45PM | Podium Presentations 2
Abstract
Sensory acuity can benefit from practice, through which we can improve our ability to see, hear, smell, and taste – a process termed perceptual learning. In the auditory system, perceptual learning supports the development of speech and musicality and improves speech recognition in the hearing-impaired. Non-sensory processes, like attention or reward, make critical contributions to perceptual learning, but the neural circuits and mechanisms that mediate their involvement are poorly understood. The orbitofrontal cortex (OFC) has well-established roles in signaling reinforcement and in transmitting state-dependent feedback via direct projections to auditory cortex (AC). We hypothesized that OFC neurons provide non-sensory input to AC that supports auditory perception and perceptual learning. If OFC transmits a non-sensory signal to AC that shapes AC sensitivity and perception, then silencing OFC activity should disrupt AC sensitivity and impair behavioral sound detection. To test this prediction, we used muscimol (a GABAa agonist) to inactivate bilateral OFC, and simultaneously recorded extracellular responses from AC neurons in freely moving Mongolian gerbils of both sexes as they performed an amplitude modulation (AM) detection task. We found that inactivation of bilateral OFC significantly impaired both behavioral and AC neural AM detection. Next, we asked whether OFC neurons exhibited learning-related changes in activity by using chronically implanted electrode arrays to record from OFC neurons as gerbils trained on an auditory perceptual learning task with progressively more challenging AM stimuli. We found that the firing rates of OFC neurons gradually increased and correlated with the degree to which perceptual thresholds improved. To determine whether learning affected the specific subpopulation of OFC neurons that innervate the AC, we used fiber photometry to record calcium signals from just the OFC neurons that project to the AC as gerbils underwent auditory perceptual learning on the same task. We found that calcium signals in these cells grew larger as perceptual thresholds improved, suggesting that OFC neurons send progressively stronger signals to AC over the course of perceptual learning. Our results support the hypothesis that the OFC facilitates practice-dependent improvements in perception and AC sensitivity via a direct projection to AC.
Kaho Magami, Sijia Zhao, Claudia Contadini-Wright, Mert Huviyetli and Maria Chait
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 3:00PM - 3:45PM | Podium Presentations 2
Abstract
Microsaccades (MS) are tiny, involuntary eye movements that occur during fixation. They are controlled by a network involving the frontal eye fields and the superior colliculus and are believed to represent the unconscious continuous exploration of the environment. Recent findings, predominantly in vision, suggest that this sampling is affected by the attentional state of the individual: MS incidence reduces during -and in anticipation of- task-relevant events and under high load. Despite the potential wealth of information conveyed by MS, our understanding of how auditory perceptual processes interact with the attentional mechanisms that regulate MS is limited. We report on a series of experiments (each N=~30) in which we investigated how sounds, and listener engagement with sound, affect MS dynamics. We employed various auditory tasks that captured different aspects of auditory attention, including listening effort, bottom-up attentional capture, and selective attention. Our results demonstrate that auditory attention modulates MS dynamics. Specifically, auditory-evoked microsaccade inhibition (MSI; rapid reduction in MS rate) was influenced by bottom-up attentional capture, exhibiting more pronounced effects for perceptually salient events. MS dynamics were also modulated by top-down attention, such that sounds in an attended stream elicited larger MSI than sounds in an ignored stream. In experiments where participants performed a speech-in-noise task, MS rate was specifically modulated at critical points in the sentence (keywords) where attentional demands were highest. A comparison between concurrently recorded MS and pupil dilation (a common index of instantaneous and sustained arousal) indicate that MS rate specifically indexes the allocation of instantaneous auditory attention - that is distinct from the modulation of arousal indicated by pupil dilation. Overall, these findings uncover the intricate interplay between auditory attention and the attention network controlling MS, establishing microsaccades as a valuable tool for measuring auditory attentional allocation and related deficits.
Ghattas Bisharat, Ekaterina Kaganovski, Anita Temnogorod and Jennifer Resnik
Topic areas: correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Our emotional state is a potent driver of decision-making and multi-layered cognitive processes such as learning and memory. At a much more fundamental level, emotional states can interact with the way we perceive sensory inputs. For example, the doorbell will sound louder when we are stressed, and hills will seem steeper when we are despondent. So, to understand how emotional states shape behavior, we first need to understand how emotional states shape sensory processing and perception. How is emotional state-information represented in the sensory systems? How do changes in emotional state shape the perception of specific stimuli? For how long? To tackle these questions, we developed a rodent auditory behavioral approach that enables direct evaluation of perception. We coupled these behavioral paradigms to manipulations of the animal's emotional state (inducing chronic stress) and, using two-photon calcium imaging, detailed probing of neural activity modulations at the network and cell-specific levels in the auditory cortex. Our findings indicate that chronic stress decreases sound-evoked activity in the auditory cortex. These alterations are not caused by acute stress and occur gradually over time. Moreover, we found changes in perception as a result of stress-induced changes to cortical activity. Our findings could have clinical implications by explaining how stressful states distort the neural representation of sensory stimuli and alter our perception of the world.
Toshiaki Suzuki, Timothy Olsen and Andrea Hasenstaub
Topic areas: multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Recent studies have revealed that the auditory cortex (AC) not only processes auditory information but is also involved in behavioral output and is modulated by other sensory information. We have previously shown that the mouse AC contains a group of cells that are influenced by visual information (1). These neurons are mainly found in the deep layers of the auditory cortex. To identify likely sources of this visual information, we aimed to map the areas that project to deep AC, using localized microinjection of retrograde tracers using iontophoresis. Four C57/BL6 mice (Age at injection: P49-83) were injected with a viral mixture of AAV2retro-hsyn-EGFP and AAV9-hsyn-ChR2-mCherry in the right auditory cortex (Depth: -0.75mm) by iontophoresis injection. After 3-4 weeks, we perfused their brains and then sliced them. EGFP-positive cells were analyzed using Aligning Big Brains & Atlases (BioImaging And Optics Platform) with reference to the Allen Brain Atlas (2017 CCF v3). EGFP-positive cells were scattered throughout the whole brain, but 96.9% were found in the isocortex. The somatosensory cortex accounted for 15.2% and the visual cortex for 8.52%, while the auditory cortex accounted for 52.7% of the total signal. A closer look at the visual region revealed that while the primary visual cortex (VISp) accounted for 0.55%, EGFP-positive cells were also observed in higher visual cortices. While VISa: 0.93% and VISal: 2.54% were well represented and are located adjacent to the auditory cortex, the regions far from the auditory cortex also showed a certain percentage (VISam: 0.41%, VISpm: 0.15%, VISpor: 1.3%). These results provide evidence that the iontophoresis method can reveal the circuit structure of localized cells in the auditory cortex. The deep layers of the mouse auditory cortex, which have been thought to receive input from the medial higher visual cortex (2), actually receive input, large or small, from each of the visual cortex regions identified in this study. Reference 1. Morrill and Hasenstaub. (2018). Visual Information Present in Infragranular Layers of Mouse Auditory Cortex. J. Neurosci. 2. Banks et al. (2011). Descending projections from extrastriate visual cortex modulate responses of cells in primary auditory cortex. Cereb. Cortex.
Meredith Schmehl, Surya Tokdar and Jennifer Groh
Topic areas: correlates of behavior/perception multisensory processes neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
How the brain uses multisensory cues to process complex sensory environments remains a key question in neuroscience. Of particular interest is whether relatively early sensory areas, which are commonly considered to be unisensory in function, might take in information from other sensory modalities to inform the representation of the primary modality of interest (for review, see Schmehl & Groh, Annual Review of Vision Science 2021). We explored how visual cues might inform the representation of sounds in the macaque inferior colliculus, a subcortical auditory region that receives visual input and has visual and eye movement-related responses. We conducted in vivo single- and multi-unit extracellular recordings in the inferior colliculus while two monkeys (Macaca mulatta, one female age 15 years, one male age 7 years) performed a localization task involving both auditory and visual stimuli. We found that pairing a visual cue with a sound can change a neuron's response to that sound, even if the neuron is unresponsive to visual input alone. Visual cues also enhance localization behavior in both spatial precision and temporal latency. Finally, when two simultaneous sounds are present and one sound is accompanied by a visual cue, neurons are more likely to respond to the visually-paired sound on individual trials. Together, these results suggest that the inferior colliculus uses visual cues to alter its sound responsiveness and inform perceptual behavior, providing insight into how the brain combines multisensory information into a single perceptual object at a relatively early stage of the auditory pathway.
Or Yudco, Libi Feigin, Ido Maor and Adi Mizrahi
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Category learning is a fundamental brain process that enables quick and accurate response to novel stimuli in complex sensory scenes. In the auditory modality, category learning underlies many processes of sound perception, such as understanding of language in humans. While the learning process has been shown to be accompanied by changes in neuronal representation, its underlying mechanisms are not yet clear. Using an automated learning platform, we trained female mice (n = 22) to discriminate between two categories: rising frequency modulated (FM) sweeps and falling FM sweeps. At the end of training, we presented mice with novel stimuli in order to decipher the learned categorization rule and found that they the used frequency content of the sweep as the categorical boundary cue, rather than the slope of the sweep. Using the multiarray silicon probe, Neuropixels, we performed electrophysiological recordings from the auditory cortex of awake mice (n = 9 experts, n = 4 naïve) while listening passively to FM sweeps and pure tones. We acquired data from primary auditory cortex (AUDp, n = 389 neurons) and the auditory temporal association cortex (TeA, n = 91 neurons) and found that more neurons of expert mice prefer the frequency of the category boundary as compared to naïve mice. Furthermore, neurons of expert mice as well their population activity have higher discriminability between sets of both FM sweeps and pure tones. Our results show that the plastic changes in neuronal discriminability of sounds by neurons in the auditory cortex correspond to the behavioural strategy used by the mice to categorize sounds.
Benni Praegel, Adria Dym, Feng Cheng, Shaul Druckmann and Adi Mizrahi
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Adolescence is known to be a period of uncertainty, exploration, and learning. Our understanding of the underlying neural correlates of adolescence remains scarce. Here, we studied adolescence through the prism of auditory learning, and the neural representations of learned sounds in the auditory cortex of mice. We asked whether adolescent and adult mice discriminate tone categories differently, and how are these differences expressed in auditory cortical responses in behaving mice. First, we trained freely behaving mice (n=15 adult mice and n=15 adolescent mice) to perform a go no-go task of pure tone categories. We reveal weaker performance in adolescence compared to adulthood and found that it was attributed to specific biases. Second, we trained head-restrained mice (n=6 adult mice and n=5 adolescent mice) on the same task and performed two separate experiments: 1) We manipulated auditory cortex on a trial-by-trial basis using optogenetic silencing (n = 4 mice injected with the GtACR2 opsin and n = 2 control mice inject with AAV-CAMKII-GFP). Inhibiting auditory cortex in adult mice decreased performance, indicating a causal relationship between auditory cortex and tone categorization. 2) We recorded single units in the auditory cortex during engaged behavior using neuropixels (n=14 adult recordings and n = 13 adolescent recordings). We isolated units from primary auditory cortex, secondary auditory cortex, and temporal association area (n = 1174 units in adult mice and n = 1134 units in adolescent mice). We are currently evaluating the task-, stimulus- and choice-related activity in single neurons, as well as in population dynamics. These data will allow us to reveal the neural correlates of behavior in adolescence as compared to adulthood.
Michellee M. Garcia, Amber M. Kline, Hiroaki Tsukano, Collin M. Graves, Pranathi R. Dandu and Hiroyuki K. Kato
Topic areas: hierarchical organization subcortical processing thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
How our brain integrates information across parallel sensory channels to achieve coherent perception remains a fundamental question in neuroscience. In the auditory system, sound information reaches the cortex via two parallel pathways. The “primary” lemniscal pathway relays fast and accurate sound information to layer 4 (L4) of the primary auditory cortex (A1). Conversely, the “secondary” non-lemniscal pathway is considered a slow integrator of multisensory information relayed indirectly from cortical areas. Recent anatomical and physiological findings, however, challenge this simple dichotomy. These include our discovery of a short-latency ( < 10ms) sound input onto layer 6 (L6) of the secondary auditory cortex (A2), comparable in speed to the “fast” lemniscal input to A1 L4. Here, we examined the hypothesis that this short-latency input is conveyed via non-lemniscal pathways by conducting cortical area- and layer-targeted retrograde tracing. We found that A2 L4 and L6 receive inputs from distinct medial geniculate nucleus (MGN) subdivisions; specifically, A2 L6 receives input from the medial division of MGN (MGm) while A2 L4 is innervated by the caudal part of the ventral division of MGN (MGv). Interestingly, further MGN subdivision-specific retrograde tracing revealed that MGm and caudal MGv receive inputs from overlapping but distinct domains of the shell of the inferior colliculus, which in turn receive direct input from the cochlear nucleus. These findings demonstrate a non-lemniscal origin of parallel ascending pathways that bypass A1 and directly reach both the superficial and deep layers of A2. Moreover, our results suggest that caudal MGv, but not rostral MGv, belongs to the non-lemniscal pathway, despite the conventional view of MGv as a homogeneous lemniscal structure. Ongoing electrophysiology and optogenetic manipulation studies aim to investigate the sound response properties of these non-lemniscal pathways and explore how parallel ascending pathways are integrated in the cortex to shape perception.
Marios Akritas, Alex G. Armstrong, Jules M. Lebert, Arne F. Meyer, Maneesh Sahani and Jennifer F. Linden
Topic areas: memory and cognition neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The perceptual salience of a sound depends on the acoustic context in which it appears. Single-neuron correlates of this contextual sensitivity can be estimated from neuronal responses to complex sounds using the nonlinear-linear "context model" (Williamson et al. 2016 Neuron). Context models provide estimates of both the principal (spectrotemporal) receptive field of a neuron and a "contextual gain field" describing its nonlinear sensitivity to combinations of sound input. Previous studies of contextual gain fields in auditory cortex of anaesthetized mice revealed strong neuron-specific patterns of nonlinear sensitivity to sound combinations. However, the stability of these patterns over time, especially in awake animals, is unknown. We recorded electrophysiological activity of neurons in auditory cortex of awake mice over many days using chronically implanted tetrode arrays. Concurrently we recorded locomotor activity and pupil diameter to measure behavioural state. Repeated recordings were made at each recording site across at least five days, during presentations of prolonged complex sounds (dynamic random chord stimuli). We used spike-waveform matching to identify the same units recorded on different days, and the context model to estimate principal receptive fields and contextual gain fields for each neuron in each recording session. We then quantified the stability of these fields both within and across days. We also examined the dependence of context model fits on measures of behavioral state. Contextual gain fields of auditory cortical neurons in awake mice were remarkably stable across many days of recording. In more than 90% of the 69 neurons tracked for multiple days, neuron-specific patterns of sound combination sensitivity (and spectrotemporal sensitivity) remained stable on a timescale that matched or substantially exceeded the typical five-day range of our measurements. Interestingly, there were small but significant effects of changes in locomotion or pupil size on the ability of the context model to fit temporal fluctuations in the neuronal response. We conclude that contextual sensitivity is an integral and stable feature of the neural code in the awake auditory cortex, which may be modulated by behavioral state.
Corentin Puffay, Jonas Vanthornhout, Marlies Gillis, Bernd Accou, Hugo Van Hamme and Tom Francart
Topic areas: speech and language correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The extent to which the brain tracks a natural continuous speech stimulus can be measured by modeling the relationship between the stimulus features and the corresponding EEG. Typically acoustic features are used, but the neural tracking of lexical and linguistic features has also been shown. Lexical features (i.e., word and phoneme onsets) carry information about the prosody, while linguistic features (i.e., word surprisal, word frequency, phoneme surprisal, and cohort entropy) carry information about the value of a word or a phoneme considering the semantical context. Such information can be used as a marker of speech understanding. Nonlinear deep learning models have recently been used to assess neural tracking of lexical and linguistic speech features. We here evaluate these models on a dataset with various speech rates to manipulate speech understanding and investigate how speech rate affects the neural tracking of linguistic features. We use the EEG of 18 participants who listened to stories at various speech rates. We developed a deep neural network, trained on a match-mismatch task to measure the contribution of linguistics features to neural tracking on top of the contribution of lexical features. In this task, the model must choose whether an EEG segment matches the auditory stimulus that evoked it (matched) or another arbitrary segment (mismatched). Without re-training, we evaluate this model on different speech rates. To assess whether neural tracking is related to speech understanding, we compare model performance with behavioral measures. We hypothesize that neural tracking of linguistic features is affected by the speech rate, which provides an objective measure for speech understanding. As opposed to subject-specific linear models, our deep learning model can model nonlinearities in the brain response and does not require training on new subjects to perform the speech understanding assessment.
Michael Johns, Regina Calloway, I. M. Dushyanthi Karunathilake, Samira Anderson, Jonathan Simon and Stefanie Kuchinsky
Topic areas: correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Listening to speech in noise requires substantial effort, even for young normal-hearing adults. While many studies have examined listening effort in response to single words or sentences in noise, few have examined the trajectory of listening effort across longer stretches of speech or how sustained listening may interact with listeners’ expectations about upcoming difficulties—both of which may be more representative of real-world listening. In the present study, 17 younger normal-hearing adults listened to 60-s long audiobook passages while pupil size was recorded. Pupil size has been found to track changes in both attention mobilization—how listeners prepare their attention, reflected by baseline pupil size (BPS)—and effort allocation—how listeners deploy their cognitive resources, reflected by the task-evoked pupil response (TEPR). Participants were instructed to attend to one of two competing speakers, with the target speaker presented at either the same (0 dB SNR) or a quieter level (-6 dB SNR) than the non-target speaker. Passages were blocked by SNR and repeated three times in a row. Generalized additive mixed models were used to analyze the time course of the trial-level TEPR as a nonlinear function of BPS (median pupil size during a 2-s neutral pre-stimulus period), SNR, and presentation (first, second, or third). There was a significant modulation of the TEPR by BPS that differed by SNR and presentation. At lower BPS values, the TEPR was significantly larger in the harder -6 dB compared to the 0 dB SNR condition for all three presentations, suggesting that under-mobilization of attention led to increased listening effort despite listeners being able to anticipate the difficulty of subsequent presentations. At intermediate BPS values, reflective of more optimal attention mobilization, differences between the two SNR conditions were largely absent. Lastly, at higher BPS values, the TEPR was significantly reduced in the -6 dB compared to the 0 dB SNR condition for the second and third presentations, with a larger and more sustained difference in the second than in the third presentation. This suggests that, in an over-mobilized or more distractible state, listeners initially ‘gave up’ or under-allocated listening effort to the harder SNR condition. Together, these findings suggest that how listening effort unfolds over time depends on how individuals mobilize their attention in anticipation of difficult listening conditions.
Sophie Bagur, Jacques Bourg, Alexandre Kempf, Thibault Tarpin, Khalil Bergaoui, Yin Guo, Sebastian A Ceballo, Joanna Schwenkgrub, Antonin Verdier, Jean-Luc Puel, Jérôme Bourien and Brice Bathellier
Topic areas: correlates of behavior/perception hierarchical organization neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The temporal structure of sounds is essential for their interpretation. Sensory cortex represents these temporal cues via two codes: the temporal sequences and the spatial patterns of neuronal activity. It is unknown which of these coexisting codes causally drives sensory decisions. To separate their contributions, we engineered optogenetically-driven activity in the mouse auditory cortex to elicit neural patterns differing exclusively along their temporal or spatial dimensions. We trained mice to discriminate these two types of patterns and found that they could learn to behaviourally discriminate spatial but not temporal patterns. It is well known that mice can successfully learn to discriminate sounds differing only by their temporal structure. Therefore, we asked how such temporal sensory information can be behaviourally discriminated given our experimental observation that downstream learning mechanisms fail to exploit temporal neuronal information. We performed large-scale neuronal recordings across the auditory system (inferior colliculus, thalamus and auditory cortex) combining calcium imaging and electrophysiology. These results revealed that the auditory cortex is the first region in which spatial patterns efficiently represent all temporal cues at a timescale of hundreds of milliseconds. Interestingly, this occurs without loss of neural temporal information, demonstrating a hybrid cortical coding scheme in which temporal and spatial codes carry redundant information. The emergence of a spatial code for temporal cues between thalamus and cortex can explain why cortex is necessary for the discrimination of temporally structured sounds but not of pure tones, as shown by multiple inactivation studies. A quantitative model of associative learning shows that fast learning requires spatial representations which are found in subcortical structures for pure tones but not for temporal auditory cues. Finally, this feature is shared by the deep layers of neural networks trained to categorise time-varying sounds but not on other tasks. It is therefore likely a general condition to associate temporally structured stimuli with categorical decisions. Overall, our results identify a division of labour for the processing of temporal information: the late auditory system transforms temporal information into spatial patterns and downstream decision networks associate these spatial patterns with behavioural output.
Jenna Blain, Monty Escabi and Ian Stevenson
Topic areas: neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Spectrotemporal receptive fields (STRFs) are used to model the time-frequency sensitivity of auditory neurons. In many instances, STRFs are derived using unbiased synthetic stimuli, such as dynamic ripples or random chords, which can easily be estimated using spike-triggered averaging. When natural sounds are used, decorrelation and regularization techniques are needed to remove residual stimulus correlations that can distort the estimated STRFs. Furthermore, nonlinearities and non-stationarities make it difficult to predict neural responses to natural sounds. We obtained neural recordings from the inferior colliculus of unanesthetized rabbits in response to a sequence of natural sounds and dynamic moving ripples (DMR). We developed a model-based approach for deriving auditory STRFs and for predicting single trial spike trains to either the DMR or the natural sounds. The model consists of a nine parameter Gabor STRF (gSTRF; Qiu et al. 2003), which accounts for the neuron’s spectro-temporal integration of the stimulus and a four parameter nonlinear integrate-and-fire compartment which incorporates intrinsic noise, cell membrane integration, and nonlinear thresholding to generate simulated output spikes. We used Bayesian optimization to fit neural data and derive optimal model parameters by maximizing the model’s log-likelihood. To validate our spiking gSTRF model, we compared the optimal gSTRFs to other approaches such as regularized regression and a generalized linear model. We compared STRFs derived using DMR and natural sounds for each of these estimators as well as spike train predictions obtained from each model. We also carried out these comparisons with simulated data where the “ground truth” STRF and spiking activity was known a priori. For these simulations, we demonstrate that the gSTRF converges to the original simulation parameters and replicates the spiking activity from the original simulations to millisecond precision. Furthermore, for real neural data the gSTRF allows us to estimate physiologically interpretable parameters, such as the neuron’s best frequency, delay, and best temporal and spectral modulation frequency from the optimized gSTRF parameters. Our approach allows one to derive auditory STRFs and predict neural spiking activity to natural sounds using functionally interpretable basis functions. The small number of parameters make exploration of nonlinear and nonstationary effects due to natural sound statistics more feasible.
Brendan Williams, Tanya Danaphongse, Samantha Kroon, Yuko Tamaoki, Jonathan Riley and Crystal Engineer
Topic areas: auditory disorders speech and language correlates of behavior/perception novel technologies
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The comprehension and identification of human speech is dependent on the health of the auditory pathway. In neurodevelopmental disorders like autism spectrum disorder (ASD), developmental alterations to the auditory pathway can disrupt typical function - causing a cascade of processing errors. These physiological deficits in how sound is processed are often linked to poor performance in clinical evaluations of language. Modest improvements are accessible through extensive speech therapy, but many individuals still report deficits following treatment. To improve therapeutic outcomes, adjunctive therapies are needed. One potential adjunct to traditional speech therapy is vagus nerve stimulation (VNS). When paired with a sound, VNS drives plasticity in the auditory cortex, improving neural response strength and latency to the paired sound. Utilizing in-vivo multi-unit electrophysiology and go/no-go behavioral discrimination tasks, this study assesses the physiological and functional discrimination ability of rats prenatally exposed to valproic acid (VPA). A subset of VPA-exposed rats will receive sound-paired VNS to determine whether VNS driven improvements in auditory processing can overcome the physiological and behavioral deficits previously reported. Preliminary data suggests that despite having significantly weaker mean response to the 40ms onset of speech sounds in the anterior auditory field (AAF), VPA-exposed rats exhibit no behavioral deficit in discriminating these sounds. Furthermore, for VPA-exposed rats who received VNS, no significant behavioral changes were observed. Ongoing experiments include further characterizing the behavioral discrimination abilities in VPA-exposed rats, and using VNS-sound pairing to restore AAF responses to sounds. The results of this research will contribute towards the characterization of the VPA-exposure model of autism, and our understanding of the relationship between AAF physiology and behavioral sound discrimination. This research is one step towards our long-term goal of developing novel interventions which improve sound processing and language ability for individuals with neurodevelopmental disorders.
Kaho Magami, Marcus Pearce and Maria Chait
Topic areas: memory and cognition
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The auditory system possesses the remarkable ability to track environmental statistics (‘context’) and predict upcoming events, even when not consciously attending to sound. However, the effect of interruptions on the retention of old context in memory remains unknown. To address this question, thirty-one participants passively listened (while performing an incidental visual task) to regularly repeating tone sequences (50 ms tones; cycles of 10 tones) while EEG responses were measured, with 75% of scenes including brief interruptions (1, 3, or 5 random tones introduced partway). The results revealed that the learning trajectory of the sound context was reflected in the EEG sustained response, characterized by an increase and plateau of sustained power with exposure to the sound context, followed by a drop upon interruption onset and subsequent recovery as the original context re-emerged. The recovery slope varied across conditions, with shorter interruptions leading to steeper slopes. Moreover, the power in the 3- and 5-tone interruption conditions never fully recovered to the level observed in the no-interruption condition. Importantly, these dynamics were consistent with predictions from an ideal observer model (IDyOM; PPM; Pearce, M. T. (2005); Harrison et al., (2020)), which quantifies the entropy (uncertainty) of each tone-pip in the sound sequence. The alignment between model predictions and sustained EEG responses suggests that brief interruptions may alter the internal representation of ongoing regularity, leading to higher uncertainty - the brain does not disregard even brief interruptions as mere noise, altering the long-term representation of the ongoing context. References: Pearce, M. T. (2005). The Construction and Evaluation of Statistical Models of Melodic Structure in Music Perception and Composition. Doctoral Dissertation, Department of Computer Science, City University of London, UK. Harrison, P. M. C., Bianco, R., Chait, M., & Pearce, M. T. (2020). PPM-Decay: A computational model of auditory prediction with memory decay. PLoS computational biology, 16(11), e1008304. https://doi.org/10.1371/journal.pcbi.1008304
Alexander J. Billig, Ariane E. Rhone, Kirill V. Nourski, Dominic Gray, Joel I. Berger, Christopher M. Garcia, Christopher K. Kovach, Christopher I. Petkov, Brain J. Dlouhy, Hiroto Kawasaki, Matthew A. Howard, Griffiths Timothy D. and Steinschneider Mitchell
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Music is present in all human cultures, can evoke powerful emotional responses, and has potential in a range of therapeutic settings. Neural transformations from acoustic to higher-order representations underlying these effects during naturalistic listening have rarely been studied with high spatiotemporal resolution. We used techniques from continuous speech modelling (Broderick et al, J Neurosci 2019 39:7564-75) to examine the contribution of different acoustic and musical features to responses throughout the human brain. Participants were 55 patients undergoing pre-surgical intracranial electroencephalography (iEEG) monitoring for epilepsy using cortical grids and depth electrodes. They listened passively to at least three pieces of Western classical and popular music. Following a study in mice (Martorell et al, Cell 2019 177:256-271), which found that 40 Hz click trains entrained hippocampal unit firing and reduced Alzheimer’s disease-like pathology, we also presented a subset of participants with a 40 Hz sinusoidally amplitude-modulated version of one musical piece. Multivariate temporal response functions were derived from the iEEG data, relating the 1-8 Hz bandpass-filtered neural signal at lags of 0-600 ms to acoustic and musical stimulus features (envelope, rectified envelope derivative, spectrogram, key clarity and stability). Cross-validated prediction accuracies were tested against results from a null distribution obtained by permuting stimulus information. Stimulus envelope most strongly predicted responses in core auditory cortex within posteromedial Heschl’s gyrus, with peaks in the temporal response function within 100 ms. In a subset of envelope-following sites at mostly superior temporal locations, the inclusion of onset and spectral information in the model explained additional response variance. Higher-level musical features (key clarity and stability) were most strongly represented at more anterior and medial temporal sites, including temporal pole, parahippocampal gyrus, and hippocampus. Encoding of these features was maximal at lags greater than 300 ms. For amplitude-modulated stimuli, 40 Hz iEEG power increased at widely distributed sites, and in the small number of participants with single unit data, firing in auditory cortex but not hippocampus aligned to particular phases of the 40 Hz cycle.
Mohsen Alavash, Malte Woestmann and Jonas Obleser
Topic areas: auditory disorders correlates of behavior/perception novel technologies
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Complex auditory scenes come with perceptual challenges that render a listener slower and more uncertain in their perceptual decisions. Here, using a selective pitch discrimination task, we aim to explain such behaviors by the dynamics of cortical networks derived from functional magnetic resonance imaging (fMRI) and source-localized electroencephalography (EEG). During the task, a spatial cue prompted the listeners to attend to one of two concurrent tone streams and to judge the pitch difference (increase/decrease) in the target stream. Listeners additionally reported their confidence in own decisions. Individual titration of the pitch difference throughout the task retained the listeners’ accuracy at ~70% but yielded considerable inter-individual variability in response speed and confidence. The fMRI study (N=40, young listeners) allowed us to characterize a significant modulation of interconnectivity between cinguloopercular network and each auditory and dorsal attention network, the degree of which were predictive of individual metacognitive differences in listening, i.e., response speed and confidence. The EEG study (N=33/35, young/older listeners) revealed a significant increase in frontoparietal connectivity within low-beta frequency range (16-24 Hz) during processing of the auditory spatial cue, the degree of which was significantly stronger in younger than older listeners. Our findings support the functional significance of large-scale brain networks beyond auditory cortex in attentional control during selective listening. The connectivity dynamics of these networks hold explanatory power to account for interindividual differences in objective and subjective measures of listening behavior.
Danyi Lu, Jeffrey Johnson, Kara Sisson, Melissa Stroud, Katie Neverkovec, Kendall Stewart, Jordan Roberts and Gregg Recanzone
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Auditory cortical processing in primates has been proposed to be divided into at least two parallel processing streams, a caudal spatial stream and a rostral non-spatial stream. Whereas functional imaging studies in humans have supported this hypothesis, few studies have investigated neural processing at the single-cell level in the auditory cortex of nonhuman primates. Therefore, we recorded single neurons from auditory cortex while an adult male macaque monkey was performing a two-alternative forced choice task to discriminate either the modulation frequency or the spatial location of a broadband amplitude-modulated noise on alternating blocks of trials. Stimuli were 500ms duration, 65 dB SPL, 100% modulation depth broadband noise presented from 90 – 170 degrees approximately 1 m from the center of the monkey’s head. The macaque was trained to initiate a trial by moving a joystick. The first stimulus (S1) was modulated at either 17 or 34 Hz, presented from 130 degrees, followed by the second stimulus (S2) 500 ms after the offset of the S1. In the temporal task, the modulation frequency of the S2 varied from the S1 by +/- 1 octave in 7 equal octave steps, and it was presented from the same speaker as S1. For the spatial task, the same stimulus as the S1 was presented from +/- 40 degrees in 8 degree steps. The monkey was required to move the joystick to indicate that it perceived the S2 as either at a higher or lower rate than the S1 in the temporal task, or to the left or right of the S1 in the spatial task. We recorded single neuron activity from the contralateral primary auditory cortex (A1; n = 113), the caudolateral field (CL; n = 19), the caudomedial field (CM; n = 49) and the rostral field (R; n = 113). We calculated the firing rate (FR) and the vector strength (VS) of each neuron from 70-500 ms from each S2 onset. We also calculated the linear regression for both FR and VS. Finally, we calculated the dynamic range in each task. In the temporal task, we found that neurons in all areas had equivalent FR tuning. However, based on the VS, neurons in R had the best tuning and those of CL the worst. In the spatial task, neurons in CL had the best FR tuning with the other areas having equivalently worse tuning. Tuning based on the VS was equivalent in all areas. These results provide good evidence in support of parallel processing of temporal and spatial information in the primate auditory cortex at the single neuron level.
Matthieu Fuchs, Yunshan Cai, Keshov Sharma, Mark Diltz and Liz Romanski
Topic areas: memory and cognition correlates of behavior/perception multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
During audiovisual communication, attention may focus on one modality or switch between modalities in order to extract time-varying information accurately. Previous studies investigating feature selective attention indicate that neuronal activity is often increased to attended features of a stimulus, thereby enhancing reliability. Furthermore, behavioral studies of modality selective attention indicate that subjects respond quicker and more accurately when a target appears in the expected modality. To examine modality-specific attentional modulation in the primate prefrontal cortex, we compared responses to a compound naturalistic audiovisual stimulus under different contexts. We trained macaque monkeys to perform an audiovisual nonmatch-to-sample task (NMTS) where a face-vocalization movie was presented, and repeated, until either the face or the vocalization component changed, and was detected with a button press. This audiovisual NMTS task was run in a randomized context where either the face or vocalization mismatched from trial to trial; or in a single modality block-design where each trial in a block was always a change in the vocalization component or always a change in the face component of the audiovisual movie. Recordings were made in one subject with a 64 channel implanted micro-array (Microprobes) in the ventrolateral prefrontal cortex (VLPFC). Analysis of neural activity during the decision period of the task indicated a significant difference in the mean firing rate of 24/48 cells (paired T-test, p < 0.05) during the single-modality block-design, compared to responses during the randomized context, indicating attentional modulation by modality context. In 78% cells, mean firing rate was increased during the predictable single modality block compared to the randomized context. While there was no significant difference in performance accuracy for this well-learned task, increased difficulty with a longer delay period and novel stimuli may elicit performance differences in random versus predictable contexts. Analyses focused on information theoretic measures and time-course of neuronal responses will assess modality specific contributions to audiovisual memory and attention.
Keith Kaufman, Rebecca Krall, Megan Arnold, Stephen David and Ross Williamson
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The interplay between external inputs and internal states, such as arousal, shapes the neural basis of perception. Arousal states, reflected by pupil size, continuously influence the neocortex, impacting membrane potentials, cortical state, and sensory processing. We hypothesized that pupil-linked arousal states would have a diverse influence on the cortex as stimulus encoding recruits an array of distinct cell types that span the cortical lamina (i.e., intratelencephalic (IT), extratelencephalic (ET), and corticothalamic (CT) cells). To test this, we employed two-photon calcium imaging to record neural activity from these populations within the auditory cortex (ACtx), in conjunction with pupillometry to monitor changes in pupil size. We first inspected the response strength to pure tone stimuli of layer (L) 2/3, L5 IT, ET, and CT cells. Increases in arousal enhanced the activity in all subpopulations except L5 IT cells. Reliability analysis revealed that ET and CT activity is most reliable (less variable) when pupil is large, while L2/3 and L5 IT cells were consistent across states. To compare excitability at high and low states, we calculated a modulation index for every cell. Values of this index closer to -1 implies greater activity at low states respective to high states, whereas values closer to +1 indicates the opposite effect. L5 IT cells had the highest variability and lowest mean modulation index, whereas ET cells displayed the lowest variability and highest mean modulation index. To explore the effects of arousal on spontaneous and evoked activity, we employed a multivariate regression model to predict a cell’s activity on a given trial based on pupil diameter and its neural response (Schwartz et al., 2020). Our analysis revealed heterogeneity in how pupil size correlates with baseline and evoked activity, encompassing both suppression and enhancement effects. Notably, L5 IT cells demonstrated a more uniform distribution of pupil effects compared to other populations. We also presented broadband noise to examine shared trial-to-trial response variability and found that higher states coincided with a decrease in shared variability in these cell types. This suggests that wakefulness alters correlated neural activity uniformly, highlighting the role of arousal in shaping functional connectivity. Collectively, our findings provide key insights into the intricate relationship between arousal and cell-type-specific activity during sensory coding.
Tomas Suarez Omedas and Ross Williamson
Topic areas: hierarchical organization neural coding novel technologies
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Discriminating relevant auditory signals from background noise poses a fundamental challenge to sensory systems. In rodents, the auditory cortex (ACtx) is known to play a key role in disentangling auditory signals from background noise; however, the specific contributions of distinct cortical subpopulations remain elusive. We investigated how subsets of excitatory neurons in ACtx layer (L) 2/3 and 5 process sensory signals in the presence and absence of noise. We focused on intratelencephalic (IT) neurons in L2/3 and both IT and extratelencephalic (ET) neurons in L5, each subset characterized by distinct functional and anatomical properties. Our goal is to elucidate the functional mechanisms these subpopulations use to disentangle signals from noise. Using genetically modified mouse lines and viral techniques, we conducted in-vivo two-photon calcium imaging of L2/3 and L5 (IT and ET) neurons while presenting pure tones in the presence or absence of broadband noise (50 dB SPL). At the single neuron level, L2/3 neurons exhibited a reduction in responses relative to their preferred frequency. This reduction was subtractive and divisive, determined through regression analysis. In contrast, L5 IT and ET neurons exhibited similar responses for both conditions. We analyzed the information content of cortical communications through binary decoding and population dimensionality, respectively. Population dimensionality analysis demonstrated that the number of dimensions used by L2/3 and L5 neurons for inter-neuronal communication remained unchanged across conditions. We also see that the dimensionality of L5 IT-ET neuron communication remains constant regardless of noise conditions, suggesting that disentangling tones from noise does not necessitate increased neural complexity. Finally, we trained a support vector machine to detect if a tone was present or absent from neuronal activity. Despite their differences at the single neuron level, both L2/3 and L5 populations exhibited improved discrimination in absence of noise, with higher d-prime values for similar tone intensities. Our findings suggest that tone responses at the single-neuron and population-level differ when in the presence of noise. Within the cortical column, L2/3 representations of tone in noise likely undergo local computations to disentangle tones from noise. This dissociated information is then transmitted to L5, facilitating its broadcast to relevant cortical and subcortical structures.
Marina Silveira, Audrey Drotos, Trinity Pirrone, Trevor Versalle, Amanda Bock and Michael Roberts
Topic areas: subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Neuropeptides play key roles in shaping the organization, function, and computations of neuronal circuits. We recently found that Neuropeptide Y (NPY), which is a powerful neuromodulator in many brain regions, is expressed in the inferior colliculus (IC) by a distinct class of GABAergic neurons. The IC is localized in a central position in the auditory system, integrating information from numerous auditory nuclei making the IC an important hub for sound processing. In the IC, NPY neurons project locally and send long-range inhibitory projections outside the IC. Previous studies showed that most neurons in the IC have local axon collaterals, suggesting that the IC is rich in local circuits. However, the organization and function of local circuits in the IC remains largely unknown. We previously found that neurons in the IC can express the NPY Y1 receptor (Y1R+) and that application of the Y1R agonist, [Leu31, Pro34]-NPY (LP-NPY), decreases the excitability of Y1R+ neurons. However, how Y1R+ neurons and NPY signaling shape local IC circuits is unknown. Here we found that Y1R+ neurons represent nearly 80% of glutamatergic neurons in the IC, providing extensive opportunities for NPY signaling to regulate excitation in local IC circuits. Next, to investigate how Y1R+ neurons and NPY signaling contribute to local IC circuits, we used optogenetics to activate local Y1R+ neurons while recording from other Y1R+ neurons as well as neurons that do not express the NPY Y1 receptor (Y1R- neurons). We found that Y1R+ neurons provide excitatory input to most other Y1R+ and Y1R- neurons in the IC and therefore form highly interconnected networks within local IC circuits. Additionally, Y1R+ neuron synapses exhibit moderate short-term synaptic plasticity, suggesting that local excitatory circuits maintain their influence over computations during sustained stimuli. We further found that application of LP-NPY decreases recurrent excitation in the IC, suggesting that NPY signaling strongly regulates local circuit function in the IC. Together, our data show that Y1R+neurons are highly interconnected in the local IC and their influence over local circuits is tightly regulated by NPY signaling.
Jennifer Lawlor, Sarah Elnozahy, Fangchen Zhu, Fengtong Du, Aaron Wang, Tara Raam and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
During sensorimotor learning, animals link a sensory cue with actions that are separated in time using circuits distributed throughout the brain. Learning thus requires neural mechanisms that can operate across a wide spatiotemporal scale and promote learning-related plasticity. Neuromodulatory systems, their broad projections and diverse timescales of activity, meet these criteria and have the potential to link various sensory and motor components. Yet, it remains unknown the extent to which this proposed model of plasticity occurs in real-time during behavioral learning. The acquisition of sensorimotor learning in a go/no-go task has been found to be faster and more stereotyped than previously thought (Kuchibhotla et al., 2019). We trained mice to respond to one tone for a water reward (S+) and withhold from responding to another (S-). We interleaved reinforced trials with those where reinforcement was absent (“probe”). Early in learning, animals discriminated between S+ and S- in probe but not reinforced trials. This unmasked a rapid acquisition phase of learning followed by a slower phase for reinforcement, termed ‘expression’. What role does cholinergic neuromodulation play in task acquisition? Here, we test the hypothesis that cholinergic neuromodulation provides a ‘teaching signal’ that drives primary auditory cortex (A1), and links stimuli with reinforcement. We exploit our behavioral approach and combine this with longitudinal two-photon calcium imaging of cholinergic axons in A1 during discrimination learning. We report both robust stimulus-evoked cholinergic activity to both S+ and S- and stable licking-related activity throughout learning at the level of the axon segment. While this activity mildly habituates in a passive control, in behaving animals the S+ and S- stimulus-evoked activity is enhanced (S+: duration, S-: amplitude and duration) on the timescale of acquisition. Additionally, we test the hypothesis that cholinergic neuromodulation impacts the rate of task acquisition. We expressed ChR2 bilaterally in cholinergic neurons within the basal forebrain of ChAT-cre mice and activated these neurons on both S+ and S- trials throughout learning. Test animals acquired the task faster than control groups. These results suggest that phasic bursts of acetylcholine, projecting widely to cortical regions, directly impact the rate of discrimination learning.
Jiayue Liu, Josh Stohl and Tobias Overath
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
This study aimed to investigate whether EEG could capture the internal noise associated with cochlear synaptopathy in humans and whether it correlates with hearing difficulties. Cochlear synaptopathy has been the subject of numerous studies in the past decade, but the reaction of the auditory cortex to it remains unclear. A recent mouse study suggested that with 90% synapse loss, the auditory cortex exhibits hyper-synchronized neuronal activity (‘internal noise’) only in missed tone detection trials in noise, which could contribute to degraded behavioral performance in noisy listening conditions (Resnik & Polley, 2021). In this study, 30 participants with near-normal hearing performed a monaural tone detection task in either quiet or noise while their EEG was recorded. They also underwent tasks that have been suggested to reveal cochlear synaptopathy. The analysis aimed to determine whether single-trial EEG could predict behavior (hit vs miss) and whether such EEG prediction correlated with other indicators of cochlear synaptopathy. Ongoing EEG analyses suggest that pre-stimulus EEG activity does not predict behavioral outcomes. In contrast, significant prediction performance of post-stimulus EEG likely reflects the presence of the P300 component for hit trials, but not earlier auditory processing stages. This prediction performance was correlated with the Speech, Spatial and Quality of Hearing questionnaire, but not with any of the other measures, such as speech perception thresholds or extended-high-frequency audiometric thresholds. More data is needed to determine the effect robustly.
Xiaoxuan Guo, Ester Benzaquen and Tim Griffiths
Topic areas: memory and cognition hierarchical organization
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Auditory figure-ground has been developed as a measure for central sound grouping in complex auditory scenes. Previous studies have shown that extracting a static, temporally coherent figure of varying frequencies from a tone cloud can successfully predict real-life listening. In this study, over 100 participants with varying age and hearing sensitivity were recruited. We added a dynamic component to the auditory figure that followed the pitch contour of natural human speech and investigated how well it further predicted speech-in-noise ability both on a sentence and word level, together with different ranges of frequencies (high or low frequency figure-ground), static figure-ground, peripheral hearing, and age using hierarchical regression and structural equation modelling. The results demonstrated improved predictive value of the dynamic figure-ground (standardised coefficient: -0.28) compared to the static one (nonsignificant), as well as their significant contribution to the auditory figure-ground as an important latent variable to speech-in-noise (0.52), which was of a comparable effect size as the pure-tone audiogram (0.32). Overall, this study has shown how peripheral hearing sensitivity, central sound grouping (of both static and dynamic sounds), and age interact with each other and their respective contribution to speech-in-noise perception.
Rashi Monga, Celine Drieu and Kishore Kuchibhotla
Topic areas: memory and cognition
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
How are associations between stimuli, actions and outcomes in goal-directed sensorimotor behaviors learned? Most models addressing this question assume that a ‘trial’ (individual reinforced pairing) is a fundamental unit of learning. Here we tested the validity of this assumption by comparing learning performance in 4 groups of mice trained with different numbers of trials per day. We exploited the use of non-reinforced (probe) trials to dissociate between ‘acquisition’ of task contingencies (measured in probe) from behavioral ‘expression’ (measured in reinforced trials). We reasoned that if the cumulative reinforced pairings (‘practice’) dominates these processes, then reducing the number of reinforced trials to half in a session would double the number of days taken by mice to reach expert performance. Water-restricted mice were trained on an auditory Go/No-Go task where they learned to lick to a S+ to receive a water reward and withhold licking to a S- to avoid a timeout. We evaluated performance of mice receiving either 140 (n=6) or 70 (n=12) reinforced trials interleaved with 20 probe trials in a session per day. 140-trial cohort mice took on average 4.3±2.2 sessions to acquire, and 11.6±3.0 sessions to express the task contingencies. Surprisingly, mice receiving 70 trials acquired and expressed the task at the same rate (5.0±1.4 and 13±4.9 sessions respectively, p>0.05). These results suggest that a trial-based account fails to explain learning trajectories. To further test the limits of this effect, we reduced the number of reinforced trials per session to 35 (n=10). All mice acquired the task contingencies in more sessions (7.8±1.9, p=0.002). Interestingly, these mice rarely expressed even after 50 sessions (3/10 reached expert performance) and those that did took longer than all cohorts (33.3±7.6, F(3, 22)=32.28, p
Christine Junhui Liu, Carolyn Sweeney, Cathryn Macgregor, Lucas Vattino, Kasey Smith and Anne Takesian
Topic areas: correlates of behavior/perception thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Identifying neural targets that control central auditory plasticity will have far-reaching impact, offering potential ways to restructure neural circuitry. Work from our lab and others have demonstrated that a group of cortical GABAergic neurons expressing vasoactive intestinal peptide (VIP) is important for sensory plasticity. Although many studies have leveraged the expression of VIP to genetically target this specific interneuron population, few have evaluated the function of the non-classical signaling molecule VIP in sensory processing and plasticity. We used a GPCR-Activation-Based (GRAB) peptide sensor and in vivo fiber photometry to study the release of VIP in mouse auditory cortex (ACtx) during passive sound presentation and associative auditory learning. Sound stimuli elicited VIP sensor responses in ACtx from a subset of mice. Furthermore, as mice learned to associate specific sounds with reward, VIP release was modulated by the behavioral relevance of the sound stimuli. These experiments represent the first effort to study the in vivo release of VIP within sensory cortex. Parallel experiments are evaluating the postsynaptic effects of VIP release in ACtx. We first used in situ hybridization to quantify the expression levels of mRNA encoding the VIP receptor 1, Vipr1, across ACtx. Consistent with previous studies in other sensory cortices, we found that Vipr1 is expressed within 77% of excitatory pyramidal cells, marked by expression of vesicular glutamate transporter 1 (Slc17a7) and 18% of GABAergic neurons, marked by expression of the GABA synthesizing enzymes Gad1 and Gad2. Ongoing studies are using in vitro electrophysiology to determine the functional postsynaptic effects of VIP receptor 1 activation. Together, these results may elucidate the effects of VIP within auditory cortical circuits, laying the necessary foundation for future loss- and gain-of-function experiments to evaluate the function of VIP release in auditory perception and learning.
Chenggang Chen, Xindong Song, Yueqi Guo and Xiaoqin Wang
Topic areas: multisensory processes neural coding novel technologies
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Despite four decades of research, the nature of the neural representation of sound location in the auditory cortex remains unclear. Previous studies have failed to identify any maps or patches of spatial representation in the mammalian auditory cortex (Middlebrooks and Pettigrew, 1981, J Neurosci; Middlebrooks, 2021, J Neurosci). A prevailing hypothesis of cortical spatial processing is the distributed population coding, supported by the evidence that neurons respond broadly to sound locations on the contralateral hemifield (Ortiz-Rios et al., 2017, Neuron; van der Heijden et al., 2019, Nat Rev Neurosci). However, electrophysiology and fMRI methods have limited spatial resolution to evaluate the cortical representation of sound locations. In the present study, we took advantage of the flat brain of the marmoset, a highly vocal New World monkey, and used wide-field calcium imaging methods to investigate the neural representations of sound location in the auditory cortex and neighboring multisensory region (medial superior temporal, MST) in awake condition. Most cortical areas preferred contralateral sound locations, but regions tuned to the front and ipsilateral locations formed five to ten patches. We found those patches in both primary and nonprimary, rostral, and caudal auditory cortex. Next, we investigated whether spatial tuning of patches depends on interaural time and level differences (ITD and ILD) cues. We found patches that prefer low frequency were ITD cue dependent. In contrast, patches that prefer high frequency were cue independent. Patches identified with spatial and binaural stimuli were relatively stable across sound levels. Furthermore, a neighboring multisensory MST has weak sound-driven responses and was not topographically organized by sound frequency. Surprisingly, MST was organized topographically by sound locations that range from far-contralateral to front. Finally, a horizontal visual stimulus will evoke a strong response only in the MST but not in the auditory cortex. We also identified a retinotopic map in the MST. Notably, multisensory auditory and visual-spatial maps in the MST largely overlapped. In summary, we found that auditory space is represented in the cortex by patch and map.
Samuel Norman-Haignere, Menoua Keshishian, Orrin Devinsky, Werner Doyle, Guy McKhann, Catherine Schevon, Adeen Flinker and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception hierarchical organization neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The auditory system must integrate across many different temporal scales to derive meaning from complex natural sounds such as speech and music. A key challenge is that sound structures – such as phonemes, syllables, and words in speech – have highly variable durations. As a consequence, there is a fundamental difference between integrating across absolute time (e.g., a 100-millisecond window) vs. integrating across sound structure (e.g., a phoneme or word). Auditory models have typically assumed time-yoked integration, while cognitive models have often assumed structure-yoked integration, which implies that the integration time should scale with structure duration. Little empirical work has directly tested these important and divergent assumptions, in part due to the difficulty of measuring integration windows from nonlinear systems like the brain and the poor spatiotemporal resolution of noninvasive neuroimaging methods. To address this question, we measured neural integration windows for time-stretched and compressed speech (preserving pitch) using a novel method for estimating integration windows from nonlinear systems (the temporal context invariance paradigm) applied to spatiotemporally precise intracranial recordings from human neurosurgical patients. Stretching and compression rescale the duration of all sound structures and should thus scale the integration window if it is yoked to structure but not time. Across the auditory cortex, we observed significantly longer integration windows for stretched vs. compressed speech, demonstrating the existence of structure-yoked integration in the human auditory cortex. However, this effect was small relative to the difference in structure durations, even in non-primary regions of the superior temporal gyrus with long integration windows ( >200 milliseconds) that have been implicated in speech-specific processing. These findings suggest that the human auditory cortex encodes sound structure using integration windows that are mainly yoked to absolute time and weakly yoked to structure duration, presenting a challenge for existing models that assume purely time-yoked or structure-yoked integration.
Yang Zhang, Xingdong Song, Yueqi Guo, Chenggang Chen and Xiaoqin Wang
Topic areas: speech and language neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Species-specific vocalizations are behaviorally critical sounds. Recognizing the vocalizations is important for the survival and social interactions of the vocal animals. In humans, a voice patch system has been identified on the lateral superior temporal gurus (STG) that is selective to human voices. In non-human primates, vocalization selective regions are found on the rostral portion of the temporal lobe, which are outside of the auditory cortex, both in macaques and marmosets using functional magnetic resonance imaging (fMRI). It is yet unclear whether vocalizations are uniquely processed in the auditory cortex. Using wide-field calcium imaging, a technique with both high temporal and high spatial resolution, we discovered two voice patches in marmoset auditory cortex that prefer species-specific vocalizations over other vocalizations and sounds. One patch is located on the posterior A1 (primary auditory cortex), and the other one is located on the anterior non-core regions. These voice patches are hierarchically organized based on latency and selectivity analyses. In addition, call types and identity information are carried by population responses from the voice patches. Furthermore, we found that the voice patches are functionally connected. Overall, these results reveal the existence of voice patches in the auditory cortex of marmosets and support the notion that, for different primate species, similar cortical architectures are adapted for recognizing communication signals both for vocalizations and faces.
Hiroaki Tsukano, Michellee M Garcia, Pranathi R Dandu, Collin M Graves and Hiroyuki K Kato
Topic areas: memory and cognition correlates of behavior/perception hierarchical organization
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Our brains continuously compare incoming sensory inputs against predictions based on previous experiences, assigning less salience to predictable stimuli. While sensory habituation to repetitive stimuli is considered the simplest manifestation of this predictive coding, the circuit mechanisms underlying long-term habituation over days remain unclear. We previously reported that daily passive sound exposure attenuates neural responses in the mouse primary auditory cortex (A1), a plasticity mediated by local inhibition from somatostatin-expressing neurons (SST neurons). In the current study, we further explored the source of top-down predictive signals regulating SST neurons to create a “negative image” that cancels out sound input. Retrograde tracing demonstrated that A1 SST neurons receive projections from frontal cortical areas, most prominently from the orbitofrontal cortex (OFC). Two-photon calcium imaging of OFC axon terminals in A1 revealed enhanced top-down input following daily passive tone exposure, supporting its role in encoding predictive signals. Muscimol infusion into the OFC reversed the pre-established habituation in A1 by enhancing pyramidal neuron activity while suppressing SST neuron activity. Importantly, this effect was absent in naïve animals, highlighting the specific involvement of the OFC in experience-dependent predictive filtering. We also found that the deletion of NMDA receptors in SST neurons reduces habituation, pointing to the role of synaptic plasticity at inputs onto SST neurons. Together, our findings suggest that the predictive filtering of sensory activity is realized by a global circuit mechanism recruiting the frontal top-down inputs.
Cathryn MacGregor, Lucas Vattino, Christine Junhui Liu, Carolyn Sweeney and Anne Takesian
Topic areas: thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
A sparse population of inhibitory interneurons within Layer 1 (L1) of the auditory cortex (ACtx) plays a critical role in auditory plasticity and learning. However, the map of direct inputs onto this population remains incomplete. The ACtx integrates inputs from diverse brain regions, including the medial geniculate body (MGB), the auditory thalamus, which is composed of distinct nuclei that relay both sensory and non-sensory information. Projections from MGB to ACtx primarily target layer 4 (L4), but also extend to cortical L1. Here, we examined the functional and anatomical connections from MGB to two major subtypes of L1 interneurons expressing either vasoactive intestinal peptide (VIP) or neuron-derived neurotrophic factor (NDNF). Using whole-cell electrophysiology recordings during optogenetic activation of MGB afferents, we show that both L1 VIP and NDNF interneurons receive monosynaptic MGB inputs. Furthermore, our results indicate that the MGB-evoked excitatory postsynaptic current (EPSC) amplitudes onto L1 VIP and NDNF interneurons are comparable to those recorded onto L4 excitatory pyramidal neurons, which are known to receive robust MGB input. To examine the distribution of presynaptic neurons amongst the distinct MGB nuclei that provide these monosynaptic inputs to L1 VIP and NDNF interneurons, we employed monosynaptic rabies tracing. The MGB can be subdivided into three major subregions: the primary ventral division (MGBv), and the higher-order medial (MGBm) and dorsal divisions (MGBd), which integrate multimodal sensory and non-sensory information. We found that both L1 VIP and NDNF interneuron populations receive inputs from all three MGB subregions. However, presynaptic neurons that synapse onto VIP interneurons predominantly arise from the MGBv, whereas neurons synapsing onto NDNF interneurons are more broadly distributed across the primary and higher-order MGB nuclei. These results show that both VIP and NDNF interneurons in superficial ACtx receive monosynaptic inputs from the MGB, but are differentially targeted by distinct nuclei. Together, these findings provide insight into the distinct thalamic inputs that synapse onto L1 interneuron subtypes and motivate future studies that will examine how these subtypes are differentially recruited in vivo during auditory perception and learning.
Alex Boyd, Virginia Best and Kamal Sen
Topic areas: auditory disorders speech and language neural coding novel technologies
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Listening in acoustically cluttered scenarios remains a difficult task for both humans and machines. For listeners with hearing loss, this difficulty is often extreme and can seriously impede communication in noisy everyday situations. It has long been recognized that spatial filtering can alleviate these difficulties, and many hearing devices now incorporate directional microphones or beamforming technology. Here we present a biologically inspired algorithm designed to isolate sounds based on spatial location, and consider its potential utility in hearing devices. The algorithm is based on a hierarchical network model of the auditory system, in which binaural sound inputs drive populations of neurons tuned to specific spatial locations and frequencies, and the spiking responses of neurons in the output layer are then reconstructed into audible waveforms. The algorithm has sharp spatial tuning, can be flexibly configured, and is well-suited to low-power real-time applications. We previously evaluated the algorithm in normal hearing listeners, by measuring speech intelligibility in a challenging mixture of five competing talkers, and found benefits similar to those provided by a multi-microphone beamforming array. In our current work we are extending this evaluation to listeners with sensorineural hearing loss and comparing the performance of the algorithm to current standard hearing-aid beamforming approaches. We will present those results and discuss the advantages of biologically inspired processing for hearing devices more broadly.
I.M Dushyanthi Karunathilake, Joshua P. Kulasingham and Jonathan Simon
Topic areas: speech and language correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. However, it remains unclear which aspects of the corresponding neural responses correspond to speech intelligibility, which is only loosely coupled to the acoustics. Intelligibility related neuro-markers derived from such neural responses would play a crucial role in advancing our understanding of the neurophysiology of the speech understanding, evaluation of auditory function across diverse clinical populations, and hearing device evaluation. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, making it difficult to cleanly distinguish effects of intelligibility from the underlying acoustical confounds. In this study, speech intelligibility is manipulated while keeping the acoustical structure unchanged, using degraded speech plus a priming paradigm. Acoustically identical three-band noise vocoded (degraded) speech segments (~20 s duration) are presented twice, but the second presentation is preceded by the original (non-degraded) version of the same speech segment. This priming, which generates a ‘pop-out’ percept, substantially improves the intelligibility of the second presentation of the degraded speech passage while keeping the acoustics identical. We recorded magnetoencephalography (MEG) data from 25 younger adults and investigated how intelligibility affects auditory and linguistic neural tracking measures using multivariate Temporal Response Functions (mTRFs). As expected, behavioral results confirmed that perceived speech clarity is improved by priming. mTRF analysis revealed that auditory (speech envelope and envelope onset) and phoneme onset neural responses are influenced only by the acoustics of the sensory input (bottom-up driven mechanisms). Critically, our key findings suggest that neural measures associated with the segmentation of sounds into words emerges first with better speech intelligibility, especially those time-locked at N400-like latencies in prefrontal cortex (PFC), in line with engagement of top-down mechanisms associated with priming. Taken together, our results suggest that time locked neural responses associated with lexical segmentation may serve as novel objective measures of speech intelligibility.
Jarrod Hicks and Josh McDermott
Topic areas: correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Human hearing is largely robust to noise, but the basis of this robustness is poorly understood. We explored whether internal models of real-world noise aid detection and recognition of foreground sounds in noise. One prediction of this hypothesis is that hearing should improve with exposure to a noise source, since noise properties can be better estimated with more samples. Consistent with this idea, we found that detection and recognition in background noise improved with exposure to the background. An observer model designed to detect outliers from a distribution of background noise accounted for this pattern of human behavioral performance. Detection performance was enhanced for recurring backgrounds and was robust to interruptions in the background, suggesting listeners build up and maintain representations of noise properties over time. The results suggest noise robustness is supported by internal models—“noise schemas”—that capture the structure of noise and facilitate the estimation of other concurrent sounds.
Abbey Manning, Philip T. R. Bender, Helen Boyd-Pratt, Martin Hruska and Charles T. Anderson
Topic areas: correlates of behavior/perception thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Mutations in the gene that encodes for Src homology 3 and multiple ankyrin repeat domains protein 3 (Shank3) result in altered synaptic function and morphology. Shank3 is a synaptic scaffolding protein that assists in tethering and organizing proteins and glutamatergic receptors in the postsynaptic density of excitatory synapses, thereby supporting normal synaptic function. The localization of Shank3 to excitatory synapses and the formation of stable Shank3 sheets is regulated by the binding of zinc to the C-terminal sterile-alpha-motif (SAM) domain of Shank3. Disruptions of zinc in synapses that are enriched with Shank3 leads to a loss of postsynaptic proteins important for synaptic transmission, suggesting that zinc supports the localization of postsynaptic proteins via Shank3. The brain is highly enriched with free zinc inside glutamatergic vesicles at presynaptic terminals. Zinc transporter 3 (ZnT3) moves zinc into vesicles where it is co-released with glutamate. Alterations in ZnT3 are implicated in multiple neurodevelopmental disorders, and ZnT3 knock-out (KO) mice – which lack synaptic zinc – show behavioral deficits associated with autism spectrum disorder and schizophrenia. Here we show that ZnT3 KO mice have smaller dendritic spines and mini excitatory postsynaptic current amplitudes than WT mice. Additionally, spines with Shank3 are smaller in ZnT3 KO mice compared to WT mice, and synapses with both Shank3 and ZnT3 have larger spines in WT mice compared to synapses with only Shank3. Together these findings suggest a mechanism whereby presynaptic zinc supports normal postsynaptic structure and function via Shank3.
Estelle In T Zandt and Dan Sanes
Topic areas: correlates of behavior/perception neural coding neuroethology/communication
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Throughout development, we are exposed to a range of natural sounds, including vocalizations, that gain meaning through experience. In fact, neural remodeling remains sensitive to the environment throughout adolescence, a time during which many behavioral skills remain immature. An abundance of behavioral work shows that the natural acoustic environment, including speech sounds, has a long-term impact on perceptual skills later in life. However, at the neural level, developmental remodeling of auditory cortex (AC) has largely considered how early juvenile exposure to non-natural stimuli (i.e. tones, white noise) can permanently modify sound coding properties. It thus remains unknown whether vocalization encoding continues to mature through adolescence following early experience with the full vocalization repertoire. Here, we investigate the development of AC responses to vocalization sequences in awake, freely-moving adolescent and sexually mature Mongolian gerbils (Meriones unguiculatus). Gerbils are a highly social rodent species with a rich vocal repertoire and a prolonged adolescent period. We used chronically-implanted high density silicon probes to wirelessly record single neuron responses in the same animals across weeks of adolescence or adulthood. Initial analyses have focused on AC neuron firing rate, trial-to-trial variance, and dynamic range of response to vocalization sequences. Cross-sectional comparisons of adult and adolescent responses suggest that the percent of the AC single neurons that are modulated by vocalizations increases through late adolescence. Similarly, our preliminary analyses suggest that the AC neuron dynamic range also increases during this interval. These results suggest that AC responses to natural calls do continue to mature long after the canonical early critical period has closed. Our current analyses are focusing on whether specific vocalizations demonstrate unique developmental trajectories, and how neural responses to specific syllables in the vocalization sequence develop. The results of these experiments will provide insight into how responses to natural, behaviorally-relevant sounds mature in the cortex.
Rose Ying, Daniel Stolzberg and Melissa Caras
Topic areas: correlates of behavior/perception subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Sensory perception is highly dynamic, capable of both rapid context-dependent shifts, as well as slower changes that emerge over time with extended training. Previous research has shown that these perceptual fluctuations are driven by corresponding changes in the sensitivity of auditory cortical neurons to sound. However, it is unclear whether these changes emerge in the ascending auditory pathway and are inherited by the auditory cortex, or arise in the cortex de novo. As a first step towards answering this question, we implanted Mongolian gerbils with chronic microelectrode arrays in either the central nucleus of the inferior colliculus (CIC) or the ventral medial geniculate nucleus (vMGN). We recorded single- and multi-unit activity as animals trained and improved on an aversive go/no-go amplitude modulation (AM) detection task, and during passive exposure to the same AM sounds. AM-evoked firing rates and vector strengths were calculated and transformed into the signal detection metric d’. Neural thresholds were obtained for each training day by fitting d’ values across AM depths and determining the depth at which d’ = 1. Thresholds were compared between periods of task performance and passive sound exposure across several days of training to determine whether there were learning-related and/or context-dependent changes in activity. Neural thresholds obtained from CIC and vMGN during task performance and during passive sound exposure improved across days of training, suggesting that both regions display learning-related plasticity independent of context. While both regions also exhibited a context-dependent change in coding strategy, such that AM stimuli were better encoded by vector strength during passive exposure and by firing rate during task performance, only the vMGN exhibited lower (better) AM thresholds during the task context. These findings raise the possibility that extended perceptual training improves neural sensitivity by acting at or below the level of the auditory midbrain, whereas rapid, context-dependent sensitivity enhancements first emerge in the auditory thalamus. Our results contribute to a deeper understanding of the circuits supporting perceptual flexibility, and may ultimately inform strategies for improving sound perception in hearing-impaired individuals.
Chen Lu and Jennifer Linden
Topic areas: neural coding thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Hearing impairment has been identified as a longitudinal risk factor for schizophrenia in the general population (Linszen et al. Neurosci Biobehav Rev 2016). The 22q11.2 chromosomal microdeletion is one of the strongest known genetic risk factors for schizophrenia; ~30% of carriers develop schizophrenia in adulthood (Schneider et al. Am J Psychiatry 2014). Up to 60% of 22q11.2 deletion carriers have mild to moderate hearing impairment, primarily from chronic middle-ear problems that typically emerge in childhood and persist in adulthood (Verheij et al. Clin Otolaryngol 2017). The question of whether hearing impairment increases psychosis risk in patients with 22q11.2 Deletion Syndrome (22q11.2DS) has never been systematically addressed. Here we used the Df1/+ mouse model of 22q11.2DS to investigate how genetic risk for schizophrenia and experience of hearing impairment might interact to affect brain function. The Df1/+ mouse replicates the large inter-individual variation in hearing ability observed among 22q11.2DS patients and also exhibits auditory brain abnormalities consistent with disrupted cortical excitation/inhibition balance (Zinnamon et al. 2022 Biol Psychiatry Global Open Sci). We measured peripheral hearing sensitivity and cortical auditory evoked potentials (AEPs) in 29 Df1/+ mice and 22 WT littermates, exploiting the large inter-individual variation in hearing ability among Df1/+ mice to distinguish cortical effects of genetic background from those of experience with hearing impairment. To quantify alterations in cortical gain and adaptation, we analysed the growth of tone-evoked AEPs as loudness or inter-tone interval duration increased. These AEP growth measures were abnormal in Df1/+ mice with normal hearing, but were also affected by hearing impairment. Our results suggest that auditory cortical abnormalities in 22q11.2DS may depend not only on the genetic deletion but also on experience of hearing impairment. In ongoing work, we are investigating the abnormalities in auditory cortical function at the single-neuron and neuronal population levels as well as in cortical evoked potentials.
Sahil Luthra, Raha Razin, Chisom Obasih, Adam Tierney, Fred Dick and Lori Holt
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Theoretical accounts suggest that auditory categorization relies on selective attention to diagnostic auditory dimensions and that listeners may also suppress non-diagnostic dimensions. The current study leverages fMRI to examine cortical activation when categorization depends on diagnostic information conveyed in particular frequency bands. Prior to scanning, adults complete a five-day training regime in which they learn to categorize four novel nonspeech auditory categories. Each stimulus consists of three consecutive high-bandpass-filtered hums (~1000-9000 Hz), presented simultaneously with three low-bandpass-filtered hums (~80-750 Hz), where the hums are nonspeech fundamental frequency contours taken from Mandarin words varying in lexical tone; stimuli are derived from productions by multiple talkers, making tone patterns acoustically variable. Critically, in a category-diagnostic frequency band, the three hums follow a single tone pattern; in the other (non-diagnostic) frequency band, all hums have different tone patterns. Thus, categorization depends on recognizing acoustically variable hum patterns within a category-diagnostic frequency band; for two categories, the category-diagnostic band is high, and for the others, the category-diagnostic band is low. We test the hypothesis that successful categorization requires directing attention to a category-diagnostic spectral band and potentially suppressing the non-diagnostic band. We do so by comparing the activation evoked during auditory categorization within different tonotopically mapped regions as well as “attention-o-tonotopic” maps driven by explicitly cued attention to high or low spectral bands (e.g., “listen high”). We hypothesize that successful categorization will be linked to enhanced recruitment of cortical regions that prefer the diagnostic frequency band; baseline activity is indexed in a control task with identical stimuli but where categories are defined by stimulus amplitude. We further hypothesize that categorization involves suppression, indexed as reduced recruitment of cortical regions that prefer the non-diagnostic frequency band (relative to baseline activity). Preliminary results suggest that auditory categorization may drive selective attention to category-diagnostic dimensions.
Ekaterina Yukhnovich, Kai Alter and William Sedley
Topic areas: auditory disorders correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The search for a biomarker that would form part of a ‘final common pathway’ for various tinnitus subgroups, irrespective of specific contributory mechanisms, is ongoing (Adjamian et al, 2006; Eggermont & Roberts, 2004; Mohan et al, 2022). Through an experiment in which a roving paradigm was used, it was proposed that Intensity Mismatch Asymmetry, based on the predictive coding model of tinnitus, may be such a biomarker in humans (Sedley et al, 2016; Sedley et al, 2019). The roving paradigm involved two types of standard intensity stimuli, with deviants defined as pseudo-random transitions between one standard type and the other (Sedley et al, 2019). Specifically, a high intensity (loud) standard was alternated with a quieter (downward) deviant, while a low intensity (quiet) standard was alternated with a louder (upward) deviant. It was found that participants with tinnitus had larger MMN responses to upward deviants, but smaller MMN responses to downward deviants, compared to the control group. Further work showed distinct patterns of changes depending on stimulus frequency (tinnitus frequency vs 1 kHz control) and distinct correlates of tinnitus and hyperacusis (accepted for publication). Participants watched a subtitled movie while listening to the stimuli as a passive task. The present study investigates whether attentive state affects the MMN responses in either tinnitus or control groups. While MMN is argued to be pre-attentive, the evidence for this has been varied; the specific type of deviant may influence how the MMN amplitudes change during various attentive states (Alho et al, 1994; Fitzgerald & Todd, 2020; Naatanen et al, 1993; Woldorff et al, 1991). Methods: Tinnitus and control groups of participants were recruited (N=25 each), matched in hearing profiles, age, and gender. The stimuli were presented at the tinnitus frequency of each individual with tinnitus, with their control counterpart hearing the same frequency. While listening to the sounds for 1 hour, the participants randomly engaged in 3 types of tasks: 1) visual attention task, 2) auditory attention task, and 3) passive attention task (subtitled movie). The findings, currently under analysis, may inform the way attentive state is controlled in similar studies in the future. They will also show the extent to which attention is a mediator of the effects of tinnitus on MMN responses.
Veronica Tarka, Quentin Gaucher and Kerry Walker
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Pitch is our perception of the tonal quality of sound. It is the basis of musical melody and plays a key role in communication and sound segmentation. The pitch is perceived at a single fundamental frequency (F0), which can be derived for a complex sound from either the regular spacing of harmonics in the frequency domain, the repetition rate of the sound’s waveform in time, or a combination of these features. Studies in marmosets have described specialized neurons located within a “pitch centre” in auditory cortex that encode F0 invariantly to other spectral changes, but there has not been clear evidence for such specialized pitch neurons in other species. We performed Neuropixels recordings of single neurons in the auditory cortex of 4 anaesthetized ferrets while presenting a variety of pitch-evoking sounds across a range of F0s (0.25-4 kHz). We found that some neurons derived F0 exclusively from resolved harmonics, while others from temporal periodicity. A further subset of neurons encoded F0 invariantly across both classes of pitch cues, which may be the first evidence for specialized “pitch neurons” in non-primates. These neurons were not confined to a localized pitch centre, as in the marmoset, but were instead distributed throughout primary auditory cortex (A1 and AAF). F0 tuning in these neurons was robust across many complex sounds but usually failed to extend to the frequency of pure tones. This suggests some auditory cortical neurons may be specialized to represent the pitch of complex sounds, and these may differ from neurons that represent the frequency of pure tones.
Vinay Raghavan, James O'Sullivan, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception novel technologies
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Objective. People suffering from hearing impairments often struggle to follow a conversation in a multitalker environment. Current hearing aids can suppress background noise; however, little can be done to help a user attend to a single conversation amongst many without first knowing which speaker a user is attending to. Cognitively-controlled hearing aids have been proposed using auditory attention decoding (AAD) methods; however, these methods have not been able to meet the demands of conversational speech or handle instances of divided attention or inattention. Here, we propose a novel framework that directly classifies auditory event-related potentials (ERPs) to glimpsed and masked speech events to determine whether the source of the event was attended. Approach. We present a system that (1) identifies auditory events using the local maxima in the envelope rate of change, (2) assesses the temporal masking of auditory events relative to competing speakers, and (3) utilizes masking-specific ERP classifiers to determine if the source of the event was attended. Main results. Using invasive electrophysiological recordings, we showed that ERPs from recording sites in auditory cortex can effectively decode the attention of subjects. This method of AAD provides higher accuracy, shorter switch times, and more stable decoding results compared with traditional CCA-based methods, permitting the quick and accurate detection of changes in a listener’s attentional focus. Significance. Our framework extends the scope of AAD algorithms by introducing a linear, direct-classification method for determining a listener’s attentional focus that leverages the latest research in multitalker speech perception. This work moves us closer to the development of effective and intuitive cognitively-controlled hearing assistive devices.
Tiange Hou, Blake Sidleck, Jack Toth, Olivia Lombardi, Priya Agarwal, Danyall Saeed, Dylan Leonard, Luz Andrino, Abraham Eldo, Madelyn Kerlin and Michele Insanally
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The ability to flexibly respond to sensory cues in dynamic environments is essential to adaptive auditory-guided behaviors such as navigation and communication. How do neural circuits flexibly gate sensory information to select appropriate behavioral strategies based on sensory input and context? Auditory neural responses during behavior are diverse, ranging from highly-reliable ‘classical’ responses (i.e. robust, frequency-tuned cells) to irregular or seemingly random ‘non-classically responsive’ firing patterns (i.e., nominally non-responsive cells) that fail to demonstrate any significant trial-averaged responses to sensory inputs or other behavioral factors. While classically responsive cells have been extensively studied for decades, the contribution of non-classically responsive cells to behavior has remained underexplored despite their prevalence. Recent work has shown that non-classically responsive cells in auditory cortex (AC) and secondary motor cortex (M2) contain significant stimulus and choice information and encode flexible task rules. While it has been shown that both classically and non-classically responsive units are essential for asymptotic task performance their role during learning is unknown. Here, we explore how diverse cortical responses emerge and evolve during flexible behavior. We recorded single-unit responses from AC while mice performed a reversal learning task. Cortical response profiles during learning were highly heterogeneous spanning the continuum from classically to non-classically responsive. Strikingly, we found that the proportion of task-encoding non-classically responsive neurons significantly increased during late learning when the largest behavioral improvements occur demonstrating that non-classically responsive neurons are preferentially recruited during learning. To identify the role of top-down feedback on AC circuits during key learning phases we optogenetically silenced M2→AC projection neurons while recording AC spiking responses. Remarkably, silencing M2 inputs selectively modulated non-classically responsive cells and impaired behavioral performance during post-reversal learning. Our findings demonstrate that task-encoding non-classically responsive cells are preferentially recruited during learning by top-down inputs enabling neural and behavioral flexibility.
Noemi Gonçalves, Alexa Buck, Carolina Campos de Pina, Typhaine Dupont, Nicolas Michalski, Luc Arnal and Boris Gourevitch
Topic areas: auditory disorders cross-species comparisons neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Brain activity synchronizes with sensory input rhythms through neural entrainment. This seems to be particularly important for auditory perception and can even be enhanced when the stimulus amplitude modulation rate corresponds to the time constant of activated neural circuits. For nearly half a century we have known that there is an increased response in the human EEG and MEG response to a stimulation rate of 40Hz in the auditory system, a phenomenon dubbed as the 40Hz auditory steady state response (ASSR) in the literature. Mechanisms remain unclear and would involve Parvalbumin positive interneurons. Importantly, the 40Hz ASSR has been identified as a marker of neurological health with multiple studies demonstrating its decreased prevalence in many neurological disorders. Thus, the 40Hz ASSR gained importance and was quickly adopted in rodents and especially mice models as a functional marker for brain disorders. However, does the enhanced neural entrainment at 40Hz even exist in mice? To answer this we went back to the compound and local neural activity in anesthetized and awake mice that we’ve recorded for years in the auditory pathways. We show that there is only a small enhancement of response amplitudes at 40Hz rhythms in the inferior colliculus, auditory thalamus (MGB) and auditory cortices of mice. This enhancement is robust to anaesthesia, but is more prevalent at the subcortical levels. These results bring into question the specific role this frequency plays in the brain, and whether mouse is a valuable animal model for relating the 40Hz ASSR to the presence of neurological pathologies.
Jiaxin Gao, Mingxuan Fang, Honghua Chen and Nai Ding
Topic areas: speech and language neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Speech recognition crucially relies on slow temporal modulations (
Amy LeMessurier, Ayat Agha and Robert Froemke
Topic areas: hierarchical organization neural coding neuroethology/communication subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Perception of vocalizations is crucial for social behavior. A conserved example of this is mothers responding to distress calls from infants. In mice, experienced mothers (dams) find and retrieve isolated pups into the nest when pups emit ultrasonic vocalizations (USVs). Virgin females generally don’t retrieve pups until they gain experience, for example by co-housing with a dam and litter. The onset of retrieval behavior is correlated with heightened sensitivity to USVs in left auditory cortex (AC). This plasticity may support learning via projections from cortex to early structures in the auditory pathway. The central auditory pathway is highly interconnected, with dense “corticofugal” projections from auditory cortex to earlier structures that may support vocal perception by filtering incoming auditory input. To test whether projections from left AC are required for retrieval, we chemogenetically silenced activity in layer 5 during retrieval. In expert retrievers the fraction of pups retrieved was substantially reduced in the CNO condition relative to vehicle. Silencing only neurons projecting to inferior colliculus (corticocollicular) led to a similar decrease. However, silencing neurons projecting to striatum (corticostriatal) had no effect, suggesting that corticocollicular projections are particularly critical for linking perception to behavior. To test this we used 2-photon Ca2+ imaging in awake mice to compare encoding of USVs in corticostriatal and corticollicular neurons. Corticocollicular neurons in expert retrievers exhibited sustained increases in activity during USV playback compared to presentation of pure tones, while activity was equivalent during USV and pure tone presentation in corticostriatal neurons. This was corroborated by in vivo patch recordings in optotagged projection neurons. The sustained activity we observed in corticollicular neurons may reflect increased excitability in a dedicated network of recurrently-linked cortical and subcortical areas. To examine whether this develops with experience, we tracked activity in corticocollicular neurons in an additional cohort of virgins over several days before and during co-housing as performance improved. This revealed robust population responses to USVs on each day; however, response durations in individual neurons changed over days. Overall these results suggest that corticofugal projections are crucial for pup retrieval, and plasticity in these projections could drive learning.
Vrishab Commuri, Jonathan Simon and Joshua Kulasingham
Topic areas: memory and cognition speech and language correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Auditory cortical responses to speech obtained by magnetoencephalography (MEG) show robust speech tracking in the high-gamma band (70-200 Hz), but little is currently known about whether such responses depend at all on the focus of selective attention. In this study we investigate differences in high-gamma cortical responses to male and female speech, and we address whether these responses, thought to originate from primary auditory cortex, depend on selective attention. Twenty-two human subjects listened to concurrent speech from male and female speakers and selectively attended to one speaker at a time while their neural responses were recorded with MEG. The male speaker’s pitch range coincided with the lower range of the high-gamma band. In contrast, the female speaker’s pitch range was higher, and only overlapped the upper end of the high-gamma band. Neural responses were analyzed using the temporal response function (TRF) framework. As expected, the responses demonstrate robust speech tracking in the high gamma band, but only to the male’s speech. Responses present with a peak latency of approximately 40 ms indicating an origin of primary auditory cortex. The response magnitude also depends on selective attention: the response to the male speaker is significantly greater when male speech is attended than when it is not attended. This is a clear demonstration that even very early cortical auditory responses are influenced by top-down, cognitive, neural processing mechanisms. Supported by the National Institutes of Health (R01-DC019394) and the (National Science Foundation (SMA 1734892).
Ian Griffith and Josh McDermott
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Attentional selection allows biological organisms to successfully recognize communication signals amid concurrent sound sources (the “cocktail party problem”). Although attentional abilities have been characterized to some extent in humans, we lack quantitative models that can account for attention-mediated behavior, explain the conditions in which attentional selection should succeed or fail, and reveal how attention should influence neural representations to enable selective listening. Inspired by neurophysiological observations that attention often acts as a feature-specific multiplicative gain, we developed a model of auditory attention by equipping a neural network with learnable stimulus-dependent gains. We optimized the model to perform an attentional word recognition task on audio signals, reporting words spoken by a cued “target” talker in a multi-source mixture. We then evaluated model performance across a range of target-distractor conditions. In the presence of competing talkers, the model correctly reported the words of the cued talker and ignored the distractor talkers, on par with humans. The model also failed in conditions where humans fail, showing lower accuracy with multi-talker distractors, when target and distractor voices were the same sex, and with inharmonic speech. Analysis of the model’s internal representations revealed that attentional selection occurred exclusively at later model stages, suggesting a normative explanation for findings of “late” selection in the cocktail party problem. The results suggest that human-like attentional strategies emerge as an optimized solution to the cocktail party problem and show how task optimization and contemporary neural networks can provide a normative perspective on attention and its potential instantiation in the brain.
Derek Rosenzweig, Emma Ning, David Poeppel and Claire Pelofi
Topic areas: memory and cognition speech and language multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
How neuronal systems support invariant object classification and labeling is a fundamental and challenging problem in neuroscience. Absolute pitch (AP) is the rare ability to identify and label tones without access to external reference, and as such represents an example of auditory object recognition. A simple model to account for this might consist of a classification task for a neural network retrieving labels corresponding to fundamental frequency, but it remains unknown how this computation is implemented in the brain. Altogether, AP serves as an identifiable behavioral marker for studying architecture in the left dorsal stream contributing to rapid identification and labeling of fundamental frequency. In this study, we examine audiovisual interference in musicians with AP and determine whether retrieval processes occur automatically and without instruction. We recruited a group of trained musicians and identified AP participants through accuracy performance on a pitch labeling task. In a behavioral audiovisual Stroop task, we found that AP musicians demonstrated interference as measured by increased response time (RT) on mismatch as compared to match trials of the task. In contrast, non-AP musicians did not demonstrate this audiovisual incongruence effect, as reflected in similar RTs across match and mismatch conditions. These behavioral findings support the hypothesis that invariant pitch classification and labeling occurs automatically and without instruction in musicians with AP. In a follow-up MEG study, we investigate the hypothesis of automatic activation of phonetic representation when AP musicians perceive tones. We then analyze the time course of this interference through event related potentials to gain insight on the computational underpinnings of AP pitch labeling.
Marianny Pernia, Manaswini Kar, Kayla Williams and Srivatsun Sadagopan
Topic areas: auditory disorders correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Exposure to moderate-to-intense sounds can cause temporary threshold shifts (TTS) in the audiogram and subtle but lasting deficits in auditory nerve fiber recruitment. It has been widely postulated that these subtle deficits, as evidenced by reduced wave I amplitudes of the auditory brainstem response (ABR), lead to speech perception deficits in humans. However, the link between wave I amplitude reduction and speech perception deficits is still controversial. To gain insight into possible circuit pathologies underlying such deficits, we induced TTS in a guinea pig animal model to determine its effects on a complex sound recognition behavior (operant vocalization-in-noise categorization task). ABR recordings revealed elevated thresholds that returned to within ~10 dB of baseline 30 days after noise exposure, accompanied by permanently decreased wave I amplitudes in all animals. We then tested these animals on a vocalization categorization task in quiet and noisy listening conditions. Surprisingly, animals did not show overt performance deficits in this task. Deeper analyses of behavioral data, however, suggested that guinea pigs might adopt non-auditory behavioral strategies to compensate for underlying deficits. To dissociate non-auditory effects from bottom-up deficits in call representations, we estimated thresholds for call categorization-in-noise using pupillometry and an oddball paradigm, which largely reflects bottom-up processing. We found that categorization was impaired selectively at loud sound levels, consistent with the loss of high-threshold auditory nerve fibers. Taken together, these results suggest that top-down non-auditory strategies can mask TTS-linked deficits in the ascending auditory pathway. In ongoing experiments, we are performing neural recordings in the auditory cortex to probe the effects of TTS on the neural representation of vocalizations across cortical laminae.
Hannah Oberle, Clara Martinez-Voigt, Esther Choi and Pierre Apostolides
Topic areas: hierarchical organization neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The auditory cortex sends excitatory feedback (corticofugal) projections to the inferior colliculus (IC), a midbrain hub involved in complex sound coding. Corticofugal axons primarily target higher-order dorsomedial and lateral “shell” IC sub-nuclei. We previously characterized corticofugal transmission at the dorsomedial IC, revealing single-cell and network mechanisms that enable powerful non-linear computations in distinct IC cell classes (Oberle et al., 2022; 2023). However, whether corticofugal synaptic activity has similar or divergent effects in the lateral IC is unknown. Here, we combine transgenic mouse lines, optogenetics, and patch-clamp electrophysiology in acute brain slices to measure corticofugal transmission in the lateral IC and compare with data from the dorsomedial IC. We crossed VGAT-ires-cre and Ai14 fl/fl mice to identify GABAergic (VGAT+) and presumptive glutamatergic (VGAT-) neurons during recordings. We expressed the excitatory opsin Chronos in auditory cortex to optogenetically activate auditory corticofugal axons with trains of light flashes. All dorsomedial IC neurons tested (15/15) exhibited EPSPs during optogenetic stimulation of corticofugal axons; surprisingly fewer lateral IC neurons in the same slices (13/37) showed EPSPs during the same stimulation. Corticofugal train EPSPs were much smaller in lateral compared to dorsomedial IC neurons, suggesting a sparser convergence of corticofugal axons onto lateral IC neurons. At the dorsomedial IC, we have shown that corticofugal signals drive polysynaptic excitation in VGAT+ but not VGAT- neurons (Oberle et al., 2023). Preliminary data suggest this circuit motif is absent in the lateral IC: Onset latencies of corticofugal EPSPs did not significantly differ in VGAT- and VGAT+ neurons. The lateral IC has a unique organization characterized by GABAergic modules: Dense clusters of VGAT+ neurons are targeted by somatosensory inputs, but sparse auditory cortex input (Lesicko et al., 2016). The surrounding matrix zones have a lower density of GABAergic neurons but are densely contacted by auditory corticofugal axons (Lesicko et al., 2016). Interestingly, we find a number of VGAT+ (4/10) and VGAT- (2/10) cells in the GABA-rich modules responded to optogenetic stimulation of auditory corticofugal fibers despite apparently sparse corticofugal axons in the vicinity. Thus, we identify surprising intricacies of the auditory cortico-collicular pathway’s impact on distinct IC sub-regions.
Aysegul Gungor Aydin, Faiza Ramadan, Alex Lemenze and Kasia Bieszczad
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Temporal cues in speech are prominent perceptual cues for understanding language sounds. The ability to appreciate spectrotemporally rich speech sounds relies on learning-dependent processes in the cortex. Auditory cortical neuroplasticity can establish memories for the language-relevant acoustic cues that help the auditory system to “tune in” to salient timing cues in the auditory soundscape by enhancing sound-evoked temporal processing for learned salient cues. Recent work has demonstrated that blocking an epigenetic regulator of neuroplasticity, histone deacetylase 3 (HDAC3) promotes long-term memory formation for highly specific temporal features of acoustic cues in animals learning an amplitude-modulation rate discrimination task (AMRD) . The mechanism of HDAC3 action is thought to be on de novo gene expression events that are required for memory. Thus, we capitalized on an opportunity to determine genes and biological pathways that may be important for efficient and high-fidelity temporal cue learning and processing in the auditory cortex. We performed bulk RNA-sequencing (RNA-seq) on adult auditory cortical samples in trained rats treated with an HDAC3 inhibitor (vs. a group of vehicle-treated trained rats) learning the same established AMRD task. We further utilized single-nucleus RNA-seq to determine cell-type specific patterns of experience-induced transcription with a particular focus to uncover transcriptional contributions in non-neuronal cell types such as astrocytes, which are a critical component of the tripartite synapse and have been implicated in disorders which have auditory processing difficulties such as autism spectrum disorder. This work demonstrates that auditory associative learning results in changes to the adult auditory transcriptome that, combined with prior work in the lab, is now known to be task dependent. There are 28 unique genes that were differentially expressed (vs. vehicle). These gene sets include Homer1 and Shank2, which regulate excitatory synapse morphogenesis and play a critical role in synaptic plasticity. Bioinformatic analysis was used to determine gene network analysis and protein-protein interactions and revealed three functional clusters in learning-induced auditory cortical genes: glutamate receptor binding, cellular response to interferonα, and amine ligand-binding receptors. Revealing gene targets for temporal processing informs mechanisms of speech sound impairment or developmental language disorders.
Tingan Zhu, Cynthia King and Jennifer Groh
Topic areas: multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Sound location is inferred in a head-centered manner, based on binaural time difference, binaural loudness difference, and spectral content. However, visual stimulus localization is based on the retinal location of an image, i.e. in an eye-centered reference frame. In species with mobile eyes, these two reference frames are different, and information about eye movements/position must therefore be incorporated before the visual and auditory spatial scenes can be linked to each other. Our group’s initial discovery of eye movement related eardrum oscillations (EMREO; Gruters, Murphy et al. PNAS 2018) has suggested this process may start as early as in the ear canal. Subsequent studies further revealed that the EMREO conveys parametric information about eye movements (e.g. horizontal displacement and initial eye position) and can provide reliable reconstruction of the saccade target (Lovich et al. Phil Trans B 2023; Lovich et al. Biorxiv 2022; King et al. Biorxiv 2022; Brohl and Kayser Biorxiv 2022). However, it is still unknown whether the EMREO plays a role in linking visual and auditory space. Accordingly, here we ask if it is possible to alter the EMREO signal by artificially shifting the visual reference frame via training with prism goggles (20 diopters, 11.4 degrees rightward shift), thus disrupting the original relationship between the visual and auditory sensory inputs. EMREO signals were recorded in multiple daily sessions before and after the training with prism goggles in two human subjects. During each recording session, subjects were seated in a darkened anechoic room and asked to saccade to visual targets on the horizontal azimuth, spaced at -12, 0, and 12 degrees. For recording sessions following prism training, goggles were removed to allow successful eye tracking. Eye movements were tracked using the Eyelink 1000 and simultaneous ear canal recordings were obtained using ER10B+ microphones. We found that visual prism adaptation induced a consistent change in the EMREO signal, despite no changes in the associated eye movements. The EMREO change was larger in one subject than the other, but was present in both. These results indicate that the EMREO signal is plastic and that changes in visual input can induce this plasticity.
Alexandria Lesicko and Maria Geffen
Topic areas: correlates of behavior/perception multisensory processes subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The inferior colliculus (IC) is an obligatory relay station and massive convergence center for auditory information. In addition to its role in sound processing, the IC receives inputs from diverse multisensory and neuromodulatory structures and is implicated in acoustico-motor behavior. The lateral cortex of the IC, a multisensory region, contains a network of neurochemical modules that receive input from somatosensory brainstem and cortical regions. Previous studies have shown that inputs from the somatosensory cortex target inhibitory colliculo-thalamic projection neurons in the neurochemical modules, leading to suppression of auditory responses in the auditory thalamus. These results suggest that modular regions of the IC may serve as somatosensory-driven gating regions for auditory information. To test this hypothesis, we trained mice to perform a go/no-go task in which they lick for a water reward after presentation of a noise target, while selectively activating somatosensory-inputs to the IC on a subset of behavioral trials. We also performed anterograde trans-synaptic labeling of somatosensory-recipient neurons in the IC. In addition to assessing the functional role of somatosensory inputs to the IC, we used two-photon imaging to determine the sound response properties of neurons in modular and extramodular regions of the IC. Movement behavior during imaging was recorded and analyzed using FaceMap and DeepLabCut. Preliminary data suggest that activation of somatosensory inputs to the IC decreased performance on the sound detection task. Axon fibers from trans-synaptically labeled somatosensory-recipient IC neurons were found in known targets of the lateral cortex, including the medial geniculate body, laterodorsal tegmental nucleus, contralateral lateral cortex, and the superior colliculus. Two-photon imaging of IC responses to pure tones, FM sweeps, noise, and vocalizations was successfully performed. Sound, motion, and sound/motion responsive units were parsed using a generalized linear model. Motion-responsive units outnumbered sound-responsive and sound/motion responsive units, suggesting that motion is robustly encoded in IC. The results of the experiments will further determine what effect somatosensory input to the IC has on sound processing and target detection and will show whether modular and extramodular regions of the IC have distinct sound and movement processing features.
Mark Saddler and Josh McDermott
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Neurons encode information in the timing of their spikes in addition to their firing rates. The fidelity of spike timing is arguably greatest in the auditory nerve, whose action potentials are phase-locked to the fine-grained temporal structure of sound with sub-millisecond precision. The role of this temporal coding in hearing remains controversial because physiological mechanisms for extracting information from spike timing remain unknown. We investigated the perceptual role of auditory nerve phase locking using machine learning in the spirit of ideal observer analysis. We optimized deep artificial neural network models to perform real-world hearing tasks using simulated auditory nerve representations of naturalistic auditory scenes as input. To investigate the perceptual role of auditory nerve phase locking, we separately optimized models with altered phase locking limits (by manipulating lowpass characteristics of simulated inner hair cells), asking whether high-fidelity temporal coding in a model’s cochlear input was necessary to obtain human-like behavior. We compared human and model behavior on different tasks: measuring their abilities to recognize and localize words, voices, and environmental sounds, and to make pitch judgments. Models with high-frequency phase locking replicated human behavior across all tasks and stimulus conditions. Degraded phase locking affected some tasks more than others. Voice recognition and sound localization were most susceptible, with degraded phase locking leading to impaired performance in noise and inhuman responses to pitch and localization cue manipulations. By contrast, degraded phase locking left word recognition largely intact, with models exhibiting human-level performance in most real-world noise conditions. The results suggest that information conveyed by the precise, millisecond-level timing of auditory nerve spikes contributes to perception and therefore must be extracted by the human auditory system. Our modeling approach links neural coding to real-world perception and clarifies conditions in which prostheses that fail to restore high-fidelity temporal coding (e.g., contemporary cochlear implants) could in principle restore near-normal hearing.
Daniel C. Comstock, Kelsey M. Mankel, Brett M. Bormann, Soukhin Das and Lee M. Miller
Topic areas: speech and language correlates of behavior/perception subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Given the complexity of speech processing and the number of neural systems involved, adequately characterizing speech perception difficulties requires evaluation of neural, cognitive, and perceptual factors, in addition to basic hearing capabilities. To aid in this task, we use Cheech, continuous speech blended with synthetic frequency sweep chirps designed to elicit neural responses from brainstem to cortex simultaneously as measured with EEG (Backer et al, 2019). Cheech can provide unique insight into how neural factors relate to auditory processing and cognition, and potentially uncover the source(s) of underlying speech perception difficulties. We employed Cheech in a dynamic, multi-talker, spatial-attention-switching task to assess the neural contributions to speech processing. Cheech-modified short stories were played in silence or in the presence of a spatially separated competing talker to normal hearing, young adult listeners. Listening performance was assessed through behavioral metrics including narrative comprehension and embedded target word identification. Individual differences in cognitive factors (e.g., selective attention, working memory) and perceptual factors (e.g., pitch discrimination, temporal fine structure) were also evaluated as potential moderators of speech listening performance. Cheech-evoked neural potentials originating from points along the auditory pathway including the Auditory Brainstem Response (ABR), Middle Latency Response (MLR), Long Latency Response (LLR), as well as the linguistic component, N400, were extracted. Listening to a single talker resulted in more robust neural responses in lower levels of the auditory pathway (i.e., ABR, MLR) relative to dual talker conditions. Attending to the target talker also elicited enhanced neural responses at higher-level stages of the auditory pathway (i.e., LLR, N400) compared to distracter talker responses. Additionally, both listening performance and selective attention cognitive test results were reflected in neural encoding in both early and late cortical responses, while working memory test results were reflected only in late cortical responses. These results indicate the Cheech paradigm is capable of revealing the neural encoding differences between good and poor listeners amongst normal hearing individuals, and therefore may be sensitive enough to detect and characterize a broad range of speech perception difficulties within the auditory brain.
Jiajie Zou and Nai Ding
Topic areas: speech and language
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
When listening to speech, low-frequency cortical activity tracks the speech envelope. It remains controversial, however, whether such envelope-tracking neural activity reflects entrainment of neural oscillations or superposition of transient responses evoked by sound features. Recently, it is suggested that the phase of envelope-tracking activity can potentially distinguish entrained oscillations and evoked responses. Here, we analyzed the phase of cortical activity synchronized to the speech envelope during passive listening, for both healthy individuals and patients with disorders of consciousness (DOC). It is observed that the stimulus-response phase lag changes linearly with frequency between 3.5 and 8 Hz, for all participants who show reliable cortical tracking to speech, regardless of the consciousness state. Furthermore, the response phase changes at a similar rate over frequency, i.e., showing a similar group delay, for these participants. In sum, these results show that theta-band cortical activity synchronized to the speech envelope show the linear phase property, which is barely modulated by the consciousness state and can be modeled by evoked responses.
Jong Hoon Lee, Eunji Jung and Seung-Hee Lee
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Reversal learning tasks have been widely used in mammals to study cognitive flexibility, the ability to change one’s behavioral response to different circumstances. The task involves subjects rapidly adapting to changes in stimulus-outcome or response-outcome contingencies and therefore requires both sensory and motor regions of the brain to work in tandem. Previous research has demonstrated the importance of the posterior parietal cortex (PPC), the auditory cortex (AC), and the inferior colliculus (IC) in auditory tasks. In this study, we examined how these three regions encode task-relevant information, such as auditory stimuli identity, contingency (stimulus-outcome association), reward, reward history, and licking behavior, at the level of individual neurons using a statistical approach based on generalized linear models. In-vivo single-unit recordings before, during, and after stimulus-contingency reversal revealed key differences between the three regions in both which and when different task variables are encoded. We next explored how these regions worked together by conducting in-vivo calcium imaging of two unique top-down projections from the PPC to the AC (PPCAC) and the IC (PPCIC). Using the same approach as above, we showed that PPCAC neurons encode stimulus contingency and update reward history in the next trial, whereas PPCIC neurons encode "Go" stimuli and reward feedback, which may facilitate fast licking responses to the relevant stimuli. Taken together, our findings demonstrate that PPC plays a central role in flexible decision making, with cortico-cortical and cortico-collicular circuits to AC and IC, respectively, playing separate but crucial roles in encoding changes in stimulus-outcome associations.
Linda Garami, Chris Angeloni, Maria Geffen and József Fiser
Topic areas: memory and cognition hierarchical organization multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Sensory systems extract patterns at varying complexity from the input. Such extraction (chunking) is key to fast and efficient coding of environmental information, but it also biases sensitivity to changes and accuracy in recognition. Despite being a common feature, organizing principles of chunking are usually tested modality-specifically preventing the detection of the modality-independence of chunking principles and their behavioral consequence. We hypothesized that certain chunking principles and the resulting perceptual biases are similar across modalities, and their neural correlates are present in the primary cortical areas. We focused on a well-known auditory chunking principle called the Iambic-trochaic law. In language processing, longer syllables have the tendency to signal word ends and longer duration of a tone in a sequence without any linguistic content also tends to be interpreted by both adults and babies as the end of a chunk. People tend to interpret the stream of an auditory stimulus train consisting of short (S) and long (L) tones separated by silence (… S S L S S L S S L …), as a repeating pattern of SSL rather than any other alternatives (e.g., SLS). Importantly, this chunking results in a decreased detection accuracy of randomly inserted gaps at a perceived chunk’s border compared to the inside of the chunk. To test the universality of this chunking principle, we tested if bias in human performance in temporal change detection is comparable in the visual modality. We implemented the stream segregation go/no-go paradigm for human participants identical in vision and audition. Human participants had a lower d-prime if an unexpected gap was inserted after the long object (tone or square) in both modalities. We recorded neuronal activity in the auditory cortex (AC) of awake, head-fixed mice passively listening to acoustic stimuli. We tested whether and how neurons in the AC detect gap insertion in a repeating pattern of a similar continuous stream. We found that activity in the AC was significantly higher in response to stimuli with unexpected gap insertion when the gap was inserted prior to the long tone than when it was inserted prior to one of the short tones. We used identical paradigms across modalities (aud/vis) and models (human/mouse). Our results support the idea of domain-general non-linguistic grouping principles and raise well-testable further questions that can lead to a domain-independent sensory processing model.
Andrea Bae, Roland Ferger and Jose Luis Pena
Topic areas: neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
As sound localization specialists, barn owls provide a unique opportunity to examine the neural computations underlying spatial coding. In particular, their well-described midbrain stimulus selection network, a circuit containing a map of auditory space dedicated to localizing sounds, can be leveraged to investigate how this circuit prioritizes salient sounds in environments with competing sounds. Earlier in vivo recordings in the owl’s optic tectum (OT) have shown neuronal responses and gamma oscillations are spatially tuned to both visual and auditory information, and may play a role in stimulus selection. However, these previous recordings have relied on single electrodes in single regions, and open questions remain regarding how network responses and brain oscillations facilitate information flow across regions for stimulus selection. Towards this end, we recorded spike responses and local field potentials in OT and one of its downstream forebrain regions simultaneously in awake head fixed owls while presenting competing auditory stimuli at different speaker positions from a free field speaker array. Relative salience of the competing stimuli was controlled by altering intensity differences between the two stimuli. Additionally, competing stimuli consisted of two unfrozen broadband noises or two amplitude modulated broadband noises to examine the contribution of the envelope to stimulus selection. Preliminary findings show that spike responses from areas of the map representing less salient stimuli decreases as the intensity of the more salient competing stimulus increases. This was observed at all competing stimulus locations and stimulus conditions, consistent with previous findings that the midbrain stimulus selection network exhibits global inhibition to suppress activity across the topographic space map for non-salient locations for competing visual or bimodal (visual + auditory) stimuli. Our findings also demonstrate that global inhibition occurs across hemispheric spatial regions. In addition, spike patterning influenced by brain oscillations across midbrain and forebrain regions was analyzed.
Genesis Nunez, Chase Hintelmann, Divya Raskonda, Prisha Patel, Xingeng Zhang and Justin Yao
Topic areas: correlates of behavior/perception multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Sensory impairments, such as hearing loss, can lead to cognitive processing deficits. For example, hearing-impaired individuals exhibit reduced temporal integration when performing an auditory task (Florentine et al., 1988) and diminished audiovisual integration (Musacchia et al., 2009) compared to age-matched individuals with normal hearing. Here, we examined how hearing loss impairs the decision-making cognitive variables of temporal integration and multisensory enhancement. We trained normal hearing adult gerbils to perform a single-interval alternative forced-choice audiovisual decision-making task. Gerbils initiated trials by placing their nose in a nose port and were required to discriminate between slow ( 6-Hz) presentation rates of amplitude-modulated (AM) noise (“auditory-only” condition), light-emitting diode (LED) flashes (“visual-only” condition), or simultaneous AM and LED flashes (“audiovisual” condition) by approaching the left or right food tray located on the opposite side of the test cage. Temporal integration was quantified as the duration from trial onset to when animals departed the nose port area and approached one of the two food trays (i.e., “integration time”). Discrimination performance for audiovisual trials were the most accurate, and displayed the fastest integration times, compared to the single-modality trials. We induced hearing loss by exposing animals to loud noise (~120 dB SPL) during a single 2-hour session. Noise exposure significantly reduced hearing sensitivity as auditory brainstem response thresholds increased ~30-50 dB SPL for clicks and tones. Hearing loss severely impaired discrimination performance for auditory-only trials and modestly altered performance for visual-only and audiovisual trials. Hearing loss significantly extended integration times for all sensory conditions. These findings suggest that hearing loss impairs sensory processing outside of the auditory domain. We propose this involves degraded neural representations of decision-making variables within the parietal cortex, a region involved in the temporal integration of auditory and visual information. To test this, we are recording extracellular responses from parietal cortex neurons while gerbils perform the audiovisual decision-making task pre- and post-hearing loss.
James Baldassano and Katrina MacLeod
Topic areas: cross-species comparisons neural coding neuroethology/communication subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Inhibition has multiple roles in the processing of sound at the level of the brain stem. The avian superior olivary nucleus (SON) is the main source of inhibition to auditory brain stem nuclei. SON neurons receive excitatory input from the intensity-coding cochlear nucleus angularis (NA) and the coincidence detecting nucleus laminaris (NL). SON neurons provide feedback inhibition to the ipsilateral cochlear nucleus magnocellularis (NM), NA, and NL. A separate population of SON neurons projects to the contralateral SON, but whether these two distinct populations differ in their physiological properties is unknown. Studies previously revealed two physiological phenotypes: regular tonic firing (RT) and single spiking (SS). We describe here a third phenotype, a chattering tonic phenotype (CT). In vivo experiments showed that SON neurons have a range of temporal processing capabilities. To determine whether in vitro cell types correlate with postsynaptic target divergence or temporal processing roles, we investigated the circuitry and physiology SON neurons in vitro using patch clamp electrophysiology. First, by electrically stimulating the presynaptic nuclei in specialized wedge slices that contained NA, NL, and SON, we determined that all SON phenotypes could receive excitatory synaptic input from both NL and/or NA, suggesting convergent afferent connectivity. Second, we used naturalistic fluctuating current injections in vitro that mimicked in vivo inputs to assess temporal sensitivities. SS neurons were the most sensitive to temporally modulated input and had highest reliability. RT neurons were the least sensitive to temporally modulated input and more closely resembled integrators, while CT neurons had moderate sensitivity and reliability in their firing. Third, intracellular labeling of recorded SON neurons allowed for reconstruction of axonal projections. These patterns were strongly related to the response types. SS neurons projected contralaterally. CT neurons projected ipsilaterally and dorsally in a fiber tract directed toward NM and NL. Finally, the RT neurons projected ipsilaterally via two fiber tracts, either toward NA or toward NL and NM. These results suggest SON neurons have physiological specializations that allow a range of temporally responsivity, consistent with the diversity of in vivo responses. The data suggests that circuit specializations and temporal information in functionally distinct pathways.
Iain DeWitt, Josef Rauschecker, Kareem Zaghloul and Barry Horwitz
Topic areas: speech and language
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
To recognize spoken words, cerebral cortex must learn the natural statistics of feature co‑occurrences in the native language. Computational models suggest this will impart parametric differences in word-form processing as a function of compositional regularity. Using whole-brain fMRI and intracranial electrocorticography (ECoG) in intractable epilepsy patients, we recorded neural responses to auditory word-form stimuli that varied in the predictability of their phoneme sequences with respect to the natural statistics of speech. In the fMRI experiment, to increase the likelihood of detecting bottom-up processing, non-word stimuli were presented in rapid serial streams while participants performed an orthogonal low-level target detection task. In the ECoG experiment, to increase the likelihood of detecting lexical processing, participants performed a lexical decision task while attending to the stimuli. For each experiment, three alternative neural processing models were considered: (i) a feedforward model, in which stimulus-induced activity was expected to propagate along the auditory ventral stream proportional to the probabilities of the stimulus’ constituent sub-sequences (i.e., diphone and triphone probabilities); (ii) a Bayesian model of phoneme transition (i.e., a Hidden Markov Model), where stimulus-induced activity is expected to scale with winner-take-all competitive inhibition (i.e., phone entropy); and (iii) a Bayesian word recognition model (such as Shortlist B), where stimulus-induced activity is likewise expected to scale with competitive inhibition (i.e., lexical entropy). To account for acoustical processing, single-subject analyses incorporated nuisance terms for a model of A1 processing. BOLD fMRI responses in left rostral parabelt were found to vary parametrically with the Bayesian model of phoneme prediction. This model is analogous to statistical learning in infants but reflects the learned transitional probabilities of the native language. High-gamma ECoG responses in superior temporal gyrus, at a site proximal to the anterior-lateral aspect of Heschl's gyrus, were found to vary parametrically with the Bayesian model of word prediction. This result was found both with an anatomically constrained searchlight analysis and with anatomically unconstrained unsupervised clustering. The models supporting each result are suggestive of locally recurrent processing in a winner-take-all ventral-stream object-recognition network.
Brooke Holey and David Schneider
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Behavior is a strong predictor of sensory input. Movements with unexpected sensory consequences drive strong responses in primary sensory cortex that reflect a mismatch between experience – sensory input from the periphery – and expectation. The motor cortex is thought to be an important component of the cortex’s prediction circuitry, specifically for its role in generating expectations via an internal model. However, it remains unknown whether or how motor cortical activity encodes the sensory consequences of movement or whether this encoding reflects expectation and experience. Here, we show that mouse motor cortex neurons are responsive to passive sounds, driven in part through long-range inputs from the auditory cortex. When a silent movement unexpectedly produces a sound, many individual motor cortical cells have rapid, sound-locked responses that cannot be accounted for by changes in behavior but instead can be accounted for by a model incorporating sound and movement responses. Through optogenetic tagging of auditory-projecting M2 cells, we find that activity evoked by the novel self-generated sound is relayed to auditory cortex, indicating that corollary discharge signals from motor cortex are not purely motor in nature. Interestingly, following several days of motor-sound coupling, motor cortical neurons become unresponsive to the same self-generated sound that previously drove a large response, suggesting a gating of responses to self-generated sounds based on expectation. Together, these findings reveal that the motor cortex transiently encodes the unexpected sensory consequences of a movement and reverts to stable, motor-centric dynamics once a sensory consequence becomes predictable, consistent with an important role for the motor cortex in the brain’s prediction circuitry.
Sagarika Alavilli, Andrew Francl, Preston Hess and Josh McDermott
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Background The ability to localize sounds in the world is critical to perceiving our environment, but is not perfectly accurate. These performance limits remain incompletely documented and understood, in part because research on sound localization has tended to rely on synthetic stimuli (tones and noises). Measurements of everyday sound localization have the potential to reveal new insights and to provide benchmarks for models of sound localization, which ultimately must explain real-world competence. Methods We used a speaker array to measure human localization of a wide variety of natural sounds in a real-world setting (a small classroom). We quantified overall localization accuracy, the accuracy variation across different sound positions, the incidence of front-back confusions, and the accuracy with which specific sounds were localized in azimuth and elevation. Lastly, we assessed whether a stimulus-computable neural network model of sound localization could reproduce human performance. Results Human localization exhibited some known effects previously seen with synthetic stimuli, such as more accurate localization near the midline, and was overall fairly accurate apart from front- back confusions. However, some sounds were localized better than others. The qualitative effects seen in human localization (better localization near the midline, and front-back confusions) were also evident in the model. The model also predicted human localization accuracy for individual sounds well above chance, though predictions were below the noise ceiling, indicating room for modeling improvements. Conclusions The results provide a benchmark for models of sound localization and a framework to enable quantitative predictions of human performance in this domain.
Aneesh Bal, Samantha Soto, Andrea Santi, Patricia Janak and Kishore Kuchibhotla
Topic areas: memory and cognition
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Animals possess a wide repertoire of cognitive abilities that enable learning of a diverse range of tasks and skills throughout the lifespan. However, this learning process is rarely isolated, and the ability to leverage past information significantly impacts the capacity to acquire new knowledge efficiently. Transfer learning, multi-task learning, and continual learning are three key mechanisms that facilitate fast acquisition and retention of new knowledge by capitalizing on prior knowledge and task similarities. To investigate the neural mechanisms underlying multi-task and continual learning, we sought to develop an approach that allows mice to learn a diverse and large range of tasks that vary along different dimensions, such as stimulus perceptual dimensions, task rules, and motor outputs. If successful, dissecting the neural substrates of continual learning becomes more tractable due to the wide range of physiological, optical, and molecular tools available in the mouse. But how can we teach mice to learn many tasks? Most rodent literature requires animals to learn 1 or 2-3 tasks at most. We reasoned that this low number of tasks is constrained by the method of training rather than a mouse’s true cognitive abilities. To overcome these limitations, we created a self-paced, multi-task mouse playground, for automated training of cognitive tasks. Mice (n= 4-6) reside in a home-cage connected to a behavioral arena by a gating mechanism which permits one mouse in at a time. Inside the behavioral arena, mice can perform tasks to receive water reward, their only source of water. As a result, mice complete a variety of experimenter-administered tasks in a fully volitional manner. To test whether mice could learn many tasks, we trained mice (n=4) on 5 auditory Go-NoGo tasks in which mice learned a discrimination problem along distinct perceptual dimensions. We found that mice learned all 5 tasks (
Yizhen Zhang, Laura Gwilliams, Ilina Bhaya-Grossman, Matthew Leonard and Edward Chang
Topic areas: speech and language neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Understanding spoken language requires extracting individual words from a continuous acoustic signal. Unlike written text, detecting word boundaries in spoken language is challenging because words are often not separated by silence and acoustic cues are not reliable. Thus, listeners may instead segment words using multiple sources of learned knowledge. An outstanding question is how the brain extracts words from the speech stream; specifically it is unknown which brain areas encode word boundaries and whether those representations are independently or jointly encoded with lexical information. To address this, we recorded high-density electrocorticography (ECoG) responses while participants passively listened to spoken narratives, and investigated the process by which the brain segments words in natural speech. We first explored whether neural populations are sensitive to word boundaries in single trials. We found neural populations throughout the lateral temporal cortex had evoked responses time-locked to word boundaries. Specific electrodes exhibit complex, multi-phasic evoked responses, consisting of 1-3 distinct response peaks around word boundaries. We used partial correlation to show that both acoustic cues and word-level features modulated the word boundary response. Specifically, we observed a sequence of feature encoding around word boundaries: envelope cues occurred immediately after the word onset, followed by sensitivity to lexical frequency, and finally the duration of the whole word. With regard to spatial localization, acoustic-phonetic features were primarily encoded in the middle superior temporal gyrus (STG), while word-level features were encoded in the middle STG as well as the surrounding cortex. A widely distributed STG neural population jointly encoded multiple levels of features, and that neural population also exhibited superior word segmentation performance compared to the electrodes that exclusively encoded acoustic-phonetic or word-level features. Together, these findings suggest that the human STG is sensitive to word boundaries, with acoustic (envelope) cues and lexical features (frequency and duration) jointly contributing to the word segmentation process. The acoustic-phonetic inputs are encoded in the core STG, whereas lexical encoding is both in the middle STG as well as in surrounding cortical regions. These results support a new model of distributed, integrative processing in the STG during spoken word processing.
Hemant Kumar Srivastava, Kathy Liang, Zakir Mridha, Michelle Pei, Ariadna Cobo-Cuan, Hong Jiang, John Oghalai and Matthew McGinley
Topic areas: hierarchical organization neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Sound responses in auditory thalamus and cortex depend strongly on neuromodulatory brain state, as indexed by pupil size (McGinley et al., 2015). In particular, responses show an overall inverted-U shaped dependence on pupil size. However, at what stages, and with what pattern, arousal influences sensory processing remains largely unknown. Here, we performed pupil and sound response measurements in head-fixed awake mice, from the cochlea to the inferior colliculus (IC). Using optical coherence tomography (OCT) to measure the vibratory responses of basilar membrane for the first time in awake animals, we observed that responses to pure tones showed no evidence of modulation due to pupil-indexed brain state changes. Similarly, in preliminary results of subcutaneous auditory brainstem responses(ABR; N=2 mice, n=4 sessions), we did not see obvious state-dependence. We also tested whether inferior colliculus (IC) is influenced by pupil-indexed brain state. We used Neuropixels probes to record central IC (ICC) tone responses (N=7 mice, n=10 sessions, n=2131 neurons). Our results indicate strong state dependence in ICC. The population average showed an inverted-U shaped relationship. Moreover, using PCA followed by K-means clustering, we found subpopulations of units with distinct patterns of state-dependence. The largest group (49% of sound-responsive neurons) showed an inverted-U shaped relationship while the other major group(41%) showed monotonically reduced activity with increasing arousal. Interestingly, the inverted-U neurons had narrower bandwidth than the decreasing neurons(p
Ruoyi Chen, Nathan Vogler, Violet Y. Tu, Alister C. Virkler, Jay Gottfried and Maria N. Geffen
Topic areas: correlates of behavior/perception multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
In natural environments, animals process sensory stimuli of different modalities simultaneously. Odor and sound are crucial for mice in predator detection and mother-infant interaction. However, we know little about how the brain integrates information of these two modalities. To investigate auditory-olfactory integration, we used viral tracing, immunohistochemistry, and behavioral experiments, and started with a focus on how odors affect sound processing. Anatomically, we identified a direct projection from the piriform cortex (PC) to the auditory cortex (AC) using multiple viral tracers and mouse lines. We first injected retroAAV-hSyn-eGFP in the AC of Cdh23 mice and observed cell-body labeling throughout the PC. Injecting retroAAV-hSyn-Cre in the AC of Ai14 mice yielded consistent results. We quantified labeled cells in different PC subregions and found more cells in the posterior PC projecting to AC compared to the anterior PC. Injections of anterograde AAV1-hSyn-Cre in PC of Ai14 mice showed labeled neurons in all AC subregions, with more labeling in the ventral auditory field. Moreover, we observed cell-body labeling in the auditory thalamus (MGB). To examine the effect of odor on sound perception, we trained mice on a Go/No-Go sound detection task. We observed a decrease in the sound detection threshold on odor-present trials. We took a step further to make both odor and sound stimuli relevant, by training mice on a Go/No-Go task for sound and/or odor detection. We then tested them with combinations of sound and odor stimuli at different intensities. We found that odor modulated sound detection in an intensity-dependent manner. Together, our findings demonstrate integration of auditory and olfactory information and propose a novel pathway enabling such integration.
David Peita and Sung-Joo Lim
Topic areas: memory and cognition
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Humans can direct their attention to an object held in working memory (i.e., retrospective attention), which facilitates recall of the attended object and enhances precision of its feature. However, it is unclear whether attentional enhancement in perceptual precision can extend to recalling multiple features comprising the object. Using a change detection task with speech syllables, we examined the effect of object-based auditory retrospective attention on memory recall of two distinct auditory features. On each trial, listeners (N=30, 18-23 years) maintained two speech syllables in memory. During memory retention, either a valid or a neutral visual retro-cue was presented to direct listeners’ attention to one, to-be-probed syllable. After hearing a probe syllable, listeners were asked to judge the change occurred in the pitch or spatial location of the syllable. Valid retro-cues led to faster detection of both pitch and spatial location changes in the attended syllable memory object. Psychophysical modeling results revealed that retrospective attention did not significantly enhance the perceptual precision of the attended syllable object neither in pitch nor spatial dimensions. Regardless of cue types, participants were more precise at judging pitch changes occurred on the syllable object compared to judging changes along the spatial dimension. Overall, these results demonstrate that when listeners must maintain information across multiple features of objects held in memory, selective attention re-directed to a specific object facilitates faster memory recall, by prioritizing the attended (vs. unattended) object, rather than enhancing its representational precision. In addition, these results indicate a potential bias in auditory working memory towards maintaining sensory-specific features, favoring acoustic spectral features over spatial information.
Michelle Moerel, Alina Schepers, Sonia Bălan and Omer Faruk Gulban
Topic areas: subcortical processing thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The thalamic reticular nucleus (TRN) is hypothesized to filter incoming sensory information, thereby acting as gatekeeper across sensory modalities. In spite of this crucial role [1], the auditory part of the TRN remains virtually unexplored in the human brain because of its small size (~2 x 3 x 4 mm) and the limited spatial resolution of regular non-invasive methods for exploring the human brain. Here we used magnetic resonance imaging (MRI) at ultra-high field (UHF) strength (7 Tesla), yielding high spatial resolution, to characterize the auditory part of the human TRN anatomically and functionally in vivo. First, we manually segmented the TRN on publicly available in vivo quantitative T2* datasets (0.35 mm3 isotropic) [2]. Comparing these segmentations with a post mortem dataset [3] showed substantial overlap between the in vivo and post mortem TRN segmentations, supporting the feasibility of identifying the human TRN based on MRI data. However, this analysis also showed substantial inter-individual variability in TRN shape and size. Second, we used functional MRI data (i.e., responses to natural sounds; ~6 hours of functional data per participant) [4,5] to search for a sound-responsive region within the anatomically-defined TRN. While a sound-responsive region was observed in the TRN, its location bordering the medial geniculate body (MGB) of the thalamus made interpretation ambiguous. Follow up work, where high resolution anatomical and functional data is collected in the same participant, will be needed to unequivocally answer the research question. 1. Yu, X. J., Xu, X. X., He, S. & He, J. Change detection by thalamic reticular neurons. Nat. Neurosci. 12, 1165–1170 (2009). 2. Gulban, O. et al. Mesoscopic in vivo human T2* dataset acquired using quantitative MRI at 7 Tesla. Neuroimage 264, 1–40 (2022). 3. Alkemade, A. et al. A unified 3D map of microscopic architecture and MRI of the human brain. Sci. Adv. 8, (2022). 4. Sitek, K. et al. Mapping the human subcortical auditory system using histology, postmortem MRI and in vivo MRI at 7T. Elife 8, (2019). 5. Lage-Castellanos, A., De Martino, F., Ghose, G. M., Gulban, O. F. & Moerel, M. Selective attention sharpens population receptive fields in human auditory cortex. Cereb. Cortex (New York, NY) 33, 5395 (2023).
Victoria Figarola, Abigail Noyce, Adam Tierney, Ross Maddox, Fred Dick and Barbara Shinn-Cunningham
Topic areas: memory and cognition subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Individuals often need to attend to one signal within a crowded acoustic scene, a phenomenon known as the cocktail party effect. While attentional effects on auditory cortical responses are well-documented, it is still unclear how auditory subcortical nuclei are affected by attention. This study combines two previous electroencephalography (EEG) paradigms to simultaneously evaluate the effects of top-down attention on brainstem and cortical physiological responses. Competing, temporally interleaved low and high melodies were created by concatenating pitch-evoking 'pseudotones' generated by convolving a periodic impulse train (low pitch: 40-56 Hz; high pitch: 64-96 Hz) with a brief tone pip at a carrier frequency (low: 3500 Hz; high: 4500 Hz). Each melody had notes repeating at 2 Hz, with an overall note rate of 4 Hz. Participants attended to either the high- or low-pitched melody and responded whenever a 3-note pattern repeated in that stream. From past work, we expected to see strong cortical EEG responses at 2 Hz, but with a phase that shifted depending on which melody listeners attended. Note that each pitch period of each note elicited an auditory brainstem response (ABR). This allowed us to test whether attention altered responses along the ascending pathway in the brainstem while at the same time evoking cortical responses to the attended and unattended streams. We have recruited 19 subjects thus far (12F/7M) all with normal hearing thresholds. We extracted both subcortical ABRs and cortical evoked responses to each note, as well as the phase and inter-trial phase coherence (ITPC) at 2Hz. While we found robust ABRs evoked by each tone pip, we saw no evidence that attention modulated any of the ABR components while attending versus ignoring the distractor (Wave V latency difference: 0.061ms; Wave V amplitude difference: 0.0018microV). These results suggest that attentional effects in subcortical structures are weak, at best. On the other hand, each note onset evoked cortical activity, specifically enhancement of every other note depending on which stream was being attended. We have also observed that the best attending listeners approach a 180 degree phase separation between conditions. By simultaneously recording cortical responses and ABRs, we are able to measure potential attention-mediated changes in the neural signals in the cortex and brainstem simultaneously.
Zhipeng Zhang, Jennifer Nacef, Fenna Poletiek, Benjamin Wilson, Timothy Griffiths, Yukiko Kikuchi and Christopher Petkov
Topic areas: memory and cognition correlates of behavior/perception multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
How human speech and language evolved from an auditory system shared with ancestors to living nonhuman animals remains an important open question, one with implications for the extent to which aspects of the human language system can be modelled in nonhuman animals. A key property of human language is combinatorial semantics, where information from a sequence of words is integrated to identify meaningful content. We designed a novel behavioural touchscreen task implemented with two rhesus monkeys (Macaca mulatta) in their home units. The task allowed us to study whether the monkeys could associate nonsense speech sounds with visual colours or shapes, prior to integrating the information contained in a sequence of two sounds identifying a specific object by its joint colour and shape properties. The paradigm was implemented in two key phases. In the first phase, the animals started by learning to associate the nonsense words with either specific colours or shapes. Learning the task was effortful and the two monkeys struggled to maintain high performance throughout the touch screen testing sessions in the colony. However, they also showed regular bouts of high performance, and they met predefined criteria for progression (i.e., majority of sessions in the testing week before progression showing above chance performance based on permutation tests). One of the monkeys recently progressed to the final phase of the experiment, where sequences of two sounds identified objects by both colour and shape properties. The macaque consistently chose the correct combinations whose joint colour and shape properties matched the informative content in the two sounds, over foil objects that contained one or none of the features referred to by the sounds. This was evident in the high levels of performance for the combinations, with fewer choices going towards the foil objects (42 testing sessions; pairwise t-tests corrected for multiple comparisons: p < 0.001). Our next steps in this research program are to proceed with testing whether the first monkey can generalize learning to probe trials of novel combinations and to progress the second monkey to this final phase of testing. The results provide tentative support for a primate prototype of auditory combinatorial semantics in nonhuman primates. The results also demonstrate that such abstract auditory learning is difficult for monkeys, altogether providing novel insights into language evolution.
Kyle Rupp and Taylor Abel
Topic areas: hierarchical organization neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The auditory system seamlessly parses auditory scenes and sorts components into sound categories. Studies of the underlying neural representations of sound categorization have largely focused on human-defined stimulus features or matrix decomposition techniques, which both fall short in capturing the rich, complex stimulus transformations involved. Meanwhile, recent deep neural network (DNN) models have solved this problem with few constraints on the available stimulus transformations. Assuming the models have identified an optimal set of stimulus features that grow increasingly complex and abstract with increasing layer depth, we can view it as a data-driven feature extractor with representations ranging from low-level acoustics to abstract category-level descriptions. Guided by this framework, we built encoding models to predict neural responses in auditory cortex (ACx) using layer activations within a sound categorization DNN as input features, referred to as DNN-derived encoding models (DDEMs). Neural data was recorded via stereoelectroencephalography (sEEG) in 16 patient-participants while they listened to a set of 165 two-second clips of natural sounds from categories including speech, non-speech vocalizations, music, and environmental sounds. Neural responses were predicted with state-of-the-art accuracy, with supratemporal plane (STP) channels modeled best by shallow DNN layers, and channels in superior temporal gyrus/superior temporal sulcus (STG/S) modeled best by deeper layers. DDEMs consistently outperformed spectrotemporal receptive field models, suggesting more complex representations than simple spectrotemporal tuning throughout ACx. Furthermore, the category encoding strength for human vocalizations (as determined by a separate analysis) correlated strongly with the best DNN layer across channels: voice category-selective channels were most closely associated with deep DNN layers. We then used DDEMs to estimate integration windows by identifying the shortest stimulus inputs that did not appreciably change the predicted neural responses; integration windows segregated anatomically, with windows of ~85-185 ms in STP and ~245-500 ms in STG/S. These results elucidate the functional properties in ACx: STP encodes acoustic properties (albeit with higher complexity than spectrotemporal tuning) at short timescales, while STG/S integrates over longer timescales to encode higher order stimulus transformations more akin to voice category selectivity.
Muhammad Zeeshan, Fei Peng, Bruno Castellaro, Shiyi Fang, Nicole Rosskothen-Kuhl and Jan Schnupp
Topic areas: neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Cochlear implants (CIs) are an advanced treatment for patients with profound sensorineural hearing loss. Over the last few decades, bilateral CIs are increasingly used to restore binaural hearing. However, human bilateral CI users usually exhibit poor sensitivity to interaural time differences (ITDs) in particular. The reasons for this poor ITD sensitivity are still only poorly understood. We suspect that the manner in which ITDs interact with Interaural Level Difference (ILD) cues may play a role, given that conventional CI stimulation provides CI users with impoverished ITD cues, so that they become increasingly reliant on ILD and insensitive to ITD. To investigate this possibility, we first need to document ITD and ILD sensitivity and their interactions in the naive auditory pathway. We deafened the rats neonatally with kanamycin injections intraperitoneally (from day 9 to 20) and verified by measuring the auditory brainstem responses. Biphasic pulse train stimuli at rates of 1, 100, and 900 pps with different ITD (±0, ±0.04, ±0.08, ± 0.12 ms) and ILD (±0, ±1, ±4 dB) combinations were delivered through bilaterally implanted CIs. Inferior colliculus (IC) multiunit responses were recorded extracellularly, and analyzed for statistically significant ITD or ILD sensitivity using Kruskal-Wallis tests. At pulse rates of 1, 100 and 900 Hz, 85.6%, 99.7% and 97.2% respectively of multiunits were found to be ITD sensitive, 88.5%, 96.4% and 88% were ILD sensitive, and 76.8%, 96.1%, 85.5% were sensitive to both. We conclude that sensitivity to both ITDs and ILDs was very widespread in the IC of our naive, neonatally deafened rats, that it can be observed at a wide range of pulse rates, and that multiunits sensitive to one type of cue are invariably sensitive to the other type of cue too.
Jared Collina, Gozde Erdil, May Xia, Janaki Sheth, Konrad Kording, Yale Cohen and Maria Geffen
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Understanding the factors that contribute to individual variability in perceptual decision-making behavior is critical for unraveling the complex processes underlying cognitive function. We investigated whether features of learning trajectories could predict individual variability in categorization behavior of mice. We trained mice to categorize tone into two categories, high and low. Mice were rewarded for turning a wheel in a specific direction depending on the auditory stimulus. After the mice learned the stimulus-action association, we presented uncertain stimuli to study the categorization behavior. We recorded mice behavioral responses and clustered them into distinct learning trajectories based on a range of features, including learning speed, symmetry of stimulus association learning, and the stability of performance across training sessions. We hypothesized that mice with similar learning trajectories would exhibit similar categorization behaviors. We found substantial individual variability in the categorization behavior of the mice. We observed distinct learning trajectories characterized by differences in learning speed and error rates. Interestingly, mice with more symmetric learning rates of the two categories exhibited higher accuracy during the categorization task. Conversely, mice with asymmetric learning rates displayed more fluctuating performance patterns and higher error rates. In addition, we explored potential predictors of individual variability in categorization behavior, such as initial performance, exploratory behavior, and neural activity patterns. We leveraged advanced data analysis techniques to extract relevant features from these predictors and examine their relationship with categorization behavior. Our findings suggest that features of learning trajectories can predict individual variability in categorization behavior in mice. By identifying predictors of variability in learning trajectories, we hope to uncover novel insights into the mechanisms driving individual differences in cognitive processes.
Roland Ferger, Andrea Bae and José Luis Peña
Topic areas: neural coding subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The barn owl has been a model organism for sound localization for decades. It's outstanding abilities, the distinct use of interaural time difference (ITD) and interaural level difference (ILD) for localizing sound sources in azimuth and elevation, respectively, as well as the topographical organization of the owl's midbrain have elucidated many fundamental principles of sound localization. The external nucleus of the inferior colliculus (ICX) contains the part of the sound localization pathway, where multiple frequency channels merge and neurons respond to distinct combinations of ITD and ILD in dichotic stimuli (delivered via earphones), or equivalently to sound source locations in free field (more distant speakers). The ICX projects to the optic tectum (OT), the avian homologue of the superior colliculus, where neurons respond to auditory and visual stimuli. The OT is part of a global inhibition network which effectively suppresses responses to the less salient of two stimuli presented at different locations. This was shown for visual-visual stimulus pairs as well as responses to auditory stimuli in presence of competing visual outside of a neurons receptive field. In this study, we compare the amount and effect of global inhibition and adaptation between ICX and its direct down-stream projection target OT. Previously, stimulus competition has been shown in ICX and was explained by circuit-independent mechanism like binaural de-correlation. In a parallel study, we investigate responses in OT, and here we build upon this to elucidate the relative contribution of peripheral mechanisms and global inhibition on auditory-auditory competition.
Menoua Keshishian, Samuel Thomas, Brian Kingsbury, Serdar Akkol, Stephan Bickel, Ashesh D. Mehta and Nima Mesgarani
Topic areas: speech and language
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Linguistic information of speech is represented hierarchically in the auditory pathway, such that higher-order representations emerge as we get further away from the primary auditory cortex. But how or why these different levels of representations emerge in the brain remains a matter of debate. An unbiased (data-driven) computational model of speech processing can be used to answer this question. Developing data-driven computational models directly using neural data is next to impossible, given the scarcity of neural data compared to synthetic data modalities (text, image, audio, etc.). A common approach to overcome this limitation is to train a deep neural network on a large corpus to perform a human-relevant task and learn insights about the brain processes by comparing the representations of the artificial model and the brain. E.g., using a large language model (LLM) trained to predict the next word in a sequence to predict the brain activity of a listener. A crucial difference between LLMs and the speech processing neural pathway is that the input to the auditory system is a highly variable sound which such models ignore when using the transcript. In this work, we ask: If we train a data-driven model to perform the task of speech recognition from start to finish (spectrogram to words), can it: (1) Explain the extraction of linguistic information observed in the brain? (2) Reveal the anatomical organization of speech processing steps? We use an RNN-Transducer as a computational model of speech perception. Our model consists of 6 bottom-up LSTM layers that process sound and one feedback LSTM layer. We map the model layer activations to intracranial activity recorded from human subjects and label each electrode by its best predictive layer of the model. We observe that deeper layers of the model better predict the downstream areas in the auditory pathway. To identify the cause of this improved prediction accuracy, we determine the degree of linguistic feature encoding in RNN-T layers, from acoustic to semantic. This analysis reveals the emergence of a hierarchy of language representation in the model such that earlier layers are best at predicting phoneme-level information and later ones at word-level information. Together, these two levels of analysis show a progressive encoding of linguistic information across different layers of the network and how these computations map to different regions of the auditory cortex.
Ryan Calmus, Zsuzsanna Kocsis, Yukiko Kikuchi, Hiroto Kawasaki, Timothy D. Griffiths, Matthew A. Howard and Christopher I. Petkov
Topic areas: memory and cognition speech and language neural coding
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
To understand how the auditory-cognitive system establishes internal models of the world, there is substantial interest in identifying neural signals that carry traces of the auditory sensory past or an expectancy about the future. To date, memory traces of sensory sequences containing regularities have been decoded in auditory cortex and the hippocampus in animal models, or in humans using non-invasive neuroimaging. However, outside of animal models we lack insights into how site-specific neurophysiological signals from the auditory cortical mnemonic system may carry information reflecting maintenance activity to sounds over delays, and on the signals that may characterize retrospective replay of regularities in a sensory sequence. We conducted an auditory statistical learning task containing non-adjacent dependencies with 12 neurosurgery patients during presurgical intracranial monitoring of refractory epilepsy. Patients listened to sequences of 3 nonsense words drawn from sets (X, A and B), with regularities between relevant pairs of sounds (A-B) often separated in time by uninformative (X) words. We first analyzed site-specific local field potentials from auditory cortex, hippocampus and frontal cortex using traditional methods, demonstrating engagement of the fronto-temporal network in the processing of the sequence regularities. We also present time-resolved multivariate decoding results that reveal the latencies and timescales of sequence item representations in regions across this network. This represents to our knowledge the first human intracranial electrophysiological evidence of auditory hippocampal replay, showing that time-compressed replay occurs before and after key sounds in the sequence. Furthermore, the results revealed that maintenance of sound representations within non-primary auditory cortex (superior temporal gyrus) appears prolonged relative to primary auditory cortex (Heschl’s gyrus). Finally, in support of a developing computational model of sequence processing and binding (Calmus et al., 2019), prefrontal areas including precentral gyrus appear to maintain an ordered buffer of auditory item representations. These results elucidate critical roles for the auditory mnemonic system in transforming sensory events into mental structures, providing insight into the mechanistic contributions of medial temporal, prefrontal and superior temporal regions in the maintenance, prediction and ordinal manipulation of sequential information.
Ashlan Reid, Demetrios Neophytou and Hysell Oviedo
Topic areas: thalamocortical circuitry/function
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Sensory cortical circuits mature during developmental critical periods (CP), epochs of time during which neural connectivity is acutely responsive to the statistics of the sensory environment. Here, we find hemispheric differences in the timing of the CP in mouse auditory cortical circuits (ACx), with the right ACx demonstrating an earlier emergence of mature circuit structure. Voltage imaging of animal-matched acute auditory thalamocortical (TC) slices indicates that the laminar distribution and spatiotemporal dynamics of the TC response matures to adult patterns at earlier postnatal ages in the right ACx. In addition, in vitro intracellular recordings indicate that the increase in GABAergic synaptic inhibitory tone, linked to the establishment and termination of the CP, occurs earlier in the right ACx. Finally, using multi-channel silicon probe recordings in vivo, we find that the tonotopy exhibited in adult ACx is altered by manipulation of the early sensory environment, with right and left ACx showing sensitivity to acoustic statistics during disparate developmental time windows. Together, our data indicate a temporal shift in the maturational trajectory of left and right auditory cortical circuits. Given the abundance of temporally-linked changes taking place in early postnatal development, including those physically intrinsic to the animal (e.g. ear canal opening) and in the external environment (e.g. littermate vocalizations), a shift in the CP time window could dramatically influence the nature of the acoustic inputs co-occurring with the molecular events driving circuit maturation, precipitating the lateralized functionality found in adult cortical circuits and disrupted in various human brain disorders.
Kamryn Stecyk, Kameron Clayton, Anna Guo, Anna Chambers, Ke Chen, Kenneth Hancock and Daniel Polley
Topic areas: auditory disorders correlates of behavior/perception hierarchical organization subcortical processing
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Auditory inputs rapidly reach efferents innervating the head to increase the tension of the tympanic membrane, dampen basilar membrane motion, dilate the pupil, and elicit twitches of the ears and eyelid. These audiomotor reflexes are elicited by high-intensity sound and are understood to protect the ear from injury and facilitate defensive behaviors. Here, we performed quantitative videography of the mouse face during presentation of a wide class of sounds and documented a different type of audiomotor behavior that provides novel insights into normal and disordered hearing. Adult mice were head-fixed atop a piezoelectric force plate during high-speed (150 Hz) videography of the face. Facial movements (FaMs) occurred slightly later than startle reflexes (30 vs 15 ms post-sound onset) but were 40 dB more sensitive than the startle. FaMs were insensitive to narrowband sounds ( < 1 octave) but faithfully tracked the low frequency sound envelope of broadband sounds, providing an involuntary behavioral readout for temporal processing of speech tokens presented in varying levels of background noise. Recordings from the primary auditory cortex (ACtx) identified regular spiking neurons concentrated in layers 5 and 6 that were modulated by spontaneous FaMs, suggesting that ACtx could either enable or modulate sound-evoked FaM (438/5 units/mice). To test that hypothesis, we bilaterally inactivated ACtx and noted that sound-evoked FaMs were enhanced, suggesting corticofugal neurons may impose constitutive inhibition of this subcortical reflex pathway (N = 10 PV-Cre x ChR2 reporter mice). Following noise-induced damage to the high-frequency cochlear base, audiomotor responses to high-frequency noise bands were reduced, while responses to low-frequency noise bands were enhanced, consistent with central gain identified in later waves of the ABR measured in the same mice (N = 8). Mice with mutations in the autism risk gene Ptchd-1 also exhibited hyperreactive sound-evoked FaMs (N = 12), demonstrating that this reflex pathway is sensitive to both acquired and inherited forms of hypersensitivity. Our findings provide a new platform for behavioral phenotyping of complex sound processing without training. These findings demonstrate that a broad class of low-intensity sounds elicit rapid movements of the face and whisker array, thereby identifying a peripheral source (and perhaps a confound) for many examples of cross-modal multisensory interactions.
Christina der Nederlanden, Marc Joanisse, Laurel Trainor and Jessica Grahn
Topic areas: speech and language
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Neural tracking of the speech envelope has been widely used to index attention and comprehension of speech in adults. Neural tracking has also been used to illustrate differences in neural processing for speech and song, with greater tracking of song than speech that is largely dependent on the degree of rhythmic regularity. Few studies have examined neural processing of speech in infancy, which makes it unclear whether neural tracking can also index auditory processing in infancy. Here we compare how 32 four-month-olds and 31 adults neurally track--as measured by cerebro-acoustic phase coherence--the rhythms of infant-directed (ID) and monotone speech and song. At four months, infants showed robust neural tracking for ID and monotone speech and song much like adults. However, four-month-olds showed increased neural tracking of the envelope for speech and song at both the delta and theta bands, while adults only showed greater coherence in the theta band for speech and song. Infants also tracked the acoustic signal in a more stimulus-driven manner, showing greater neural tracking of the wildly exaggerated features of ID speech over ID song and greater neural tracking for the stronger rhythmic content of monotone song than monotone speech. In contrast, adults tracked ID song greater than ID speech and monotone speech greater than song. Neural processing is mature enough in infancy to show similar patterns of envelope tracking as adults. Future work should examine whether neural tracking can also index attention and comprehension in infancy and how the development of domain-specific knowledge contributes to neural processing of communicative signals.
Edmund Lalor and Samuel Norman-Haignere
Topic areas: speech and language
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Speech is central to human life. Yet how the human brain converts patterns of acoustic speech energy into meaning remains unclear. This is particularly true for natural, continuous speech, which requires us to efficiently parse and process speech at multiple timescales in the context of our ongoing conversation and situational knowledge. Much progress has been made on this problem in recent years by the realization that the dynamics of cortical activity track those of natural speech. This has led to the development of new methods to study the neurophysiology of speech processing in more naturalistic paradigms. However, the field still lacks consensus regarding the precise physiological mechanisms and neurostructural origins of this tracking. In particular, two contrasting theories have been advanced that attempt to explain the genesis of this phenomenon. The first proposes that the quasi-rhythmic nature of continuous speech “entrains” intrinsic, endogenous oscillations in the brain as a way to parse that continuous speech into smaller units for further (linguistic) processing. Meanwhile, the second proposes that the cortical tracking of speech reflects the summation of a series of transient evoked responses from hierarchically organized neural networks that are tuned to the different acoustic and linguistic features of speech. In this study, we aim to introduce a framework for critically examining these two theories side by side. We demonstrate this framework by modeling EEG datasets (N = 33) recorded while healthy, neurotypical adults listened to natural (e.g., audiobook) speech stimuli. Specifically, we model these data using an approach that (implicitly) assumes that the EEG tracking of speech derives from evoked activity, and using of three models of oscillatory entrainment that have been proposed in the literature. Initial analysis suggests a dominant role for evoked activity in EEG responses to natural speech. However, future fine-tuning of the oscillatory models – and indeed testing alternative models – remains a necessity to reach consensus on this fundamental question. We hope that the framework introduced here will facilitate progress towards that consensus.
Jessica Winne, Rebecca Schrader and Melissa Caras
Topic areas: memory and cognition
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
The perineuronal net (PNN) is an extracellular matrix structure that envelops parvalbumin-positive (PV+) interneurons. Reduced PV and PNN expression has been associated with decreased GABAergic signaling and elevated cortical plasticity. We hypothesize that both PV and PNN expression is dynamically regulated during auditory learning. Method: Mongolian gerbils were trained on an aversive go/nogo amplitude modulation (AM) detection task, and the expression of PV and PNNs surrounding PV+ cells in the primary auditory cortex was assessed by immunohistochemistry. Training was divided into two stages. During procedural training, animals learned to drink water during non-AM noise and to stop drinking during a highly salient (0 dB re: 100% depth) AM noise to avoid a shock. After animals achieved a d’ > 2 for two consecutive sessions, the animals progressed to the perceptual training stage, during which weaker AM depths were presented over several days. PV and PNN expression was assessed in untrained animals, animals that completed procedural training, and animals that completed two or seven days of perceptual training. Results: Auditory training significantly affected PV expression in layer (L) 2/3 and L5. PV expression was lower after procedural training than in untrained controls (L2/3: p=0.0002; L5: p=0.0001). In contrast, PV expression was higher after two days of perceptual training than after procedural training (L2/3: p=0.0001; L5: p=0.0001) or after seven days of perceptual training (L2/3: p=0.0002; L5: p=0.0002). Auditory training also significantly affected the proportion of PV+ cells surrounded by PNNs in L2/3 and L5. A smaller proportion of PV+ cells were surrounded by PNNs after procedural training compared to untrained controls (L2/3: p=0.03; L5: p=0.01). In contrast, the proportion of PV+/PNN+ cells was significantly higher after seven days of perceptual training compared to procedural training (L2/3: p=0.0077, L5: p=0.0022), or two days of perceptual training in L2/3 (p=0.0274). These results appear to be specific to auditory cortex, as training had no significant effect on PV or PNN expression in somatosensory cortex. Our results suggest that PV expression (and shortly thereafter, PNN expression) decreases initially, opening a window for neural plasticity. Overexpression and renormalization of PV and PNN levels after extended training may stabilize network modifications and newly-acquired auditory expertise.
Cody Zhewei Cao, G Karthik, Areti Majumbar, Andrew Jahn and David Brang
Topic areas: speech and language multisensory processes
Fri, 11/10 10:15AM - 12:15PM | Posters 1
Abstract
Seeing a speaker’s face facilitates accurate speech perception. Research has shown that listeners use lipreading to restore degraded auditory speech information, but these influences fail to account for the full benefits imparted by visual speech. Prior work has hypothesized that visual speech can restore two additional features of speech: spectral and temporal information. For example, listeners can recover spectral information using speakers’ mouth width and the speaker’s lip closure helps listeners parse the temporal boundary between words. However, it remains poorly understood whether, and how, spectral and temporal information is restored in the auditory cortex. In the current study, we asked two questions: first, is visual speech integration regionally specific, where temporal and spectral information is restored in separate areas, or regionally nonspecific, where the same region restores both kinds of information? Second, how does visual speech alter the spatial pattern of auditory system activity to improve audibility of speech? We hypothesized that visual speech restores the spatial pattern of auditory activity evoked by degraded auditory speech, making it more similar to the pattern of clear speech. We collected fMRI data from 64 subjects who listened to 200 sentences presented across five conditions-- auditory-alone unfiltered, auditory-alone temporally degraded, auditory-alone spectrally degraded, audio-visual temporally degraded, and audiovisual(AV) spectrally degraded. Univariate contrasts reveal that the same visual signal has different effects on auditory processing depending on the degraded auditory feature. Visual speech increased BOLD in the STG for both types of degraded speech, but differed across conditions in other regions: AV spectral recovery increased BOLD in Heschl's gyrus whereas AV temporal recovery increased BOLD in anterior STG. Second, we used Representational Similarity Analysis to compare fMRI data of audiovisual conditions to the auditory unfiltered condition. Early results from RSA analysis suggest that visual speech restoration uses distinctly different mechanisms from auditory speech perception. We plan to use a single-trial-based GLM regressor to examine if representational distance is closer between AV filtered speech and original auditory speech than that of audio-alone filtered speech. Together, we show auditory cortex uses visual speech signals to selectively recover features of the degraded auditory signal .
Frauke Kraus, Jonas Obleser and Björn Herrmann
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Pupil size and neural alpha oscillatory power are often used to indicate cognitive demand, but it is unclear how much these metrics covary with an individual’s motivational state. Here we tested whether pupillometry and alpha power are sensitive to both listening demand and motivational state. Participants performed an auditory gap-detection task while pupil size or magnetoencephalogram (MEG) were recorded. Task difficulty and a listener’s motivational state were orthogonally manipulated through changes in gap duration and monetary-reward prospect, respectively. While participants’ performance decreased with task difficulty, reward prospect enhanced performance under hard listening conditions. Pupil size increased with both task difficulty and higher reward prospect. Importantly the reward-prospect effect was largest under difficult listening condition. Moreover, larger pre-gap pupil size was associated with faster response times on a within-participant level. In contrast, neural alpha power showed no effects of reward-prospect. Of relevance to the utility of pupillometry in audiology and translational neuroscience, pupil size indexed higher motivational state especially under demanding listening. However, we could not find a similar response of neural alpha power. These results add to the mounting evidence that pupil size and alpha power are not two interchangeable physiological indices of cognitive investment.
Stephanie Rosemann and Christiane M. Thiel
Topic areas: auditory disorders correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Age-related hearing loss affects a large part of the older population and commonly affects the higher frequencies. Research using magnetic resonance imaging provided evidence for neuroanatomical changes covering grey and white matter in age-related hearing loss. However, studies using diffusion-weighted magnetic resonance imaging (DWI) and a diffusion tensor imaging model to compute measures such as fractional anisotropy (FA) or mean diffusivity (MD) showed inconsistent results concerning differences between normal-hearing and hard of hearing participants. Further, measures of FA do not account for the crossing fibers problem (multiple fiber populations within one voxel). Recently, more advanced diffusion models have been developed that can resolve these multiple fibers within a voxel. Hence, the aim of the current study was to investigate changes in white matter morphology in age-related hearing loss by employing a fixel-based approach to study the number of axons and the axon diameter within a voxel. Data from 28 hard of hearing and 31 normal-hearing participants (aged 50-75 years) were included in the analysis. The hard of hearing participants showed a mild to moderate and symmetrical age-related hearing loss. We observed a significant decrease in fiber density (FD, indicator for a change in tissue microstructure) for fiber bundles originating in the body of the corpus callosum (fronto-medial fibers) and in the cerebellum in hard of hearing compared to normal-hearing participants. Further, we found a significant decrease in the combined measure of fiber density and fiber bundle cross section (FDC, indicating microscopic and macroscopic changes) in a small part of frontal projecting fibers from the corpus callosum in hard of hearing participants. Our data provide evidence of reduced FD and FDC in age-related hearing loss, possibly indicating a loss of axons as well as decreased axon diameter in fibers that originate in the corpus callosum and project to the medial part of the precentral gyrus. This is the first study assessing whiter matter morphology with an advanced diffusion model resolving fiber crossings. These results suggest that age-related hearing loss is associated with a loss of tissue microstructure in the main interhemispheric commissure.
Fenghua Xie, Yixiao Gao, Tao Wang and Kexin Yuan
Topic areas: subcortical processing thalamocortical circuitry/function
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Largely topographical projections from different subdivisions of the thalamus, such as the primary, secondary and association sensory thalamus, to hierarchically defined cortical areas have been recognized across sensory systems. However, how corticothalamic projections, which are believed to be crucial for the remarkable flexibility and precision exhibited by our sensory systems, are organized remained poorly understood compared with the thalamocortical counterpart. Here we report that, first, the primary sensory thalamus received direct inputs from cortical L5 neurons. Second, in contrast to the robust thalamocortical topography, L5 neurons in each of the primary, secondary and association auditory cortical areas project to all the subdivisions of the auditory thalamus. Third, the association cortex uniformly provided the most L5 inputs to each individual thalamic subdivisions followed by the secondary auditory cortex. Lastly, L5 axon terminals were mainly varicosity-type and evenly distributed across thalamic subdivisions, but those in the polymodal association subdivisions were the largest. Our data suggest that different subdivisions of the auditory thalamus may under the modulation of common rather than topographic L5 inputs. The corticothalamic anatomic framework we revealed urges a revisit of current cortico-thalamo-cortical circuit models in the sensory systems.
Mengting Liu, Yixiao Gao, Tao Wang, Fengyuan Xin, Ying Hu and Kexin Yuan
Topic areas: hierarchical organization subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
In the ascending auditory pathway, the inferior colliculus (IC) serves as a critical relay between auditory periphery and auditory thalamus (AT). The projections from different subdivisions of the IC to the AT, the so-called tectothalamic pathways, have been thought to be organized in a topographical manner, assuming that neurons in an individual IC subdivision are functionally homogeneous. However, observations that are discrepant from this traditional view have been made, suggesting heterogeneous neuronal functions within an individual IC subdivision. Here, by taking advantage of transgenic mice and various viral tools, we report that the distribution and projection patterns of IC neurons distinctly expressing different molecular markers, parvalbumin (PV) and somatostatin (SOM) in particular, can likely reconcile this discrepancy. Specifically, we identified ICPV+ and ICSOM+ neurons, which are present in all IC subdivisions, as two of the major cell types in the IC. They predominantly project to the ventral division of the medial geniculate body (MGBv), which is the lemniscal AT, and the posterior limiting nucleus of the thalamus (POL), respectively. The POL is a subdivision of the extralemniscal AT and predominantly projects to the posterior tail of the striatum, which plays a critical role in modulating defensive decision-making. Interestingly, this projection pattern was irrelevant with the specific IC subdivision in which ICPV+ or ICSOM+ neurons were locally labeled. We further revealed that ICPV+ neurons mainly receive inputs from the auditory brainstem, whereas ICSOM+ neurons likely integrate processed sensory information of different modalities from various sources. We also demonstrated that ICPV+ neurons are more heterogeneous than ICSOM+ neurons in terms of their electrophysiological properties, terminal size and neurotransmitter type, likely supporting the robustness of ICPV+ neurons in processing acoustic features with great complexity. Our findings provided an explanation for existing discrepancies regarding tectothalamic projections, and suggested a molecular approach for defining tecothalamic pathways in the auditory system.
Wenxi Zhou, Aishwarya Balwani, Sueyeon Chung and David Schneider
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Neural activity in auditory cortex (ACtx) is influenced by animal’s motion, expressed as a marked suppression of sound-evoked responses during movement compared to rest. This suppression of sound responses can be shaped by learning to reflect the acoustic features of self-generated sounds. Such predictive processing is thought to involve the comparison between what an animal expects and what it hears. Yet it remains unknown how in ACtx expectations are learned and represented in the ACtx, nor how expectations alter the population dynamics of auditory cortical ensembles. Here, we show that ACtx is necessary for developing acoustic expectations and that representation of auditory prediction reshape population geometry following motor-sensory experience. We trained mice to push a lever for water rewards. Once mice mastered the task, we paired each lever press with a predictable tone and recorded in ACtx on the first and last days of pairing. We found that two thirds of ACtx neurons significantly changed their activity during movement hundreds of milliseconds prior to the self-generated tone, resulting in an overall ramping-up of baseline firing rates in ACtx. When averaged across all recorded neurons, movement-related changes were similar prior to and after motor-sensory experience. However, after motor-sensory experience movement-related activity became concentrated in neurons that are responsive to the expected tone frequency. Using a data driven matrix factorization approach, we found that these movement-related changes were concentrated in neurons with rapid but prolonged sound responses and are correlated with the neurons’ relative tuning towards the expected frequency. PCA analyses showed that movement-related activity in ACtx was originally orthogonal to the neural dimension that encodes sound frequency; but this was altered by motor-sensory experience, resulting in a significant overlap in motor and acoustic dimensions, which is consistent with movement signals encoding an expectation for the paired sound. Predictive suppression, reallocation of movement signals, and reshaping of the cortical manifold could all be eliminated by selectively silencing the ACtx during motor-sensory experience. Together, these findings reveal that expectations about self-generated sounds emerge in ACtx, are encoded by movement-related activity, and reshape the population geometry of auditory cortex, which may facilitate predictive processing during behavior.
Vishal Choudhari, Kiki van der Heijden, Prachi Patel, Stephan Bickel, Ashesh Mehta and Nima Mesgarani
Topic areas: speech and language correlates of behavior/perception novel technologies
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Auditory attention decoding (AAD) is the process of using a listener's brain waves to determine which talker they are focusing on in a multi-talker environment. Previous studies have shown that neural responses in the auditory cortex are modulated by top-down attention, using both invasive and non-invasive recording techniques. However, these recording techniques vary in their spatial and temporal resolution, and the frequency bands they can reliably analyze. Non-invasive studies mainly focus on low-frequency neural activity ( 70 Hz) across different brain areas. Different frequency bands reflect distinct underlying mechanisms related to auditory attention. Therefore, it is crucial to have a standardized dataset that enables a fair comparison of neural correlates of auditory attention across different frequency bands and regions of the auditory cortex. To address this, we conducted invasive electroencephalogram (iEEG) recordings from the auditory cortex of neurosurgical patients. During the experiments, the neurosurgical patients (1) listened to spatialized single-talker speech stimuli and (2) selectively attended to a single talker in a spatially-separated multi-talker speech setting. Frequency analysis of the neural activity recorded during the single-talker setting reveals consistent patterns of location and speaker selectivity across different subjects. We also examined the AAD performance based on talker location and spectrogram, across various frequency bands and anatomical regions of the auditory cortex. Our findings provide valuable insights into the potential development of a brain-controlled hearing device that uses AAD to selectively enhance the talker of interest. Depending on the recording method and its invasiveness level, our results guide the selection of specific anatomical regions and neural frequency bands that should be targeted for such a device.
Marios Akritas, Sonali S. Sriranga, Alex G. Armstrong, Jules M. Lebert, Arne F. Meyer and Jennifer F. Linden
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The auditory system has often been described as an "early warning" system for the brain, optimised for fast detection of sound events occurring far away or outside the focus of attention. A possible behavioural correlate of this “early warning” in awake mice is sound-evoked whisker twitches (Meyer et al. 2018 Neuron) and other sound-evoked body movements (Bimbard et al. 2023 Nat Neurosci). What is the nature of these sound-evoked movements, and how do they relate to auditory cortical activity? We used sequences of noise bursts varying in sound intensity, predictability, and duration to analyse sound-evoked whisker, nose, and pinna movements in 7 awake head-fixed mice. In 4 of the animals, we also investigated the relationship between the sound-evoked movements and simultaneously recorded single-unit and multi-unit activity in the auditory cortex. Sound-evoked whisker, nose, and pinna movements did not resemble startle responses. The movements occurred even for quiet sounds and grew gradually with increasing sound intensity (25-70 dB SPL). Moreover, the movements were minimally affected by stimulus predictability or expectation. In fact, facial movements evoked by 65 dB SPL noise bursts were either similar for regularly and irregularly timed noise bursts (1s versus jittered 0.8-1.2s inter-onset intervals), or slightly but significantly larger for the rhythmic sounds. Further experiments involving long, variable-duration noise bursts (0.4-1.6s) showed that while sound onsets evoked robust increases in whisker, nose, and pinna movements, sound offsets only evoked increases in pinna movement. Finally, analysis of auditory cortical recordings revealed a small but significant subset of units (15% for whisker and pinna, 7% for nose) in which trial-to-trial variation in sound-evoked response magnitudes correlated (or anti-correlated) with trial-to-trial variation in sound-evoked movement magnitude. We conclude that sound onsets reliably evoke whisker, nose, and pinna movements in awake head-fixed mice, and sound offsets can also evoke pinna movements. Moreover, trial-to-trial variation in the magnitude of sound-evoked movements is related to trial-to-trial variation in sound-evoked firing rates for a small but significant fraction of neurons in the auditory cortex.
Stephanie Lovich, David Kaylie, Cynthia King, Christopher Shera and Jennifer Groh
Topic areas: correlates of behavior/perception cross-species comparisons hierarchical organization multisensory processes
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The visual and auditory systems work together to ensure surrounding stimuli are perceived accurately. Information about eye movements is necessary to this process because eye movements shift the relative positions of the visual and auditory sense organs. We have recently reported an oscillation of the eardrums time-locked with the onset of a saccade and in the absence of outside auditory stimuli, suggesting that such information is available as early as the auditory periphery (Eye movement-related eardrum oscillations or EMREOs; Gruters, Murphy et al. PNAS 2018; Lovich et al. in press 2023). However, the underlying mechanical causes of EMREOs are still unknown. Here, we sought to determine the role of the middle ear muscles in producing these saccade-associated eardrum oscillations. We recorded EMREOs in two rhesus monkeys during spontaneous saccades before and after surgical transection of the middle ear muscles. The monkeys were head-restrained in a dark room, and eye movements were tracked with a video eye tracker (1000 Hz sampling rate) while eardrum oscillations were recorded using microphones placed in the ear canals of both ears. We report that surgical transections of either the stapedius muscle or the tensor tympani muscle cause changes in the EMREO. Transection of the stapedius causes the EMREO to be significantly diminished in amplitude, but it is not eliminated. Transection of the tensor tympani muscle, in contrast, causes the EMREO to be significantly *increased* compared to the pre-surgery baseline data. This data corroborates parallel findings collected in our lab in humans with stapedius and/ or tensor tympani dysfunctions. Together, these findings suggest that there may be multiple contributors to the EMREO in a complex system within the ear, and that the middle ear muscles may work as a push and pull mechanism with opposite functions to stabilize the system.
Dimitri Brunelle, Timothy Fawcett and Joseph Walton
Topic areas: subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The inferior colliculus (IC) is a major midbrain convergence site critical for processing complex sounds such as speech and undergoes fundamental changes with age-related hearing loss (ARHL). The loss of peripheral inputs and senescence-related alterations in neurotransmission lead to decreased activity driving neurons in the IC, causing spectral-temporal auditory processing deficits. Local field potentials (LFPs) represent the electrical potential surrounding neurons in the extracellular space, reflecting pre-synaptic activity and the integration of excitatory and inhibitory signals from neuronal inputs generating action potentials. In the current study, age-related changes in sound-evoked LFPs were assessed in the CBA/CaJ mouse model of ARHL via deconstructing various LFP components evoked by wideband noise bursts. To assess the effects of age on pre-synaptic sound-evoked activity in the IC, multi-channel arrays were placed in the IC central nucleus and neural activity was acquired from 11 young (4-6 mo.), 10 middle-aged (8-14 mo.), 20 old (24-25 mo.), and 6 very old (27-30 mo.) mice. Only one session and recording location was selected per animal. LFPs were recorded from linear 16-channel NeuroNexus probes sampled at 1017 Hz and down-sampled to 2-300 Hz. 589 recording sites were characterized into low, mid, or high frequency regions based off the tonotopic arrangement of the IC. LFPs were temporally decomposed into several regions, based on the amplitude and time of the component of interest. Statistically significant age effects at 80 dB SPL stimuli were found in both the magnitude and latency of temporal LFP components across all frequency regions. Old mice exhibited significantly larger N1 peaks than young in the low (130uV) and mid (98uV) frequency regions. Young mice exhibited the highest P1 peak magnitudes to mid and high frequencies with a 27uV higher P1 versus middle-aged animals at mid and 22-30uV higher P1 amplitudes than older mice at high frequencies. The magnitude of the N2 was 51.6 to 54.3uV deeper for young compared to both old and very old animals at low frequencies while only 41.8uV deeper at mid frequencies. The results of this study indicate a significant age-related modulation of both excitatory and inhibitory components of the LFP in aged auditory midbrain neurons. These changes vary as a function of tonotopy and may be related to known age-related alterations in auditory midbrain neurochemistry.
Jimmy Dion, Monty Escabí and Ian Stevenson
Topic areas: neural coding subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Real-world listening poses significant challenges for humans and other animals when communicative sounds occur in competing background noise. These same acoustic scenarios are often the most challenging for the hearing impaired and artificial speech recognition systems. Although perceptual studies have shown that both the spectrum and modulation statistics of a background sound can influence the perception of a foreground target, how the brain separates sound mixtures and solves this computational problem is poorly understood. Here, we recorded neural activity from populations in the auditory midbrain in order to assess how the statistics of natural background sounds alter the neural representation of a foreground vocalization. Multi-unit population activity was obtained from the inferior colliculus of head-fixed unanesthetized rabbits listening to natural sound mixtures using linear 64-channel recording arrays (Neuronexus). Speech sentences or zebra finch song motifs were presented as foregrounds in the presence of seven competing natural background sounds and perturbed variants at multiple signal-to-noise ratios. These backgrounds encompassed a wide range of modulation statistics and included speech babble, bird babble, running water, and construction noise. The backgrounds were delivered in the original unmodified (ORIG) or the perturbed phase randomized (PR) or spectrum equalized (SE) configurations. The PR perturbation preserves the original sound spectrum but distorts (whitens) the original sound modulations; whereas the SE perturbation distorts the spectrum (whitens) and preserves the original sound modulations. Using shuffled correlation methods, we separated the foreground and background-driven neural response components for each of the sound mixtures and conditions (ORIG, PR and SE). Preliminary results show that the distance between the foreground-driven population activity with noise and without noise strongly depends on the background sound statistics. For some background sounds, the spectrum dominates and distorts the foreground sound encoding. While for other backgrounds, the modulation statistics more strongly interfere with the encoding of the foreground. Collectively, the findings demonstrate how both the spectrum and modulation statistics of natural backgrounds influence and interfere with the representation of vocalization foreground sounds suggesting that these are both critical features underlying masking of real-world natural sounds.
Gregory Hamersky, Luke Shaheen, Jereme Wingert and Stephen David
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
In everyday hearing, listeners encounter complex auditory scenes containing spectrally overlapping sounds. Accurate perception requires streaming, i.e., the grouping of sound features into meaningful sources based on statistical regularities in the time and frequency domains. Psychoacoustic studies have described auditory streaming as a perceptual phenomenon often using artificial stimuli, but less is known about its underlying neural basis with natural sounds. To understand mechanisms underlying stream segregation, we recorded single unit activity in primary (A1) and secondary (PEG) fields of auditory cortex in awake, passively listening ferrets. Animals (n=4) were presented with natural sounds from two broad, ethologically relevant categories: textures (background, BGs) and transients (foregrounds, FGs). BG and FG stimuli were presented individually and concurrently. Neural responses to overlapping pairs (BG+FG) were modeled as linear weighted combinations of responses to the isolated BG and FG. Model weights showed BG+FG responses were consistently suppressed relative to responses to the individual sounds. Perceptually, FG stimuli pop out from BG. However, the linear model showed surprisingly strong suppression of FG relative to BG responses. The suppression was observed throughout the 1s duration of the stimulus in A1 but only during the first half of the stimulus duration in PEG. The dominant activation by the BG stimulus at early stages of the auditory hierarchy suggests that strong activation by the BG stimulus is required for its segmentation and subsequent subtraction and perceptual popout of the FG. To study the neural computations supporting these responses, we trained two encoding models, a traditional LN model and a convolutional neural network (CNN), to predict time-varying neural responses using a separate natural sound dataset. We then evaluated the ability of the models to account for the patterns of FG suppression observed in the data. The CNN model had greater prediction accuracy and was able to account for the degree of FG suppression across different neurons and different stimuli. Analysis of the tuning space captured by the CNN can reveal spectrotemporal features that drive the relative suppression of the sound streams.
Alex Clonan, Xiu Zhai, Ian Stevenson and Monty Escabi
Topic areas: memory and cognition speech and language correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Recognizing speech in noisy environments is a critical task of the human auditory system. While spectrum and modulation statistics both influence speech recognition in noise, current modeling approaches are unable to predict human recognition sensitivity in distinct, real-world backgrounds. Here we assess how the spectrum and modulation statistics of natural sounds mask the recognition of spoken digits (0 to 9). We enrolled participants in a psychoacoustic study where digits were presented in various natural background sounds (e.g., water, construction noise, speech babble; tested for SNR=-18 to 0 dB) and their perturbed variants. We perturbed the backgrounds by either 1) phase randomizing (PR) or 2) spectrum equalizing (SE). PR retains the power spectrum but distorts the modulation statistics while SE distorts the power spectrum and retains modulation statistics. Even at a constant noise level, the ability to recognize foreground digits was substantially helped or harmed by these background perturbations, depending on the original background sound. To explore this interference, we used texture synthesis (McDermott & Simoncelli 2011) to manipulate individual background modulation statistics. We found that adding texture statistics decreased accuracy in speech babble. Interestingly, however, adding statistics increased accuracy in a construction noise with strong comodulations. We next developed a physiologically inspired model of the auditory system model to predict perceptual trends. Sounds were decomposed through a cochlear filter bank (peripheral stage) and a subsequent set of spectrotemporal receptive fields that model modulation selectivity in auditory midbrain (mid-level stage). Logistic regression was performed on these features to estimate perceptual transfer functions and predict human accuracy. The peripheral model (with spectrum cues alone) accounted for 67% of the perceptual variance, while the mid-level model (spectrum and modulation cues) accounted for 91%. Stimulus SNR has a substantial (log-linear) influence on recognition accuracy but appears to act independently from background statistics. These findings show how the diverse spectrum and modulation content of environmental background sounds has complex effects that can either help or harm speech recognition. However, an interpretable model of mid-level auditory computations predicts perceptual sensitivity and identifies the specific acoustic cues contributing to listening in noise.
Jian Carlo Nocon, Jake Witter, Howard Gritton, Xue Han, Conor Houghton and Kamal Sen
Topic areas: neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Cortical circuits that encode sensory information contain populations of neurons, yet little remains known about the strategies through which information is aggregated from single units. Such aggregation may be necessary in environments in which single unit information is degraded due to competing stimulus sources. One example of such an environment is the cocktail party phenomenon, in which the auditory system can spatially separate sounds from single sources within a complex auditory scene. In this study, we apply a novel information theoretic approach to estimate mutual information of target stimulus discriminability in populations within mouse auditory cortex. We investigate multiple coding schemes to determine how target stimulus information is maintained within a complex auditory scene: a summed population (SP) code, where response origins are irrelevant to coding; and a vectorized labeled line (LL) code, where response origin is maintained. Our results show that a small subset of neurons is sufficient to nearly maximize the mutual information of target discriminability over various configurations of target and competing masker stimulus locations, with the LL code outperforming the SP code and information approaching levels obtained without noise. Finally, we find that information in both coding schemes increases with spatial separation of target and masker, in correspondence with psychometric studies on spatial release from masking in humans and animals. Altogether, we reveal that a compact population of neurons in auditory cortex provides a robust code for sounds in the presence of competing stimuli.
Yaser Merrikhi, Alireza Khadir, Charlotte Kruger, Sajad Jafari and Stephen G. Lomber
Topic areas: auditory disorders correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Electrical stimulation of the auditory nerve with a cochlear implant (CI) is the method of choice for treatment of severe-to-profound hearing loss. Auditory cortical function and plasticity are major contributing factors to the variability in speech perception outcomes. Spectrally degraded stimuli, presented to normal-hearing individuals, can serve as a model of cortical processing of speech by CI users. This study utilized intracranial electroencephalography (iEEG) to study processing of spectrally degraded speech throughout the cortical hierarchy, test for hemispheric asymmetries and determine the relationship of cortical activity to speech perception. Participants were normal hearing adult neurosurgical epilepsy patients. Stimuli were utterances /aba/ and /ada/, degraded using a noise vocoder (1-4 bands) and presented in a one-interval discrimination task. Cortical activity was recorded using depth and subdural iEEG electrodes (>2000 contacts). Recording sites were assigned to regions of interest, organized into several groups: auditory core in posteromedial Heschl’s gyrus (HGPM), superior temporal plane, superior temporal gyrus (STG), ventral and dorsal auditory-related, prefrontal and sensorimotor cortex. Event-related band power was examined in broadband gamma (30-150 Hz) and alpha (8-14 Hz) bands. Stimuli yielded chance identification performance when degraded to 1-2 spectral bands. Performance was variable in the 3-4 band conditions and near-ceiling in the clear condition. Cortical activation featured regional differences with respect to stimulus spectral complexity and intelligibility. HGPM was characterized by strong bihemispheric activation regardless of task performance. A progressive preference for clear speech emerged along both the ventral and the dorsal auditory processing pathways. Better task performance at 3-4 bands was associated with gamma activation on the STG and alpha suppression along the dorsal pathway (supramarginal gyrus) in response to all vocoded stimuli. Within sensorimotor areas, differences in task performance were paralleled by different patterns of cortical activity. Direct recordings reveal a hierarchical organization of degraded speech processing. Examination of responses to noise-vocoded speech provides insights into the neural bases of variability in speech perception in CI users. This work will aid in the development of novel objective measures of CI performance and of neuromodulation-based rehabilitation strategies.
Alexa Buck, Typhaine Dupont, Rupert Andrews, Olivier Postal, Jerome Bourien and Boris Gourevitch
Topic areas: neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
There has been a long debate as to whether the auditory system uses a rate or temporal code. But what if this problem has been ill-defined as an 'either or’ rather than combinatorial? Here we recorded responses to a long random dynamic complex sounds from a large sample of neurons in the inferior colliculus (IC), auditory thalamus (MGB) and auditory cortex (AC) of awake mice in addition to simulating responses in a biophysical model of the auditory nerve. We then quantified the amount of stimulus information carried by the firing rate, temporal patterns and neural silence (resting state of the neuron). We confirmed that stimulus information carried by individual neurons as well as information redundancy within populations of neurons decreases along ascending auditory pathways regardless of the encoding strategy used. We observed that maximum information reached by neurons was progressively transitioning from temporal encoding to encoding in the firing rate along the ascending auditory pathway. We showed that periods of neural silence contain a significant amount of stimulus information, especially in subcortical areas, and therefore should be considered as part of neural encoding. Importantly, our observations from all auditory areas show the amount of information carried by a given code heavily depends on a given neuron’s firing characteristics. These results suggest that both silent and active patterns of neural responses are relevant for information encoding further implying a multitude of encoding strategies co-existing at each level of the auditory system.
Joel I. Berger, Alexander J. Billig, Phillip E. Gander, Meher Lad, Sukhbinder Kumar, Kirill V. Nourski, Christopher K. Kovach, Ariane R. Rhone, Christopher M. Garcia, Hiroto Kawasaki, Brian J. Dlouhy, Matthew A. Howard and Timothy D. Griffiths
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Working memory is the capacity to hold and manipulate behaviorally relevant information in mind. Previous work (Kumar et al., 2021) examined auditory working memory (AWM) during maintenance of a tone using intracranial EEG and described oscillatory local field potential (LFP) correlates of AWM in auditory, frontal cortices, and the hippocampus. The present study sought to identify correlates of precision of working memory, reflecting cognitive resources available using modeling. Neural correlates of precision were hypothesized to emerge in the hippocampus (HC) based on both LFPs and single units during maintenance and retrieval. Behavioral responses to the task and LFPs from the HC were recorded in four adult neurosurgical patients undergoing invasive monitoring for presurgical localization of epileptic foci [wherein the hippocampus was subsequently found to be not a seizure focus]. In three patients, single units were also recorded in the HC and Heschl’s gyrus (HG). For the AWM task, participants were presented with short target tones, each followed by a 3s retention period. The task was to adjust a test tone to the target within 5s. Working memory precision was calculated over all trials based on the reciprocal of the standard deviation of the response error. LFP data were analyzed using time-frequency analysis based on wavelet transforms, and single units were isolated with an automated spike-sorting procedure and examined with trial raster plots and peri-stimulus time histograms. Participants performed the task with similar precision to previously studied healthy controls. Low-frequency LFP activity (
Huaizhen Cai and Yale Cohen
Topic areas: multisensory processes
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Because our environment is inherently multisensory, it is reasonable to speculate that our brain has evolved to preferentially process such multisensory information. Indeed, multisensory activity has been found throughout the entire auditory hierarchy: from the middle ear to the prefrontal cortex. However, despite the large literature on multisensory processing, we do not have a full picture of the relationship between multisensory behavior and neural activity, especially in primate models. Here, we recorded auditory-cortex neural activity at different spatiotemporal scales in order to evaluate its contributions to multisensory behavior. Specifically, we recorded EEG, LFP, and single-unit activity while a monkey performed an audiovisual detection task, in which an ecologically relevant ‘coo’ call embedded in a chorus was delivered with or without a corresponding ‘coo’ video. The signal to noise ratio was varied from -15 - 10dB with a step size of 5dB. We report how auditory neurons encode multisensory signals and how well we can decode stimulus and task parameters from individual neural signals (e.g., LFP or single units).
Nathan Schneider, Michael Malina, Rebecca Krall and Ross Williamson
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Auditory-guided behavior is a fundamental aspect of our daily lives, as we rely on sounds to guide our decisions and actions. The primary route for auditory information to propagate from the ACtx is through intratelencephalic (IT) and extratelencephalic (ET) neurons in layer (L) 5. These neurons form the major output of the ACtx and, as a result, are in a privileged position to influence auditory-guided behavior. To investigate the behavioral role of IT and ET neurons, we devised a head-fixed choice task where mice categorized the rate of sinusoidal amplitude-modulated (sAM) noise bursts as either fast or slow to receive a water reward. To test the necessity of ACtx, we conducted bilateral optogenetic inhibition with GtACR2 and observed a significant decrease in hit rate during inhibition trials. We then used two-photon calcium imaging with selective GCaMP8s expression to monitor the activity of L5 IT and ET populations. Clustering analyses of these populations revealed heterogeneous responses that correlated with various stimulus and task variables. Of particular interest was a distinct motif present in ET neurons, characterized by “categorical” firing patterns that indicated a preference for either slow or fast sAM rates. This categorical selectivity was not initially present, but was revealed through longitudinal recordings, illustrating dynamic alterations in the responses of ET neurons across learning. Critically, this categorical selectivity in ET neurons did not manifest during passive exposure to identical stimuli. This suggests that learned categorical selectivity is shaped via top-down inputs that act as a flexible, task-dependent filter. Moreover, ET activity reflected behavioral choices independently of stimulus identity or reward outcome, with choice selectivity increasing throughout learning. In contrast, L5 IT neurons initially exhibited category information which then degraded as mice acquired task proficiency. Furthermore, the ability to decode both stimulus identity and behavioral choice from IT activity decreased across learning. This suggests a tradeoff of information between these two distinct populations within L5, with IT projections playing a role in initial task acquisition, while ET projections are recruited and reinforced throughout learning. Collectively, these findings underscore the differential roles of L5 neurons and contribute to our understanding of how auditory information is used to guide decision-making and action.
Grant Zempolich and David Schneider
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The abilities to detect errors and improve performance following mistakes are paramount to behaviors such as speech and musicianship. Although hearing is instrumental for monitoring and adapting these behaviors, the neural circuits that integrate motor, acoustic, and goal-related signals to detect errors and guide learning in mammals remain unidentified. Here, we show that the mouse auditory cortex encodes error- and learning-related signals during a skilled sound-generating behavior, and that auditory-cortical activity is necessary for learning from mistakes. We developed a closed-loop, sound-guided behavior that requires mice to use real-time acoustic feedback to guide their ongoing forelimb movements. Large-scale electrophysiology recordings from auditory cortex during behavior revealed that individual neurons encode rich information about sound, movement, and goal. Functional clusters of auditory cortex neurons signal errors and predict within-trial and across-trial changes in behavior. Brief, behavior-triggered optogenetic suppression of the auditory cortex hinders behavioral corrections on both rapid and long time scales, indicating that cortical error signals are necessary for learning. Together, these experiments identify a cortical role for detecting errors and learning from mistakes, and suggest that the auditory cortex may subserve skilled sound-generating behavior in mammals.
Sharlen Moore, Zyan Wang, Ziyi Zhu, Ruolan Sun, Angel Lee, Adam Charles and Kishore V. Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
A fundamental tenet of animal behavior is that decision-making involves multiple 'controllers.' Initially, behavior is goal-directed, driven by desired outcomes, shifting later to habitual control, where cues trigger actions independent of motivational state. Clark Hull's question from 1943 still resonates today: "Is this transition [to habit] abrupt, or is it gradual […]?" Despite a century-long belief in gradual transitions, this question remains unanswered as current methods cannot disambiguate goal-directed vs habitual control in real-time. Motivation is the basis of goal-directed behaviors, while not the main driver of habitual performance, and thus we sought to study habit expression en passant, in individual mice. To do so, we introduce a novel ‘volitional engagement’ approach, motivating animals by palatability rather than biological need. Offering less palatable water in the home cage reduced motivation for plain water in an auditory discrimination task when compared to water-restricted animals. Using quantitative behavior and computational modeling, we found that palatability-driven animals learned to discriminate as quickly as water-restricted animals but exhibited state-like fluctuations when responding to the reward-predicting cue—reflecting goal-directed behavior. These fluctuations abruptly ceased after thousands of trials, with animals now always responding to the reward-predicting cue. In line with habitual control, post-transition behavior displayed automaticity, decreased error sensitivity (assessed via pupillary responses), and insensitivity to outcome devaluation. Surprisingly, some animals reverted to goal-directed behavior after several sessions showing habitual behavior, suggesting that transitions to habitual decision-making are not permanent. Bilateral lesions of the habit-related dorsolateral striatum (DLS) blocked transitions to habitual behavior. Preliminary recordings taken simultaneously in the dorsomedial striatum (DMS) and DLS are used to understand the interplay between these two areas around the transition point. Thus, 'volitional engagement' reveals abrupt transitions from goal-directed to habitual behavior, suggesting the involvement of a higher-level process that arbitrates between the two. Understanding the exact time course of habit formation opens new avenues to further explore its neural correlates, develop predictive models of habitual behavior and implement timed interventional strategies to manipulate it.
Samuel Smith, Jenna Sugai, Kenneth Hancock and Daniel Polley
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Our perception of sound is related to - but not determined by - the acoustic waveform entering the ear. The same fixed sound can be misheard as another (error of confusion), not heard (error of omission), or heard when no sound was presented (error of commission). An observer can produce each of these error classes despite static auditory evidence and task context. In animals, this variability has been causally linked to fluctuations in global brain state - endogenous patterns of neuromodulator activity linked to arousal and attention. Here, we hypothesized that neural and physiological indices of brain state are predictive of canonical forms of auditory perceptual errors in human subjects. To investigate, we developed a sustained vigilance task in which subjects were asked to monitor a stream of tone clouds for strings of repeating tones. The task was configured such that participants could misclassify the length of strings (error of confusion), fail to report detection of a string (error of omission), and/or report a string where there was none (error of commission). Simultaneously, 64-channel EEG, pupil diameter, eye gaze, and blinks were recorded. Measurements were made in a cohort of 41 normal hearing, neurotypical, young/middle-aged adults (≤50 years, 11 male). The occurrence of a target response string elicited a stereotyped neuroelectric, oculomotor, and autonomic pupillary response. These responses were suppressed for errors of omission, yet partially present for errors of commission. Misclassifications of string length (i.e., errors of confusion) were characterized by sustained power at the counting frequency (8 Hz), which was instead reduced during correct trials. Importantly, neuroelectric and autonomic signatures for particular forms of listening errors were identifiable seconds before the string initiation. For errors of confusion, pupil diameter and global EEG activity were reduced in size several seconds before target onset (p < 0.01). This reduction was largest during later blocks when task performance was highest. Overall, these results show that, in neurotypical human subjects, auditory errors can be accounted for by dynamic neural and physiological indices of brain state. These findings may highlight biomarkers in neurodivergent populations who remain fixed in states of heightened commission errors (e.g., auditory hallucinations and tinnitus), omission errors (e.g., catatonia), or classification errors (e.g., dementia).
Koun Onodera, Amber Kline and Hiroyuki Kato
Topic areas: hierarchical organization neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The auditory cortex consists of primary and higher-order regions interconnected to form hierarchical streams for sound information processing. Despite ample evidence suggesting the critical roles of higher-order auditory cortices in extracting complex acoustic features, how they interact with primary areas for their unique computations remains unclear. Here, we used area-specific optogenetics in mice to investigate the role of inter-areal circuits in sound processing within individual regions. We began by re-evaluating the hierarchy among the primary auditory cortex (A1), anterior auditory field (AAF), and secondary auditory cortex (A2), given recent studies questioning A2’s classification as a higher-order cortex. Optogenetic silencing of one region while simultaneously recording from the other two revealed reciprocal excitation primarily in the superficial layers of all area pairs. A1→A2 showed the most robust excitation, followed by A2→A1 and A1↔AAF connectivities. Surprisingly, minimal interaction was observed between AAF and A2, despite their geographical proximity. Additional analyses of noise correlation and Granger causality between area pairs validated these observations. Our results thus support a hierarchy between A1 and A2, while AAF appears to form a distinct information processing stream. Building on the significant mutual interaction between A1 and A2, our ongoing research investigates how these two areas cooperate in extracting complex acoustic features. Our prior studies identified preferential representations of frequency-modulated sweeps in A1 and coincident harmonics in A2. Leveraging these sound features, we examine how A1 and A2 cooperate to construct unified representations for sounds with combined features, such as frequency-modulated harmonics. By elucidating the dynamic interplay between primary and secondary cortices, our study aims to decipher the principles underlying the extraction of complex features within hierarchical sensory streams with recurrent connections.
Yuko Tamaoki, Samantha Kroon, Jonathan Riley and Crystal Engineer
Topic areas: speech and language hierarchical organization subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Rett syndrome is a genetic disorder that is caused by a mutation in X-chromosome linked Mecp2 gene. Individuals with Rett syndrome often exhibit seizures, impaired sociability, and difficulty in cognition, motor movements, and speech-language perception and production. These children initially develop seemingly typically, and regression symptoms occur at the age of 6 to 18 months. In the rodent model of Rett syndrome, similar regression symptoms, including impairments in sensory processing, become apparent starting from 4 months of age. Behaviorally, these heterozygous Mecp2 rodents perform poorly on auditory discrimination tasks when background noise of varying intensities were present. These behavioral impairments are accompanied by degraded cortical activity patterns.. In the primary auditory cortex (A1), the tonotopic map that is normally organized from low to high frequencies are disrupted in Mecp2 heterozygous rats. There is a shift in the tonotopic organization in which a greater representation of higher frequencies are observed. These findings have been documented in post-regression animals and nothing is known about auditory processing in pre-regression animals. Additionally, no studies have documented subcortical physiology in Mecp2 animals. Therefore, the aims of this study are to 1) document multi-unit primary auditory cortex responses to sounds in heterozygous Mecp2 rats before and after signs of regression are apparent and 2) investigate responses from the inferior colliculus in both pre-regression and post-regression rats. Neural responses evoked by tones, speech sounds, and click sounds were recorded from the primary auditory cortex and the inferior colliculus in heterozygous Mecp2 rats and age-matched littermate control wild-type rats. Our preliminary results suggest that responses to sounds in the inferior colliculus are also degraded in post-regression Mecp2 rats. Surprisingly, pre-regression Mecp2 rats also exhibit degraded responses to sounds in the primary auditory cortex. Insights derived from this study may expand the current understanding of auditory processing in Rett syndrome and other neurodevelopmental disorders.
Audrey Drotos, Marina Silveira, Sarah Wajdi, Joelle Chiu and Michael Roberts
Topic areas: subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Rapid changes in frequency known as frequency-modulated (FM) sweeps are common components of many complex sounds including conspecific vocalizations. FM sweeps are encoded in the auditory system through neuron selectivity for upward or downward sweeps. In particular, the inferior colliculus (IC) is a midbrain hub of auditory integration and has been implicated as the site where FM sweep selectivity is generated via the convergence of excitatory and inhibitory inputs tuned to different sound frequencies. However, the specific mechanisms underlying FM sweep selectivity remain poorly understood. We performed in vivo juxtacellular recordings from neurons in the mouse IC while presenting FM sweeps of different speeds, directions, and frequency ranges to examine the factors that shape direction selectivity in IC neurons. In line with previous literature, we find that direction selectivity indexes (DSIs) for 4-64 kHz sweeps are correlated with a neuron’s best frequency. Additionally, we show that the FM selectivity is dependent not only on the direction but also the frequency range of the FM sweep, with neurons exhibiting greater DSIs to sweeps of two octaves compared to four octaves. Surprisingly, we also find that a large proportion of IC neurons were strongly inhibited by FM sweeps and that direction selectivity in some cells was generated after the sound via direction-specific offset spiking. Overall, these results highlight diverse mechanisms underlying FM selectivity in the awake mouse IC, which are likely important for understanding how IC neurons process complex sounds, such as conspecific vocalizations, that contain FM sweeps.
Arthur Lefevre, Vikram Pal Singh, Jingwen Li, Jean-René Duhamel and Cory Miller
Topic areas: neural coding neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Acoustic communication is of critical importance for primates. It is involved in every type of social interactions and individuals thus need to adapt their vocalizations to different social contexts. However, research has mostly focused on vocalizations production and perception separately, uncovering distinct networks of brain regions for these functions. Here we investigated the neural basis of conversation, i.e., production and perception of vocalizations consistently emitted by two interacting subjects. Using microwire brush arrays we wirelessly recorded neurons from the anterior cingulate cortex (ACC), a region involved in vocalization production but also receiving inputs from auditory areas, in marmoset monkeys as animals engaged in their natural vocal behaviors inside the colony room. We simultaneously recorded the calls (8 distinct types) from a pair of bonded individuals using wearable microphones and videotaped them. We recorded isolated single units from dorsal ACC (area 24) and quantified the relationship between different properties of neural activity and marmoset vocal behavior, including during conversational exchanges. Analyses revealed that 60 percent of ACC neurons displayed activity related to vocalization production and/or perception. Most neurons (35%) displayed a significant increase in firing rate up to 3 seconds before call production. More critically, a subset of the population (15%) exhibited responses in response to hearing calls and even anticipatory activity before call perception, suggesting that ACC neurons may predict the occurrence of vocalizations from the partner during conversations. Interestingly, activity was modulated by the type of call, irrespective of the sound intensity. Finally, we found that some ACC neurons responded to both vocal production and vocal perception. These results indicate that in primates, ACC is a key region for natural and dynamic acoustic communication, which could adapt vocalization production depending on the social context and the type of calls emitted by a partner.
Matthew Ning, Sudan Duwadi, Meryem Yucel, Alexander Von Luhmann, David Boas and Kamal Sen
Topic areas: novel technologies
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
When analyzing complex scenes, humans often focus their attention on an object at a particular spatial location. The ability to decode the attended spatial location would facilitate brain-computer interfaces for complex scene analysis. Here, we investigated functional near-infrared spectroscopy’s (fNIRS) capability to decode audio-visual spatial attention in the presence of competing stimuli from multiple locations. We targeted the dorsal frontoparietal network including the frontal eye field (FEF) and intra-parietal sulcus (IPS) as well as the superior temporal gyrus/planum temporal (STG/PT). They all were shown in previous functional magnetic resonance imaging (fMRI) studies to be activated by auditory, visual, or audio-visual spatial tasks. We found that fNIRS provides robust decoding of attended spatial locations for most participants and correlates with behavioral performance. Moreover, we found that FEF makes a large contribution to decoding performance. Surprisingly, the performance was significantly above chance level 1s after cue onset, which is well before the peak of the fNIRS response. Our results demonstrate that fNIRS is a promising platform for a compact, wearable technology that could be applied to decode attended spatial locations and reveal contributions of specific brain regions during complex scene analysis.
Zahra Ghasemahmad, Maryse Thomas, Carolyn Sweeney, Jeffrey Wenstrup and Anne Takesian
Topic areas: neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Mice emit a repertoire of complex vocalizations during different behavioral contexts, including courtship and aggressive social interactions. Playback of these vocalizations can elicit specific, stereotyped behavioral responses in the listening mice. Auditory cortex is thought to provide information about the vocalization identity to the motor and emotion centers of the brain involved in shaping these behavioral reactions. However, the representation of these specific vocalizations within neuronal subpopulations across auditory cortical regions is not well understood. Using a transgenic mouse line that expresses the calcium indicator, GCaMP6s, in a subset of cortical pyramidal neurons (Thy1-GCaMP6s mice), we performed widefield imaging from the auditory cortex in awake head-fixed mice during playback of mouse vocalizations emitted in distinct affective contexts. Playback of a range of mouse vocalizations induced robust activity across the primary auditory cortex (A1), anterior auditory field (AAF), and multiple higher-order auditory cortical fields, identified by mapping best frequency spatial gradients. Our ongoing studies are using two-photon calcium imaging to examine the responses of L2/3 pyramidal neurons in A1 to vocal stimuli. We observe that these pyramidal neurons show diverse responses to vocalizations, with subsets of neurons that show a fast, transient activation, neurons that show a prolonged increase in activity and neurons that are suppressed. Moreover, subsets of these neurons show robust responses to a range of vocalizations, whereas others show reliable and selective responses to specific vocalizations. Our future experiments will further examine the responses to these vocal stimuli within various auditory cortical subregions and across several distinct excitatory and inhibitory cell types. Together, these studies will provide insight into the processing of salient vocalizations within specific circuits across auditory cortical fields and the possible role of these circuits in shaping sensory-driven behaviors.
Sarah Tune and Jonas Obleser
Topic areas: memory and cognition speech and language correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Preserved communication abilities promote social well-being and healthy aging. While sensory acuity deteriorates, an age-independent support mechanism for communication arises when attention-guided neural filtering of relevant sensory information in auditory cortex is preserved. Yet, how longitudinally stable is such a compensatory brain–behaviour link? More generally, has neural filtering any potency in predicting inter-individual differences in future changes in behavioural functioning? We here tracked N=105 individuals neurally and behaviourally over approximately two years (age-varying cohort of 39–82 yrs). First, despite the expected decline in sensory acuity, listening-task performance proved remarkably stable. Second, when looking into each measurement time point separately (T1, T2), neural filtering and behavioural metrics were correlated with each other. However, neither neural filtering at T1 nor its T1–T2 change were predictive of individuals’ two-year change in listening behaviour, under a combination of modelling strategies. Our results cast doubt on the translational potential of attention-guided neural filtering metrics as predictors of longitudinal change in listening behaviour over middle to older adulthood. Our data support the conjecture that audiology-typical listening behaviour and neural filtering ability follow largely independent developmental trajectories associated with significant inter-individual variability.
Thomas Harmon, Seth Madlon-Kay, John Pearson and Rich Mooney
Topic areas: auditory disorders correlates of behavior/perception multisensory processes neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
An influential idea is that corollary discharge from vocal motor signals suppresses auditory cortical responses to auditory feedback, ultimately helping the brain to distinguish self-generated vocal sounds from other sounds. Support for this idea stems mostly from studies in humans and monkeys, and whether it is a general principle of mammalian auditory cortical function remains unclear. Male mice vocalize extensively during courtship, but mouse courtship is a complex behavior that also involves both vocal and non-vocal movements, odor cues from the female, and heightened arousal, any or all of which could potentially modulate auditory cortical activity. We developed a protocol for studying courtship interactions between female and head-fixed male mice to systematically search for evidence of vocal motor modulation in the mouse auditory cortex. We found that the male’s ultrasonic vocalizations (USVs), as well as its arousal and locomotion, increased when he interacted with a female. We used two photon calcium imaging in the head-fixed male to monitor the activity of auditory cortical neurons during vocalizations and playback of the same vocal bouts. We found neurons that responded strongly to playback stimuli but only weakly responded during vocalization, consistent with a suppressive corollary discharge mechanism. Comparing vocal and non-vocal courtship interactions allowed us to control for effects of arousal, odor, and locomotion, revealing a specific influence of vocalization on a subset of auditory cortical neurons. Many of these vocalization-modulated neurons responded to vocal playback and ultrasonic tones, and their activity positively scaled to the amplitude of vocal production, consistent with the influence of vocal feedback. Furthermore, many responded prior to vocal onset, pointing to a motor influence. To further isolate and characterize the influence of a vocal motor signal from vocal feedback, we imaged from the auditory cortex of congenitally deaf (TMC1-/-) male mice. Notably, vocal modulation was inverted in deafness, with the majority of neurons showing negative modulation which scaled to the amplitude of the vocal bout. Taken together, these results show that the auditory cortex of the hearing mouse integrates vocal motor signals and vocal feedback to suppress responses to self-generated vocalizations. Therefore, vocal corollary discharge mechanisms are likely to be a general feature of the mammalian auditory cortex.
Cynthia King, Stephanie Lovich, David Murphy, Rachel Landrum, David Kaylie, Christopher Shera and Jennifer Groh
Topic areas: multisensory processes
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
We recently discovered a unique type of low-frequency otoacoustic emission (OAE), time-locked to the onset and offset of saccadic eye movements and occurring in the absence of external sound (Gruters et al., 2018). These eye movement-related eardrum oscillations (EMREOs) contain parametric information about horizontal and vertical eye displacement and position, revealing that information about the position of the visual spatial map (i.e. retina) with respect to the head is available to the auditory system at the level of the periphery (Lovich et al, 2023). EMREOs are likely to be produced by some combination of middle ear muscles and outer hair cells, and thus abnormalities in this signal may ultimately have clinical relevance for diagnosing efferent causes of hearing dysfunction. However, before this promise can be realized, normative data from participants with normal hearing are needed. By identifying attributes of EMREOs that are similar across normal participants, we can set the stage for future comparisons with EMREOs in individuals with abnormalities that affect various motor components of the ear. We find that in subjects with normal hearing thresholds and normal middle ear function, all ears exhibit measurable EMREOs characterized by a phase reversal for contralaterally versus ipsilaterally-directed horizontal saccades. There is a large peak in the signal occurring soon after saccade onset, and an additional large peak time-locked to saccade offset. We find that waveforms are less variable to horizontally versus vertically-directed saccades, and we report evidence that saccade duration is encoded in the waveform. Components of EMREOs that are most consistent across subjects, such as the phase reversal for contraversive vs ipsiversive saccades, are the ones that are most likely to play an essential role in their function. In contrast, response differences between subjects are likely to reflect normal variation in individuals’ auditory system anatomy and physiology, similar to traditional measures of auditory function such as auditory-evoked OAEs, tympanometry and auditory-evoked potentials. In future analysis, focusing on the most consistent EMREO characteristics, identified here in a normal population, provides the best strategy for pinpointing differences in abnormal systems.
Philip Bender, Mason McCollum, Benjamin Mendelson, Helen Boyd-Pratt and Charles Anderson
Topic areas: correlates of behavior/perception thalamocortical circuitry/function
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Synaptic zinc signaling modulates synaptic activity and is present in specific populations of cortical neurons, suggesting that synaptic zinc contributes to the diversity of intracortical synaptic microcircuits and their functional specificity. To understand the role of zinc signaling in the cortex, we performed whole-cell patch-clamp recordings from intratelencephalic (IT)-type neurons and pyramidal tract (PT)-type neurons in layer 5 of the mouse auditory cortex during optogenetic stimulation of specific classes of presynaptic neurons. Our results reveal that synaptic zinc potentiates AMPAR function in a synapse-specific manner. We performed in vivo 2-photon calcium imaging of the same classes of neurons in awake mice and found that changes in synaptic zinc can widen or narrow the sound-frequency tuning bandwidth of IT-type neurons, but only widen the tuning bandwidth of PT-type neurons. These results expand the known functions of synaptic zinc and reveal synapse- and cell-type specific actions of synaptic zinc in the cortex.
Kelvin T.Y. Wong, Y. Teresa Shi, Lin Zhou, Jeffrey Yang, Molly Han, Audrey Lu, Kai Lu and Robert C. Liu
Topic areas: memory and cognition neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
While socially-significant sounds that evoke natural behavioral responses in animals are usually assumed to be fixed and species-specific, they often require experience and learning to become meaningful for a listener. One example of social sound learning comes from the mouse maternal model of communication, in which mothers or virgin co-carers use distal cues to locate and retrieve their pups back to the nest. Traditionally, ultrasonic vocalizations emitted by pups serve as a localizing signal, but mothers can also learn to associate synthetic sounds with pup retrieval. While much research has implicated auditory cortex (ACx) in associative sound learning that guides behavioral actions in a variety of paradigms, not all auditory tasks have been found to require ACx - either to learn or express the behavior. Here, we tested how silencing ACx alters an animal's ability both to learn a new sound that reliably predicts where pups will be found, and to express the auditory behavior after it has been learned. We first asked whether ACx is necessary for expression of the learned behavior using chemogenetic inhibition. ACx of naïve, female virgin mice (N=11) were bilaterally injected with an adeno-associated virus carrying a silencing DREADD. After three weeks of expression, those animals were trained in a T-maze to enter one of the two arms cued by an amplitude-modulated band-pass noise and rewarded with pups, which were then retrieved back to the nest in the main stem. Within 8 days, most animals learned to use the sound to locate pups. After the task was learned, we temporarily inactivated the ACx by injecting clozapine-n-oxide (CNO) and found the performance significantly decreased (p < 0.05). Next we tested the effect of ACx inactivation during learning. CNO or saline was injected 30 min prior to each daily training session. Our data show that after eight sessions, the CNO animal group (N=14) showed a significant impairment in performance (p < 0.05) compared to the saline group (N=15) and the control virus group (N=10). Together, our results suggest that ACx activity is necessary for both learning a socially rewarded acoustic cue and expressing recognition for the sound in the behavior soon after learning. Continuing studies are examining how this ACx activity is being dynamically used for learning and guiding approach to a pup during behavior, ultimately helping to inform us about sensory cortical function during communication behaviors.
Nicholas Audette and David Schneider
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Many of the sensations experienced by an organism are caused by their own actions, and accurately anticipating both the sensory features and timing of self-generated stimuli is crucial to a variety of behaviors. In the auditory cortex, neural responses to self-generated sounds exhibit frequency-specific suppression, suggesting that movement-based predictions may be implemented early in sensory processing. By training mice to make sound-generating forelimb movements, we recorded detailed neural responses while mice produced and experienced sounds that met or violated their expectations. We identified suppression of responses to self-generated sounds that was specific across multiple acoustic dimensions and to a precise position within the trained movement. Prediction-based suppression was concentrated in L2/3 and L5, where deviations from expectation also recruited a population of prediction-error neurons. Prediction error responses were short latency, stimulus-specific, and dependent on a learned sensory-motor expectation. Recording when expected sounds were omitted revealed expectation signals that were present across the cortical depth and peaked at the time of expected auditory feedback. Building on these findings, we are pursuing two new experimental directions. First, we are implementing simultaneous recordings from identified subregions of the auditory cortex and thalamus to understand how prediction signals traverse the distributed auditory circuit with millisecond resolution. Second, we are developing acoustic augmented reality home cage behaviors and freely moving recording techniques to understand how neural responses to a given external stimulus can shift dynamically to suit the needs of divergent behavioral contexts. Together, these experiments can identify circuit mechanisms that enable predictive processing across multiple interacting areas, and can provide insight into how these predictive processing circuits and other modes of contextual modulation are multiplexed across a fixed neural population.
Helen A. Boyd-Pratt, Philip T.R. Bender and Charles T. Anderson
Topic areas: correlates of behavior/perception thalamocortical circuitry/function
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Auditory processing in the auditory cortex relies on precisely organized circuits consisting of discrete inhibitory and excitatory neurons in different cortical layers. Parvalbumin (PV)-positive interneurons, the most common type of inhibitory neuron in the sensory cortex, provide inhibitory control of cortical function through their synaptic connections with local excitatory neurons. ZnT3 is a zinc transporter protein that loads free zinc into glutamatergic vesicles where it is coreleased with glutamate and shapes the function of NMDA and AMPA receptors. This synaptic zinc is a powerful modulator of synaptic signaling and supports the processing of acoustic stimuli. Recently, research has uncovered cell-type and synapse-specific roles of synaptic zinc signaling, but the synaptic mechanisms by which vesicular zinc shapes the activity of inhibitory interneurons is not well understood. Understanding the role of synaptic zinc and ZnT3 in specific excitatory-inhibitory circuits addresses a significant gap in our understanding of the mechanisms underlying the synaptic basis of inhibitory control crucial for precise acoustic encoding. We used optogenetic activation and simultaneous paired whole-cell patch-clamp electrophysiological recordings in acute brain slices from PV-Cre mice to determine the synapse-specific effects of synaptic zinc on excitatory-inhibitory microcircuits in the auditory cortex of mice.
Mason McCollum, Abbey Manning and Charles T. Anderson
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
A fundamental feature of sensory processing is the ability of animals to adapt to the current conditions in their environment yet maintain their ability to detect novelty in the same environment. A correlate of this ability is a robust neuronal phenomenon called stimulus-specific adaptation, in which a repeated stimulus will result in adaptation with smaller and smaller neuronal responses over time, but a deviant stimulus will still elicit larger robust responses from the same neurons. Recent work has established that synaptically released zinc is an endogenous mechanism that shapes neuronal responses to sounds in the auditory cortex. Here, to understand the contributions of cortical synaptic zinc to deviance detection of specific cortical neurons we performed wide field and 2-photon calcium imaging of multiple classes of cortical glutamatergic neurons. We find that intratelencephalic (IT) neurons in both layer 2/3 and layer 5 as well as extratelencephalic (ET) neurons in layer 5 all demonstrate deviance detection, however, we find a specific enhancement of this deviance detection in ET neurons that arises from ZnT3-dependent synaptic zinc from layer 2/3 IT neurons. Genetic deletion of ZnT3 from layer 2/3 IT neurons removes the enhancing effects of synaptic zinc on ET neuron deviance detection and also results in poorer acuity of detecting deviant sounds by behaving mice.
Ralph Peterson, Alessandro La Chioma, Jessica Guevara, Per-Niklas Barth, Deanna Garcia and David Schneider
Topic areas: correlates of behavior/perception neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
While moving on natural surfaces, mice produce broadcast sounds that alert potential predators to their location. Studies in wild mice show that mice actively adapt their locomotor behavior to minimize self-generated acoustic cues. Although laboratory mice are bred in captivity for generations, they retain innate predator avoidance behaviors such as preferring the periphery of an open arena and running for shelter in response to a looming visual cue. It remains unknown whether laboratory mice, like their wild counterparts, adjust their walking behavior on noisy surfaces to help avoid dangerous situations. Here, we made audio and video recordings of mice in a controlled laboratory setting as they walked on natural surfaces including sand, pebbles, and dry leaves. Different natural surfaces produce distinct sounds in frequency bands that are detectable by mouse predators. Deep quantification of open field behavior reveals that mice uniquely adapt their behavior on each surface. When a mouse’s ears are plugged they behave as if they are on a quieter surface than they actually are, indicating the importance of acoustic self-monitoring for driving these surface-dependent behaviors. These experiments identify an innate behavior through which mice adjust their locomotion using acoustic self-monitoring, and which may facilitate predator avoidance.
Dakshitha Anandakumar, Cristina Besosa and Robert Liu
Topic areas: memory and cognition correlates of behavior/perception neural coding neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Animals evolved neural mechanisms for ethological signals like vocalizations to robustly trigger social behavioral responses that help organisms live and reproduce. But these mechanisms can be made more efficient through experience by learning that other cues reliably predict the need for the same critical social behaviors. Thus, a baby’s cry and an alarm from a baby monitor can both elicit an urgent response from a new mother to her infant. Does the neural coding of such behaviorally synonymous signals remain independent during sensory processing to separately elicit responses via downstream motivational and motor areas, or does their convergent meaning emerge earlier within the sensory system itself? We answer this for acoustic cues linked to a natural reproductive behavior in mice: sound-guided retrieval of pups. Pup ultrasonic vocalizations (USVs) elicit search and retrieval by mothers, but they can also learn in a T-maze (Dunlap et al, 2020) to be guided by a new synthetic Target whose source location predicts pup delivery, like USVs. Auditory cortex (ACx) is needed for trained mice to express the learned association. However, contrary to usual expectations that learning would increase cue-evoked excitation in ACx, noncanonical forms of ACx plasticity dominate, particularly in neurons responding to both cues. We recorded 992 neurons from 5 Trained and 5 Yoked (i.e. sound exposed in T-maze with pups but no retrieval) mothers awake and passively listening to the Target. In Core ACx of Trained mice, neurons that jointly responded to both cues and belonged to the Target’s “lateral band” (i.e. BF outside the Target’s spectrum) were tuned in their selective Target-evoked suppression rather than excitation, unlike Yoked mice. This result reinforces the hypothesis that behavioral meaning impacts sound encoding through selective ACx Core suppression (Galindo-Leon et al, 2009). Meanwhile in secondary ACx (A2), we saw an increased prevalence of OFF firing beyond chance occurrence specifically in neurons that exhibited Off firing to both Target and USV, and their OFF excitation was tuned around the Target. This broadens prior findings of A2 Off plasticity for USVs in mothers (Chong et al, 2020) by revealing individual A2 neurons as a de novo site of convergent excitation by behaviorally synonymous sounds that are acoustically distinct. Hence, ACx plays an active role in constructing semantic categories from sounds. Funded by NIH R01DC008343.
Jonathan Peelle, Maria Crusey and Molly Henry
Topic areas: memory and cognition speech and language
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Acoustic rhythms generate expectations that influence our ability to attend to future stimuli. In the present study, we examine auditory entrainment using a target detection task in which a tone is presented in steady state broadband noise following an amplitude-modulated entraining rhythm (Hickok, Farabhbod, and Saberi, 2015). Prior work has shown target detection accuracy fluctuates at the same frequency of the entraining stimulus. Two broad categories of possible mechanisms have been suggested to account for this observation. In the first, bottom-up entrainment of neural oscillations (relying primarily on sensory input) rhythmically alters excitability in auditory areas. Alternatively, attention-based modulation relying on top-down processing may influence perception. In adult listeners with normal self-reported hearing, we tested whether auditory rhythmic entrainment might be mediated by top-down attentional processing by explicitly cuing proactive attention on a subset of trials.. We used two types of cued trials. On general cue trials, a set of two dashes (“- -”) appeared prior to the auditory stimulus. On specific cue trials, participants saw a single word on the screen, “peak” or “trough”, indicating the phase of the rhythmic modulation at which a target (if present) would occur. If rhythmic modulation of target detection depends on top-down processes, we would expect explicit cues to improve accuracy. Conversely, if rhythmic modulation of target detection is primarily reliant on sensory information, we would expect little difference between cued and uncued conditions. In preliminary analyses (N = 8) we found robust effects of attention, such that target detection was significantly more accurate on cued trials than uncued trials (BF = 243). Because the acoustic stimuli were identical across these trial types, we attribute performance differences to attention. These results suggest attention plays a key role in rhythmic entrainment.
Alessandro La Chioma and David Schneider
Topic areas: multisensory processes neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Virtual reality (VR) is an important tool in modern systems neuroscience, allowing researchers to couple experimentally controlled sensory feedback to an animal’s real-time behavior. Most VR systems yoke sensory feedback to an abstract measure of behavior (e.g. running speed) or to a contrived, non-natural behavior (e.g. operating a lever). Yet in nature, the consequences of one’s action are often reproducibly coupled to specific behavioral kinematics (e.g. the moment one’s foot hits the ground). Moreover, while many extant VR approaches focus on a single sensory modality, most natural behaviors have multimodal sensory consequences. Here, we developed a more naturalistic visual-acoustic VR system. We performed real-time locomotion tracking and gait kinematics analysis to provide artificial footstep sounds that were tightly yoked to a precise phase of the step cycle, creating an ethological and experimentally manipulable form of auditory reafference. While running on the treadmill and hearing footstep sounds, head-fixed mice repeatedly traversed two different contextual environments, each consisting of a distinct visual corridor accompanied by distinct footstep sounds. Using this system, we asked whether neural activity in the auditory cortex (AC) reflects predictions about the sound that footsteps are expected to produce, and whether prediction-related processing is modulated by context. Following behavioral acclimation, we made high-density neuronal recordings from primary AC as mice traversed the two VR environments and experienced either expected or deviant footsteps. We identified subsets of neurons with significant contextual modulation, responding differently to the same sound heard in the expected versus the unexpected context. These expectation violation-like signals emerge almost immediately after a mouse enters a new context, suggesting a rapid updating of predictions in parallel with behavior. We noted the presence of neurons with context-dependent modulation in infragranular cortex, consistent with other forms of predictive processing. Preliminary population-level analysis suggests that context information is embedded in AC population activity. Ongoing analyses are assessing how real-time footstep VR compares to more traditional speed-based VR systems. Overall, our results suggest that the AC combines auditory and motor signals with visual cues for flexible, context-dependent processing of self-generated sounds.
Conor Lane, Veronica Tarka, Edith Hamel and Etienne de Villers-Sidani
Topic areas: memory and cognition neural coding subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Psilocybin - a psychoactive compound found mostly in the psilocybe genus of mushroom - has long been employed for spiritual and therapeutic purposes. Promising clinical studies have created a surge of interest in its potential as a treatment for neuropsychiatric conditions, such as addiction and treatment resistant depression. It is increasingly believed that the serotonin 2A receptor agonist’s therapeutic properties are exerted by facilitating structural and functional neuroplasticity, such as increasing dendritic spine formation and changes in whole-brain functional connectivity. Knowledge of the direct effects of psilocybin on neuronal activity remains sparse. Moreover, despite observed changes in human auditory perception, no study of psilocybin’s effects on auditory cortex (ACx) sensory processing has been undertaken. We hypothesised that psilocybin may exert acute effects on ACx information processing through changes in stimulus-specific adaptation (SSA), the mechanism by which neurons become less sensitive to repeated stimuli over time, whilst remaining sensitive to novel or unexpected stimuli. This is vital for encoding behaviourally relevant information, and helps to fine-tune neuronal circuitry through changes to synaptic structure and function. We used in vivo two-photon microscopy combined with a fluorescent (Thy1-GCaMP6s) neuronal indicator mouse to examine the acute effects of 1 mg/kg psilocybin administration on ACx responses to pure tone stimulation. Awake mice (N = 7) were exposed to randomly presented 100 ms pure tones from 4-64 kHz before and after drug administration. Administration of saline saw a 12.4% ± 5.2 (SE) mean reduction in sound-responsive cells between recordings, whereas psilocybin saw 1.2% ± 4.2 (paired t-test, p= 0.0036) reduction. The mean relative spiking activity of responsive cells was reduced 21.2% ± 3.3 following saline and psilocybin, 10.6% ± 3.6 (paired t-test, p= 0.037). The cumulative probability of a cell responding to the lowest intensity of sound (35 dB) was 35.5% for saline and 40.0% for psilocybin. These results indicate that psilocybin impairs SSA in ACx neurons, reducing the characteristic decrease in sensitivity to repeated stimuli. Interference with SSA suggests that psilocybin may enhance the brain’s ability to adapt to new stimuli, and facilitate neuroplastic changes. With further research, this could be used to support recovery and rehabilitation from brain injuries or in neuropsychiatric conditions.
Jack Toth, Badr Albanna, Brian DePasquale, Saba Fadaei, Olivia Lombardi, Trisha Gupta, Kishore Kuchibhotla, Kanaka Rajan, Robert Froemke and Michele Insanally
Topic areas: correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Neuronal responses during behavior are diverse, ranging from highly reliable ‘classical’ responses to irregular or seemingly random ‘non-classically responsive’ firing. While a continuum of response properties is frequently observed across neural systems, little is known about the synaptic origins and contributions of diverse response profiles to network function, perception, and behavior. Here, we combined in vivo cell-attached, extracellular, and whole-cell recordings during behavior with a novel task-performing spiking recurrent neural network (RNN). We recorded from the auditory cortex (AC) of rats and mice during a go/no-go auditory recognition task (rats: d’ = 2.8±0.1, N = 15; mice: d’ = 2.5±0.1, N = 7). In both species, we observed a wide range of single-unit response types from classically responsive cells that were highly modulated relative to pre-trial baseline to non-classically responsive cells with relatively unmodulated firing rates. To relate synaptic structure to spiking patterns over the response-type continuum, we developed a spiking RNN model incorporating both excitatory and inhibitory spike-timing-dependent plasticity (STDP) trained to perform a similar go/no-go stimulus classification task as behaving animals. This model captured the distribution of heterogeneous responses observed in the AC of behaving rodents. Detailed inactivation experiments revealed that classically responsive and non-classically responsive model units contributed to task performance via output and recurrent connections, respectively. Excitatory and inhibitory plasticity independently shaped spiking responses to increase the number of non-classically responsive units while keeping the full network (all units) engaged in performance. Local patterns of synaptic inputs predicted spiking responses of network units as well as the responses of auditory cortical neurons from in vivo whole cell recordings during behavior. Strikingly, only non-classically responsive units altered network dimensionality by inducing correlations suggesting these units play a privileged role in determining the scale of network dynamics. While excitatory and inhibitory STDP independently increased the fraction of non-classically responsive units, both rules must be active to preserve high-dimensional activity. Thus, a diversity of neural response profiles emerges from synaptic plasticity rules with distinctly important functions for network performance.
Kaiwen Shi, Gunnar Quass, Meike Rogalla, Alexander Ford, Jordyn Czarny and Pierre Apostolides
Topic areas: neural coding subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The Inferior Colliculus (IC) is an evolutionarily conserved midbrain structure playing pivotal roles in processing amplitude modulation (AM) of sound envelope, a key feature in conspecific vocalizations and human speech. The IC is comprised of several sub-regions: A primary central region receives ascending auditory inputs from the brainstem and projects to primary auditory thalamus, while non-primary “shell” regions integrate intra-collicular inputs and project to behaviorally relevant, higher-order thalamic regions interfacing with the amygdala, striatum and nonprimary auditory cortex. While decades of studies on AM coding have focused on central IC neurons, whether and how AM sounds are encoded in shell IC neurons are underexplored, owing to the challenges of recording from these neurons located near the tectal surface. Here, we used 2-photon calcium imaging to study how shell IC neurons of awake, head-fixed mice respond to sinusoidal amplitude modulated (sAM) narrow-band noise (65-70 dB SPL, 5-200 Hz sAM rate, 0-100% sAM depth, carrier bandwidth: 16 ± 2 kHz). The calcium indicator GCaMP6f/6s/8s was expressed in shell IC neurons of 5-8 weeks old mice; 2-photon microscopy was used to record neural activity in the shell IC as the mice were passively listening to sAM sounds with varying sAM depths and rates. We analyzed responses from 1213 sound-responsive neurons recorded from 13 mice. We find that all the major sAM tuning properties previously described in the central IC — low-pass, high-pass, band-pass, and band-reject — were similarly identified in both excitatory and inhibitory shell IC neurons. Overwhelmingly, increasing the depth of sAM sounds enhanced shell IC neuron responses to preferred and non-preferred sAM rates, indicating a monotonic encoding of sAM depth and limited mixed selectivity to specific combinations of sAM rate and depth. Although most individual shell IC neurons displayed low neurometric discriminability and selectivity for sAM sounds, we find that sAM information is accurately represented in the shell IC based on a neural population code. Specifically, sAM rate is well-represented in the shell IC and such representation is heavily dependent on the sAM depth. Altogether our data uncovers a substantial population level sAM representation in the non-lemniscal regions of the IC, thus shedding light on the building blocks of complex sound perception.
Yuran Zhang, Jiajie Zou and Nai Ding
Topic areas: speech and language correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The syllable is a perceptually salient unit in speech. Since both the syllable and its acoustic correlate, i.e., the speech envelope, have a preferred range of rhythmicity between 4 and 8 Hz, it is hypothesized that theta-band neural oscillations play a major role in extracting syllables based on the envelope. A literature survey, however, reveals inconsistent evidence about the relationship between speech envelope and syllables, and the current study revisits this question by analyzing large speech corpora. It is shown that the center frequency of speech envelope, characterized by the modulation spectrum, reliably correlates with the rate of syllables only when the analysis is pooled over minutes of speech recordings. In contrast, in the time domain, a component of the speech envelope is reliably phase-locked to syllable onsets. Based on a speaker-independent model, the timing of syllable onsets explains about 24% variance of the speech envelope. These results indicate that local features in the speech envelope, instead of the modulation spectrum, are a more reliable acoustic correlate of syllables.
Sara Jamali, Stanislas Dehaene, Sophie Bagur, Timo van Kerkoerle and Brice Bathellier
Topic areas: memory and cognition correlates of behavior/perception neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The brain can detect violations of temporal regularities in incoming stimuli, an ability that may reflect the predictions it generates. This ability is often studied based on single-tone violations and with short intervals between stimuli, which may be explained, at least in part, by stimulus-specific adaptation mechanisms. Here, we show that the mouse auditory cortex represents surprise responses to local regularity violations of short sound sequences with a very long inter-sequence interval of 30s. Although, at the level of a single trial, we find no effect of adaptation to commonly played sequences, responses to rare stimuli are still greater than responses to common stimuli. This effect is suppressed by eliminating sequence structure. By contrast, we were unable to observe any violation response to pure global sequence regularities using short or long inter-sequence intervals. Thus the global prediction effects observed so far only in awake monkeys and humans do not seem to exist in the mouse auditory cortex. VIP neurons were found to mainly implement a sequence termination code independent of sequence identity. These results provide new insights into an underlying circuit logic of stimulus prediction that does not only rely on adaptation.
Yulia Oganian, Katsuaki Kojima, Assaf Breska, Srikantan Nagarajan and Edward Chang
Topic areas: speech and language neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The amplitude envelope of speech is crucial for accurate comprehension. Considered a key stage in speech processing, the phase of neural activity in the theta-delta bands (1 - 10 Hz) tracks the phase of the speech amplitude envelope during listening. However, the mechanisms underlying this envelope representation have been heavily debated. A dominant model posits that envelope tracking reflects entrainment of endogenous low-frequency oscillations to the speech envelope. Alternatively, envelope tracking reflects a series of evoked responses to acoustic landmarks within the envelope. It has proven challenging to distinguish these two mechanisms. To address this, we recorded magnetoencephalography while participants (n=12, 6 female) listened to natural speech, and compared the neural phase patterns to the predictions of two computational models: An oscillatory entrainment model, and a model of evoked responses to peaks in the rate of envelope change. Critically, we also presented speech at slowed rates, where the spectro-temporal predictions of the two models diverge. Our analyses revealed transient theta phase-locking in regular speech, as predicted by both models. However, for slow speech we found transient theta and delta phase-locking, a pattern that was fully compatible with the evoked response model but could not be explained by the oscillatory entrainment model. Furthermore, we compared neural responses to acoustic edges of different magnitudes between the slowed and regular speech conditions. We found that while responses were overall scaled with edge magnitude, they were also normalized relative to the contextual distribution of edge magnitudes. That is, while edges were overall shallower in slowed speech than in regular speech, response magnitudes did not differ between the two conditions. Taken together, our results suggest that neural phase locking to the speech envelope reflect a discrete representation of transient information rather than oscillatory entrainment. This representation is normalized for contextual speech rate, making it a flexible and versatile representation of syllable structure in speech.
Baher Ibrahim, Yoshitaka Shinagawa, Austin Douglas, Gang Xiao, Alexander Asilador and Daniel Llano
Topic areas: subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The inferior colliculus (IC) is an information processing hub that receives widespread convergent auditory projections. While the dorsal cortex (DC) - the non-lemniscal division of the IC- receives major auditory cortical projections, some reports showed that the DC is a tonotopic structure, which indicates the structure’s ability to integrate the basic spectral features of sound to process the complex auditory information. However, it is unclear if the DC has another level of mapping to integrate the different spectral and temporal features of complex sounds across different sound levels. Therefore, the two-photon imaging of the calcium signals was used to track the neuronal response of the DC to sounds of different degrees of spectral and temporal complexity such as pure tones (PT), amplitude unmodulated (UN), and modulated (MN) broadband noise. In addition to the tonotopic map, the DC showed a periodtopic organization where the cells of a medial rostrocaudal area were best tuned to UN separating medial and lateral regions where the cells were best tuned to MN. Analyzing the neuronal response to each tested sound was used to generate spectral and temporal indices for each neuron, which were then used to map the DC based on the dynamics of the neuronal responses across different sound amplitudes. The DC showed a cellular organization that mapped the DC surface into two main regions: dorsomedial (DMC) and dorsolateral (DLC) cortices. At the lowest tested sound level (40 dB SPL), the DMC was more responsive to simple tones (i.e. PT) and less responsive to complex sounds (i.e. UN and MN) compared to the DLC. Although increasing the sound level increased the percentage of responsive cells in both DMC and DLC, it dynamically modulated the cells of the DMC to be more responsive mostly to UN without changing the response profile of the DLC. These data suggest that the DC is mapped to process the different spectrotemporal features of sound based on the sound intensity to probably enhance the segregation of different sound sources.
Shang Ma and Karl Kandler
Topic areas: subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Sensory experience is essential for the proper development and maintenance of brain circuits. In the auditory system, alterations in sensory experience greatly impact the precise network organization through various forms of structural and functional plasticity. We have previously shown that the intrinsic networks within the central nucleus of the inferior colliculus (CNIC) undergo significant changes around the beginning of auditory experience. In this study, we addressed the role of auditory experience in the maturation of excitatory and inhibitory intrinsic CNIC connections. To this end, we performed laser-scanning photostimulation with caged glutamate in acute brain slices from both normal hearing and acoustically deprived mice (malleus removal at hearing onset, ~P12). Our results indicate that auditory experience promotes intrinsic network diversification by 1) tonotopic location-dependent strengthening and weakening of local synaptic inputs onto CNIC neurons, 2) spatial de-clustering of both excitatory and inhibitory local inputs, and 3) spatial segregation of input origin of local excitation and inhibition. Auditory deprivation primarily affected the reorganization of intrinsic inputs onto GABAergic CNIC neurons, which retained immature input characteristics (Surgery: N=24 cells/16 animals; Sham: N=25/17). In contrast, intrinsic inputs onto glutamatergic CNIC neurons remained largely unaffected by auditory deprivation (Surgery: N=23/15; Sham: N=19/13). Our results reveal input diversification as a major developmental trajectory of the intrinsic CNIC network and highlight a cell-type specific role of early auditory experience in this process. Supported by NIDCD 5R01DC019814.
Kirill Nourski, Mitchell Steinschneider, Ariane Rhone, Joel Berger, Emily Dappen, Hiroto Kawasaki and Matthew Howard
Topic areas: speech and language correlates of behavior/perception hierarchical organization
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Electrical stimulation of the auditory nerve with a cochlear implant (CI) is the method of choice for treatment of severe-to-profound hearing loss. Auditory cortical function and plasticity are major contributing factors to the variability in speech perception outcomes. Spectrally degraded stimuli, presented to normal-hearing individuals, can serve as a model of cortical processing of speech by CI users. This study utilized intracranial electroencephalography (iEEG) to study processing of spectrally degraded speech throughout the cortical hierarchy, test for hemispheric asymmetries and determine the relationship of cortical activity to speech perception. Participants were normal hearing adult neurosurgical epilepsy patients. Stimuli were utterances /aba/ and /ada/, degraded using a noise vocoder (1-4 bands) and presented in a one-interval discrimination task. Cortical activity was recorded using depth and subdural iEEG electrodes ( >2000 contacts). Recording sites were assigned to regions of interest, organized into several groups: auditory core in posteromedial Heschl’s gyrus (HGPM), superior temporal plane, superior temporal gyrus (STG), ventral and dorsal auditory-related, prefrontal and sensorimotor cortex. Event-related band power was examined in broadband gamma (30-150 Hz) and alpha (8-14 Hz) bands. Stimuli yielded chance identification performance when degraded to 1-2 spectral bands. Performance was variable in the 3-4 band conditions and near-ceiling in the clear condition. Cortical activation featured regional differences with respect to stimulus spectral complexity and intelligibility. HGPM was characterized by strong bihemispheric activation regardless of task performance. A progressive preference for clear speech emerged along both the ventral and the dorsal auditory processing pathways. Better task performance at 3-4 bands was associated with gamma activation on the STG and alpha suppression along the dorsal pathway (supramarginal gyrus) in response to all vocoded stimuli. Within sensorimotor areas, differences in task performance were paralleled by different patterns of cortical activity. Direct recordings reveal a hierarchical organization of degraded speech processing. Examination of responses to noise-vocoded speech provides insights into the neural bases of variability in speech perception in CI users. This work will aid in the development of novel objective measures of CI performance and of neuromodulation-based rehabilitation strategies.
Jesse Herche, Meredith Schmehl, David Bulkin, Gelana Tostaeva, Justine Griego and Jennifer Groh
Topic areas: cross-species comparisons multisensory processes neural coding subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Audiovisual integration involves coordinate translation between the auditory system’s head-centered and the visual system’s eye-centered reference frames. In primates, saccades cause frequent shifts between the visual and auditory scenes. Robust cross-referencing is critical to audiovisual processing. Eye movements generate a phenomenon in the auditory periphery called Eye-Movement Related Eardrum Oscillations (EMREOs) (Gruters et al., PNAS 2018). The EMREO encodes precise, parametric information about saccade direction and amplitude (Lovich et al., Biorxiv 2022) in both humans and monkeys (Gruters et al., 2018; Lovich et al., Phil Trans B in press), but its origin and impact on central auditory processing remains unknown. Here, we search for signals similar to the EMREO in the central auditory system. Specifically, we analyzed local field potential recordings throughout one rhesus macaque’s inferior colliculus (IC). The IC helps localize sounds and modulates its response to them based on eye position. It has substantial connections to oculomotor areas and auditory effector structures (middle ear muscles and cochlear outer hair cells) hypothesized to generate EMREOs. Analysis of >45,000 free saccades across 760 recording locations showed an event-related potential response (latency about 20 ms after saccade onset) as well as a continuing oscillation, in the frequency range of the EMREO (30-40 Hz), discernable as early as >100 ms before saccade onset. Consistent with past EMREO findings, multiple regression analysis of the IC data shows most robust encoding of a saccade’s horizontal displacement. Future work should clarify the chronicity of EMREO with the IC’s saccadic signature and variability across subjects.
Ralph Peterson, Aramis Tanelus, Aman Choudhri, Aaditya Prasad, Megan Kirchkessner, Jeremy Magland, Jeff Soules, Robert Froemke, David Schneider, Dan Sanes and Alex Williams
Topic areas: neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Identifying the emitter of a vocalization among a group of animals is a persistent challenge to studying naturalistic social behavior. Invasive surgical procedures – such as affixing custom-built miniature sensors to each animal in a group – are often needed to obtain ground truth and temporally precise measurements of whether an individual animal is vocalizing. In addition to being labor intensive and species specific, these surgeries are often not tractable in very small or young animals and may alter or restrict the natural behavioral repertoire of an animal. Thus, there is considerable interest in developing non-invasive sound source localization and vocal call attribution methods that work off-the-shelf in typical laboratory settings. Here, we demonstrate that deep learning frameworks for sound-source localization display favorable performance as compared to existing state-of-the-art methods, work very well in reverberant environments, and produce calibrated measures of uncertainty. Given the lack of existing data to train and evaluate such models, we acquired a diverse and publicly available benchmark dataset consisting of ground truth microphone array data from known sound sources.
Yingjia Yu, Anastasia Lado, Yue Zhang, John Magnotti and Michael S. Beauchamp
Topic areas: speech and language multisensory processes
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
It has long been known that seeing the face of the talker improves the intelligibility of noisy auditory speech. Advances in computer graphics have made encountering synthetic faces an everyday occurrence, but little is known about whether synthetic faces improve speech intelligibility. To address this knowledge gap, audiovisual recordings were created from two human talkers (one male, one female), each speaking 30 different words. Pink noise was added to the voices at a signal-to-noise ratio of -12 dB. Commercial software from JALI Research was used to animate two rigged facial 3D models (one male and female) to create synthetic faces lip synced to the gender-matched voice. Three formats of each word were created: noisy auditory-only (An); noisy audiovisual with a real face (AnR) and noisy audiovisual with a synthetic face (AnS). Thirteen participants were presented with noisy words using the Amazon Mechanical Turk online testing service. Participants viewed a video introduction to the study and then answered the prompt "Type the word” following presentation of each word. The phonetic composition of each typed response was scored for accuracy based on the Carnegie-Mellon pronunciation dictionary. Each participant was presented with each word only once, in a single format, to prevent any learning effects. Across participants, every word was presented in every format. Across words and participants, AnR stimuli resulted in the highest accuracy (58%), followed by AnS (37%) and An (24%). A linear mixed-effects model was constructed with dependent variable of phoneme accuracy, fixed effect of format, and participant and batch as random effects. There was a significant effect of format (X22 = 108, p < 10-16) driven by significant differences between all condition pairs (AnR vs. An, t63 = -10.4, p = 10-11; AnS vs. An, t63 = -4.5, p= 10-4; AnR vs. AnS, t63 = -5.8, p = 10-6). These data show that both real and synthetic faces improve the intelligibility of noisy speech, with an advantage for real faces. Improvements in synthetic face animation could eventually reduce this advantage, allowing them to provide maximal benefit to listeners when human faces are unavailable. From a scientific perspective, synthetic faces are a valuable experimental tool because they allow different facets of talker identity, facial emotions, and speech mouth movements to be parametrically varied, manipulations that are difficult to achieve with real faces.
John Magnotti, Yue Zhang, Xiang Zhang, Aayushi Sangani, Zhengjia Wang and Michael Beauchamp
Topic areas: speech and language correlates of behavior/perception multisensory processes
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Humans have the unique ability to decode the rapid stream of language elements that constitute speech. Although auditory noise in the environment interferes with speech perception, visual speech from the face of the talker may compensate. However, people vary in their use of visual speech and the amount of visual information varies based on speech content. We combined behavior, BOLD fMRI, and intracranial EEG to examine the neural correlates of these person-level and word-level differences. Fifty-two participants were presented with speech in 4 formats: audiovisual speech (AV) with(out) auditory noise and auditory-only (AO) speech with(out) noise. In line with past research, clear speech was highly intelligible, noisy speech sometimes intelligible, and AV speech was more intelligible than AO speech. Trials were sorted by intelligibility, then multivariate analyses were used to compare the patterns of activity in superior temporal cortex for clear speech (without noise) and intelligible vs. unintelligible noisy speech. For BOLD fMRI participants (n=37), we used all voxels in superior temporal cortex. For iEEG patients (n=15), we used 140 temporal electrodes that showed a significant response to clear AO speech (mean 70-150Hz broadband high frequency activity from 0-1s after auditory onset vs. baseline, p < 0.001 Bonferroni corrected). For BOLD fMRI and iEEG, pairwise pattern correlations between clear speech and intelligible noisy speech were higher than correlations between clear speech and unintelligible noisy speech. Using the fMRI data, we found that differences in multivariate pattern similarity (using multidimensional scaling) corresponded to intelligibility of noisy speech for both words and sentences, for both AO and AV speech (r = 0.55). Using the greater temporal resolution of the iEEG data, we found that word-level differences in intelligibility were predicted by neural pattern similarity: when the response to a noisy word was more similar to the response to the clear version of that word, perceptual intelligibility was higher (r = 0.43). Additionally, we found that the neural pattern improvement for AV vs. AO speech reliably predicted the degree of audiovisual improvement for that word, presumably because of the neural integration of the visemes and phonemes. Understanding the neural substrates of individual- and word-level differences in noisy speech perception may help to develop strategies for helping those with impaired speech perception.
Yufei Si, Brian Mullen, Alan Litke and David Feldheim
Topic areas: multisensory processes subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Determining the location of an object in a complex environment and instantaneously evaluating its saliency is a fundamental and necessary brain function. To achieve this, the brain needs to receive, process, and integrate sensory information from multiple sensory modalities. A model to study spatial sensory integration is the superior colliculus (SC), the only structure in the brain that contains both auditory and visual maps of space, and contains multimodal neurons with aligned spatial receptive fields (RFs). The SC then integrates these inputs to promote an appropriate motor response. We and others have shown that a combination of graded molecular cues and activity-dependent refinement is used to create the point-to-point map of visual space in the superficial layers of the SC; however, it remains unknown how the computed auditory map of space in the mammalian SC forms and how this map becomes aligned and integrated with the visual map. Here we describe a set of experiments designed to test the hypothesis that sensory experience is required to align the visual and auditory maps of azimuth in the mouse SC. To test this hypothesis, we have manipulated visual and auditory experience in CBA/CaJ mice and determined the slopes of the visual and auditory SC maps (RF azimuth vs. anteroposterior SC position) using large-scale in vivo physiological recordings of SC neurons in response to spatially restricted visual and auditory stimuli. Our data show that neither retinal input nor visual experience is required for the formation of the auditory topographic map of space in the mouse SC. We find that disrupting normal auditory experience via plugging one or both ears during development alters the auditory map, but not the visual map, leading to their misalignment. Taken together, these results suggest that the visual map is not used as a template to form an auditory map of space and that auditory but not visual experience is required to form the auditory map and align it with the visual map in the mouse SC.
Jessica Mai, Valentina Esho, Rowan Gargiullo, Eliana Pollay, Megan Zheng, Cedric Bowe, Abigail McElroy, Lucas Williamson, Osama Hussein, Carrissa Morgan, Nia Walker, Kaitlyn A. Brooks and Chris Rodgers
Topic areas: auditory disorders correlates of behavior/perception neural coding novel technologies
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
In natural behavior, we actively move our heads, eyes, hands, and bodies to collect sensory information. For instance, people are better able to localize sounds when they move their head while listening. This active strategy is especially important for people with cochlear implants or single-sided hearing loss. However, our understanding of the neural circuitry that enables active sound-seeking is limited, because in most auditory studies the head is held still. Therefore we have developed a new behavioral model of active sound-seeking in mice and assessed the corresponding computations in auditory cortex with large-scale wireless recording. Neurons in auditory cortex encoded sound and movement. Surgical induction of conductive hearing loss impaired sound-seeking. Mice robustly recovered from unilateral but not bilateral hearing loss, suggesting a role for plasticity in central auditory pathways. We also developed new hearing loss assessments based on the acoustic startle response using machine learning and videography, and the auditory brainstem response (ABR) using modern digital hardware. In ongoing work, we plan to identify the motor strategies freely moving mice use to localize sound, how this is directed by a network of interacting brain regions, and how this enables recovery from hearing loss.
Andrea Santi, Sharlen Moore, Jennifer Lawlor, Aaron Wang, Kelly Fogelson and Kishore Kuchibhotla
Topic areas: memory and cognition
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Memories must be accessible for them to be useful. Alzheimer’s disease (AD) is a progressive form of dementia in which cognitive capacities slowly deteriorate due to underlying neurodegeneration. Interestingly, anecdotal observations have demonstrated that Alzheimer’s patients can exhibit cognitive fluctuations during all stages of the disease. In particular, it is thought that contextual factors are critical for unlocking these hidden memories. To date, however, exploration of the neural basis of cognitive fluctuations has been hampered due to the lack of a behavioral approach in mouse models to dissociate memories from contextual performance. Our previous work demonstrated that interleaving ‘reinforced’ trials with trials without reinforcement (‘probe’ trials) in an auditory go/no-go discrimination task, allows us to distinguish between acquired sensorimotor memories and their contextual expression. Here, we used this approach, together with two-photon calcium imaging on behaving AD-relevant mice (APP/PS1+), to determine whether amyloid accumulation impacts underlying sensorimotor memories (measured using ‘probe’ trials) and/or contextual-performance (measured using ‘reinforced’ trials) in an age dependent manner. We found that, while contextual-performance was significantly impaired in 6-8mo APP/PS1+ mice compared to age-matched controls, sensorimotor memories were surprisingly intact. At later ages (12mo), however, APP/PS1+ mice began to show deficits in both domains suggesting a sequence where contextual performance degrades before the sensorimotor memories. Using two-photon imaging in the auditory cortex of 6-8mo APP/PS1+ mice, we found that the poor contextual performance was accompanied by network suppression, altered stimulus selectivity, and aberrant behavioral encoding. Impairments were not due to peripheral hearing deficits (measured by auditory brainstem response) and were concentrated near Aβ plaques. Strikingly, these deficits were less apparent in probe trials, suggesting the sensorimotor memory trace remains intact. These effects were recapitulated with a reinforcement learning model in which deficits in contextual scaling and inhibition explain the observed effects. Taken together, these results suggest that Aβ deposition impacts the integration of behavioral signals that enable contextual performance before degrading the underlying sensorimotor memory, suggesting that modulating these circuits may hold promise to reveal hidden memories.
Nilay Atesyakar, Andrea Shang and Kasia Bieszczad
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Comprehension of sound in noise is a remarkable feat of auditory systems. Learning experiences with salient sounds are known to enhance signal-in-noise detection performance. For example, music training was found to improve speech-in-noise performance. Learning experiences, moreover, involve physiological changes to receptive fields in the auditory cortex (ACx). As such, cortical neurophysiological plasticity induced by learning about sound cues may facilitate hearing those salient signals in noisy backgrounds, i.e., relative to novel or insignificant sounds. Given the well-established association between experience-dependent changes in receptive fields in ACx and memory formation, the role of experience-dependent physiological changes in ACx to signal processing merits research to understand successful signal-in-noise detection. Several ACx mechanisms have emerged as key candidates of modulation, including changes to receptive field properties like sound-evoked threshold and bandwidth. In this regard, learning-induced changes to receptive fields that mimic ACx function in noisy backgrounds may effectively promote sound signal detection from noise and facilitate adaptive behavior. Recent investigations have also shown that learning-induced ACx plasticity can be facilitated by treating subjects with an HDAC-inhibitor (HDAC3i) while they learn an association between a signal acoustic frequency and reward. However, the extent to which the effect of HDAC3i persists in novel backgrounds of noise remains unknown. Thus, the current study utilizes a rodent (rat) model of sound-reward learning for a 5.0 kHz (60 dB) pure tone frequency cue with in vivo auditory cortical multiunit electrophysiological recordings. We assessed behavioral and A1 responses to signal and non-signal frequency cues presented under different signal-to-noise ratios (SNR). The results revealed that both high and low noise have a suppressive effect on selectivity of frequency tuning in trained animals. Yet, HDAC3i protects suppressive effect of low but not high noise on frequency tuning bandwidth. Comparing A1 responses of naïve animals to trained animals will reveal the potential role of learning-induced cortical plasticity in signal-in-noise detection. The findings will enable us to build a comprehensive model of the representational plasticity underlying long-term memory for sounds, which will improve the ability to achieve successful hearing-related therapeutics in real world environments.
Annesya Banerjee, Mark Saddler and Josh McDermott
Topic areas: auditory disorders correlates of behavior/perception novel technologies
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Current cochlear implants (CIs) fail to restore fully normal auditory perception in individuals with sensorineural deafness. Several factors may limit CI outcomes, including suboptimal algorithms for converting sound into electrical stimulation, plasticity limitations of the central auditory system, and auditory nerve degeneration. Models that can predict the information that can be derived from CI stimulation could help clarify the role of these different factors and guide development of better stimulation strategies. We investigated models of CI-mediated hearing based on deep artificial neural networks, which have recently been shown to reproduce aspects of normal hearing behavior and hierarchical organization in the auditory system. To model normal auditory perception, we trained a deep neural network to perform real-world auditory tasks (word recognition and sound localization) using simulated auditory nerve input from an intact cochlea. We modeled CI hearing by testing this same trained network on simulated auditory nerve responses to CI stimulation. To simulate the possible consequences of learning to hear through a CI, we retrained this network on CI input. Further, to model the possibility that only part of the auditory system exhibits this plasticity, in some models we retrained only the late stages of the network. When the entire network was reoptimized for CI input, the model exhibited speech intelligibility scores significantly better than typical CI users. Speech recognition on par with typical CI users was achieved only when just the late stages of the models were reoptimized. However, for sound localization, model performance remained abnormal relative to normal hearing even when the entire network was reoptimized for CI input. Overall, this work provides initial validation of machine-learning-based models of CI-mediated perception. Our results help clarify the interplay of impoverished peripheral representation from CI stimulation and incomplete central plasticity in limiting CI user performance of realistic auditory tasks.
Alice Milne and Maria Chait
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The brain is highly sensitive to auditory regularities and this is exploited in many scenarios from parsing complex auditory scenes, to language acquisition. To understand the impact of stimulus predictability on perception, it is important to determine how the detection of a predictable structure influences processing and attention. Here we probed how the brain response differs based on the predictability of an auditory sequence. Using an EEG paradigm we tested the neural response to sequences of 50ms tones arranged into a random order, a deterministic pattern or a probabilistic structure where the transitional probabilities between tones allowed them to be segmented into triplets. In addition, we introduced deviant tones that were outside the spectral frequency of the main sequence, predicting based on previous evidence, that there would be a stronger deviant response in more predictable sequences. We found that the brain rapidly detects the underlying structure, locking to the rate of the triplets. Furthermore, the sustained neural response is modulated by different forms of predictability. Finally, we demonstrate that the event-related response to deviant tones is influenced by sequence type and the position of the deviant in the triplet structure. We discuss our findings in relation to cognitive resource allocation and the predictive coding framework.
Behrad Soleimani, I.M. Dushyanthi Karunathilake, Proloy Das, Stefanie E. Kuchinsky, Behtash Babadi and Jonathan Z. Simon
Topic areas: speech and language neural coding subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Understanding speech comprehension in difficult listening conditions can be enabled by examining cortical connectivity changes. While Functional magnetic resonance imaging (fMRI) studies have utilized Granger causality to uncover important mechanisms in speech and language processing, the limited temporal resolution of fMRI restricts the capture of higher frequency neural interactions crucial for complex speech processing. On the other hand, although magnetoencephalography (MEG) can capture neural interactions at the millisecond scale, its limited spatial resolution poses challenges in conventional connectivity analyses. Our recently proposed methodology called network localized Granger causality (NLGC) can extract Granger causal interactions in MEG data without the need for an intermediate source-localization step. This one-shot approach effectively addresses challenges related to false alarms and localization errors, providing a robust assessment of cortical connectivity. In this study, NLGC is applied to MEG recordings from younger and older adults, performing a speech listening task with varying background noise conditions. The analysis focuses on directional cortical connectivity patterns within and between the frontal, temporal, and parietal lobes, specifically in the delta and theta frequency bands. The findings demonstrate significant age- and condition-related connectivity alterations, particularly in the theta band. In younger adults, increasing background noise leads to a shift from predominantly temporal-to-frontal (bottom-up) connections for clean speech to dominantly frontal-to-temporal (top-down) connections in noisy conditions. In contrast, older adults exhibit bidirectional information flow between frontal and temporal cortices regardless of the background noise. Furthermore, the study introduces a classification of connections as either excitatory or inhibitory based on their temporal relationships, enabling a more nuanced understanding of the neural mechanisms involved in speech perception. While delta band connection types show no significant age-related changes, theta band connection types exhibit substantial changes in excitation/inhibition balance across age and condition.
Ziyi Zhu and Kishore Kuchibhotla
Topic areas: memory and cognition
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Humans, even as infants, use purposeful cognitive strategies such as exploration and hypothesis testing to learn about causal interactions in the environment. In animal learning studies, however, it has been challenging to disentangle potential purposeful strategies from errors arising from imperfect knowledge or inherent biases. Here, we trained head-fixed mice on a wheel-based auditory two-choice task and exploited the intra- and inter-animal variability to understand the drivers of errors during learning. Early in learning, rather than choosing randomly or based on immediate trial history, mice displayed a strong bias towards a given choice (left or right). This choice bias was dynamic – continuing for tens to hundreds of trials, before switching abruptly to an unbiased state or to the other side, ruling out inherent motor biases. Moreover, biased states coincided with rapid motor kinematics, reflecting less deliberation and more directed choice exploration. Finally, throughout learning we introduced ‘catch’ trials (correct choices that are not reinforced) followed by a block of ten non-reinforced trials. During these blocks, animals performed significantly better with less bias, abruptly changing strategies to exploit their acquired cue-response learning to test for potential changes in outcome contingencies. These findings argue that rodents actively probe their environment in a directed manner, potentially engaging in a rudimentary form of hypothesis testing to refine their decision-making and maintain long-term flexibility.
Su Kim and Kishore Kuchibhotla
Topic areas: memory and cognition correlates of behavior/perception thalamocortical circuitry/function
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Sensorimotor learning requires linking sensory input with an action and an outcome. How a new sensorimotor pairing engages and evolves along a sensory hierarchy, from the midbrain to thalamus to cortex, remains unknown. Interestingly, recent evidence suggests that the auditory cortex (AC) is the default pathway for audiomotor discrimination learning but then becomes dispensable at expert levels. The mammalian auditory system is organized in a feedforward fashion with auditory stimuli being progressively processed by the cochlea, brainstem, the midbrain (IC, inferior colliculus), the thalamus (MGB, medial geniculate bodies), and finally the AC. In particular, the IC-MGB-AC circuit exhibits rich feedforward and feedback projections that, to date, have not been monitored simultaneously during a learning process. To do this, we trained mice to lick to a pure tone for a water reward (S+) and withhold licking to another tone (S-) to avoid a timeout. We developed a new surgical preparation which allows the implantation of a single cranial window over the IC and the AC. We then used two-photon mesoscopic calcium imaging throughout goal-directed learning to monitor cell bodies (jRGECO1a) in the AC and the IC , and the feedforward projections (axon-GCaMP6s) from the MGB to the AC (n=4). By tracking the same cell bodies and axons across a month of training, we examine the sequence of stimulus-related and non-stimulus related plasticity and the nature of how learning and overtraining impact these processes across the sensory hierarchy.
Sunreeta Bhattacharya, Lori Holt, Fernando de la Torre, Kevin Joo, Alvaro Fernandez, Cheng Ma and Yiming Fang
Topic areas: speech and language correlates of behavior/perception multisensory processes neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Most of what we understand about human communication has come from laboratory studies in highly controlled contexts. Advances in sensor hardware are now – for the first time – supporting the possibility of moving the study of human communication out of the laboratory and into natural environments. Here, we target engagement -- a reliable indicator of the success of social interactions. In dyadic conversations, engagement is reflected in behaviors such as turn-taking, acoustic and gesture mirroring and facial expressions but there are few models that utilize multimodal information from speech, gaze, facial expressions and gestures to track communicative engagement, especially in paralinguistic features that can be extracted from a conversation. We asked pairs of participants to have a semi-guided conversation on their COVID lockdown experience and their views on vaccine mandates. Each participant wore a pair of deep learning-powered eye-tracking glasses that recorded speech, participant's gaze, and video of the conversation partner. We studied talker acoustics and turn-taking dynamics, mirroring behavior, and acoustic coupling by extracting spectrotemporal acoustic information (pitch and energy modulations) from the diarized speech signal. Subjects filled out questionnaires regarding engagement, personality and political leaning. We found robust markers of communicative engagement in a number of non-verbal features of speech. In particular, features that predicted individual-level engagement include: activity measured as proportion of time spent in solo speech, effort measured as energy of utterances, voice modulations measured as the variance of pitch distribution, and extent of coupling measured as frequency of acoustic mirroring events. We also found alignment in gaze and speech turn-taking events. The gaze information was combined with the landmarks to determine when gaze was directed at the partner’s face, body or the background – allowing us to study gaze dynamics alongside the dynamic turn-taking behavior from the acoustic signal. With the wealth of multimodal data we replicate classic work on naturalistic speech in dyads (e.g. using hand coding of features via the ‘sociometer’), extend results from constrained environments to more realistic scenarios, and lay the foundation for real-time analyses of human communication in natural environments, presenting real-world challenges that can be aligned with neural measures.
Justin Buck and Guillermo Horga
Topic areas: memory and cognition auditory disorders speech and language subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Auditory hallucinations are debilitating false perceptions common in patients with schizophrenia. Excess striatal dopamine has been strongly implicated in the development of hallucinations but the precise circuits and cognitive processes that link this neurochemical alteration to false perception remain unclear. Given recent findings in rodents suggesting that striatal dopamine plays a direct role in auditory processing, we sought to test whether the striatum plays a role in auditory learning in healthy humans (HC) and patients with schizophrenia (SCZ). To do so, we designed a behavioral task where HCs (N=47) and medication-free SCZs (N=56) were presented with auditory stimuli (speech and natural sounds) in the fMRI scanner. Stimulus probability was varied in a blocked structure and participants were asked to report when they heard speech. In HCs, striatal responses on stimulus trials were elevated relative to no stimulus trials (signed-rank: p=1.73x10-3) and activation patterns were sensitive to stimulus history (mixed-effects linear regression: p=7.98x10-3). In contrast, SCZs did not show distinguishable responses by trial type (signed-rank p=0.401) and this effect scaled with clinical hallucination severity (Spearman: rho=-0.36; p=8.36x10-3). To clarify these results, we built a computational learning model that updates trial-by-trial perceptual expectations via weighted prediction errors. Fitting the model to neural data suggested that trial-by-trial activations were well described as stimulus prediction errors for all participants although fitted parameters varied by clinical hallucination severity (Spearman: rho=-0.3; p=0.025). To explore the role of dopamine in this alteration we collected neuromelanin-sensitive (NM)-MRI, a validated proxy measure of nigrostriatal dopamine function, in a subset of patients (N=29). We found that the degree of stimulus sensitivity (Spearman rho=-0.414; p=0.039) and fitted model parameters (Spearman rho=-0.46; p=0.021) negatively correlated with nm-MRI signal in the substantia nigra. Our results suggest that the human striatum plays a role in auditory learning and that excess nigrostriatal dopamine disrupts learning signals in patients with schizophrenia.
Melissa Ryan, Siddhartha Joshi, Hemant Srivastava, Hong Jiang and Matthew McGinley
Topic areas: correlates of behavior/perception subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Rapid acoustic onsets are prominent in natural sounds, including in speech, such as after gaps, and in adventitious sounds, such as rustling noises. Octopus cells (OCs) in the cochlear nucleus are highly responsive to these rapid onsets and drive precisely timed inhibition in the inferior colliculus (IC) via the ventral nucleus of the lateral lemniscus1. However, the function of OC-driven onset inhibition in the IC remains unknown. There are several divergent models for the role of this onset inhibition in the circuit function of the IC, including: suppressing spectral splatter; preserving excitatory/inhibitory balance during onsets; and facilitating feature binding. 2,3. To begin to arbitrate between these models, we examined the impact of onset gate duration on computational models and Neuropixel recordings of neural responses to single tones in IC of mice. We have characterized the theoretical limit, and simulated cochlear extent, of spectral splatter in natural sounds, and tones with varying onset gate durations, and find that changes in spectral splatter due to sub-millisecond differences in onset gate duration are recapitulated by the cochlea. During presentation of single tones with manipulated onset gate durations, we record neural response properties in the IC of head-fixed, awake mice. We find that sub-millisecond gate duration affects tone-evoked firing by up to 2-fold (N=3) particularly at tone carrier frequencies away from best frequencies, indicating that population responses in IC are strongly impacted by spectral splatter. Future optogenetic work will test the potential role of OCs in spectral splatter suppression. Ongoing analyses are determining the implications of our results for models of the function of rapid onset inhibition in IC. 1 Oertel, D., Cao, X. J., Ison, J. R., & Allen, P. D. (2017). Cellular computations underlying detection of gaps in sounds and lateralizing sound sources. Trends in neurosciences, 40(10), 613-624. 2 Spencer, M. J., Nayagam, D. A., Clarey, J. C., Paolini, A. G., Meffin, H., Burkitt, A. N., & Grayden, D. B. (2015). Broadband onset inhibition can suppress spectral splatter in the auditory brainstem. PloS one, 10(5), e0126500. 3 McGinley, M. J. (2014). Rapid Integration Across Tonotopy by Individual Auditory Brainstem Octopus Cells. In The Computing Dendrite (pp. 223-243). Springer, New York, NY.
Matthew McGinley and Jan Willem de Gee
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The brain has a remarkable capacity to adaptively shift processing to support a diverse array of behavioral goals and constraints. In sensory domains, the brain can enhance the processing of difficult-to-perceive stimuli when they are important (Kahneman, 1973; Miller & Cohen, 2001), such as when correct discrimination leads to a large reward. Matching attentional effort to potential gains must be continuously re-calibrated. To study how mice adapt attention to value, we have developed an “sustained attention-value (SAV) task” for head-fixed mice (de Gee et al., 2022). The SAV task is a quasi-continuous listening task, in which mice learn to lick for sugar-water reward to report detection of the unpredictable emergence of temporal coherence in an ongoing tone-cloud. To detect all weak-coherence signals, mice would need to sustain an infeasibly high level of attentional effort across the 90-minute sessions. Therefore, to probe adaptive effort allocation, we switched the sugar-water reward size (droplet volume) between high and low values in blocks of 60 trials. Thus, mice should expend more attentional effort in blocks of high reward. Here, we have developed a novel signal detection theory (SDT) framework for the SAV task, based on signal start time-matched sampling of a survival function-based estimate of false alarm rate. We refer to this approach as real-time SDT (rt-SDT). rt-SDT is applicable to a broad range of quasi-continuous perceptual tasks and provides a simple, data-efficient, and statistically grounded estimates of sensitivity (d’) and criterion (c, also called bias) for such tasks. We report rt-SDT signatures of rapid and adaptive shifts in performance in 88 mice. In the high vs low reward blocks, mice were both more liberal (loc) and had higher sensitive (d’). Furthermore, mice were also more liberal and sensitive after correctly licking (hit) on the previous trial. Ongoing analysis is determining the interaction of these two reward-history effects on performance. In sum, we find that application of rt-SDT reveals that mice adapt their allocation of cognitive resources to inferred changes in task utility on multiple time scales. Kahneman (1973). Attention and effort. Englewood Cliffs, NJ: Prentice-Hall. Miller & Cohen (2001). An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1). de Gee ... & McGinley, M. J. (2022). Strategic self-control of arousal boosts sustained attention. bioRxiv, 2022-03.
Fangchen Zhu and Kishore Kuchibhotla
Topic areas: memory and cognition
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The ability to reverse the contingency of a set of previously learned associations is a common example of rule learning in both animals and artificial intelligence – agents can solve novel tasks by applying a simple reversal rule instead of relearning the response-outcome associations when the contingencies are swapped. Consequently, the neural basis of reversal learning has been studied in humans, primates, and rodents. Rodent models offer the advantage of using advanced optogenetic tools to decode the causal role of neuronal activity and application of a reversal rule. Currently, most head-fixed rodent reversal paradigm utilizes the Go/No-Go (GNG) task structure, in which the animal has to learn to inhibit action towards a previously rewarded cue and initiate action for a previously unrewarded cue. However, there are several confounds that arise when interpreting incorrect responses in a GNG task, such as disengagement and purposeful lapses. This makes it challenging to determine when the animal applies a rule during reversal trials. Here, we present a novel two-choice auditory task in which mice learn a cue that signals the reversal of previously learned actions. We show that mice can learn this task in a few weeks, and there is a surprising asymmetry in learning to apply the reversal rule. This paradigm provides the basis for future optical control of rule learning in mice models.
Ole Bialas, Sahil Luthra, Erin Smith, Lori Holt, Fred Dick and Edmund Lalor
Topic areas: correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Selectively listening to acoustic signals is crucial in acoustically complex environments. Traditionally, attention is understood as an interplay between goal-directed top-down modulation and bottom-up stimulus salience. However, attention can also be guided by the observer's prior expectations. A potentially important driver of such expectations is the presence of statistical regularities in the acoustic environment. Temporal statistical regularities encountered as syllabic transitional probabilities have long been shown to affect perception and learning. Notably, acoustic regularities can affect performance and learning even if they occur along a dimension that is irrelevant for the task at hand. The venerable 'probe signal' psychoacoustic paradigm has shown that listeners asked to detect pure tones in noise will show markedly different perceptual thresholds for tone frequencies whose probability of occurrence differs - even though tone frequency is task-irrelevant. The listener's expectation - based on the global probability of each tone frequency - can dramatically affect low-level perception. While this may suggest an exaggerated response to the expected signal, error-based accounts predict the opposite pattern: enhancement of unexpected, rarely occurring stimuli. Here, we present preliminary data from a human EEG experiment that asks how cortical electrical responses are modulated when tone-in-noise detection is changed by learning about the probability structure of a task-irrelevant auditory dimension. We use a simple go/no-go task, in which listeners are asked to detect a single tone in noise, presented at their detection threshold. These tones are presented at two different but neighboring frequencies, one highly probable and the other rarely occurring. Simultaneously, we acquire behavioral data from a large number of subjects in an online experiment to ensure that the effects of stimulus statistics on perception are robust. Importantly, our task is one that can be performed without overt verbal instruction, such that non-human animals can learn it. This allows us to investigate the question of statistically-driven selective attention across multiple levels of neural and behavioral explanation.
Weizhe Guo, Wenkang An, Abigail Noyce and Barbara Shinn-Cunningham
Topic areas: memory and cognition neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Representational similarity analysis (RSA; Kriegskorte 2008) has allowed researchers to characterize a number of cognitive processes, including visual object recognition (e.g Cichy 2014, Kaneshiro 2015) and audiovisual integration (Cecere 2017). Here, we tested whether RSA could be applied to the similarity structures of internal cognitive control states rather than those of external stimuli, using auditory selective attention as our experimental paradigm. On each trial, subjects were presented with a mixture of 4 overlapping syllables. Depending on the condition (21 in total), subjects (N=19) were cued to use spatial attention, non-spatial attention, or passive listening, and, on attention trials, to report the identity of one target syllable from the mixture. fMRI data (TE = 3.48 ms, TR = 650 ms, SMS8, 2.3 mm isotropic,) were collected and preprocessed. Each subject’s data were registered into the MNI152 standard space. Trial-wise activation maps were generated using the least-squares separate approach (Turner 2012), fitting a separate general linear model for each subject and each trial. For each subject and region of interest (ROI), we measured dissimilarity between pairs of conditions by training a support vector machine (SVM) with 12-fold leave-one-run-out cross-validation. Anatomical ROIs were drawn from Destrieux (2010); searchlight ROIs had radius 4mm. Within auditory processing regions in superior temporal gyrus (STG), we primarily observed attention trials separated from passive listening, but with minimal difference between spatial and non-spatial attention. In the parietal lobe, along the intraparietal sulcus (IPS) and superior parietal lobule (SPL), we observed much stronger separation between spatial and non-spatial attention, as well as spatial attention separated more strongly from passive listening than was non-spatial attention. Searchlight-level analyses localized these findings more precisely, showing strongest results in posterior regions of STG and inferior portions of SPL, but consistent throughout the IPS. These results are consistent with previous findings that traditionally-construed auditory and visual processing networks respond differently to spatial versus non-spatial auditory tasks (Michalka 2015, 2017, Deng 2019). The RSA approach allows us to characterize the neural representation of internal states even in the absence of stimulus-level differences between conditions.
Lonike Faes, Isma Zulfiqar, Luca Vizioli, Zidan Yu, Yuan-Hao Wu, Jiyun Shin, Lucia Melloni, Ryszard Auksztulewic, Kamil Uludag, Essa Yacoub and Federico De Martino
Topic areas: correlates of behavior/perception hierarchical organization neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Predictive coding postulates that our brains build an internal representation of the sensory world through the comparison of predictions casted by an internal model and the sensory input. Such a process takes place throughout the (sub-) cortical hierarchy. A specific role is attributed to the different cortical layers in the information flow between top-down predictions and bottom-up sensory evidence [1,2]. While ultra-high field fMRI (7 Tesla) can be used to observe cortical depth dependent brain activity non invasively in humans, the effect of draining veins renders laminar gradient-echo BOLD activity tainted, making it difficult to disentangle neuronal from vascular dynamics [3]. To investigate the role of cortical layers in response to tones that either respect or deviate from contextual cues using BOLD fMRI, we use a biophysical model [4,5] that combines neuronal dynamics and laminar vascular physiology within a dynamic causal modeling (DCM) framework. Using this approach we account for draining effects and reveal the laminar distribution of responses to unpredictable and mispredicted tones (compared to predictable ones) across the bilateral auditory cortex. In accordance with the predictive coding hypothesis [1, 2, 6] our results indicate a distinct role of deep and superficial cortical layers in the contextual processing of auditory stimuli. 1. Bastos A, Usrey M, Adams R, et al. Canonical microcircuits for predictive coding. Neuron. 2012;76(4). doi: 10.1016/j.neuron.2012.10.038. 2. De Lange F, Heilbron M, Kok P. How do expectations shape perception? Trends in Cogn Sci. 2018;22(9). https://doi.org/10.1016/j.tics.2018.06.002 3. Turner R. How much can a vein durian? Downstream dilution of activation-related cerebral blood oxygenation changes. NeuroImage. 2002;16. https://doi.org/10.1006/nimg.2002.1082 4. Havlicek M, & Uludag K. A dynamical model of the laminar BOLD response. NeuroImage. 2020;204. https://doi.org/10.1016/j.neuroimage.2019.116209 5. Uludag K & Havlicek M. Determining laminar neuronal activity from BOLD fMRI using a generative model. Progress in Neurobiology, 2021;207. https://doi.org/10.1016/j.pneurobio.2021.102055 6. Heilbron M & Chait M. Great Expectations: Is there Evidence for Predictive Coding in Auditory Cortex? Neuroscience. 2018;389. doi: 10.1016/j.neuroscience.2017.07.061.
Guan-En Graham, Liesl Co, Abhiram Kandukuri, Madelyn Sumner, Michael Chimenti, Kevin Knudtson, Devin Grenard and Kasia Bieszczad
Topic areas: memory and cognition
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Long-term auditory memories require experience-dependent neurophysiological plasticity in the adult auditory system. Epigenetic mechanisms are powerful regulators of learning-induced gene expression that can potentiate lasting effects on neuronal function to strengthen learned behaviors. One such epigenetic regulator, histone deacetylase 3 (HDAC3), works with transcriptional machinery to gate the activity-dependent de novo DNA transcription events that are required for long-term memory formation. Prior auditory studies in rodents show increased cortical plasticity, memory, and sound-specific behavior by pharmacologically blocking HDAC3 during discrimination learning. Here, we provide molecular-level insight to auditory cortical mechanisms that may underlie sound-specific memory and discriminative behavior. Genome-wide RNA-sequencing results reveal that inhibiting HDAC3 in adult male rats during the early acquisition phase of a two-tone frequency discrimination task produces large changes in learning-dependent transcription by further up- or down-regulating unique subsets of induced genes (relative to vehicle and sound-naïve groups). Gene-targeted studies (using qRT-PCR) in a separate cohort of identically trained and treated animals confirmed several identified genes of interest (GOIs), e.g., Egr1, Chrna7 and Per2, that may mediate the enhancing effects of HDAC3-inihibition on auditory memory and cortical plasticity. We utilized single molecule fluorescent in situ hybridization (smFISH) to visualize GOI mRNA transcripts in the anatomical context of the auditory cortex. Egr1, Per2, and Chrna7 were characterized within horizontal slices of auditory cortical tissue collected from naïve, or trained and treated (HDAC3-inhibited vs. Vehicle) rats, and co-localized with CamkIIa and Rorb transcripts to specify excitatory pyramidal cells and cortical layers, respectively. We report transcriptional changes within subpopulations of auditory cortical cells that are prime candidates for altering local microcircuitry in ways that support long-lasting sound-specific memories and discriminative behaviors.
Bradley E. White
Topic areas: memory and cognition speech and language hierarchical organization
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Decades of research have implicated a wide range of cortical areas involved with degraded speech processing and listening effort, but how these areas are functionally organized to perform such complex tasks is not well understood. We investigated how degraded speech impacts neural networks between attention and language areas in the human brain. Optical fNIRS brain imaging data were analyzed from 29 young adults (18 females; mean age = 30.66 ± 6.12). Participants were right-handed, screened for no hearing loss, and assessed for comparable speech recognition, English language understanding, and nonverbal intelligence. Participants were monolingual native English speakers with little-to-no experience with another language. Participants listened to sentences and performed a plausibility judgment task. Sentences were blocked and randomized in a 2x2x3 design that varied in syntax (simple subject-object order and complex object-subject order), speed (typical and fast), and clarity (clear undegraded, 3 channel band-pass filtered, and 8 channel noise vocoded). We predicted that cognitive load from increasingly degraded speech would trade off with executive resources for attention in the prefrontal cortex (PFC) and language in the left hemisphere (LH). Data were preprocessed and analyzed using the NIRS Brain AnalyzIR Toolbox. Subject-level functional connectivity (FC) was modeled between all possible channel pairs for each condition using robust general linear regression with pre-whitening. Group-level FC was modeled per channel pair for each condition using robust linear mixed-effects. Each subject was treated as a random effect. FC between conditions were compared using t-tests and corrected for multiple comparisons. For significant FC in each contrast, a weighted graph model was constructed with nodal degree and PageRank. We observed three main findings. First, band-pass filtered and noise vocoded speech differently impacted FC for syntax. Second, these FC networks were sensitive to interactions with additional listening challenges (speech rate). Third, the PFC had dual roles. During low cognitive demand and disengagement, the PFC operated independently of the LH (no FC). When listening was more degraded, the PFC operated in synchrony with the LH (FC). These findings inform us of the increased computational demands that may be required for successful processing of degraded speech, and how the PFC and LH network together to overcome listening challenges.
Nasiru Gill, Jason Putnam and Nikolas Francis
Topic areas: memory and cognition neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Our ability to monitor the recurrence of sensory events across different timescales is essential for adapting to dynamic environments, optimizing neural resource management, and identifying behaviorally-meaningful stimuli. One way that the central nervous system accomplishes this task is by repetition plasticity, in which neural activity is modulated up or down by repetitive sensation. Timescales of repetition plasticity typically span milliseconds to tens of seconds, with longer durations for cortical vs subcortical regions. Here, we used 2-photon (2P) imaging to study repetition plasticity in mouse primary auditory cortex (A1) layer 2/3 (L2/3) during the presentation of spectrotemporally randomized pure-tones. Our study revealed neurons with both repetition enhancement and suppression for equiprobable pure-tone frequencies spaced minutes apart, over a period lasting tens of minutes. Each neuron showed repetition plasticity for 1-2 frequencies near the neuron’s best frequency. Our results highlight cortical specialization for pattern recognition over long timescales in complex acoustic sequences.
Jasmine Hect, Kyle Rupp, Frederic Dick, Lori Holt and Taylor Abel
Topic areas: speech and language correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Voice perception engages characteristic regions of auditory cortex. However, it remains unclear to what extent these regions rely on shared or unique mechanisms for processing voice and non-voice sounds and for assessing for sound patterns specific to voice. Direct intracerebral recordings were obtained from seven patient-participants with epilepsy undergoing chronic monitoring. Participants listened to 1) voice and non-voice stimuli (Voice Localizer) and 2) synthetic sounds generated from modulated noise, called Gaussian Sound Patterns (GSPs). We tested the hypothesis that temporal voice areas (TVAs) rely on shared representations of voice and other natural sounds across human auditory cortex, including supratemporal plane (STP), superior temporal gyrus (STG) and superior temporal sulcus (STS).The GSPs stimuli mirror the spectrotemporal features of natural sounds, while remaining perceptually distinct. A convolutional neural network (CNN) classified GSPs stimuli across various sound categories and was used to identify the GSPs most and least like voice (250 total). Broadband high-gamma activity (HGA; 70-150 Hz) was extracted from epochs of auditory stimuli and silence and was used to identify sound-responsive channels (two-sample t-test, FDR-corrected, q
Katharina Bochtler, Fred Dick, Lori Holt, Andrew King and Kerry M M Walker
Topic areas: memory and cognition
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The ability to direct our attention towards a single sound source such as a friend’s voice in a crowded room is necessary in our acoustical world. This process is thought to rely, in part, on directing attention to different sound dimensions, such as frequency. Previous investigations have shown task-dependent changes in the frequency tuning of auditory cortical neurons when ferrets actively detect or discriminate a particular frequency of sound (e.g. Fritz et al. 2010). However, questions remain about how attentional gain can arise based on sound statistics. Specifically, to what extent can this modulation occur even if frequency is not a necessary component of the task demands? Mondor & Bregman (1994) demonstrated that human listeners’ reaction times on a tone duration task were slower when the presented tone frequency was unexpected (i.e. low probability). Here, we test the hypothesis that the statistical likelihood of sound frequencies alone can also affect animals’ behavioural decisions on orthogonal dimensions of sounds. We trained ferrets on either a go/no-go threshold detection or 2-alternative forced choice tone duration discrimination probe signal task in which we manipulated the statistical likelihood of tone frequencies. Our results show that, similar to humans, ferrets’ reaction times increased for low-probability frequencies but only if pure tones were presented near individual threshold for the go/no-go variant. As with humans, accuracy remained stable across frequencies for the duration discrimination variant while greater accuracy is seen for the expected frequency in the threshold detection variant. These results suggest that attentional filters are employed during listening, even for an acoustical dimension (frequency) that is orthogonal to the task demands (duration). Our future experiments will use this task in combination with microelectrode recordings to investigate the neurophysiological basis of statistical-based attentional filtering in the auditory cortex.
Aditya Vaidya, Liberty Hamilton and Alexander Huth
Topic areas: speech and language
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
High spatial and temporal resolution intracranial recordings (iEEG, including sEEG and ECoG) have rapidly advanced our understanding of how human brains process speech at fine timescales. However, these methods have limited anatomical coverage and opportunities to acquire data are rare. In contrast, non-invasive fMRI offers whole-brain coverage and is easier to obtain, but is considered unsuitable for many questions due to its low temporal resolution. Here we overcome this limitation using computational models and large fMRI datasets collected while subjects listened to up to 20 hours of natural speech. Our technique replicates two discoveries made with iEEG: the presence of onset-selective speech areas in STG (Hamilton et al. 2018), and a hierarchy of temporal integration windows (Norman-Haignere et al. 2022). Both results rely on differentiating responses at the scale of ~100 ms, something previously thought impossible with fMRI. We fit voxelwise encoding models using activations from WavLM, an artificial neural network trained to model speech sounds that captures many aspects of speech and provides unsurpassed brain prediction performance. WavLM features are computed at 100 Hz, but are then downsampled to 0.5 Hz and convolved with a hemodynamic response function (HRF) before being linearly combined in the encoding model. We can simulate underlying neural responses by applying the same linear transform—excluding the downsampling and HRF—directly to the 100 Hz features. To map onset-selective speech areas we fit WavLM-based encoding models to predict “onset” and “sustained” ECoG response components from Hamilton et al. We correlated the models’ weights with those obtained from fMRI. Despite the onset response’s short timescale, we replicated the finding of onset-like responses in posterior superior temporal gyrus (pSTG). To map integration windows we directly replicated Norman-Haignere et al. in silico. In their temporal context invariance (TCI) paradigm, stimuli of various durations are presented in different contexts. A brain area’s integration window is then the smallest duration that produces a context-invariant response. We used our WavLM-based models to simulate 100 Hz responses to these stimuli for each voxel, and applied the TCI procedure to estimate integration windows. Consistent with iEEG findings, this showed a gradient of integration windows in primary auditory cortex ranging from
Mark Presker, Gary Aston-Jones and Kasia Bieszczad
Topic areas: memory and cognition correlates of behavior/perception subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Sensory plasticity may underlie altered reactivity to drug-cues in cocaine addiction. Altered behavioral control and neural encoding of drug cues are main features of addiction underlying relapse propensity, and reward system plasticity is central to addiction. Historically, the role of sensory processing in addiction has received little attention. Recently, the centrality of sensory systems in learning and memory has emerged and the effects of drug taking on sensory systems may lie at the heart of altered drug cue reactivity to drug cues in addiction. Experience-dependent plasticity in the auditory brainstem occurs over a lifetime. These early processing changes are prime candidate mechanisms for adaptive processes like focused attention, or maladaptive disease states, like addiction. We focus on the auditory brainstem response (ABR) to test whether basic sensory processing is affected by learning to associate sound cues with the effects of cocaine. The ABR is a transient, sound-evoked neural potential arising from synchronous activity in brainstem neurons and characterized by five distinctive peaks arising from sequential nodes in the early ascending auditory pathway. We hypothesized that repeated tone-cocaine pairings induce sound-specific auditory brainstem plasticity, revealed in ABR changes. Experiment 1: An auditory-cocaine conditioning paradigm for rats was used to obtain click- and tone-evoked ABR pre- and post-conditioning (N=19). Rats (n=14) underwent 6d of conditioning to associate a pure tone stimulus with cocaine (20mg/kg; intraperitoneal). To control for cocaine exposure, a subset of rats were conditioned instead with saline (n=5). Experiment 2: To test behavioral effects of prior drug-cue exposure, we employed operant procedures in addition to ABR recordings pre- and post-cocaine (n=14) or saline (n=8) conditioning (N=22). ABR were analyzed for changes in established measures: sound level thresholds and peak latencies. We also employ a simple approach to identify and characterize broad ABR changes to complement traditional analyses. We observe both cocaine sound-specific changes in ABR related to the spectral identity of the cocaine-paired tone cue and global non-specific changes. We also report sound-specific ABR changes that correlate with behavioral measures of sound-cue stimulus control. These data support that drug addiction transforms the sensory systems, which are now prime candidate targets of drug addiction processes and recovery.
Tobias Teichert
Topic areas: speech and language hierarchical organization novel technologies subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Background. Key insights into the function of the human auditory system stem from electrical or magnetic recordings of auditory evoked responses (AEP) at the scalp or brain surface. A fundamental limitation of surface-based AEPs is the inherent inability to uniquely identify the underlying 3-dimensional electric fields and thus their neural generators. Despite their importance, no systematic effort has been made to directly measure, characterize, and analyze the underlying electric fields in their native 3-dimensional glory. Here we report recent success in recording 3-dimensional auditory evoked fields (AEF) from one entire hemisphere of one macaque monkey using our novel approach termed “whole-brain mesoscopic electrophysiology” or MePhys. We use MePhys to dissect the generators of the frequency following response (FFR), a spectro-temporally complex scalp-recorded AEP in response to speech sounds, believed to reflect multiple sources in brainstem, midbrain and cortex. Methods. Our MePhys platform consists of 62 multi-contact T-Probes (Plexon) arranged in a regular grid on the axial plane with shafts spaced 4 and 5 mm apart (medio-lateral and antero-posterior, respectively). The 16 contacts along the dorso-vental extent of each shaft were spaced out to span the underlying tissue of Tel-, Met-, and Diencephalon, as well as parts of the mesencephalon, but excluding the pons and medulla. Depending on the underlying anatomy, inter-electrode distance along a shaft ranged between 2.5 and 0.35 mm. In addition, we implanted 32 EEG skull electrodes for a total of 1024 electrode contacts. Results. We were able to simultaneously record AEFs from brainstem, inferior colliculus, thalamus, a dense cluster of regions in auditory cortex, as well as cerebellum, posterior parietal and motor cortex. FFRs were identified in brainstem, inferior colliculus, thalamus and the superior temporal plane. FFRs in each region had their own characteristic latency and filter properties. Brainstem and cortex clearly contributed to the scalp recorded FFR. However, despite vigorous responses in the inferior colliculus and thalamus, it was less clear if they propagate and contribute to the scalp FFR. Conclusion. Whole-brain mesoscopic electrophysiology has immense promise to simultaneously record and study communication between (auditory) regions across the entire brain. It thus builds a crucial bridge between macroscopic and microscopic electrophysiological recordings.
Shovan Bhatia, Kyle Rupp, Jasmine Hect, Sreekrishna Ramakrishnapillai and Taylor Abel
Topic areas: speech and language neural coding
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
INTRODUCTION Spectrotemporal receptive fields (STRFs) are a time-frequency measure of an auditory neuron response to stimuli delivered at various frequencies. They generate linear approximations of neural responses from sound spectrograms and can be used to study the auditory pathway. We hypothesize that a STRF model can be leveraged to understand and predict neural responses in the auditory cortex from the spectrotemporal features of natural sound stimuli. METHODS We queried our database of 24 patients aged 4-19 undergoing stereoelectroencephelography (sEEG) for drug-resistant epilepsy between 2019 and 2023. Electrodes in the auditory cortex, specifically within the superior temporal gyrus (STG), superior temporal sulcus (STS), and Heschl’s gyrus (HG) were chosen for analysis. Experimental sessions consisted of patients listening to 165 natural sounds (Natural Sounds Stimulus Set, McDermott Lab) and neural activity was recorded. High gamma neural activity and spectrograms of the sound stimuli were analyzed through STRF models. Three STRF models with varied fitting methods were generated using 5-fold cross validation and a time window of 300ms: Direct Fit, Gradient Descent, and Coordinate Descent (STRFPak, Thenussien Lab). Real and STRF predicted neural response were compared across key regions of interest (ROIs): STG, STS, and HG. Correlations between the real neural response and STRF predicted neural response were evaluated using a Pearson correlation. R2 values was the primary metric assessed. RESULTS Two patients, a 14-year-old male and 15-year-old female, were included in the preliminary analysis. There was a total of 43 channels analyzed for the first patient: STG (n=22), STS (n=19), HG (n=2), and 16 channels for the second patient: STG (n=5), STS (n=9), HG (n=2). For both patients, the Direct Fit and Coordinate Descent models performed the best within HG (Patient 1: R2 = 0.40, 0.45; Patient 2: R2 = 0.59, 0.60) and Gradient Descent models performed the best within the STG (Patient 1: R2 = 0.39; Patient 2: R2 = 0.51). CONCLUSION This preliminary study demonstrates the utility of STRF models for predicting neural response to natural sounds. This can be leveraged to further understand how neuronal populations within primary auditory cortex respond to natural sounds and stimuli in everyday life.
Chaoqun Cheng, Zijian Huang, Ruiming Zhang, Likai Tang and Xiaoqin Wang
Topic areas: correlates of behavior/perception cross-species comparisons neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The common marmoset has become an important experimental animal model in scientific research, including neuroscience, biomedicine and ethology. The ability to capture and quantify behaviors of marmosets in natural environment and social scenarios is highly desired by marmoset research community. Existing pose tracking methods like DeepLabCut and SLEAP enable multi-animal 2D pose tracking, and with particular extensions, DeepLabCut can further estimate an animal's 3D pose. Some other methods, such as DANNCE and OpenMonkeyStudio, provide 3D pose tracking for single animal. Multi-marmoset 3D pose tracking especially in real-time has remained a challenge in this field. We aim to develop an efficient and user-friendly 3D pose tracking system that can be used by a wide range of researchers to study the marmoset’s natural behaviors. Here, we introduce MarmoPose: a deep learning-based pose tracking system for estimating 3D poses of multiple marmosets. MarmoPose is designed to automatically track the poses of multiple marmosets freely roaming in their home-cage environment using four video cameras mounted on the cage corners. In our system, multi-view images captured by the cameras are first processed by deep neural networks to predict 2D locations of each marmoset’s 16 body part positions. Subsequently, triangulation and optimization are performed on these 2D location data, leveraging a marmoset skeleton model, to estimate the 3D poses. MarmoPose offers several advantages over existing systems: (1) This system is designed for user-friendly deployment in a typical marmoset family cage without additional modifications and can therefore be easily transferred to other housing or experimental environment. (2) This system employs a pre-established marmoset skeleton model for 3D coordinate optimization, thereby improving the precision of the reconstructed 3D poses and rendering it possible to estimate invisible body parts in the cameras’ blind spots. (3) We provide a comprehensive dataset of annotated marmoset body parts, using 16 points of interest to capture diverse body postures of a marmoset in motion. Researchers can select any subset of these body position points for tracking interactions between multiple marmosets, offering a tradeoff between precision and computing speed. (4) This system supports real-time close-loop experimental control based on captured marmosets' 3D poses and positions and can therefore be used in coordinating other experimental functions.
Frederic Dick, Austin Luor, Sahil Luthra and Lori L. Holt
Topic areas: memory and cognition correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
How are perception and decision making shaped by experience of the statistics of the world, even when those statistics seem to have little or no bearing on the task at hand? And what cognitive and neural mechanisms might mediate this statistical shaping? Here we address these broad questions in audition by building on two venerable psychoacoustics paradigms that we have adapted for efficient online experimentation. By manipulating the relative likelihood of tone frequency probability across a variety of tone pool sizes and distributions, we show that behavior does not strictly mirror input statistics. Rather, the shape of the influence of stimulus probability is consistent with an attentional filter directed at high-probability stimuli, with graded suppression of low-probability stimuli in manner that reflects perceptual distance from the high-probability stimulus. Quite counterintuitively, statistical learning makes detection of low-probability stimuli poorer and decision-making on low-probability stimuli slower relative to baseline performance. Importantly, these effects emerge even for input statistics present across a task-irrelevant perceptual dimension. We observe that this statistical learning emerges quickly and tracks closely with volatile changes in input statistics even when the probability distributions are bimodal. Moreover, we observe the fingerprints of adult participants’ long-term perceptual priors on the course of statistical learning. Across experiments we find that the character of statistically mediated perceptual plasticity is most consistent with the emergence of suppressive processes applied in a filter-like fashion across an auditory dimension.
Jiaoting Li, Juan Huang and Xiaoqin Wang
Topic areas: correlates of behavior/perception
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Pitch perception plays a crucial role in auditory processing of real-world sounds that oftentimes do not have perfect harmonic structure. In the acoustic engineering field, the harmonic sieve model has been proposed to simulate human pitch perception of sounds with imperfect harmonic structures. However, evidence of the harmonic sieve model is lacking. Here, we systematically examined the harmonic sieve hypothesis. In a series of psychophysics tests, participants were asked to discriminate pitches of perfect harmonic tones and inharmonic tones with jittered harmonic frequency. Results showed that pitch perception was robust when harmonic frequency was jittered within 5%, whereas jitter above 5% significantly impaired pitch perception. These results suggest that 5% tolerance may represent a property of the ‘harmonic sieve’. Furthermore, we recorded electroencephalography (EEG) from participants while they were listening to sounds with perfect harmonic structure (80% trials) and jittered harmonic frequency (5, 10, 20, and 30%, 5% trials each) through earphones. The amplitude of the event-related potential (ERP) at 100-220ms recorded from frontal-central electrodes decreased as jitter level increased. We found that 5% jitter did not result in significant change in the ERP amplitude, which together with the behavioral results, suggest that 5% jitter tolerance may represent the computational characteristic of the harmonic sieve model. Finally, we developed a computational model to simulate human pitch perception by implementing a specific Gaussian sieve-filter with a biologically constrained cochlear model. Our results provide new evidence supporting the harmonic sieve hypothesis and its role in pitch perception.
Jongwon Lee and Karl Kandler
Topic areas: subcortical processing
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
Before hearing onset, the glycinergic pathway from the medial nucleus of the trapezoid body (MNTB) to the lateral superior olive (LSO) undergoes functional and tonotopic refinement, which involves the silencing of most connections and the strengthening of maintained connections. During this developmental period, glycinergic MNTB-LSO synapses transiently co-release GABA. In this study, we addressed the question of whether this GABA co-release plays a role in the maturation and/or refinement of the MNTB-LSO pathway. Using slice recordings in a genetic mouse model in which GABA co-release is disrupted by the conditional and global deletion of the GABA-synthesizing enzymes GAD67 and GAD65, we found that loss of GABA co-release had no effect on the elimination or strengthening of connections, suggesting that topographic refinement does not require synaptic GABA signaling. However, loss of GABA co-release changed the functional synaptic architecture of MNTB-LSO connections, characterized by a 27% decrease in the number of release sites and a 67% increase in quantal size. Although these changes did not affect synaptic transmission at low frequency activation, they decreased the reliability and fidelity of synaptic transmission under high activity levels. We propose that GABA co-release promotes the development of MNTB-LSO connections with many release sites and large quantal content, which supports reliable synaptic transmission at high, in vivo-like, presynaptic activity levels. Supported by NICDC grant R01DC019814 and R01DC004199
Brendan Prendergast, Jingwen Li and Cory Miller
Topic areas: memory and cognition correlates of behavior/perception neural coding neuroethology/communication
Fri, 11/10 4:00PM - 6:00PM | Posters 2
Abstract
The capacity to form internal representations of space is fundamental to all animals. While a large body of literature shows that neurons in the hippocampus of mammalian species encode an individual’s self-position while locomoting through space (i.e. place cells), species that rely on distal active sensing are also able to represent space without the need to physically enter a scene. Primates, for example, use both vision and audition to encode spatial scenes. But while previous experiments have shown that neurons primate hippocampus encode visual space (e.g. spatial view cells), no previous studies have shown whether similar spatial mechanisms exist to encode auditory space in this substrate. Here we sought to address this question by recording the activity of single neurons in the hippocampus of freely-moving marmosets (Callithrix jacchus) while subjects participated in a multi-speaker, naturalistic communication network paradigm. Preliminary analyses from two monkeys revealed a subpopulation of neurons in marmoset hippocampus (31%) was spatially selective; that is, they exhibited significantly higher call-evoked activity for calls coming from a single speaker location, or adjacent speaker locations. Importantly, analyses indicated that the subject’s relative location and head orientation at the time a stimulus was broadcast had no effect on neural activity, indicating that the selectivity of auditory space neurons was allocentric. Furthermore, we found that 39% of these neurons were selective for the interaction of an individual caller and their spatial location. Further analyses showed that both single-unit and population-level linear models can accurately decode the speaker location of call events suggesting a broader ensemble code for representing allocentric space may be present in primate hippocampus. These findings are the first evidence that neurons in primate hippocampus encode allocentric locations in space and are selective for an individual caller’s position in an acoustic scene and have significant implications for conceptions of the neural mechanisms underlying the cocktail party problem.