H. Asari and A. M. Zador, Long-Lasting Context Dependence Constrains Neural Encoding Models in Rodent Auditory Cortex, Journal of Neurophysiology, vol.102, issue.5, pp.2638-2656, 2009.
DOI : 10.1152/jn.00577.2009

C. Atencio and C. Schreiner, Columnar Connectivity and Laminar Processing in Cat Primary Auditory Cortex, PLoS ONE, vol.5, issue.3, 2010.
DOI : 10.1371/journal.pone.0009521.g013

J. Aucouturier and F. Pachet, Improving timbre similarity: how high's the sky?, J. Negat. Results Speech Audio Sci, vol.1, pp.1-13, 2004.

J. Aucouturier and F. Pachet, The influence of polyphony on the dynamical modelling of musical timbre, Pattern Recognition Letters, vol.28, issue.5, pp.654-661, 2007.
DOI : 10.1016/j.patrec.2006.11.004

J. Aucouturier and F. Pachet, The influence of polyphony on the dynamical modelling of musical timbre, Pattern Recognition Letters, vol.28, issue.5, pp.654-661, 2007.
DOI : 10.1016/j.patrec.2006.11.004

S. Baumann, T. Griffiths, L. Sun, C. Petkov, A. Thiele et al., Orthogonal representation of sound dimensions in the primate midbrain, Nature Neuroscience, vol.84, issue.4, pp.423-425, 2011.
DOI : 10.1038/88459

URL : https://hal.archives-ouvertes.fr/hal-00619270

C. M. Bishop and N. M. Nasrabadi, Pattern Recognition and Machine Learning, 2006.

Y. Boureau, J. Ponce, and Y. Lecun, A theoretical analysis of feature pooling in visual recognition, Proceedings of the 27th International Conference on Machine Learning (ICML-10) (Haifa), pp.111-118, 2010.

T. Chi, P. Ru, and S. Shamma, Multiresolution spectrotemporal analysis of complex sounds, The Journal of the Acoustical Society of America, vol.118, issue.2, pp.887-906, 2005.
DOI : 10.1121/1.1945807

G. Christianson, M. Sahani, and J. And-linden, The Consequences of Response Nonlinearities for Interpretation of Spectrotemporal Receptive Fields, Journal of Neuroscience, vol.28, issue.2, pp.446-455, 2007.
DOI : 10.1523/JNEUROSCI.1775-07.2007

S. V. David and S. A. Shamma, Integration over Multiple Timescales in Primary Auditory Cortex, Journal of Neuroscience, vol.33, issue.49, 2013.
DOI : 10.1523/JNEUROSCI.2270-13.2013

J. J. Eggermont, E. Meddis, A. Lopez-poveda, R. Popper, and . Fay, The auditory cortex: the final frontier, " in Computational Models of the Auditory System, Auditory Research, vol.35, pp.97-127, 2010.

M. A. Escabí, R. Nassiri, L. M. Miller, C. E. Schreiner, and H. L. Read, The Contribution of Spike Threshold to Acoustic Feature Selectivity, Spike Information Content, and Information Throughput, Journal of Neuroscience, vol.25, issue.41, pp.9524-9534, 2005.
DOI : 10.1523/JNEUROSCI.1804-05.2005

A. Fishbach, Y. Yeshurun, and I. Nelken, Neural Model for Physiological Responses to Frequency and Amplitude Transitions Uncovers Topographical Order in the Auditory Cortex, Journal of Neurophysiology, vol.90, issue.6, pp.2303-2323, 2003.
DOI : 10.1152/jn.00654.2003

A. Flexer, E. Pampalk, and G. Widmer, Hidden markov models for spectral similarity of songs, Proceedings of the 8th International Conference on Digital Audio Effects (DAFx), 2005.

D. Gehr, H. Komiya, and J. Eggermont, Neuronal responses in cat primary auditory cortex to natural and altered species-specific calls, Hearing Research, vol.150, issue.1-2, pp.27-42, 2000.
DOI : 10.1016/S0378-5955(00)00170-2

D. Giannoulis, E. Benetos, D. Stowell, M. Rossignol, M. Lagrange et al., Detection and classification of acoustic scenes and events: An IEEE AASP challenge, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013.
DOI : 10.1109/WASPAA.2013.6701819

URL : https://hal.archives-ouvertes.fr/hal-01123765

B. Gourévitch, A. Noreña, G. Shaw, and J. Eggermont, Spectrotemporal Receptive Fields in Anesthetized Cat Primary Auditory Cortex Are Context Dependent, Cerebral Cortex, vol.19, issue.6, pp.1448-1461, 2009.
DOI : 10.1093/cercor/bhn184

N. Halko, P. Martinsson, and J. A. Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, SIAM Review, vol.53, issue.2, pp.217-288, 2011.
DOI : 10.1137/090771806

P. Hamel, S. Lemieux, Y. Bengio, and D. Eck, Temporal pooling and multiscale learning for automatic annotation and ranking of music audio, Proceedings International Conference on Music Information Retrieval, pp.729-734, 2011.

G. Kim and A. Doupe, Organized Representation of Spectrotemporal Features in Songbird Auditory Forebrain, Journal of Neuroscience, vol.31, issue.47, 2011.
DOI : 10.1523/JNEUROSCI.2003-11.2011

D. Klein, P. Konig, and K. Kording, Sparse Spectrotemporal Coding of Sounds, EURASIP Journal on Advances in Signal Processing, vol.2003, issue.7, pp.659-667, 2003.
DOI : 10.1155/S1110865703303051

B. Kollmeier, M. R. Schädler, A. Meyer, J. Anemüller, M. et al., Do We Need STRFs for Cocktail Parties? On the Relevance of Physiologically Motivated Features for Human Speech Perception Derived from Automatic Speech Recognition, Basic Aspects of Hearing, pp.333-341, 2013.
DOI : 10.1007/978-1-4614-1590-9_37

M. Lagrange, Explicit modeling of temporal dynamics within musical signals for acoustical unit similarity, Pattern Recognition Letters, vol.31, issue.12, pp.1498-1506, 2010.
DOI : 10.1016/j.patrec.2009.09.008

URL : https://hal.archives-ouvertes.fr/hal-00945198

M. Lagrange, G. Lafay, B. Defreville, and J. J. Aucouturier, The bag-offrames approach: a not so sufficient model for urban soundscapes, after all, J. Acoust. Soc. Am. Express Lett, 2015.

I. Lampl, D. Ferster, T. Poggio, and M. Riesenhuber, Intracellular Measurements of Spatial Integration and the MAX Operation in Complex Cells of the Cat Primary Visual Cortex, Journal of Neurophysiology, vol.92, issue.5, pp.2704-2713, 2004.
DOI : 10.1152/jn.00060.2004

B. Logan and A. Salomon, A music similarity function based on signal analysis, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001., 2001.
DOI : 10.1109/ICME.2001.1237829

T. Lu, L. Liang, W. , X. Mcdermott, J. Schemistch et al., Temporal and rate representations of time-varying signals in the auditory cortex of awake primates Summary statistics in auditory perception, Nature Neuroscience, vol.4, issue.11, pp.1131-1138, 2001.
DOI : 10.1038/nn737

N. Mesgarani, C. Cheung, K. Johnson, C. , and E. F. , Phonetic Feature Encoding in Human Superior Temporal Gyrus, Science, vol.343, issue.6174, 2014.
DOI : 10.1126/science.1245994

N. Mesgarani, S. David, and S. Shamma, Representation of Phonemes in Primary Auditory Cortex: How the Brain Analyzes Speech, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07, pp.765-768, 2007.
DOI : 10.1109/ICASSP.2007.367025

N. Misdariis, A. Minard, P. Susini, G. Lemaitre, S. Mcadams et al., Environmental sound perception: Metadescription and modeling based on independent primary studies, EURASIP J. Audio Speech Music Process, pp.10-1155, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00560335

M. Müller, D. P. Ellis, A. Klapuri, R. , and G. , Signal Processing for Music Analysis, IEEE Journal of Selected Topics in Signal Processing, vol.5, issue.6, pp.1088-1110, 2011.
DOI : 10.1109/JSTSP.2011.2112333

I. Nelken and A. De-cheveigné, An ear for statistics, Nature Neuroscience, vol.66, issue.4, pp.381-382, 2013.
DOI : 10.1038/nn.3360

N. Orio, Music retrieval: a tutorial and review. Found. Trends Inf, 2006.

E. Pampalk, Audio-based music similarity and retrieval:combining a spectral similarity model with information extracted from fluctuation patterns, Proceedings of the ISMIR International Conference on Music Information Retrieval (ISMIR'06), 2006.

K. Patil, D. Pressnitzer, S. Shamma, and M. Elhilali, Music in our ears: the biological bases of musical timbre perception, PLOS Comput. Biol, 2012.

G. Peeters, L. Burthe, A. , R. , and X. , Toward automatic music audio summary generation from signal analysis, Proceedings International Conference on Music Information Retrieval, pp.94-100, 2002.
URL : https://hal.archives-ouvertes.fr/hal-01161322

D. Ress and B. Chandrasekaran, Tonotopic Organization in the Depth of Human Inferior Colliculus, Frontiers in Human Neuroscience, vol.7, 2013.
DOI : 10.3389/fnhum.2013.00586

A. Schirmer and S. Kotz, Beyond the right hemisphere: brain mechanisms mediating vocal emotional processing, Trends in Cognitive Sciences, vol.10, issue.1, 2006.
DOI : 10.1016/j.tics.2005.11.009

C. Schreiner, H. Read, and M. Sutter, Modular Organization of Frequency Integration in Primary Auditory Cortex, Annual Review of Neuroscience, vol.23, issue.1, 2000.
DOI : 10.1146/annurev.neuro.23.1.501

W. A. Sethares and T. Staley, Periodicity transforms, IEEE Transactions on Signal Processing, vol.47, issue.11, pp.2953-2964, 1999.
DOI : 10.1109/78.796431

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.8853

T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio, Robust Object Recognition with Cortex-Like Mechanisms, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.3, pp.411-42656, 2007.
DOI : 10.1109/TPAMI.2007.56