Publications

PUBLICATIONS

[My website has moved to https://serrjoa.github.io/]

Ongoing
Work

Upsampling layers for music source separation.
J. Pons, J. Serrà, S. Pascual, G. Cengarle, D. Arteaga, & D. Scaini.
[arxiv] [demo]

2022

Self-supervised perceptual audio encoding by mixing discriminative and reconstructive tasks.
S. Pascual, J. Serrà, & J. Pons.
Patent Application No. ES-202230230 (Mar 18, 2022).

On loss functions and evaluation metrics for music source separation.
E. Gusó, J. Pons, S. Pascual, & J. Serrà.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). In press.
[arxiv] [DOI]

Assessing algorithmic biases for musical version identification.
F. Yesiler, M. Miron, J. Serrà, & E. Gómez.
Proc. of the ACM Int. Conf. on Web Search and Data Mining (WSDM), pp. 1284-1290. Feb 2022.
[arxiv] [DOI] [data+code]

Lognormals, power laws and double power laws in the distribution of frequencies of harmonic codewords from classical music.
M. Serra-Peralta, J. Serrà, & A. Corral.
Scientific Reports 12, 2615. Feb 2022.
[arxiv] [DOI] [code]

2021

Audio-based musical version identification: elements and challenges.
F. Yesiler, G. Doras, R.M. Bittner, C. Tralie, & J. Serrà.
IEEE Signal Processing Magazine 38(6): 115-136. Nov 2021.
[arxiv] [DOI] [web]

Adversarial auto-encoding for packet loss concealment.
S. Pascual, J. Serrà, & J. Pons.
Proc. of the IEEE Workshop on Appl. of Signal Proc. to Audio and Acoustics (WASPAA), pp. 71-75. Oct 2021.
[arxiv] [DOI]

Universal speech enhancement with generative neural networks.
J. Serrà, S. Pascual, & J. Pons.
Patent Application No. ES-P202130914 (Sep 29, 2021).

Heaps' law and vocabulary richness in the history of classical music harmony.
M. Serra-Peralta, J. Serrà, & A. Corral.
EPJ Data Science 10: 40. Aug 2021.
[arxiv] [DOI] [code]

Upsampling layers for audio synthesis.
J. Pons, J. Serrà, S. Pascual, G. Cengarle, D. Arteaga, & D. Scaini.
Patent Application No. ES-P202130417 (May 7, 2021), US-63/220279 (Jul 9, 2021).

On tuning consistent annealed sampling for denoising score matching.
J. Serrà, S. Pascual, & J. Pons.
Technical report. ArXiv: 2104.03725. Apr 2021.
[arxiv]

Investigating the efficacy of music version retrieval systems for setlist identification.
F. Yesiler, E. Molina, J. Serrà, & E. Gómez.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 541-545. Jun 2021.
[arxiv] [DOI] [data+code]

Upsampling artifacts in neural audio synthesis.
J. Pons, S. Pascual, G. Cengarle, & J. Serrà.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3005-3009. Jun 2021.
[arxiv] [DOI] [code]

Automatic multitrack mixing with a differentiable mixing console of neural audio effects.
C.J. Steinmetz, J. Pons, S. Pascual, & J. Serrà.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 71-75. Jun 2021.
[arxiv] [DOI] [samples+scripts]

SESQA: semi-supervised learning for speech quality assessment.
J. Serrà, J. Pons, & S. Pascual.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 381-385. Jun 2021.
[arxiv] [DOI]

2020

Real-time packet loss concealment using deep generative networks.
S. Pascual, J. Serrà, & J. Pons.
Patent Application No. ES-P202031040 (Oct 15, 2020), US-63/195831 (Jun 2, 2021).

Less is more: faster and better music version identification with embedding distillation.
F. Yesiler, J. Serrà, & E. Gómez.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2020.
[arxiv] [ISMIR]

Combining musical features for cover detection.
G. Doras, F. Yesiler, J. Serrà, E. Gómez, & G. Peeters.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2020.
[zenodo] [ISMIR]

Experience: advanced network operations in (un-)connected remote communities.
D. Perino, X. Yang, J. Serrà, A. Lutu, & I. Leontiadis.
Proc. of the ACM Int. Conf. on Mobile Computing and Networking (MobiCom), num. 1. Sep 2020.
[ACM] [DOI]

Method for learning an audio quality metric combining labeled and unlabeled data.
J. Serrà, J. Pons, & S. Pascual.
Patent Application No. ES-P202030605 (Jun 22, 2020), US-63/072787 (Aug 31, 2020), EP2021/066786 (Jun 21, 2021).

System for automated multitrack mixing in the waveform domain with a learned differentiable mixing console and controller network.
C.J. Steinmetz & J. Serrà.
Patent Application No. ES-P202030604 (Jun 22, 2020), US-63/072762 (Aug 31, 2020), EP2021/066206 (Jun 16, 2021).

Accurate and scalable version identification using musically-motivated embeddings.
F. Yesiler, J. Serrà, & E. Gómez.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 21-25. May 2020.
[arxiv] [DOI] [Code+model+eval]

Input complexity and out-of-distribution detection with likelihood-based generative models.
J. Serrà, D. Álvarez, V. Gómez, O. Slizovskaia, J.F. Núñez, & J. Luque.
Proc. of the Int. Conf. on Learning Representations (ICLR). Apr 2020.
[arxiv] [OpenReview] [Presentation]

2019

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion.
J. Serrà, S. Pascual, & C. Segura.
In Advances in Neural Information Processing Systems (NeurIPS) 32: 6790-6800. Dec 2019.
[arXiv] [NeurIPS] [Code] [Examples]

Towards generalized speech enhancement with generative adversarial networks.
S. Pascual, J. Serrà, & A. Bonafonte.
Proc. of the Conf. of the Int. Speech Communication Assoc. (INTERSPEECH), pp. 161-165. Sep 2019.
[arXiv] [DOI] [Code] [Samples]

Learning problem-agnostic speech representations from multiple self-supervised tasks.
S. Pascual, M. Ravanelli, J. Serrà, A. Bonafonte, & Y. Bengio.
Proc. of the Conf. of the Int. Speech Communication Assoc. (INTERSPEECH), pp. 1791-1795. Sep 2019.
[arXiv] [DOI] [Code+model]

Time-domain speech enhancement using generative adversarial networks.
S. Pascual, J. Serrà, & A. Bonafonte.
Speech Communication 114: 10-21. Sep 2019.
[DOI] [Code] [Samples1/Samples2]

Exploring efficient neural architectures for linguistic-acoustic mapping in text-to-speech.
S. Pascual, J. Serrà, & A. Bonafonte.
Applied Sciences 9(16): 3391. Aug 2019.
[DOI] [Code]

Training neural audio classifiers with few data.
J. Pons, J. Serrà, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 16-20. May 2019.
[arXiv] [DOI] [Code]

2018

When the state of the art is ahead of the state of understanding: unintuitive properties of deep neural networks.
J. Serrà.
Métode Science Studies Journal 99: 13-17. Dec 2018.
[UV] [DOI]

There goes Wally: anonymously sharing your location gives you away.
A. Pyrgelis, N. Kourtellis, I. Leontiadis, J. Serrà, & C. Soriente.
Proc. of the IEEE Int. Conf. on Big Data (BigData), pp. 1218-1227. Dec 2018.
[arXiv] [DOI]

Real non-volume preserving voice conversion.
S. Pascual, J. Serrà, & A. Bonafonte.
LXAI Research Workshop (NeurIPS-LXAI). Dec 2018.
[TALP] [LXAI]

Self-attention linguistic-acoustic decoder.
S. Pascual, A. Bonafonte, & J. Serrà.
Proc. of the IberSPEECH Conf., pp. 152-156. Nov 2018.
[arXiv] [ISCA]

Whispered-to-voiced alaryngeal speech conversion with generative adversarial networks.
S. Pascual, A. Bonafonte, J. Serrà, & J.A. Gonzalez.
Proc. of the IberSPEECH Conf., pp. 117-121. Nov 2018.
[arXiv] [ISCA] [Code]

Towards a universal neural network encoder for time series.
J. Serrà, S. Pascual, & A. Karatzoglou.
Proc. of the Int. Conf. of the Catalan Association for Artificial Intelligence (CCIA), Frontiers in Artificial Intelligence and Applications 308, pp. 120-129. Oct 2018.
[arXiv] [IOS]

MobInsight: a framework using semantic neighborhood features for localized interpretations of urban mobility.
S. Park, J. Serrà, E. Frias-Martinez, & N. Oliver.
ACM Trans. on Interactive Intelligent Systems 8(3): 23. Jul 2018.
[arXiv] [DOI] [Demo]

Overcoming catastrophic forgetting with hard attention to the task.
J. Serrà, D. Surís, M. Miron, & A. Karatzoglou.
Proc. of the Int. Conf. on Machine Learning (ICML) 80: 4555-4564. Jul 2018.
[arXiv] [PMLR] [Code]

Empirical evidence on daily cash flow time series and its implications for forecasting.
F. Salas-Molina, J.A. Rodríguez-Aguilar, J. Serrà, M. Guillen, & F.J. Martín.
Statistics and Operations Research Transactions 42(1): 73-98. Jun 2018.
[arXiv] [DOI] [Data]

Language and noise transfer in speech enhancement generative adversarial network.
S. Pascual, M. Park, J. Serrà, A. Bonafonte, & K.-H. Ahn.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 5019-5023. Apr 2018.
[arXiv] [DOI]

Unintuitive properties of deep neural networks.
J. Serrà.
Proc. of the EC Workshop on Human Behaviour and Machine Intelligence (HUMAINT), pp. 11-12. Mar 2018.
[EC]

2017

Continual prediction of notification attendance with classical and deep network approaches.
K. Katevas, I. Leontiadis, M. Pielot, & J. Serrà.
Technical report. Dec 2017.
[arXiv]

Beyond interruptibility: predicting opportune moments to engage mobile phone users.
M. Pielot, B. Cardoso, K. Katevas, J. Serrà, A. Matic, & N. Oliver.
Proc. of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1(3): 91. Sep 2017. Presented at UbiComp 2017.
[pielot] [DOI]

Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks.
J. Serrà & A. Karatzoglou.
Proc. of the ACM Conf. on Recommender Systems (RECSYS), pp. 279-287. Aug 2017.
[arXiv] [DOI]

SEGAN: speech enhancement generative adversarial network.
S. Pascual, A. Bonafonte, & J. Serrà.
Proc. of the Conf. of the Int. Speech Communication Assoc. (INTERSPEECH), pp. 3642-3646. Aug 2017.
[arXiv] [DOI] [Code] [Examples]

Class-based prediction errors to detect hate speech with out-of-vocabulary words.
J. Serrà, I. Leontiadis, D. Spathis, G. Stringhini, J. Blackburn, & A. Vakali.
Proc. of the Conf. of the Association for Computational Linguistics (ACL), Workshop on Abusive Language Online (ALW), pp. 36-40. Aug 2017.
[OpenReview] [ACL]

Practical processing of mobile sensor data for continual deep learning predictions.
K. Katevas, I. Leontiadis, M. Pielot, & J. Serrà.
Proc. of the ACM Int. Conf. on Mobile Systems, Applications and Services (MOBISYS), Workshop on Deep Learning for Mobile Systems and Applications (DeepMobile), pp. 19-24. Jun 2017.
[arXiv] [DOI]

Compact embedding of binary-coded inputs and outputs using Bloom filters.
J. Serrà & A. Karatzoglou.
Int. Conf. on Learning Representations (ICLR) Workshop. Apr 2017.
[OpenReview]

The good, the bad, and the KPIs: how to combine performance metrics to better capture under-performing sectors in mobile networks.
I. Leontiadis, J. Serrà, A. Finamore, G. Dimopoulos, & K. Papagiannaki.
Proc. of the IEEE Int. Conf. on Data Engineering (ICDE), pp. 297-308. Apr 2017.
[IEEE] [DOI]

Hot or not? Forecasting cellular network hot spots using sector performance indicators.
J. Serrà, I. Leontiadis, A. Karatzoglou, & K. Papagiannaki.
Proc. of the IEEE Int. Conf. on Data Engineering (ICDE), pp. 259-270. Apr 2017.
[arXiv] [DOI]

Empowering cash managers to achieve cost savings by improving predictive accuracy.
F. Salas-Molina, F.J. Martín, J.A. Rodríguez-Aguilar, J. Serrà, & J.L. Arcos.
International Journal of Forecasting 23(2): 403-415. Apr 2017.
[arXiv] [DOI]

Performance metrics using KPI combinations to better capture underperforming sectors in mobile networks.
I. Leontiadis, J. Serrà, & A. Finamore.
Patent EP17382164.6, filed on 31/03/2017.

Forecast of cellular network hot spots using sector performance indicators.
J. Serrà & I. Leontiadis.
Patent EP17382163.8, filed on 31/03/2017.

Effect of acoustic conditions on algorithms to detect Parkinson's disease from speech.
J.C. Vásquez-Correa, J. Serrà, J.R. Orozco-Arroyave, J.F. Vargas-Bonilla, & E. Nöth.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 5065-5069. Mar 2017.
[IEEE] [DOI]

2016

A genetic algorithm to discover flexible motifs with support.
J. Serrà, A. Matic, J.L. Arcos, & A. Karatzoglou.
Proc. of the IEEE Int. Conf. on Data Mining (ICDM), Workshop on Spatial and Spatiotemporal Data Mining (SSTDM), pp. 1153-1158. Dec 2016.
[arXiv] [DOI] [Code]

Time-delayed melody surfaces for raga recognition.
S. Gulati, J. Serrà, K.K. Ganguli, S. Senturk, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 751-757. Aug 2016.
[MTG] [ISMIR]

Ranking and significance of variable-length similarity-based time series motifs.
J. Serrà, I. Serra, A. Corral, & J.L. Arcos.
Expert Systems with Applications 55: 452-460. Aug 2016.
[arXiv] [DOI] [Code]

What makes a city vital and safe: Bogotá case study.
A. Bogomolov, A. Clavijo, M. De Nadai, R. Lara Molina, B. Lepri, E. Letouzé, N. Oliver, G. Pestre, J. Serrà, N. Shoup, & A. Ramirez Suarez.
Proc. of the Annual Bank Conf. on Development Economics (ABCDE): Data and Development Economics, session 2D: Crime, Civil Wars, and Hotspots. Jun 2016.
[ABCDE1] [ABCDE2]

Phrase-based raga recognition using vector space modeling.
S. Gulati, J. Serrà, V. Ishwar, S. Senturk, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 66-70. Mar 2016.
[MTG] [DOI] [Code/Data]

Discovering raga motifs by characterizing communities in networks of melodic patterns.
S. Gulati, J. Serrà, V. Ishwar, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 286-290. Mar 2016.
[MTG] [DOI] [Code/Data]

Particle swarm optimization for time series motif discovery.
J. Serrà & J.L. Arcos.
Knowledge-Based Systems 92: 127-137. Jan 2016.
[arXiv] [DOI] [Code]

2015

Improving melodic similarity in Indian art music using culture specific melodic characteristics.
S. Gulati, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 680-686. Oct 2015.
[MTG] [ISMIR]

Analysis of the impact of a tag recommendation system in a real-world folksonomy.
F. Font, J. Serrà, & X. Serra.
ACM Trans. on Intelligent Systems and Technology 7(1): 6. Oct 2015.
[IIIA] [DOI]

Zipf-like distributions in language and music.
I. Moreno, F. Font-Clos, J. Serrà, & A. Corral.
Complexitat.cat Workshop. May 2015.
[complexitat.cat]

An evaluation of methodologies for melodic similarity in audio recordings of Indian art music.
S. Gulati, J. Serrà, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 678-682. Apr 2015.
[IIIA] [DOI]

2014

Mining melodic patterns in large audio collections of Indian art music.
S. Gulati, J. Serrà, V. Ishwar, & X. Serra.
Proc. of the Int. Conf. on Signal Image Technology and Internet Based Systems (SITIS), pp. 264-271. Nov 2014.
[IIIA] [DOI] [Code] [Data]

Melodic pattern extraction in large collections of music recordings using time series mining techniques.
S. Gulati, J. Serrà, V. Ishwar, & X. Serra.
Demo Session at the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2014.
[IIIA] [ISMIR]

An empirical evaluation of similarity measures for time series classification.
J. Serrà & J.L. Arcos.
Knowledge-Based Systems 67: 305-314. Sep 2014.
[IIIA] [DOI]

Landmark detection in Hindustani music melodies.
S. Gulati, J. Serrà, K.K. Ganguli, & X. Serra.
Proc. of the Int. Computer Music Conf. / Sound and Music Computing Conf. (ICMC/SMC), vol. 2, pp. 1062-1068. Sep 2014.
[IIIA] [ICMC/SMC] [Data]

Class-based tag recommendation and user-based evaluation in online audio clip sharing.
F. Font, J. Serrà, & X. Serra.
Knowledge-Based Systems 67: 131-142. Sep 2014.
[IIIA] [DOI]

Unsupervised music structure annotation by time series structure features and segment similarity.
J. Serrà, M. Müller, P. Grosche, & J.L. Arcos.
IEEE Trans. on Multimedia, Special Issue on Music Data Mining 16(5): 1229-1240. Aug 2014.
[IIIA] [DOI] [Code]

Intonation analysis of ragas in Carnatic music.
G.K. Koduri, V. Ishwar, J. Serrà, & X. Serra.
Journal of New Music Research, Special Issue on Computational Approaches to the Art Music Traditions of India and Turkey 43(1): 72-93. Mar 2014.
[IIIA] [DOI]

Audio clip classification using social tags and the effect of tag expansion.
F. Font, J. Serrà, & X. Serra.
Proc. of the AES Int. Conf. on Semantic Audio, paper num. 26. Jan 2014.
[IIIA] [AES]

2013

Folksonomy-based tag recommendation for collaborative tagging systems.
F. Font, J. Serrà, & X. Serra.
Int. Journal on Semantic Web and Information Systems 9(2): 1-30. Nov 2013.
[IIIA] [DOI]

What can we learn from massive music archives?
J. Serrà.
Dagstuhl Seminar 13451: Computational Audio Analysis. M. Müller, S. Narayanan, and B. Schuller, eds. Wadern, Germany. Nov 2013.
[IIIA] [Dagstuhl]

Learning of units and knowledge representation.
F. Metze, X. Anguera, S. Ewert, J. Gemmeke, D. Kolossa, E. Mower Provost, B. Schuller, & J. Serrà.
Dagstuhl Seminar 13451: Computational Audio Analysis. M. Müller, S. Narayanan, and B. Schuller, eds. Wadern, Germany. Nov 2013.
[IIIA] [Dagstuhl]

Source separation.
C. Uhle, J. Driedger, B. Edler, S. Ewert, F. Graf, G. Kubin, M. Müller, N. Ono, B. Pardo, & J. Serrà.
Dagstuhl Seminar 13451: Computational Audio Analysis. M. Müller, S. Narayanan, and B. Schuller, eds. Wadern, Germany. Nov 2013.
[IIIA] [Dagstuhl]

Towards cover group thumbnailing.
P. Grosche, M. Müller, & J. Serrà.
Proc. of the ACM Int. Conf. on Multimedia (ACM-MM), pp. 613-616. Oct 2013.
[IIIA] [DOI]

Sample identification in hip-hop music.
J. Van Balen, J. Serrà, & M. Haro.
In From Sounds to Music and Emotions, M. Aramaki, M. Barthet, R. Kronland-Martinet, and S. Ystad eds., Lecture Notes in Computer Science, vol. 7900, ch. 5, pp. 301-312. Sep 2013.
[IIIA] [DOI]

Note onset deviations as musical piece signatures.
J. Serrà, T.H. Özaslan, & J.L. Arcos.
PLoS ONE 8(7): e69268. Jul 2013.
[PLoS] [DOI]

Cognitive prognosis of acquired brain injury patients using machine learning techniques.
J. Serrà, J.L. Arcos, A. García-Rudolph, A. García-Molina, T. Roig, & J.M. Tormos.
Proc. of the Int. Conf. on Advanced Cognitive Technologies and Applications (COGNITIVE), pp. 108-113. May 2013.
[IIIA] [CSIC]

Measuring quantitative trends in western popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro, & J.L. Arcos.
CRM-Imperial College Workshop on Complex Systems. Barcelona, Spain. Apr 2013.
[IIIA] [CRM]

Tonal representations for music retrieval: from version identification to query-by-humming.
J. Salamon, J. Serrà, & E. Gómez.
Int. Journal of Multimedia Information Retrieval 2(1): 45-58. Feb 2013.
[IIIA] [DOI]

2012

Structure-based audio fingerprinting for music retrieval.
P. Grosche, J. Serrà, M. Müller, & J.L. Arcos.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 55-60. Oct 2012.
[IIIA] [ISMIR]

Folksonomy-based tag recommendation for online audio clip sharing.
F. Font, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 73-78. Oct 2012.
[IIIA] [ISMIR]

Characterizaztion of intonation in Carnatic music by parametrizing pitch histograms.
G.K. Koduri, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 199-204. Oct 2012.
[IIIA] [ISMIR]

Extracting semantic information from an on-line Carnatic music forum.
M. Sordo, J. Serrà, G.K. Koduri, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 355-360. Oct 2012.
[IIIA] [ISMIR]

The importance of detecting boundaries in music structure annotation.
J. Serrà, M. Müller, P. Grosche, & J.L. Arcos.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2012.
[IIIA] [MIREX]

A competitive measure to assess the similarity between two time series.
J. Serrà & J.L. Arcos.
Proc. of the Int. Conf. on Case-Based Reasoning (ICCBR), Lecture Notes in Artificial Intelligence 7466, pp. 414-427. Sep 2012.
[IIIA] [DOI] [Code]

The computer as music critic.
J. Serrà & J.L. Arcos.
The New York Times, pp. SR12. September 15, 2012.
[IIIA] [NYTimes]

Measuring the evolution of contemporary western popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro & J.L. Arcos.
Scientific Reports 2: 521. Jul 2012.
[IIIA] [DOI]

Characterization and exploitation of community structure in cover song networks.
J. Serrà, M. Zanin, P. Herrera, & X. Serra.
Pattern Recognition Letters 33(9): 1032-1041. Jul 2012.
[arXiv] [DOI]

Unsupervised detection of music boundaries by time series structure features.
J. Serrà, M. Müller, P. Grosche, & J.L. Arcos.
Proc. of the AAAI Int. Conf. on Artificial Intelligence (AAAI), pp. 1613-1619. Jul 2012.
[IIIA] [AAAI]

Extracting semantic information from on-line art music discussion forums.
M. Sordo, J. Serrà, G.K. Koduri, & X. Serra.
CompMusic Workshop. Jul 2012.
[IIIA] [CompMusic]

Computational analysis of intonation in Indian art music.
G.K. Koduri, J. Serrà, & X. Serra.
CompMusic Workshop. Jul 2012.
[IIIA] [CompMusic]

Automatic identification of samples in hip hop music.
J. Van Balen, M. Haro, & J. Serrà.
Proc. of the Int. Symp. on Computer Music Modeling and Retrieval (CMMR), pp. 544-551. Jun 2012.
[IIIA] [CMMR]

Quantifying the evolution of popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro, & J.L. Arcos.
No Lineal Conf. Jun 2012.
[IIIA] [NoLineal]

Patterns, regularities, and evolution of contemporary popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro, & J.L. Arcos.
Complexitat.Cat Workshop. May 2012.
[IIIA] [complexitat.cat]

Power-law distribution in encoded MFCC frames of speech, music, and environmental sound signals.
M. Haro, J. Serrà, A. Corral, & P. Herrera.
Proc. of the Int. World Wide Web Conf. (WWW), Workshop on Advances in Music Information Research (AdMIRe), pp. 895-902. Apr 2012.
[IIIA] [WWW]

Melody, bassline, and harmony representations for music version identification.
J. Salamon, J. Serrà, & E. Gómez.
Proc. of the Int. World Wide Web Conf. (WWW), Workshop on Advances in Music Information Research (AdMIRe), pp. 887-894. Apr 2012.
[IIIA] [WWW]

Audio content-based music retrieval.
P. Grosche, M. Müller, & J. Serrà.
In Multimodal Music Processing, M. Müller, M. Goto, and M. Schedl eds., Dagstuhl Follow-Ups, Dagstuhl Publishing, Wadern, Germany, vol. 3, ch. 9, pp. 157-174. Apr 2012.
[IIIA] [Dagstuhl]

Zipf's law in short-time timbral codings of speech, music, and environmental sound signals.
M. Haro, J. Serrà, P. Herrera, & A. Corral.
PLoS ONE 7(3): e33993. Mar 2012.
[IIIA] [DOI]

Predictability of music descriptor time series and its application to cover song detection.
J. Serrà, H. Kantz, X. Serra, & R.G. Andrzejak.
IEEE Trans. on Audio, Speech and Language Processing 20(2): 514-525. Feb 2012.
[MTG] [DOI]

2011

Identification of versions of the same musical composition: audio content-based approaches and post-processing steps.
J. Serrà.
LAP Lambert Academic Publishing, Saarbrücken, Germany. ISBN 978-3-8473-2785-1. Dec 2011.
[Amazon] [BN]

Assessing the tuning of sung Indian classical music.
J. Serrà, G.K. Koduri, M. Miron, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 263-268. Oct 2011.
[MTG] [ISMIR]

Computational approaches for the understanding of melody and rhythm in Carnatic music.
G.K. Koduri, M. Miron, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 157-162. Oct 2011.
[MTG] [ISMIR]

Unifying low-level and high-level music similarity measures.
D. Bogdanov, J. Serrà, N. Wack, P. Herrera, & X. Serra.
IEEE Trans. on Multimedia 13(4): 687-701. Aug 2011.
[MTG] [DOI]

Method for calculating measures of similarity between time signals.
J. Serrà.
Patent US 2011/0178615, published July 21, 2011. Priority num. ES20090001057-20090423. Also published as ES 2354330 (Método para calcular medidas de similitud entre señales temporales).
[FreePatentsOnline] [EspaceNet]

Nonlinear audio recurrence analysis with application to genre classification.
J. Serrà, C.A. De Los Santos, & R.G. Andrzejak.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 169-172. May 2011.
[MTG] [DOI]

Identification of versions of the same musical composition by processing audio descriptions.
J. Serrà.
PhD Thesis. Universitat Pompeu Fabra, Barcelona, Spain. Mar 2011.
[MTG] [TDX]

Cover song networks: analysis and accuracy increase.
J. Serrà, M. Zanin, & P. Herrera.
Int. Journal of Complex Systems in Science 1: 55-59. Jan 2011.
[MTG]

2010 & Before...

Model-based cover song detection via threshold autoregressive forecasts.
J. Serrà, H. Kantz, & R.G. Andrzejak.
Proc. of the ACM Int. Conf. on Multimedia (ACM-MM), Workshop on Music and Machine Learning (MML), pp. 13-16. Oct 2010.
[MTG] [DOI]

Unsupervised accuracy improvement for cover song detection using spectral connectivity network.
M. Lagrange & J. Serrà.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 595-600. Aug 2010.
[MTG] [ISMIR]

Hybrid music similarity measure.
D. Bogdanov, J. Serrà, N. Wack, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Aug 2010.
[MTG] [MIREX]

Music classification using high-level models.
N. Wack, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, E. Gómez, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Aug 2010.
[MTG] [MIREX]

Cover song networks: analysis and accuracy increase.
J. Serrà, M. Zanin, & P. Herrera.
Net-Works Int. Conf. Jun 2010.
[MTG] [Net-Works]

Indexing music by mood: design and integration of an automatic content-based annotator.
C. Laurier, O. Meyers, J. Serrà, M. Blech, P. Herrera, & X. Serra.
Multimedia Tools and Applications 48(1): 161-184. May 2010.
[MTG] [DOI]

Audio cover song identification and similarity: background, approaches, evaluation, and beyond.
J. Serrà, E. Gómez, & P. Herrera.
In Advances in Music Information Retrieval, Z. W. Ras and A. A. Wieczorkowska eds., Studies in Computational Intelligence series, Springer, Berlin, Germany, vol. 274, ch. 14, pp. 307-332. Mar 2010.
[MTG] [DOI]

From low-level to high-level: comparative study of music similarity measures.
D. Bogdanov, J. Serrà, N. Wack, & P. Herrera.
Proc. of the IEEE Int. Symp. on Multimedia, Workshop on Advances in Music Information Research (AdMIRe), pp. 453-458. Dec 2009.
[MTG] [DOI]

Unsupervised detection of cover song sets: accuracy improvement and original identification.
J. Serrà, M. Zanin, C. Laurier, & M. Sordo.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 225-230. Oct 2009.
[MTG] [ISMIR]

Music mood representations from social tags.
C. Laurier, M. Sordo, J. Serrà, & P. Herrera.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 381-386. Oct 2009.
[MTG] [ISMIR]

The discipline formerly known as MIR.
P. Herrera, J. Serrà, C. Laurier, E. Guaus, E. Gómez, & X. Serra.
Int. Society for Music Information Retrieval Conf. (ISMIR), special session on the Future of MIR (fMIR). Oct 2009.
[MTG] [fMIR]

Cover song retrieval by cross recurrence quantification and unsupervised set detection.
J. Serrà, M. Zanin, & R.G. Andrzejak.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2009.
[MTG] [MIREX]

Music type groupers (MTG): generic music classification algorithms.
N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2009.
[MTG] [MIREX]

Hybrid similarity measures for music recommendation.
D. Bogdanov, J. Serrà, N. Wack, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2009.
[MTG] [MIREX]

Assessing the results of a cover song identification system with coverSSSSearch.
J. Serrà.
Demo Session at the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2009.
[MTG]

Cross recurrence quantification for cover song identification.
J. Serrà, X. Serra, & R.G. Andrzejak.
New Journal of Physics 11: 093017. Sep 2009.
[MTG] [DOI] [Code]

Shape-based spectral contrast descriptor.
V. Akkermans, J. Serrà, & P. Herrera.
Proc. of the Sound and Music Computing Conf. (SMC), pp. 143-148. Jul 2009.
[MTG] [SMC]

Music mood annotator design and integration.
C. Laurier, O. Meyers, J. Serrà, M. Blech, & P. Herrera.
Proc. of the Int. Workshop on Content-Based Multimedia Indexing (CBMI), pp. 156-161. Jun 2009.
[MTG] [DOI]

Music similarity systems and methods using descriptors.
E. Gómez, P. Herrera, P. Cano, J. Janer, J. Serrà, J. Bonada, S. El-Hajj, T. Aussenac, & G. Holmberg.
Patent US 2008/300702, published December 31, 2008. Priority nums. US20070946860P-20070628, US20070970109P-20070905, and US20070988714P-20071116. Also published as WO 2009/001202.
[FreePatentsOnline] [EspaceNet]

Statistical analysis of chroma features in western music predicts human judgments of tonality.
J. Serrà, E. Gómez, P. Herrera, & X. Serra.
Journal of New Music Research 37(4): 299-309. Dec 2008.
[MTG] [DOI]

Transposing chroma representations to a common key.
J. Serrà, E. Gómez, & P. Herrera.
Proc. of the Int. Conf. on The Use of Symbols to Represent Music and Multimedia Objects, pp. 45-48. Oct 2008.
[MTG] [UniMi]

Improving binary similarity and local alignment for cover song detection.
J. Serrà, E. Gómez, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Sep 2008.
[MTG] [MIREX]

Chroma binary similarity and local alignment applied to cover song identification.
J. Serrà, E. Gómez, P. Herrera, & X. Serra.
IEEE Trans. on Audio, Speech and Language Processing 16(6): 1138-1152. Aug 2008.
[MTG] [DOI]

Audio cover song identification based on tonal sequence alignment.
J. Serrà & E. Gómez.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 61-64. Apr 2008.
[MTG] [DOI]

A qualitative assessment of measures for the evaluation of a cover song identification system.
J. Serrà.
Proc. of the Int. Conf. on Music Information Retrieval (ISMIR), pp. 319-322. Sep 2007.
[MTG] [ISMIR]

A cover song identification system based on sequences of tonal descriptors.
J. Serrà & E. Gómez.
Music Information Retrieval Evaluation eXchange (MIREX). Sep 2007.
[MTG] [MIREX]

Music similarity based on sequences of descriptors: tonal features applied to cover song identification.
J. Serrà.
MSc Thesis. Universitat Pompeu Fabra, Barcelona, Spain. Sep 2007.
[MTG]