Joan Serrà
  • Home
  • Publications
  • Talks/Teaching
  • Projects
  • Misc
  • Contact
  • Home
  • Publications
  • Talks/Teaching
  • Projects
  • Misc
  • Contact
Search

PUBLICATIONS

Another list of publications (with citations) is available from my Google Scholar profile.

Ongoing
Work

Upsampling layers for music source separation.
J. Pons, J. Serrà, S. Pascual, G. Cengarle, D. Arteaga, & D. Scaini.
[arxiv] [demo]

2022

Self-supervised perceptual audio encoding by mixing discriminative and reconstructive tasks.
S. Pascual, J. Serrà, & J. Pons.
Patent Application No. ES-202230230 (Mar 18, 2022).

On loss functions and evaluation metrics for music source separation.
E. Gusó, J. Pons, S. Pascual, & J. Serrà.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). In press.
[arxiv] [DOI]

Assessing algorithmic biases for musical version identification.
F. Yesiler, M. Miron, J. Serrà, & E. Gómez.
Proc. of the ACM Int. Conf. on Web Search and Data Mining (WSDM), pp. 1284-1290. Feb 2022.
[arxiv] [DOI] [data+code]

Lognormals, power laws and double power laws in the distribution of frequencies of harmonic codewords from classical music.
M. Serra-Peralta, J. Serrà, & A. Corral.
Scientific Reports 12, 2615. Feb 2022.
[arxiv] [DOI] [code]

2021

Audio-based musical version identification: elements and challenges.
F. Yesiler, G. Doras, R.M. Bittner, C. Tralie, & J. Serrà.
IEEE Signal Processing Magazine 38(6): 115-136. Nov 2021.
[arxiv] [DOI] [web]

Adversarial auto-encoding for packet loss concealment.
S. Pascual, J. Serrà, & J. Pons.
Proc. of the IEEE Workshop on Appl. of Signal Proc. to Audio and Acoustics (WASPAA), pp. 71-75. Oct 2021.
[arxiv] [DOI]

Universal speech enhancement with generative neural networks.
J. Serrà, S. Pascual, & J. Pons.
Patent Application No. ES-P202130914 (Sep 29, 2021).

Heaps' law and vocabulary richness in the history of classical music harmony.
M. Serra-Peralta, J. Serrà, & A. Corral.
EPJ Data Science 10: 40. Aug 2021.
[arxiv] [DOI] [code]

Upsampling layers for audio synthesis.
J. Pons, J. Serrà, S. Pascual, G. Cengarle, D. Arteaga, & D. Scaini.
Patent Application No. ES-P202130417 (May 7, 2021), US-63/220279 (Jul 9, 2021).

On tuning consistent annealed sampling for denoising score matching.
J. Serrà, S. Pascual, & J. Pons.
Technical report. ArXiv: 2104.03725. Apr 2021.
[arxiv]

Investigating the efficacy of music version retrieval systems for setlist identification.
F. Yesiler, E. Molina, J. Serrà, & E. Gómez.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 541-545. Jun 2021.
[arxiv] [DOI] [data+code]

Upsampling artifacts in neural audio synthesis.
J. Pons, S. Pascual, G. Cengarle, & J. Serrà.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3005-3009. Jun 2021.
[arxiv] [DOI] [code]

Automatic multitrack mixing with a differentiable mixing console of neural audio effects.
C.J. Steinmetz, J. Pons, S. Pascual, & J. Serrà.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 71-75. Jun 2021.
[arxiv] [DOI] [samples+scripts]

SESQA: semi-supervised learning for speech quality assessment.
J. Serrà, J. Pons, & S. Pascual.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 381-385. Jun 2021.
[arxiv] [DOI]

2020

Real-time packet loss concealment using deep generative networks.
S. Pascual, J. Serrà, & J. Pons.
Patent Application No. ES-P202031040 (Oct 15, 2020), US-63/195831 (Jun 2, 2021).

Less is more: faster and better music version identification with embedding distillation.
F. Yesiler, J. Serrà, & E. Gómez.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2020.
[arxiv] [ISMIR]

Combining musical features for cover detection.
G. Doras, F. Yesiler, J. Serrà, E. Gómez, & G. Peeters.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2020.
[zenodo] [ISMIR]

Experience: advanced network operations in (un-)connected remote communities.
D. Perino, X. Yang, J. Serrà, A. Lutu, & I. Leontiadis.
Proc. of the ACM Int. Conf. on Mobile Computing and Networking (MobiCom), num. 1. Sep 2020.
[ACM] [DOI]

Method for learning an audio quality metric combining labeled and unlabeled data.
J. Serrà, J. Pons, & S. Pascual.
Patent Application No. ES-P202030605 (Jun 22, 2020), US-63/072787 (Aug 31, 2020), EP2021/066786 (Jun 21, 2021).


System for automated multitrack mixing in the waveform domain with a learned differentiable mixing console and controller network.
C.J. Steinmetz & J. Serrà.
Patent Application No. ES-P202030604 (Jun 22, 2020), US-63/072762 (Aug 31, 2020), EP2021/066206 (Jun 16, 2021).

Accurate and scalable version identification using musically-motivated embeddings.
F. Yesiler, J. Serrà, & E. Gómez.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 21-25. May 2020.
[arxiv] [DOI] [Code+model+eval]

Input complexity and out-of-distribution detection with likelihood-based generative models.
J. Serrà, D. Álvarez, V. Gómez, O. Slizovskaia, J.F. Núñez, & J. Luque.
Proc. of the Int. Conf. on Learning Representations (ICLR). Apr 2020.
[arxiv] [OpenReview] [Presentation]

2019

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion.
J. Serrà, S. Pascual, & C. Segura.
In Advances in Neural Information Processing Systems (NeurIPS) 32: 6790-6800. Dec 2019.
[arXiv] [NeurIPS] [Code] [Examples]

Towards generalized speech enhancement with generative adversarial networks.
S. Pascual, J. Serrà, & A. Bonafonte.
Proc. of the Conf. of the Int. Speech Communication Assoc. (INTERSPEECH), pp. 161-165. Sep 2019.
[arXiv] [DOI] [Code] [Samples]

Learning problem-agnostic speech representations from multiple self-supervised tasks.
S. Pascual, M. Ravanelli, J. Serrà, A. Bonafonte, & Y. Bengio.
Proc. of the Conf. of the Int. Speech Communication Assoc. (INTERSPEECH), pp. 1791-1795. Sep 2019.
[arXiv] [DOI] [Code+model]

Time-domain speech enhancement using generative adversarial networks.
S. Pascual, J. Serrà, & A. Bonafonte.
Speech Communication 114: 10-21. Sep 2019.
[DOI] [Code] [Samples1/Samples2]

Exploring efficient neural architectures for linguistic-acoustic mapping in text-to-speech.
S. Pascual, J. Serrà, & A. Bonafonte.
Applied Sciences 9(16): 3391. Aug 2019.
[DOI] [Code]

Training neural audio classifiers with few data.
J. Pons, J. Serrà, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 16-20. May 2019.
​[arXiv] [DOI] [Code]

2018

When the state of the art is ahead of the state of understanding: unintuitive properties of deep neural networks.
J. Serrà.
Métode Science Studies Journal 99: 13-17. Dec 2018.
[UV] [DOI]

There goes Wally: anonymously sharing your location gives you away.
A. Pyrgelis, N. Kourtellis, I. Leontiadis, J. Serrà, & C. Soriente.
Proc. of the IEEE Int. Conf. on Big Data (BigData), pp. 1218-1227. Dec 2018. 
​[arXiv] [DOI]

Real non-volume preserving voice conversion.
S. Pascual, J. Serrà, & A. Bonafonte.
LXAI Research Workshop (NeurIPS-LXAI). Dec 2018.
[TALP] ​[LXAI]

Self-attention linguistic-acoustic decoder.
S. Pascual, A. Bonafonte, & J. Serrà.
Proc. of the IberSPEECH Conf., pp. 152-156. Nov 2018.
[arXiv] [ISCA]

Whispered-to-voiced alaryngeal speech conversion with generative adversarial networks.
S. Pascual, A. Bonafonte, J. Serrà, & J.A. Gonzalez.
Proc. of the IberSPEECH Conf., pp. 117-121. Nov 2018.
[arXiv] [ISCA] [Code]

Towards a universal neural network encoder for time series.
J. Serrà, S. Pascual, & A. Karatzoglou.
Proc. of the Int. Conf. of the Catalan Association for Artificial Intelligence (CCIA), Frontiers in Artificial Intelligence and Applications 308, pp. 120-129. Oct 2018.
[arXiv] [IOS]

MobInsight: a framework using semantic neighborhood features for localized interpretations of urban mobility.
S. Park, J. Serrà, E. Frias-Martinez, & N. Oliver.
ACM Trans. on Interactive Intelligent Systems 8(3): 23. Jul 2018.
[arXiv] [DOI] [Demo]

Overcoming catastrophic forgetting with hard attention to the task.
J. Serrà, D. Surís, M. Miron, & A. Karatzoglou.
Proc. of the Int. Conf. on Machine Learning (ICML) 80: 4555-4564. Jul 2018.
[arXiv] [PMLR] [Code]

Empirical evidence on daily cash flow time series and its implications for forecasting.
F. Salas-Molina, J.A. Rodríguez-Aguilar, J. Serrà, M. Guillen, & F.J. Martín.
Statistics and Operations Research Transactions 42(1): 73-98. Jun 2018.
[arXiv] [DOI] [Data]

Language and noise transfer in speech enhancement generative adversarial network.
S. Pascual, M. Park, J. Serrà, A. Bonafonte, & K.-H. Ahn.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 5019-5023. Apr 2018.
[arXiv] [DOI]

Unintuitive properties of deep neural networks.
J. Serrà.
Proc. of the EC Workshop on Human Behaviour and Machine Intelligence (HUMAINT), pp. 11-12. Mar 2018.
[EC]

2017

Continual prediction of notification attendance with classical and deep network approaches.
K. Katevas, I. Leontiadis, M. Pielot, & J. Serrà.
Technical report. Dec 2017.
[arXiv]

Beyond interruptibility: predicting opportune moments to engage mobile phone users.
M. Pielot, B. Cardoso, K. Katevas, J. Serrà, A. Matic, & N. Oliver.
Proc. of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1(3): 91. Sep 2017. Presented at UbiComp 2017.
[pielot] [DOI]

Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks.
J. Serrà & A. Karatzoglou.
Proc. of the ACM Conf. on Recommender Systems (RECSYS), pp. 279-287. Aug 2017.
[arXiv] [DOI]

SEGAN: speech enhancement generative adversarial network.
S. Pascual, A. Bonafonte, & J. Serrà.
Proc. of the Conf. of the Int. Speech Communication Assoc. (INTERSPEECH), pp. 3642-3646. Aug 2017.
[arXiv] [DOI] [Code] [Examples]

Class-based prediction errors to detect hate speech with out-of-vocabulary words.
J. Serrà, I. Leontiadis, D. Spathis, G. Stringhini, J. Blackburn, & A. Vakali.
Proc. of the Conf. of the Association for Computational Linguistics (ACL), Workshop on Abusive Language Online (ALW), pp. 36-40. Aug 2017.
[OpenReview] [ACL]

Practical processing of mobile sensor data for continual deep learning predictions.
K. Katevas, I. Leontiadis, M. Pielot, & J. Serrà.
Proc. of the ACM Int. Conf. on Mobile Systems, Applications and Services (MOBISYS), Workshop on Deep Learning for Mobile Systems and Applications (DeepMobile), pp. 19-24. Jun 2017.
[arXiv] [DOI]

Compact embedding of binary-coded inputs and outputs using Bloom filters.
J. Serrà & A. Karatzoglou.
Int. Conf. on Learning Representations (ICLR) Workshop. Apr 2017.
[OpenReview]

The good, the bad, and the KPIs: how to combine performance metrics to better capture under-performing sectors in mobile networks.
I. Leontiadis, J. Serrà, A. Finamore, G. Dimopoulos, & K. Papagiannaki.
Proc. of the IEEE Int. Conf. on Data Engineering (ICDE), pp. 297-308. Apr 2017.
[IEEE] [DOI]

Hot or not? Forecasting cellular network hot spots using sector performance indicators.
J. Serrà, I. Leontiadis, A. Karatzoglou, & K. Papagiannaki.
Proc. of the IEEE Int. Conf. on Data Engineering (ICDE), pp. 259-270. Apr 2017.
[arXiv] [DOI]

Empowering cash managers to achieve cost savings by improving predictive accuracy.
F. Salas-Molina, F.J. Martín, J.A. Rodríguez-Aguilar, J. Serrà, & J.L. Arcos.
International Journal of Forecasting 23(2): 403-415. Apr 2017.
[arXiv] [DOI]

Performance metrics using KPI combinations to better capture underperforming sectors in mobile networks.
I. Leontiadis, J. Serrà, & A. Finamore.
Patent EP17382164.6, filed on 31/03/2017.

Forecast of cellular network hot spots using sector performance indicators.
J. Serrà & I. Leontiadis.
Patent EP17382163.8, filed on 31/03/2017.

Effect of acoustic conditions on algorithms to detect Parkinson's disease from speech.
J.C. Vásquez-Correa, J. Serrà, J.R. Orozco-Arroyave, J.F. Vargas-Bonilla, & E. Nöth.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 5065-5069. Mar 2017.
[IEEE] [DOI]

2016

A genetic algorithm to discover flexible motifs with support.
J. Serrà, A. Matic, J.L. Arcos, & A. Karatzoglou.
Proc. of the IEEE Int. Conf. on Data Mining (ICDM), Workshop on Spatial and Spatiotemporal Data Mining (SSTDM), pp. 1153-1158. Dec 2016.
[arXiv] [DOI] [Code]

Time-delayed melody surfaces for raga recognition.
S. Gulati, J. Serrà, K.K. Ganguli, S. Senturk, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 751-757. Aug 2016.
[MTG] [ISMIR]

Ranking and significance of variable-length similarity-based time series motifs.
J. Serrà, I. Serra, A. Corral, & J.L. Arcos.
Expert Systems with Applications 55: 452-460. Aug 2016.
[arXiv] [DOI] [Code]

What makes a city vital and safe: Bogotá case study.
A. Bogomolov, A. Clavijo, M. De Nadai, R. Lara Molina, B. Lepri, E. Letouzé, N. Oliver, G. Pestre, J. Serrà, N. Shoup, & A. Ramirez Suarez.
Proc. of the Annual Bank Conf. on Development Economics (ABCDE): Data and Development Economics, session 2D: Crime, Civil Wars, and Hotspots. Jun 2016.
[ABCDE1] [ABCDE2]

Phrase-based raga recognition using vector space modeling.
S. Gulati, J. Serrà, V. Ishwar, S. Senturk, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 66-70. Mar 2016.
[MTG] [DOI] [Code/Data]

Discovering raga motifs by characterizing communities in networks of melodic patterns.
S. Gulati, J. Serrà, V. Ishwar, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 286-290. Mar 2016.
[MTG] [DOI] [Code/Data]

Particle swarm optimization for time series motif discovery.
J. Serrà & J.L. Arcos.
Knowledge-Based Systems 92: 127-137. Jan 2016.
[arXiv] [DOI] [Code]

2015

Improving melodic similarity in Indian art music using culture specific melodic characteristics.
S. Gulati, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 680-686. Oct 2015.
[MTG] [ISMIR]

Analysis of the impact of a tag recommendation system in a real-world folksonomy.
F. Font, J. Serrà, & X. Serra.
ACM Trans. on Intelligent Systems and Technology 7(1): 6. Oct 2015.
[IIIA] [DOI]

Zipf-like distributions in language and music.
I. Moreno, F. Font-Clos, J. Serrà, & A. Corral.
Complexitat.cat Workshop. May 2015.
[complexitat.cat]

An evaluation of methodologies for melodic similarity in audio recordings of Indian art music.
S. Gulati, J. Serrà, & X. Serra.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 678-682. Apr 2015.
[IIIA] [DOI]

2014

Mining melodic patterns in large audio collections of Indian art music.
S. Gulati, J. Serrà, V. Ishwar, & X. Serra.
Proc. of the Int. Conf. on Signal Image Technology and Internet Based Systems (SITIS), pp. 264-271. Nov 2014.
[IIIA] [DOI] [Code] [Data]

Melodic pattern extraction in large collections of music recordings using time series mining techniques.
S. Gulati, J. Serrà, V. Ishwar, & X. Serra.
Demo Session at the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2014.
[IIIA] [ISMIR]

An empirical evaluation of similarity measures for time series classification.
J. Serrà & J.L. Arcos.
Knowledge-Based Systems 67: 305-314. Sep 2014.
[IIIA] [DOI]

Landmark detection in Hindustani music melodies.
S. Gulati, J. Serrà, K.K. Ganguli, & X. Serra.
Proc. of the Int. Computer Music Conf. / Sound and Music Computing Conf. (ICMC/SMC), vol. 2, pp. 1062-1068. Sep 2014.
[IIIA] [ICMC/SMC] [Data]

Class-based tag recommendation and user-based evaluation in online audio clip sharing.
F. Font, J. Serrà, & X. Serra.
Knowledge-Based Systems 67: 131-142. Sep 2014.
[IIIA] [DOI]

Unsupervised music structure annotation by time series structure features and segment similarity.
J. Serrà, M. Müller, P. Grosche, & J.L. Arcos.
IEEE Trans. on Multimedia, Special Issue on Music Data Mining 16(5): 1229-1240. Aug 2014.
[IIIA] [DOI] [Code]

Intonation analysis of ragas in Carnatic music.
G.K. Koduri, V. Ishwar, J. Serrà, & X. Serra.
Journal of New Music Research, Special Issue on Computational Approaches to the Art Music Traditions of India and Turkey 43(1): 72-93. Mar 2014.
[IIIA] [DOI]

Audio clip classification using social tags and the effect of tag expansion.
F. Font, J. Serrà, & X. Serra.
Proc. of the AES Int. Conf. on Semantic Audio, paper num. 26. Jan 2014.
[IIIA] [AES]

2013

Folksonomy-based tag recommendation for collaborative tagging systems.
F. Font, J. Serrà, & X. Serra.
Int. Journal on Semantic Web and Information Systems 9(2): 1-30. Nov 2013.
[IIIA] [DOI]

What can we learn from massive music archives?
J. Serrà.
Dagstuhl Seminar 13451: Computational Audio Analysis. M. Müller, S. Narayanan, and B. Schuller, eds. Wadern, Germany. Nov 2013.
[IIIA] [Dagstuhl]

Learning of units and knowledge representation.
F. Metze, X. Anguera, S. Ewert, J. Gemmeke, D. Kolossa, E. Mower Provost, B. Schuller, & J. Serrà.
Dagstuhl Seminar 13451: Computational Audio Analysis. M. Müller, S. Narayanan, and B. Schuller, eds. Wadern, Germany. Nov 2013.
[IIIA] [Dagstuhl]

Source separation.
C. Uhle, J. Driedger, B. Edler, S. Ewert, F. Graf, G. Kubin, M. Müller, N. Ono, B. Pardo, & J. Serrà.
Dagstuhl Seminar 13451: Computational Audio Analysis. M. Müller, S. Narayanan, and B. Schuller, eds. Wadern, Germany. Nov 2013.
[IIIA] [Dagstuhl]

Towards cover group thumbnailing.
P. Grosche, M. Müller, & J. Serrà.
Proc. of the ACM Int. Conf. on Multimedia (ACM-MM), pp. 613-616. Oct 2013.
[IIIA] [DOI]

Sample identification in hip-hop music.
J. Van Balen, J. Serrà, & M. Haro.
In From Sounds to Music and Emotions, M. Aramaki, M. Barthet, R. Kronland-Martinet, and S. Ystad eds., Lecture Notes in Computer Science, vol. 7900, ch. 5, pp. 301-312. Sep 2013.
[IIIA] [DOI]

Note onset deviations as musical piece signatures.
J. Serrà, T.H. Özaslan, & J.L. Arcos.
PLoS ONE 8(7): e69268. Jul 2013.
[PLoS] [DOI]

Cognitive prognosis of acquired brain injury patients using machine learning techniques.
J. Serrà, J.L. Arcos, A. García-Rudolph, A. García-Molina, T. Roig, & J.M. Tormos.
Proc. of the Int. Conf. on Advanced Cognitive Technologies and Applications (COGNITIVE), pp. 108-113. May 2013.
[IIIA] [CSIC]

Measuring quantitative trends in western popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro, & J.L. Arcos.
CRM-Imperial College Workshop on Complex Systems. Barcelona, Spain. Apr 2013.
[IIIA] [CRM]

Tonal representations for music retrieval: from version identification to query-by-humming.
J. Salamon, J. Serrà, & E. Gómez. 
Int. Journal of Multimedia Information Retrieval 2(1): 45-58. Feb 2013.
[IIIA] [DOI]

2012

Structure-based audio fingerprinting for music retrieval.
P. Grosche, J. Serrà, M. Müller, & J.L. Arcos.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 55-60. Oct 2012.
[IIIA] [ISMIR]

Folksonomy-based tag recommendation for online audio clip sharing.
F. Font, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 73-78. Oct 2012.
[IIIA] [ISMIR]

Characterizaztion of intonation in Carnatic music by parametrizing pitch histograms.
G.K. Koduri, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 199-204. Oct 2012.
[IIIA] [ISMIR]

Extracting semantic information from an on-line Carnatic music forum.
M. Sordo, J. Serrà, G.K. Koduri, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 355-360. Oct 2012.
[IIIA] [ISMIR]

The importance of detecting boundaries in music structure annotation.
J. Serrà, M. Müller, P. Grosche, & J.L. Arcos.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2012.
[IIIA] [MIREX]

A competitive measure to assess the similarity between two time series.
J. Serrà & J.L. Arcos.
Proc. of the Int. Conf. on Case-Based Reasoning (ICCBR), Lecture Notes in Artificial Intelligence 7466, pp. 414-427. Sep 2012.
[IIIA] [DOI] [Code]

The computer as music critic.
J. Serrà & J.L. Arcos.
The New York Times, pp. SR12. September 15, 2012.
[IIIA] [NYTimes]

Measuring the evolution of contemporary western popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro & J.L. Arcos.
​Scientific Reports 2: 521. Jul 2012.
[IIIA] [DOI]

Characterization and exploitation of community structure in cover song networks.
J. Serrà, M. Zanin, P. Herrera, & X. Serra.
Pattern Recognition Letters 33(9): 1032-1041. Jul 2012.
[arXiv] [DOI]

Unsupervised detection of music boundaries by time series structure features.
J. Serrà, M. Müller, P. Grosche, & J.L. Arcos.
Proc. of the AAAI Int. Conf. on Artificial Intelligence (AAAI), pp. 1613-1619. Jul 2012.
[IIIA] [AAAI]

Extracting semantic information from on-line art music discussion forums.
M. Sordo, J. Serrà, G.K. Koduri, & X. Serra.
CompMusic Workshop. Jul 2012.
[IIIA] [CompMusic]

Computational analysis of intonation in Indian art music. 
G.K. Koduri, J. Serrà, & X. Serra.
CompMusic Workshop. Jul 2012.
[IIIA] [CompMusic]

Automatic identification of samples in hip hop music.
J. Van Balen, M. Haro, & J. Serrà.
Proc. of the Int. Symp. on Computer Music Modeling and Retrieval (CMMR), pp. 544-551. Jun 2012.
[IIIA] [CMMR]

Quantifying the evolution of popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro, & J.L. Arcos.
No Lineal Conf. Jun 2012.
[IIIA] [NoLineal]

Patterns, regularities, and evolution of contemporary popular music.
J. Serrà, A. Corral, M. Boguñá, M. Haro, & J.L. Arcos.
Complexitat.Cat Workshop. May 2012.
[IIIA] [complexitat.cat]

Power-law distribution in encoded MFCC frames of speech, music, and environmental sound signals.
M. Haro, J. Serrà, A. Corral, & P. Herrera.
Proc. of the Int. World Wide Web Conf. (WWW), Workshop on Advances in Music Information Research (AdMIRe), pp. 895-902. Apr 2012.
[IIIA] [WWW]

Melody, bassline, and harmony representations for music version identification.
J. Salamon, J. Serrà, & E. Gómez.
Proc. of the Int. World Wide Web Conf. (WWW), Workshop on Advances in Music Information Research (AdMIRe), pp. 887-894. Apr 2012.
[IIIA] [WWW]

Audio content-based music retrieval.
P. Grosche, M. Müller, & J. Serrà.
In Multimodal Music Processing, M. Müller, M. Goto, and M. Schedl eds., Dagstuhl Follow-Ups, Dagstuhl Publishing, Wadern, Germany, vol. 3, ch. 9, pp. 157-174. Apr 2012.
[IIIA] [Dagstuhl]

Zipf's law in short-time timbral codings of speech, music, and environmental sound signals.
M. Haro, J. Serrà, P. Herrera, & A. Corral.
PLoS ONE 7(3): e33993. Mar 2012.
[IIIA] [DOI]

Predictability of music descriptor time series and its application to cover song detection.
J. Serrà, H. Kantz, X. Serra, & R.G. Andrzejak.
IEEE Trans. on Audio, Speech and Language Processing 20(2): 514-525. Feb 2012.
[MTG] [DOI]

2011

Identification of versions of the same musical composition: audio content-based approaches and post-processing steps.
J. Serrà.
LAP Lambert Academic Publishing, Saarbrücken, Germany. ISBN 978-3-8473-2785-1. Dec 2011.
[Amazon] [BN]

Assessing the tuning of sung Indian classical music.
J. Serrà, G.K. Koduri, M. Miron, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 263-268. Oct 2011.
[MTG] [ISMIR]

Computational approaches for the understanding of melody and rhythm in Carnatic music.
G.K. Koduri, M. Miron, J. Serrà, & X. Serra.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 157-162. Oct 2011.
[MTG] [ISMIR]

Unifying low-level and high-level music similarity measures.
D. Bogdanov, J. Serrà, N. Wack, P. Herrera, & X. Serra.
IEEE Trans. on Multimedia 13(4): 687-701. Aug 2011.
[MTG] [DOI]

Method for calculating measures of similarity between time signals.
J. Serrà.
Patent US 2011/0178615, published July 21, 2011. Priority num. ES20090001057-20090423. Also published as ES 2354330 (Método para calcular medidas de similitud entre señales temporales).
[FreePatentsOnline] [EspaceNet]

Nonlinear audio recurrence analysis with application to genre classification.
J. Serrà, C.A. De Los Santos, & R.G. Andrzejak.
Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 169-172. May 2011.
[MTG] [DOI]

Identification of versions of the same musical composition by processing audio descriptions.
J. Serrà.
PhD Thesis. Universitat Pompeu Fabra, Barcelona, Spain. Mar 2011.
[MTG] [TDX]

Cover song networks: analysis and accuracy increase.
J. Serrà, M. Zanin, & P. Herrera.
Int. Journal of Complex Systems in Science 1: 55-59. Jan 2011.
[MTG]

2010 & Before...

Model-based cover song detection via threshold autoregressive forecasts.
J. Serrà, H. Kantz, & R.G. Andrzejak.
Proc. of the ACM Int. Conf. on Multimedia (ACM-MM), Workshop on Music and Machine Learning (MML), pp. 13-16. Oct 2010.
[MTG] [DOI]

Unsupervised accuracy improvement for cover song detection using spectral connectivity network.
M. Lagrange & J. Serrà.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 595-600. Aug 2010.
[MTG] [ISMIR]

Hybrid music similarity measure.
D. Bogdanov, J. Serrà, N. Wack, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Aug 2010.
[MTG] [MIREX]

Music classification using high-level models.
N. Wack, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, E. Gómez, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Aug 2010.
[MTG] [MIREX]

Cover song networks: analysis and accuracy increase.
J. Serrà, M. Zanin, & P. Herrera.
Net-Works Int. Conf. Jun 2010.
[MTG] [Net-Works]

Indexing music by mood: design and integration of an automatic content-based annotator.
C. Laurier, O. Meyers, J. Serrà, M. Blech, P. Herrera, & X. Serra.
Multimedia Tools and Applications 48(1): 161-184. May 2010.
[MTG] [DOI]

Audio cover song identification and similarity: background, approaches, evaluation, and beyond.
J. Serrà, E. Gómez, & P. Herrera.
In Advances in Music Information Retrieval, Z. W. Ras and A. A. Wieczorkowska eds., Studies in Computational Intelligence series, Springer, Berlin, Germany, vol. 274, ch. 14, pp. 307-332. Mar 2010.
[MTG] [DOI]

From low-level to high-level: comparative study of music similarity measures.
D. Bogdanov, J. Serrà, N. Wack, & P. Herrera.
Proc. of the IEEE Int. Symp. on Multimedia, Workshop on Advances in Music Information Research (AdMIRe), pp. 453-458. Dec 2009.
[MTG] [DOI]

Unsupervised detection of cover song sets: accuracy improvement and original identification.
J. Serrà, M. Zanin, C. Laurier, & M. Sordo.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 225-230. Oct 2009.
[MTG] [ISMIR]

Music mood representations from social tags.
C. Laurier, M. Sordo, J. Serrà, & P. Herrera.
Proc. of the Int. Soc. for Music Information Retrieval Conf. (ISMIR), pp. 381-386. Oct 2009.
[MTG] [ISMIR]

The discipline formerly known as MIR.
P. Herrera, J. Serrà, C. Laurier, E. Guaus, E. Gómez, & X. Serra.
Int. Society for Music Information Retrieval Conf. (ISMIR), special session on the Future of MIR (fMIR). Oct 2009.
[MTG] [fMIR]

Cover song retrieval by cross recurrence quantification and unsupervised set detection.
J. Serrà, M. Zanin, & R.G. Andrzejak.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2009.
[MTG] [MIREX]

Music type groupers (MTG): generic music classification algorithms.
N. Wack, E. Guaus, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serrà, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2009.
[MTG] [MIREX]

Hybrid similarity measures for music recommendation.
D. Bogdanov, J. Serrà, N. Wack, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Oct 2009.
[MTG] [MIREX]

Assessing the results of a cover song identification system with coverSSSSearch.
J. Serrà.
Demo Session at the Int. Soc. for Music Information Retrieval Conf. (ISMIR). Oct 2009.
[MTG]

Cross recurrence quantification for cover song identification.
J. Serrà, X. Serra, & R.G. Andrzejak.
New Journal of Physics 11: 093017. Sep 2009.
[MTG] [DOI] [Code]

Shape-based spectral contrast descriptor.
V. Akkermans, J. Serrà, & P. Herrera.
Proc. of the Sound and Music Computing Conf. (SMC), pp. 143-148. Jul 2009.
[MTG] [SMC]

Music mood annotator design and integration.
C. Laurier, O. Meyers, J. Serrà, M. Blech, & P. Herrera.
Proc. of the Int. Workshop on Content-Based Multimedia Indexing (CBMI), pp. 156-161. Jun 2009.
[MTG] [DOI]

Music similarity systems and methods using descriptors.
E. Gómez, P. Herrera, P. Cano, J. Janer, J. Serrà, J. Bonada, S. El-Hajj, T. Aussenac, & G. Holmberg.
Patent US 2008/300702, published December 31, 2008. Priority nums. US20070946860P-20070628, US20070970109P-20070905, and US20070988714P-20071116. Also published as WO 2009/001202.
[FreePatentsOnline] [EspaceNet]

Statistical analysis of chroma features in western music predicts human judgments of tonality.
J. Serrà, E. Gómez, P. Herrera, & X. Serra.
Journal of New Music Research 37(4): 299-309. Dec 2008.
[MTG] [DOI]

Transposing chroma representations to a common key.
J. Serrà, E. Gómez, & P. Herrera.
Proc. of the Int. Conf. on The Use of Symbols to Represent Music and Multimedia Objects, pp. 45-48. Oct 2008.
[MTG] [UniMi]

Improving binary similarity and local alignment for cover song detection.
J. Serrà, E. Gómez, & P. Herrera.
Music Information Retrieval Evaluation eXchange (MIREX). Sep 2008.
[MTG] [MIREX]

Chroma binary similarity and local alignment applied to cover song identification.
J. Serrà, E. Gómez, P. Herrera, & X. Serra.
IEEE Trans. on Audio, Speech and Language Processing 16(6): 1138-1152. Aug 2008.
[MTG] [DOI]

Audio cover song identification based on tonal sequence alignment.
J. Serrà & E. Gómez.
Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 61-64. Apr 2008.
[MTG] [DOI]

A qualitative assessment of measures for the evaluation of a cover song identification system.
J. Serrà.
Proc. of the Int. Conf. on Music Information Retrieval (ISMIR), pp. 319-322. Sep 2007.
[MTG] [ISMIR]

A cover song identification system based on sequences of tonal descriptors.
J. Serrà & E. Gómez.
Music Information Retrieval Evaluation eXchange (MIREX). Sep 2007.
[MTG] [MIREX]

Music similarity based on sequences of descriptors: tonal features applied to cover song identification.
J. Serrà.
MSc Thesis. Universitat Pompeu Fabra, Barcelona, Spain. Sep 2007.
[MTG]
Powered by Create your own unique website with customizable templates.
  • Home
  • Publications
  • Talks/Teaching
  • Projects
  • Misc
  • Contact