Potential uses of Brazilian Portuguese data repositories in Forensic Phonetics
DOI:
https://doi.org/10.14393/DLv19a2025-16Keywords:
Phonetics, Forensic Phonetics, Criminalistics, Data repositoriesAbstract
This study aims to survey existing data repositories of Brazilian Portuguese language and speech and to evaluate their potential applications in forensic phonetics, with a particular focus on speaker comparison tasks. A total of 45 speech corpora from various Brazilian regions were identified through consultations with specialists in sociolinguistics and dialectology, as well as through bibliographic research. To assess the potential use of these corpora for forensic purposes, a questionnaire was developed, addressing a range of aspects including general corpus characteristics, types of materials available, participant profiles, geographic coverage, and conditions for access and use of the collected data. Although the identified repositories fulfill their primary goal of documenting linguistic variation—especially at the lexical, syntactic, and conversational levels—they are not always suitable for the specific demands of speaker comparison within forensic phonetics. This task requires repositories that include high-quality audio recordings, metadata on speakers and recording conditions, and broad regional and demographic coverage to support the estimation of relevant linguistic and phonetic feature distributions. Among the 45 corpora surveyed, only seven were found to be adequately suited for use in speaker comparison analyses, particularly when employing likelihood ratio-based approaches, which require representative reference data. The results underscore a significant gap in the availability of appropriate and accessible resources for forensic phonetic applications in Brazil. In particular, there is a notable lack of corpora providing audio materials from the Central-West and North regions, and only one corpus includes such materials for the South and Northeast regions. This regional imbalance limits the ability to conduct robust speaker comparison analyses nationwide. Given these findings, it is essential that current and future corpus development initiatives consider including high-quality audio data, detailed metadata, and broad geographic representation. The outcomes of this research are relevant not only for linguists and forensic phonetics specialists but also for researchers working in related areas of language study, who may benefit from greater access to phonetically and regionally representative data.
Downloads
References
BRASIL. Conselho Nacional de Saúde. Resolução nº 510, de 7 de abril de 2016. Diário Oficial da União: seção 1, Brasília, DF, 24 maio 2016. Disponível em: http://conselho.saude.gov.br/resolucoes/2016/Reso510.pdf.
BRASIL. Lei n.º 13.709, de 14 de agosto de 2018. Lei Geral de Proteção de Dados Pessoais. Diário Oficial da União: Brasília, DF, 15 ago. 2018. Disponível em: https://www.planalto.gov.br/ccivil_03/_ato2015-2018/2018/lei/L13709compilado.htm.
BRASIL. Portaria n° 934-Ditec/PF, de 30 de julho de 2020. Institui o Corpus Forense do Português Brasileiro (CFPB) no âmbito do Sistema Nacional de Criminalística e estabelece regras para seu funcionamento, manutenção e compartilhamento. Brasília, 2020.
BALDWIN, J.; FRENCH, P. Forensic phonetics. Londres: Pinter, 1990.
BARBOSA, P. A. et al. Análise Fonético-Forense: em tarefa de Comparação de Locutor. Campinas: Millenium Editora, 2020.
BRESCANCINI, C. R; GONÇALVES, C. S. O peso da evidência sociofonética na perícia de Comparação de Locutor. In: BARBOSA, P. A. et al. (ed.). Análise fonético-forense em tarefa de comparação de locutor. 1. ed. Campinas: Millenium Editora, 2020, p. 67-87.
CUNHA, M. S. da. Estatísticas populacionais da frequência fundamental do português brasileiro para uso em fonética forense. 2023. Dissertação de Mestrado - Universidade Federal de São Carlos, São Carlos, 2023.
ECKERT, P. Age as a sociolinguistic variable. In: COULMAS, F. (ed.). Handbook of Sociolinguistics. Oxford: Blackwell, 1997, p. 151-67. DOI https://doi.org/10.1002/9781405166256.ch9
FREITAG, R. M. Ko. Sociolinguística no/do Brasil. Cadernos de Estudos Linguísticos, Campinas, SP, v. 58, n. 3, p. 445–460, 2016. DOI https://doi.org/10.20396/cel.v58i3.8647170
GOLD, E.; FRENCH, P. International practices in forensic speaker comparison. The International Journal of Speech, Language and the Law, v. 18, n. 2, p. 293–307, 2011. DOI https://doi.org/10.1558/ijsll.v18i2.293
GOLD, E.; FRENCH, P. International practices in forensic speaker comparisons: second survey. International Journal of Speech Language and the Law, v. 26, n. 1, p. 1– 20, 2019. DOI https://doi.org/10.1558/ijsll.38028
JESSEN, M. Forensic Phonetics. Language and Linguistics Compass, v. 2, n. 4, p. 671–711, 2008. DOI https://doi.org/10.1111/j.1749-818X.2008.00066.x
ALTENHOFEN, C. V.; KLASSMANN, M. S. Atlas lingüístico-etnográfico da região Sul do Brasil: cartas semântico-lexicais. Porto Alegre: Editora da UFRGS, 2011.
MORRISON, G. S. Forensic voice comparison and the paradigm shift. Science and Justice, v. 49, n. 4, p. 298–308, 2009. DOI https://doi.org/10.1016/j.scijus.2009.09.002
MORRISON, G. S. Forensic voice comparison. In: FRECKELTON, I.; SELBY, H. (org.). Expert Evidence. Sydney: Thomson Reuters, 2010.
OLSSON, J. Forensic Linguistics: An Introduction to Language, Crime and the Law. 2. ed. Londres: Continuum, 2008.
PASSETTI, R. R. et al. Tipicidade e qualidade de voz: considerações metodológicas sobre o controle de critérios sociolinguísticos, fonéticos e de voz. Cadernos de Estudos Linguísticos, [s. l.], v. 66, p. e024020, 2024. DOI https://doi.org/10.20396/cel.v66i00.8675468
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Daniel Fonseca Vieira, Renata Regina Passetti, Pablo Arantes

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish in this journal agree to the following terms:
Authors retain the copyright and waiver the journal the right of first publication, with the work simultaneously licensed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), allowing the sharing of work with authorship recognition and preventing its commercial use.
Authors are authorized to take additional contracts separately, for non-exclusive distribution of the version of the work published in this journal (publish in institutional repository or as a book chapter), with acknowledgment of authorship and initial publication in this journal.