Ana Pereira (UVigo)
Lourdes Lorenzo (UVigo)
María Rico Vázquez (UVigo)
Jesús Meiriño Gómez (UVigo)
Frederic Chaume Varela (UJI)
Irene de Higes Andino (UJI)
Ana Tamayo Masero (UPV/EHU)
Nazaret Fresno Cañada (UTRGV)
Laura Feyto Álvarez (RTVE)
Proyectos de I+D+i (Ministry of Science and Innovation, Government of Spain)
Media accessibility (subtitling for the deaf and audio description for the blind) plays a key role in ensuring that audiovisual media and the scenic arts are accessible to all. Subtitling is particularly relevant, as it benefits not only people with hearing loss, but also language learners and anyone in need of accessing audiovisual content in contexts where audiovisual media is played with no sound (hospitals, pubs, public transport, etc.).
This application deals with live subtitling, which provides access to live TV programmes and live events for millions of users worldwide. Live subtitling is a very challenging task, as it requires the production of intralingual subtitles as the sound of the TV programme is being heard. It is normally produced through a technique known as respeaking, whereby a person known as a respeaker paraphrases what is being said in a TV programme to a speech recognition software that turns the recognized utterances into subtitles on the screen. Given its complexity, live subtitles often feature delay and errors. This is particularly pertinent now that some broadcasters are beginning to introduce fully automatic live subtitling, without any human intervention. Here, the potential for errors increases exponentially. Given that automatic live subtitling is much more affordable than respeaking, broadcasters are beginning to use it worldwide, often without any quality monitoring. Until now, research in live subtitling has been concerned with the cognitive process involved in this task and with viewer reception, but the assessment of live subtitling quality has been largely ignored. The team behind this application has developed a metrics-based quality assessment, the NER model, that is being used by broadcasters, governments and research centres worldwide. It has also put together the first three (pilot) studies analysing the quality of automatic live subtitles in Galician, comparing a set of respoken and automatic live subtitles in Spanish and training US stakeholders in the use of the NER model.
The project, an example of inter- and multidisciplinary use of the humanities to tackle a real-life challenge of the digital era, will provide a significant contribution to scientific knowledge in the area of media access (the largest study on live subtitling quality ever conducted). More widely, it is expected to improve access and inclusion in society for the millions of viewers in Spain and the US who watch live TV programmes and attend live events on a daily basis.
At a regional level, the project intends to assess the accuracy of speech recognition in Galician, Basque and Catalan/Valencian and to produce a white paper with requirements to provide the first live subtitles in these three languages. At a national level, the plan is to compare the quality of human respoken live subtitles produced produced by the main Spanish TV broadcasters with the automatic live subtitles used by TVE, as well as to produce a white paper with recommendation of what method can be most effective depending on the type of programme. At an international level, the team will produce the first NER-based assessment of live subtitling quality in the US, comparing it to the results obtained in Spain. It will also explore the development of a technological solution to automatise the NER model.