site.btaBTA Archives Are National Treasure, Digital Archive Project Head Todorova Says at Language Data Seminar


The archives of the Bulgarian News Agency (BTA) are a national treasure. Their digitisation and management will turn them into an active platform for research, innovation and cultural memory, said Svoboda Todorova, head of the Digital Archive project implemented by BTA under the National Recovery and Resilience Plan. She presented a report on the creation and management of language data at BTA during a panel discussion on production, management and development of the language data market.
The event was held within a seminar on the role of language data in the development of language technologies and AI-based tools, organized by the European Language Data Space and the Institute of Bulgarian Language at the Bulgarian Academy of Sciences. The panel was moderated by Prof. Silvia Ilieva, Director of the GATE Centre of Excellence, Hristo Dochev (Wisertech), Ivan Vankov (Iris.ai) and Trayan Kosev (intellectual property lawyer) also participated in the discussion.
Todorova noted that BTA, founded in 1898, is the oldest and most authoritative information institution in the country, with unique linguistic and visual resources - a photo archive of about 1.8 million images and a journalistic archive of over five million pages. In her words, these resources represent linguistic and cultural data of strategic importance to Bulgarian society.
Todorova emphasised the need to build a unified digital infrastructure, centralised databases, standardised metadata, and long-term policies for storage and access. "We select popular topics that students search for in the archives because the materials are so fragile - just before a copy is examined, it is already at risk. We cannot just scan all these materials because they literally disappear in our hands," said Todorova.
She explained that processing archival data requires significant human resources. "We have ongoing controls and that takes time. It takes 5 to 6 minutes to prepare one page," she said. According to her, the archival data is accessible, but its use requires prior request and preparation. "The documents must be prepared, they can be signed and read on site," she explained.
Todorova noted that access to the BTA archives must be provided both internally, for the needs of journalists, and externally, through public portals and APIs for developers. User convenience requires multilingual search engines and combined searches in text and photo archives, she said.
In conclusion, Todorova pointed out that artificial intelligence will play a key role in the future development of BTA.
/RY/
Additional
news.modal.image.header
news.modal.image.text
news.modal.download.header
news.modal.download.text
news.modal.header
news.modal.text