Financial services company Creditinfo has handed over a huge chunk of data to the Árni Magnússon languages institute.
The data transfer contained eight million sentences in spoken and written Icelandic, which will form the basis of the digital language database the institute is setting up to support Icelandic in the digital technology age.
The development and use of smart devices which can “talk” and “understand” spoken language is taking place quickly and Icelandic is currently one of the worst digitized languages in Europe. The Árni Magnússon Institute is working to make Icelandic compatible with the latest technology and ultimately, therefore, stop the Icelandic language from becoming obsolete.
The future of Icelandic is dependent, partly at least, on its usefulness in IT, but there is a lot of work ahead, as the new database will eventually contain a billion words accessible to search and on xml format for technical applications.
“The database will be built on a collection of public texts and the data which Creditinfo has handed Árnastofnun are substantial and diverse. Hopefully it will make it easier for the institute to get the database up and running, as it is important for Icelanders to be able to develop technical solutions in Icelandic,” Creditinfo CEO Brynja Baldursdóttir told Vísir.