Enhanced User Experience with Cookiebot Technology
Enhanced User Experience with Cookiebot Technology
Other | Digital Archiving
4 000 000 Pages in 20 Languages – ABBYY FineReader Engine Preserves the National Library of Latvia
4 000 000 pages in 20 languages – ABBYY FineReader Engine Preserves the National Library of Latvia
Other | Digital Archiving
Customer Overview
Name | National Library of Latvia |
---|---|
Headquarters | Riga, Latvia |
Industry | Government, Education |
Products and Services | Free and inventive usage of Latvia’s cultural and scientific heritage |
Web | www.lnb.lv/en |
Partner Overview
Name | Content Conversion Specialists (CCS) |
---|---|
Industry | Document conversion solutions and services |
Web | www.content-conversion.com |
CHALLENGE
Turn the texts of the National Library of Latvia into searchable archives
SOLUTION
Implementation of a solution based on ABBYY FineReader Engine
RESULTS
- 4 million pages of books and periodicals processed in less than a year
- Library materials are now accessible online
As gateways to knowledge and culture, libraries shape the new ideas and perspectives that are central to a creative and innovative society as well as ensure an authentic archive of knowledge created and accumulated by past generations.
The National Library of Latvia (NLL) has amassed 4.5 million paper units, including special collections - rare books, manuscripts, Letonica (i.e. books on the history of Latvia and Latvians), the Baltic Central Library, maps, scores, sound recordings, graphic documents, small prints, periodicals. On the one hand, since its establishment in 1919 some of the oldest editions kept in the library have started deteriorating; on the other hand, the library fund has accumulated tons of valuable and popular materials. In other words, there arose a task to preserve these materials for the future and make them more accessible for the public now – a task accomplished by creating a digital archive.
See how ABBYY can help
4,000,000
pages of ancient and modern books and periodicals
20
different languages
1 year
to digitize the library
Mass Digitization Opens New Opportunities
The Internet has created tremendous opportunities in terms of accessing collections of the world’s greatest libraries. Large-scale digitization of NLL, however, had yet to be realized. The first phase of the project included the scanning and creation of image-only PDFs, which wasn’t good enough as the texts were impossible to work with.
In order to convert the materials into searchable formats the library needed OCR technology. But there another pitfall awaited: few OCR solutions could provide high quality of Latvian scripts recognition, to say nothing of support of ancient Latvian and European fonts. However, after a while the solution was found, and the second phase of archive digitization included a small pilot project with the use of ABBYY OCR technology . This project was conducted by Content Conversion Specialists (CCS) .
To provide some background, CCS has been involved in developing special software solutions for the Cultural Heritage community since 2000. As a result, a new software tool for structured digitization docWorks , based on ABBYY FineReader Engine technologies, was brought to life in 2003 and afterwards used for NLL project.
Have a task? Let’s find a solution
ABBYY Fine-tuned Art of Recognition
At the beginning the library chose materials that were either physically damaged and thus had to be “saved” at least in a digital form, or that were popular among readers or were considered historically important. The approximate scope of work included 2.5 million pages of periodicals (equal to about 1000 titles of full sets of periodicals) and 1.5 million pages of books (equal to about 7000 books).
ABBYY FineReader Engine , an integral part of CCS docWorks solution, was used to perform optical character recognition of historic texts in as many as 20 different languages. The near-perfect support of Latvian and Russian scripts – with up to 100% accuracy – played a special role in the choice of OCR provider for the project.
It should be noted that the texts contained rare gothic fonts which have fallen out of use and are not supported by most modern optical character recognition solutions. However, both Antiqua and Fraktur groups of fonts with special ornamental design were easily handled by ABBYY FineReader Engine technology.
Treasures Unveiled for the Public
It took a little more than a year to process 4 million pages of ancient books and modern periodicals. Driven by the enthusiasm of a noble goal, 60 operators worked daily in three 8-hour shifts during the project’s peak.
After the processing, the documents were exported into various formats (PDF, JPEG, XML) and imported into the periodicals portal www.periodika.lv , where they became available to scientists, researchers, professors, students and general public. Due to copyright protection, most materials are accessible only from the network of Latvian libraries, although all periodicals published before 1941 are available with no restrictions, and public domain books (i.e. with expired copyright) are also available to all internet users.
“National Library of Latvia has been involved in a large-scale digitization project with the aim to process and make available on-line about 4 million pages of historic books and periodicals. ABBYY Finereader engine has been an integral part in the project, providing very high accuracy OCR results. Most of the texts in the project were processed with a precision close to 100%. This result allows our users to both make use of high quality OCRed text and do full-text searches in the periodicals portal: www.periodika.lv” .
Joachim Bauer, Head of docWorks Group at CCS
Like, share or repost
Share True ? : “”
Ready to talk to an expert?
We’d love to help you along your automation journey.
Also read:
- [New] Stellar Sparks Top Ten Threads that Captivated Reddit Users
- 2024 Approved Dissecting Beats Step-by-Step Processes for Slicing Audio Tracks
- Ace the Art of Digital Media: Top-Notch DVD & Video Converting Tools From Digiarty
- Comment Encoder Des Fichiers AV1 À L'aide De Handbrake: Guide Complet Et Astuces
- Descargas Gratuitas De Alta Calidad: Cómo Convertir Archivos MP4 a MP3 Con Métodos Modernos
- DVDビデオ変換無料プログラムに遭遇する一般的な問題と解決策
- Fast-Track Tips: Crafting Compelling Shorts for Your IG Audience
- How To Stream Anything From Motorola G54 5G to Apple TV | Dr.fone
- In 2024, Find Your Favorite Screen An In-Depth Review of Top 6 HDMI TVs
- Is pgsharp legal when you are playing pokemon On Vivo V27e? | Dr.fone
- Latest Release: Magicard Rio Pro Driver Game Enthusiasts! Available for Win 10/8.1/7 Users
- Optimisez Le Traffic Sur Instagram Avec Des Meilleurs Convertisseurs De Vidéo : Des Solutions Innovantes Et Efficientes
- Pokémon Go Cooldown Chart On Apple iPhone 13 Pro Max | Dr.fone
- Réduction Efficace Des Fichiers Multimédias - Compresser en Moins De 10 % Pour Vidéos HD/4K/8K
- Reviving System: 8 Routes for Windows Restart
- Say Goodbye to Game Interruptions: Essential Tips to Prevent Cities: Skylines Crashes
- The Ultimate Guide to the Best MP4 Converters on PC (Windows 10/11): Quick, Dependable Tools Made Simple
- Unraveling Public Discontent with Social Media Giants
- VLCプレイヤーへの脆弱性について: 信頼度、セキュリティ管理方法解説
- Title: Enhanced User Experience with Cookiebot Technology
- Author: Richard
- Created at : 2024-10-05 00:00:47
- Updated at : 2024-10-11 20:22:56
- Link: https://solve-news.techidaily.com/enhanced-user-experience-with-cookiebot-technology/
- License: This work is licensed under CC BY-NC-SA 4.0.