Old European languages and gothic fonts
Digitizing old texts: Gothic/Fraktur OCR
ABBYY has been developing OCR for digitizing old books since 2003, and now it supports Black Letter, Schwabacher, and most other Gothic fonts in English, German, French, Italian, Spanish and Latvian Gothic.
The challenge: digitizing old texts
Black letter fonts
Black Letter fonts, also known as “Gebrochene Schriften,” or broken scripts, first emerged as early as the 12th century, and evolved over the years to consist of a variety of derivations and font types.
The Fraktur typeface, dominant in Germany, was created on behalf of the German Emperor Maximilian and soon became popular in many parts of Europe.
Characteristics and peculiarities
Common characteristics and peculiarities of the type include the elongated s and ligatures, or “joined” letters for certain letter combinations. The frequency of its application makes the understanding of Fraktur essential for studying text and developing recognition technologies for the period between 1800 and 1938.
Now that the worldwide flow of information is becoming digital, and digital library collections are being created, so it is important to start to make historic documents available online.
Scanning is just the first step - Optical Character Recognition is just as important to “open” the content for humans, for search and for other analysis technologies.
Recognition of old European documents and Gothic Fonts in books printed in 18th-20th centuries is available in the software products ABBYY FineReader PDF and ABBYY FineReader Server.
If you are developing your own software systems for recognizing text in historical documents, the software development kit ABBYY FineReader Engine or the web-based recognition service ABBYY Cloud OCR SDK can be integrated into your software to add the requested functionality for recognizing Gothic fonts.