HKR Handwritten Text Recognition Dataset
Date
3 years ago
Publish URL
License
其他
Categories

HKR stands for Handwritten Kazakh and Russian, which is a Russian and Kazakh table dataset (about 95% of Russian and 5% of Kazakh) for offline handwritten text recognition. It contains more than 1,400 filled-in forms, 63,000 sentences, more than 715,699 characters, and 200 authors. The table is generated by LATEX and its content is handwritten by the author. The dataset is written in Cyrillic language and shares 33 identical characters. In addition to these characters, the alphabet of the Kazakh language contains 9 additional specific characters.
The data set includes:
- Handwritten samples/tables of keywords in Kazakh and Russian (region, city, village, etc.)
- Russian and Kazakh samples in Cyrillic handwriting
- Handwriting sample/form for Russian poetry