HyperAI

HKR Handwritten Text Recognition Dataset

Download Help
特色图像

HKR stands for Handwritten Kazakh and Russian, which is a Russian and Kazakh table dataset (about 95% of Russian and 5% of Kazakh) for offline handwritten text recognition. It contains more than 1,400 filled-in forms, 63,000 sentences, more than 715,699 characters, and 200 authors. The table is generated by LATEX and its content is handwritten by the author. The dataset is written in Cyrillic language and shares 33 identical characters. In addition to these characters, the alphabet of the Kazakh language contains 9 additional specific characters.

The data set includes:

  • Handwritten samples/tables of keywords in Kazakh and Russian (region, city, village, etc.)
  • Russian and Kazakh samples in Cyrillic handwriting
  • Handwriting sample/form for Russian poetry