Fourth International Conference


Computer Vision & Image Processing

Organized by

Malaviya National Institute of Technology Jaipur

27 - 29 September 2019

CALAM Dataset

Handwritten Vowels and Consonants (with Modifiers) Devanagari database developed at the Department of Computer Science and Engineering of the Malaviya National Institute of Technology as part of research project grant (P.7.S.T/RD/2013/4400 – Urdu Corpus Development and HTR sanctioned by DST, Government of Rajasthan. A database for off-line Hindi handwritten character with matras (modifiers) is developed. Data set is collected from persons of different age, gender, profession and educational qualification. The character images are stored as images in PNG image format for efficient use. The dataset consists of more than 23000 images of their original size with programmatically segmented consonant, Numerals and Vowels. Data are also collected from person from different geographical locations of India.


  • Hindi hand written Vowels and Consonants with matras (modifiers) written by around 1200 writers from geographically diverse places.
  • More than 23000 handwritten (alphabets) characters images of consonants, Numerals and Vowels Characters scanned at 300 dpi.
  • Devanagari handwritten and Unicode corpus containing 1600 handwritten text-pages, written by 1600 different writers containing 8,800 Hindi handwritten text lines, with 1,20,000 Hindi handwritten text words. Each form contains approximately 4.53 text lines and 68.91 text words. In addition to this the database contains 4,160 Hindi printed text lines.
  • XML file format (structural and Unicode Ground truths) in a hierarchical manner having complete information for research findings.


Datasets(Updated Consonants,Vowels and Numerals)

Related work

Handwritten Devnagari Script Database Development for Off-Line Hindi Character with Matra (Modifiers)


© Copyright 2018-19 MNIT JAIPUR