Physical structure extraction of Algerian baccalaureate transcripts
Abderrahmane KEFALİ, Ahlem OBEİZİ, Chokri FERKOUS
Journal Title:International Journal of Informatics and Applied Mathematics
In recent years, Algerian universities have become aware of the interest of electronic archiving and the digitization of archives for a better management of their documents. The development of systems enabling the analysis and understanding of archival documents became an unavoidable need. The present paper follows this trend; it proposes a system for the analysis of the physical structure of Algerian baccalaureate transcripts, stored in the universities archives. The proposed system proceeds in two phases: 1) preprocessing, in which several operations are applied in order to reduce the noise present in the input images. 2) Segmentation; It starts with the elimination of the transcript border. Then, it extracts the text lines and the blocks, based on RLSA algorithm and the projection profiles analysis. After, it proceeds to the classification of the blocks in three: textual block, table, and graphic. Finally, it recovers textual content from textual blocks and tables.