Algoritmo para la Lectura por medio de Reconocimiento Óptico de Caracteres (OCR) de Etiquetas Nutricionales y la Generación de un Tipo de Sellos Frontales

In Colombia, no frontal warning seals are used, and the nutritional tables of processed food products are difficult to interpret without specific knowledge of nutrition. Optical Character Recognition (OCR) is a process oriented to the digital reading of a text image from which the different symbols and characters belonging to a certain alphabet are obtained (ABBY, 2019). In this work an algorithm is proposed to generate front stamps that are relevant for Colombia, from the information of the nutritional labels obtained through Tesseract OCR Engine. All the algorithms developed in the project were implemented in Python. The implemented methodology starts from the pre-processing of the images of the nutritional tables, continuing with the detection and recognition of the same. The regions of interest (ROI) are obtained, the information for the seals is extracted and finally the frontal GDA and Octagonal seals are generated. The algorithm presented an accuracy of 49% for the realization of the seals. The most frequent errors are confusing the g of the grams with the nine and not recognizing the word of interest.

Keywords

Reconocimiento óptico de caracteres (OCR)
Tesseract
detección
reconocimiento
etiquetas nutricionales
sello frontal GDA
sello frontal octagonal

item.page.subject.keyword

Optical Character Recognition (OCR)
Tesseract
Detection
Recognition
Nutrition Labels
GDA Front Label
Octagonal Front Label

URI

http://repositorio.uan.edu.co/handle/123456789/3154

Collections

Ingeniería electrónica

Full item page