Syntax Literate: Jurnal Ilmiah Indonesia p–ISSN: 2541-0849 e-ISSN: 2548-1398

Vol. 9, No. 8, Agustus 2024

 

Braille Character Recognition with Histogram of Oriented Gradients (HOG) and SVM-Based Image Processing

 

Rizka Noviyanti1, Estu Sinduningrum2*

Universitas Muhammadiyah Prof. Dr. Hamka, Jakarta, Indonesia1,2

Email: [email protected]1, [email protected]2*

 

Abstract

This research explores the application of Support Vector Machine (SVM) to enhance Braille letter recognition through image processing, a critical technology aimed at improving information accessibility for the general public. Braille letters, as a vital tactile writing system for this community, are represented by patterns of dots interpreted through touch. By employing SVM method optimized using Grid Search, the study achieved an 82% accuracy in classifying Braille letter images from a test dataset of 1560 images. These results confirm SVM's effectiveness in recognizing and classifying Braille letters, validating the efficiency of this approach in Braille image classification. The implications of this research include significant contributions to technology supporting inclusivity and information accessibility, emphasizing the importance of structured and optimized systems in addressing the challenges of recognizing different characters in the Braille writing system.

Keywords: Accessibility, Braille letter recognition, Image processing, Support Vector Machine (SVM)

 

Introduction                                                                                               


Blind people, that is, people who are visually impaired or visually impaired, have special challenges in reading and writing. Braille is a tactile writing system created specifically to enable blind or blind-blind people to read and write (Nini & Harum, 2023; Sumarni, 2019). This system uses a regular combination of dots on the surface of paper or other materials, which can be touched with the fingertips to represent letters, numbers and other signs. This method is named after Louis Braille, a French pioneer in the education of the blind, who developed this system in the 19th century.

Braille plays an important role in facilitating access to written information for those who cannot use their eyesight effectively. By using touch, individuals can feel and recognize the pattern of dots that make up Braille characters, which consist of 6 or 8 dots arranged in a certain layout. In its development, Braille not only included the alphabet, but also mathematical symbols, musical notation, and even codes for other languages (De Luna, 2020; Wulandari et al., 2022).

In the era of modern information technology, advances in image processing and artificial intelligence open up opportunities to improve the ability to automatically recognize Braille letters (Meneses-Claudio et al., 2020; Shokat et al., 2020). Image processing techniques enable image analysis to recognize Braille characters, the use of techniques such as image segmentation, feature extraction, and classification algorithms have paved the way for the development of systems that can recognize and transcribe Braille text into a more easily accessible format (Aulia & Satriani, 2020; Kausar et al., 2021).

A number of previous studies have tried various image processing methods and techniques to overcome this challenge, including color conversion, image segmentation, feature extraction, and the use of classification algorithms to recognize braille letters.

Some studies even apply the braille recognition method using the Radially Average Power Spectrum (RAPSV). The RAPSV method itself provides an accuracy of 93.91%. When frequency feature extraction is combined with geometry, the accuracy increases to 94.04%, with the combination of all features producing the highest accuracy of 97.18% (Sari, 2023). Apart from that, Raspberry Pi has also been used as a platform to develop Braille recognition applications using computer vision technology which has various levels of accuracy, in the first test using 70 gsm HVS paper and with images having 100% accuracy of all samples used, in the second test used drawing paper and binding paper, only a few samples were incorrectly recognized with an accuracy rate of 90.38%, then in the third experiment using 70 gsm HVS paper with different pixel sizes had an accuracy of 96.15% (Ramiati et al., 2020) and braille recognition used CNN which has an accuracy of 81.54% for braille character images acquired with a smartphone (Herlambang et al., 2021).

SVM is a powerful classification method and is widely used in image processing and pattern recognition. Previous research has shown the effectiveness of SVM in various applications, such as the classification of habitable houses in Kidal Village, Tumang District, Malang Regency which has shown that SVM is effective in separating data on houses that are habitable from those that are not habitable based on criteria determined using 160 data and produces an average accuracy of 98.75% with the K-fold Cross validation testing method (Agustina et al., 2018), further detects personality through handwriting analysis that relies on SVM to understand unique patterns and characteristics in an individual's handwriting that reflect their personality, in this study the accuracy results achieved reached 99.9% using the linear kernel function (Safitri & Wulanningrum, 2020), and object motion detection based on image processing using binary-image comparison. The use of the binary-image comparison method with SVM has been successful in detecting object movement in images and makes it possible to classify various object movements with a high success rate, The results obtained from this research have an accuracy achieved by the SVM model which uses a combination of color and motion features reaching 98.4% (Matalangi & Jalil, 2020; Shokat et al., 2022).

Application of the SVM method in the context of braille recognition one of them is already well known in previous research using the SVM method using Gabor wavelet feature extraction to identify braille letters which had an accuracy of 98.15% using 758 data including large braille letters, small braille letters, punctuation marks and numbers (Florestiyanto & Prapcoyo, 2021).

In this research, the feature extraction used is Histogram of Oriented Gradients (HOG), While the Histogram of Oriented Gradients (HOG) has been widely used and is effective in several applications, there is room for further research in improving the spatial representation of Braille images. Research can explore several approaches to feature extraction techniques to obtain a more detailed description of the structure of Braille dots.

This research is expected to make a significant contribution to an image processing-based Braille letter recognition system which aims to provide ease of use in image processing-based braille identification using the SVM method. By implementing SVM, it is hoped that we can achieve efficient results in recognizing braille letters.

Thus, braille is not only a historically important script, but also a foundation for technological innovations that could change the way the visually impaired general public interacts with an increasingly connected world. This research aims to explore the application of Support Vector Machine (SVM) to enhance Braille letter recognition through image processing, a critical technology aimed at im-proving information accessibility for the general public

 

Research Methods

In this research, the steps to identify Braille letter can be seen from the various stages carried out to produce an effective and efficient system. The following are the steps listed and can be seen in Figure 1.

Data Pre-processing

Data pre-processing is a crucial initial stage for organizing and preparing a Braille image dataset. The initial dataset consisted of images representing Braille letters from 'a' to 'z', but was initially mixed with no clear grouping. The goal of this pre-processing is to reorganize these images into folders corresponding to each letter. This process makes it possible to prepare the dataset so that it is ready to be used for further analysis, such as image classification and model training.

Dataset Sharing

The next step is to divide the dataset into training data (train set) and test data (test set). This division is important because it allows an objective evaluation of the performance of the developed model. Training data is used to train the model to recognize common patterns in Braille images. Meanwhile, test data is used to measure how well the model can predict or recognize Braille letters in data that has never been seen before. This dataset is usually divided in a certain ratio, for example 80% for training data and 20% for test data.

Feature Extraction

The feature extraction process is an important step in an image-based Braille recognition system. Feature extraction aims to capture relevant information from images so that it can be used to differentiate one class of letters from another class. This research uses the Histogram of Oriented Gradients (HOG) feature to capture edge and shape information in Braille images.

To increase the variety of training data and improve model performance, we apply image augmentation. Augmentation techniques used include rotation (15, -15, 30, -30 degrees) and noise addition. This augmentation is performed for each original image to produce several new image variations.

HOG features are extracted from each augmented image. HOG is a useful technique for capturing edge and shape orientation information in images. The parameters used in HOG extraction are as follows:

1)    Orientations =9: Splits the gradient orientation into 9 bins.

2)    pixels_per_cell = (8, 8): Splits the image into cells of 8x8 pixels.

3)    cells_per_block = (2, 2): Forms a block of 2x2 cells.

4)    transform_sqrt = True: Applies square root transformation for contrast normalization.

5)    block_norm = 'L2-Hys': Uses L2-Hys normalization for blocks.

After HOG feature extraction, these features are standardized using StandardScaler. This standardization process ensures that all features have the same scale, which is important for optimal performance of the classification algorithm used.

HOG feature extraction provides a robust representation of Braille images by capturing detailed edge and shape information. These features are then used to train a classification model that will identify Braille letters based on the given image.

 

 

Image Classification

The data and image classification process involves the use of a Support Vector Machine (SVM) to optimize parameters in the Braille classification task. SVM is a very effective algorithm for classification tasks, especially for data that has high dimensions and complex distributions. In this research, the Support Vector Classifier (SVC), which is an implementation of SVM for classification, is used to classify Braille images into appropriate letter categories.

SVC works based on the principle of finding an optimal hyperplane that can separate two classes of data in feature space. This hyperplane is a decision boundary that maximizes the margin (distance) between different classes. This margin is measured as the distance from the hyperplane to the closest points of each class. The main goal of SVC is to find the hyperplane that has the maximum margin, thereby increasing the generalization ability of the model to new, never-before-seen data.

One of the main strengths of SVC is its ability to handle data that cannot be linearly separated within the same feature space. This is achieved using a kernel function, which maps data from the original space to a higher feature space (which is often not directly visible), where the data can be linearly separated by a hyperplane.

To optimize SVC parameters, the Grid Search technique is used, which searches for the best combination of parameters. Grid Search is a technique used to find the best combination of parameters that produces optimal model performance. The implementation of this technique uses the Scikit-Learn library, which provides tools for performing grid searches and cross-validation, making it easier to focus on improving model performance.

 

Prediction

The final stage is prediction, where the trained SVC model is used to predict the class of the new Braille image. This prediction process involves steps such as reading and resizing the image to the same dimensions (for example, 28x28 pixels), converting it to a one-dimensional (flatten) vector, and then using the SVC model to predict the Braille letters that represent the image. These prediction results can be used for practical applications such as helping blind users identify Braille letters from the images they get.

 

Figure 1. Braille research flow using the SVM method

 

Results and Discussion

Braille identification is the process of identifying braille letters from images that are used by the public to read braille letters. Recognition systems can help improve accessibility for allpublic. In this research, researchers used Support Vector Machine (SVM) to classify braille images.

 

Datasets and Preprocessing

The initial dataset was extracted from a zip file and organized into folders corresponding to Braille letter categories (a to z). This step makes subsequent data processing easier, such as labeling and feature extraction. Once extracted, image files are grouped into directories based on the letters they represent, as can be seen in figure 3. This makes labeling and subsequent processing easier.

Figure 2. Braille dataset (Braille Character Dataset | Kaggle)

 

Figure 3. Classification of the braille dataset that has been moved to folders according to the letters

 

Each image from the dataset is resized to consistent dimensions. In this study, each image was resized to 28x28 pixels. The purpose of this resizing is to ensure that all images have the same dimensions, so they can be compared and processed consistently by the model. Once the image is resized, the next step is to convert the image into a feature vector. This transformation is carried out by "flattening" the 2D image into one 1D array. Each pixel in the image is represented by one value in the vector, so a 28x28 pixel image will produce a vector that is 784 elements long. In addition to collecting feature data, labels for each image are also collected. These labels are used to train the model to recognize Braille letters. Each letter category (az) is assigned a unique label, which is then stored in the 'labels' array.

 

Feature Extraction

In this research, HOG feature extraction succeeded in capturing important information from Braille images. The HOG feature breaks the image into small cells and calculates a gradient orientation histogram within each cell. The result is a feature vector that represents the distribution of edge orientations in the image.

1)    Gradient Orientation: The image is divided into 9 orientation bins, which means that the edge orientation from 0 to 180 degrees is broken into 9 parts. This helps in capturing various edge directions in the image.

2)    Cells and Blocks: The image is split into cells of 8x8 pixels, and each block consists of 2x2 cells. This provides sufficient spatial resolution to capture important details while preserving local context.

3)    Contrast Normalization: Square root transformation and L2-Hys normalization help in reducing the effects of lighting and contrast variations, so that the extracted features are more robust to different lighting conditions.

Standardizing features using StandardScaler ensures that all features are at the same scale. This is important because non-standardized features can cause some features to dominate others, which can reduce model performance.

 

Image Classification

In this step, the dataset is divided into training data and test data in a ratio of 80:20. Training data is used to train the SVM model, while test data is used to test the model's performance and measure its ability to generalize knowledge to new data. The use of Grid Search in model training helps to find the optimal combination of SVM parameters, to improve classification performance.

 

Figure 4. SVM model optimization using GridSearchCV shows 'SVC' as the best estimator forclassificationbraille

 

The Support Vector Machine (SVM) model is trained to recognize braille letters based on previously labeled feature data. Once the SVM model is trained using the processed dataset, the model is tested to evaluate its performance. Testing is carried out using test data that is separate from the training data to measure how well the model can recognize braille letters.

After training and optimizing the SVM model using GridSearchCV, researchers evaluated the model's performance on test data. This evaluation is carried out by calculating the model accuracy, accuracy is calculated as the ratio of the number of correct predictions to the total number of predictions. Accuracy is calculated using the following formula:

 

 

Figure 5. Prediction results using GridSearchCVshows SVCas the best estimator with an accuracy of 79%

 

The Support Vector Machine (SVM) model trained with HOG features shows good performance in classifying Braille letters. The best model produced from gridsearch achieves an accuracy of 79%. This shows that the extracted HOG features are effective in distinguishing between different Braille letters.

 

Image Prediction

To improve the practical usability of the system, researchers developed a prediction function that accepts a new image as input and returns a Braille prediction. This function allows the use of the model to identify Braille letters from images that the model has never seen before, increasing the system's applicability in real use.

 

Figure 6. Prediction results for braille letter identification

 

Previous research using various techniques such as color conversion, image segmentation, feature extraction, and classification algorithms has shown success in Braille recognition. Some of the methods used include Radially Average Power Spectrum (RAPS) and geometric feature extraction, which achieved an accuracy of 97.18%. Braille recognition uses CNN with an accuracy of 81.54% for images acquired by smartphones.

This research also shows that SVM and feature extractionHistogram of Oriented Gradients (HOG)can be used effectively in identifying Braille letters with an accuracy of 79% which opens up opportunities to make it easier for the general public to know Braille letters.

 

Conclusion

Support Vector Machine (SVM) has been proven effective in image processing and pattern recognition, including Braille recognition. This research uses SVM with Grid Search optimization techniques to find the best parameters, producing a model that is able to recognize Braille letters with sufficient accuracy. The dataset used consists of Braille images that are processed and resized into a consistent format before being classified. In this study, the SVM model and Histogram of Oriented Gradients (HOG) feature extraction achieved 79% accuracy in recognizing Braille letters in the test dataset. The identification process involves several stages, starting from data pre-processing, dataset sharing, feature extraction, image classification, to Braille letter prediction.

The results of feature extraction and model performance show that the approach used is effective for Braille recognition. HOG feature extraction successfully captures relevant spatial information from images, and data augmentation helps the model to generalize better. Feature standardization ensures consistency in feature scale, which is important for optimal performance of classification algorithms. These results demonstrate the potential of SVM and Histogram of Oriented Gradients (HOG) feature extraction in developing an efficient Braille recognition system to help the general public. This research opens up further opportunities for technological innovation that can improve the general public's accessibility and interaction with the blind.

 

 

BIBLIOGRAPHY

 

Agustina, W., Furqon, M. T., & Rahayudi, B. (2018). Implementasi Metode Support Vector Machine (SVM) Untuk Klasifikasi Rumah Layak Huni (Studi Kasus: Desa Kidal Kecamatan Tumpang Kabupaten Malang). Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 2(10), 3366–3372.

Aulia, S., & Satriani, S. N. N. (2020). Recognition of Image Pattern To Identification Of Braille Characters To Be Audio Signals For Blind Communication Tools. IOP Conference Series: Materials Science and Engineering, 846(1), 12008.

De Luna, R. G. (2020). A Tesseract-based optical character recognition for a text-to-braille code conversion. International Journal on Advanced Science, Engineering and Information Technology, 10(1), 128–136.

Florestiyanto, M. Y., & Prapcoyo, H. (2021). Braille Detection Application Using Gabor Wavelet and Support Vector Machine. RSF Conference Series: Engineering and Technology, 1(1), 160–169.

Herlambang, M. F., Hermana, A. N., & Putra, K. R. (2021). Pengenalan Karakter Huruf Braille dengan Metode Convolutional Neural Network. Systemic: Information System and Informatics Journal, 6(2), 20–26. https://doi.org/10.29080/systemic.v6i2.969

Kausar, T., Manzoor, S., Kausar, A., Lu, Y., Wasif, M., & Ashraf, M. A. (2021). Deep learning strategy for braille character recognition. IEEE Access, 9, 169357–169371.

Matalangi, M., & Jalil, A. (2020). Deteksi Gerak Objek Berbasis Pengolahan Citra Menggunakan Metode Binary-Image Comparison. Electro Luceat, 6(1), 109–116.

Meneses-Claudio, B., Alvarado-Diaz, W., & Roman-Gonzalez, A. (2020). Classification System for the Interpretation of the Braille Alphabet through Image Processing.

Nini, K., & Harum, R. (2023). Metode Struktural Analitik Sintetik (SAS) untuk Meningkatkan Kemampuan Membaca Huruf Braille pada Anak Tunanetra di Bhakti Luhur Malang. Jurnal Pelayanan Pastoral, 44–54.

Ramiati, R., Aulia, S., & Lifwarda, L. (2020). Aplikasi Identifikasi Huruf Braille Menggunakan Computer Vision Berbasis Raspberry Pi. Jurnal Nasional Teknik Elektro, 9(1), 12. https://doi.org/10.25077/jnte.v9n1.707.2020

Safitri, K. A., & Wulanningrum, R. (2020). Aplikasi Pengenalan Pola Tulisan Tangan Menggunakan Metode Support Vector Machine. Prosiding SEMNAS INOTEK (Seminar Nasional Inovasi Teknologi), 4(1), 201–206.

Sari, A. (2023). Pengenalan Huruf Braille Menggunakan Radially Average Power Spectrum Dan Geometri. Jurnal Inovtek Polbeng-Seri Informatika, 8(1), 25–36.

Shokat, S., Riaz, R., Rizvi, S. S., Abbasi, A. M., Abbasi, A. A., & Kwon, S. J. (2020). Deep learning scheme for character prediction with position-free touch screen-based Braille input method. Human-Centric Computing and Information Sciences, 10, 1–24.

Shokat, S., Riaz, R., Rizvi, S. S., Khan, I., & Paul, A. (2022). Characterization of English braille patterns using automated tools and RICA based feature extraction methods. Sensors, 22(5), 1836.

Sumarni, S. (2019). Implementasi Braille Berbasis Media Card Huruf Hijaiyyah Dalam Meningkatkan Kemampuan Mengenal Huruf Pada Tunanetra Siswa Sekolah Luar Biasa Negeri 1 Makassar. Al-Maraji’: Jurnal Pendidikan Bahasa Arab, 3(2), 17–34.

Wulandari, P. L., Kurniawan, D., Ihsan, R., Safaruddin, S., & Damri, D. (2022). Printer Penerjemahan Teks ke Audio-Braille Menggunakan Sistem Arduino Uno Untuk Tunanetra. Jurnal Aplikasi IPTEK Indonesia, 5(3), 99–106.

 

 

Copyright holder:

Rizka Noviyanti, Estu Sinduningrum (2024)

 

First publication right:

Syntax Literate: Jurnal Ilmiah Indonesia

 

This article is licensed under: