Syntax Literate: Jurnal Ilmiah
Indonesia p–ISSN: 2541-0849 e-ISSN: 2548-1398
Vol. 9, No. 8, Agustus 2024
Braille Character Recognition with Histogram of Oriented Gradients
(HOG) and SVM-Based Image Processing
Rizka Noviyanti1, Estu
Sinduningrum2*
Universitas Muhammadiyah Prof. Dr. Hamka, Jakarta, Indonesia1,2
Email: [email protected]1,
[email protected]2*
Abstract
This research explores the application of Support
Vector Machine (SVM) to enhance Braille letter recognition through image
processing, a critical technology aimed at improving information accessibility
for the general public. Braille letters, as a vital tactile writing system for
this community, are represented by patterns of dots interpreted through touch.
By employing SVM method optimized using Grid Search, the study achieved an 82%
accuracy in classifying Braille letter images from a test dataset of 1560
images. These results confirm SVM's effectiveness in recognizing and
classifying Braille letters, validating the efficiency of this approach in
Braille image classification. The implications of this research include
significant contributions to technology supporting inclusivity and information
accessibility, emphasizing the importance of structured and optimized systems
in addressing the challenges of recognizing different characters in the Braille
writing system.
Keywords:
Accessibility, Braille letter recognition, Image processing, Support Vector
Machine (SVM)
Introduction
Blind people, that
is, people who are visually impaired or visually impaired, have special
challenges in reading and writing. Braille is a tactile writing system created
specifically to enable blind or blind-blind people to read and write (Nini &
Harum, 2023; Sumarni, 2019). This system uses a regular
combination of dots on the surface of paper or other materials, which can be
touched with the fingertips to represent letters, numbers and other signs. This
method is named after Louis Braille, a French pioneer in the education of the
blind, who developed this system in the 19th century.
Braille plays an important
role in facilitating access to written information for those who cannot use
their eyesight effectively. By using touch, individuals can feel and recognize
the pattern of dots that make up Braille characters, which consist of 6 or 8
dots arranged in a certain layout. In its development, Braille not only
included the alphabet, but also mathematical symbols, musical notation, and
even codes for other languages (De Luna,
2020; Wulandari et al., 2022).
In the era of modern
information technology, advances in image processing and artificial
intelligence open up opportunities to improve the ability to automatically
recognize Braille letters (Meneses-Claudio
et al., 2020; Shokat et al., 2020). Image processing techniques
enable image analysis to recognize Braille characters, the use of techniques
such as image segmentation, feature extraction, and classification algorithms
have paved the way for the development of systems that can recognize and
transcribe Braille text into a more easily accessible format (Aulia
& Satriani, 2020; Kausar et al., 2021).
A number of previous
studies have tried various image processing methods and techniques to overcome
this challenge, including color conversion, image segmentation, feature
extraction, and the use of classification algorithms to recognize braille
letters.
Some studies even
apply the braille recognition method using the Radially Average Power Spectrum
(RAPSV). The RAPSV method itself provides an accuracy of 93.91%. When frequency
feature extraction is combined with geometry, the accuracy increases to 94.04%,
with the combination of all features producing the highest accuracy of 97.18% (Sari,
2023). Apart from that, Raspberry Pi
has also been used as a platform to develop Braille recognition applications
using computer vision technology which has various levels of accuracy, in the
first test using 70 gsm HVS paper and with images
having 100% accuracy of all samples used, in the second test used drawing paper
and binding paper, only a few samples were incorrectly recognized with an
accuracy rate of 90.38%, then in the third experiment using 70 gsm HVS paper with different pixel sizes had an accuracy of
96.15% (Ramiati et
al., 2020) and braille
recognition used CNN which has an accuracy of 81.54% for braille character
images acquired with a smartphone (Herlambang et al.,
2021).
SVM is a powerful
classification method and is widely used in image processing and pattern
recognition. Previous research has shown the effectiveness of SVM in various
applications, such as the classification of habitable houses in Kidal Village, Tumang District, Malang Regency which has shown that SVM is
effective in separating data on houses that are habitable from those that are
not habitable based on criteria determined using 160 data and produces an average accuracy of 98.75%
with the K-fold Cross validation testing method (Agustina et al., 2018), further detects
personality through handwriting analysis that relies on SVM to understand
unique patterns and characteristics in an individual's handwriting that reflect
their personality, in this study the accuracy results achieved reached
99.9% using the linear kernel function (Safitri &
Wulanningrum, 2020), and object
motion detection based on image processing using binary-image comparison. The
use of the binary-image comparison method with SVM has been successful in
detecting object movement in images and makes it possible to classify various
object movements with a high success rate, The results obtained from
this research have an accuracy achieved by the SVM model which uses a
combination of color and motion features reaching 98.4% (Matalangi & Jalil,
2020; Shokat et al., 2022).
Application of the
SVM method in the context of braille recognition one of them is already well known in previous research using the SVM method using
Gabor wavelet feature extraction to identify braille letters which had an
accuracy of 98.15% using 758 data including large braille letters, small
braille letters, punctuation marks and numbers (Florestiyanto &
Prapcoyo, 2021).
In this research,
the feature extraction used is Histogram
of Oriented Gradients (HOG), While the Histogram of
Oriented Gradients (HOG) has been widely used and is effective in several
applications, there is room for further research in improving the spatial
representation of Braille images. Research can explore several approaches to
feature extraction techniques to obtain a more detailed description of the
structure of Braille dots.
This research is
expected to make a significant contribution to an image processing-based
Braille letter recognition system which aims to provide ease of use in image
processing-based braille identification using the SVM method. By implementing
SVM, it is hoped that we can achieve efficient results in recognizing braille
letters.
Thus, braille is not only a historically
important script, but also a foundation for technological innovations that
could change the way the visually impaired general public interacts with an
increasingly connected world. This research aims to explore the
application of Support Vector Machine (SVM) to enhance Braille letter
recognition through image processing, a critical technology aimed at im-proving
information accessibility for the general public
Research Methods
In this research,
the steps to identify Braille letter
can be seen from the various stages carried out to
produce an effective and efficient system. The following are the steps listed
and can be seen in Figure 1.
Data pre-processing
is a crucial initial stage for organizing and preparing a Braille image
dataset. The initial dataset consisted of images representing Braille letters
from 'a' to 'z', but was initially mixed with no clear grouping. The goal of
this pre-processing is to reorganize these images into folders corresponding to
each letter. This process makes it possible to prepare the dataset so that it
is ready to be used for further analysis, such as image classification and
model training.
The next step is to divide the dataset into
training data (train set) and test data (test set). This division is important
because it allows an objective evaluation of the performance of the developed
model. Training data is used to train the model to recognize common patterns in
Braille images. Meanwhile, test data is used to measure how well the model can
predict or recognize Braille letters in data that has never been seen before.
This dataset is usually divided in a certain ratio, for example 80% for
training data and 20% for test data.
Feature Extraction
The
feature extraction process is an important step in an image-based Braille
recognition system. Feature extraction aims to capture relevant information
from images so that it can be used to differentiate one class of letters from
another class. This research uses the Histogram of Oriented Gradients (HOG)
feature to capture edge and shape information in Braille images.
To
increase the variety of training data and improve model performance, we apply
image augmentation. Augmentation techniques used include rotation (15, -15, 30,
-30 degrees) and noise addition. This augmentation is performed for each
original image to produce several new image variations.
HOG features are extracted
from each augmented image. HOG is a useful technique for capturing edge and
shape orientation information in images. The parameters used in HOG extraction
are as follows:
1) Orientations =9: Splits the
gradient orientation into 9 bins.
2) pixels_per_cell = (8, 8): Splits the image
into cells of 8x8 pixels.
3) cells_per_block = (2, 2): Forms a block of
2x2 cells.
4) transform_sqrt = True: Applies square root
transformation for contrast normalization.
5) block_norm = 'L2-Hys': Uses L2-Hys
normalization for blocks.
After HOG feature extraction, these
features are standardized using StandardScaler. This
standardization process ensures that all features have the same scale, which is
important for optimal performance of the classification algorithm used.
HOG feature extraction provides a robust
representation of Braille images by capturing detailed edge and shape
information. These features are then used to train a classification model that
will identify Braille letters based on the given image.
Image Classification
The data and image classification process
involves the use of a Support Vector Machine (SVM) to optimize parameters in
the Braille classification task. SVM is a very effective algorithm for
classification tasks, especially for data that has high dimensions and complex
distributions. In this research, the Support Vector Classifier (SVC), which is
an implementation of SVM for classification, is used to classify Braille images
into appropriate letter categories.
SVC works based on the principle of finding
an optimal hyperplane that can separate two classes of data in feature space.
This hyperplane is a decision boundary that maximizes the margin (distance)
between different classes. This margin is measured as the distance from the
hyperplane to the closest points of each class. The main goal of SVC is to find
the hyperplane that has the maximum margin, thereby increasing the
generalization ability of the model to new, never-before-seen data.
One of the main strengths of SVC is its
ability to handle data that cannot be linearly separated within the same
feature space. This is achieved using a kernel function, which maps data from
the original space to a higher feature space (which is often not directly
visible), where the data can be linearly separated by a hyperplane.
To optimize SVC parameters, the Grid Search
technique is used, which searches for the best combination of parameters. Grid
Search is a technique used to find the best combination of parameters that
produces optimal model performance. The implementation of this technique uses
the Scikit-Learn library, which provides tools for performing grid searches and
cross-validation, making it easier to focus on improving model performance.
The final stage is prediction, where the
trained SVC model is used to predict the class of the new Braille image. This prediction
process involves steps such as reading and resizing the image to the same
dimensions (for example, 28x28 pixels), converting it to a one-dimensional
(flatten) vector, and then using the SVC model to predict the Braille letters
that represent the image. These prediction results can be used for practical
applications such as helping blind users identify Braille letters from the
images they get.
Figure 1. Braille research flow using
the SVM method
Results and Discussion
Braille
identification is the process of identifying braille letters from images that
are used by the public to read braille letters. Recognition systems can help
improve accessibility for allpublic.
In this research, researchers used Support Vector Machine (SVM) to classify
braille images.
The initial dataset
was extracted from a zip file and organized into folders corresponding to
Braille letter categories (a to z). This step makes subsequent data processing
easier, such as labeling and feature extraction. Once extracted, image files
are grouped into directories based on the letters they represent, as can be
seen in figure 3. This makes labeling and subsequent processing easier.
Figure 2. Braille dataset (Braille Character Dataset | Kaggle)
Figure 3. Classification of the braille
dataset that has been moved to folders according to the letters
Each image from the dataset is resized to
consistent dimensions. In this study, each image was resized to 28x28 pixels.
The purpose of this resizing is to ensure that all images have the same
dimensions, so they can be compared and processed consistently by the model.
Once the image is resized, the next step is to convert the image into a feature
vector. This transformation is carried out by "flattening" the 2D
image into one 1D array. Each pixel in the image is represented by one value in
the vector, so a 28x28 pixel image will produce a vector that is 784 elements
long. In addition to collecting feature data, labels for each image are also
collected. These labels are used to train the model to recognize Braille
letters. Each letter category (az) is assigned a
unique label, which is then stored in the 'labels' array.
In this research, HOG feature extraction succeeded in capturing
important information from Braille images. The HOG feature breaks the image
into small cells and calculates a gradient orientation histogram within each
cell. The result is a feature vector that represents the distribution of edge
orientations in the image.
1) Gradient
Orientation: The
image is divided into 9 orientation bins, which means that the edge orientation
from 0 to 180 degrees is broken into 9 parts. This helps in capturing various
edge directions in the image.
2) Cells
and Blocks: The
image is split into cells of 8x8 pixels, and each block consists of 2x2 cells.
This provides sufficient spatial resolution to capture important details while
preserving local context.
3) Contrast
Normalization:
Square root transformation and L2-Hys normalization help in reducing the
effects of lighting and contrast variations, so that the extracted features are
more robust to different lighting conditions.
Standardizing features using StandardScaler
ensures that all features are at the same scale. This is important because
non-standardized features can cause some features to dominate others, which can
reduce model performance.
In this step, the dataset is divided into
training data and test data in a ratio of 80:20. Training data is used to train
the SVM model, while test data is used to test the model's performance and
measure its ability to generalize knowledge to new data. The use of Grid Search
in model training helps to find the optimal combination of SVM parameters, to
improve classification performance.
Figure 4. SVM model optimization using GridSearchCV shows 'SVC' as the best estimator forclassificationbraille
The Support Vector Machine (SVM) model is
trained to recognize braille letters based on previously labeled feature data.
Once the SVM model is trained using the processed dataset, the model is tested
to evaluate its performance. Testing is carried out using test data that is
separate from the training data to measure how well the model can recognize
braille letters.
After training and optimizing the SVM model
using GridSearchCV, researchers evaluated the model's
performance on test data. This evaluation is carried out by calculating the
model accuracy, accuracy is calculated as the ratio of the number of correct
predictions to the total number of predictions. Accuracy is calculated using
the following formula:
Figure 5. Prediction results using GridSearchCVshows SVCas the best estimator with an accuracy of 79%
The Support Vector Machine (SVM) model
trained with HOG features shows good performance in classifying Braille
letters. The best model produced from gridsearch
achieves an accuracy of 79%. This shows that the extracted HOG features are
effective in distinguishing between different Braille letters.
Image Prediction
To improve the practical usability of the
system, researchers developed a prediction function that accepts a new image as
input and returns a Braille prediction. This function allows the use of the
model to identify Braille letters from images that the model has never seen
before, increasing the system's applicability in real use.
Figure 6. Prediction results for braille
letter identification
Previous research using various techniques
such as color conversion, image segmentation, feature extraction, and
classification algorithms has shown success in Braille recognition. Some of the
methods used include Radially Average Power Spectrum (RAPS) and geometric
feature extraction, which achieved an accuracy of 97.18%. Braille recognition
uses CNN with an accuracy of 81.54% for images acquired by smartphones.
This research also shows that SVM and
feature extractionHistogram of Oriented Gradients (HOG)can be used effectively in identifying Braille letters with
an accuracy of 79% which opens up opportunities to make it easier for the
general public to know Braille letters.
Conclusion
Support Vector Machine (SVM) has
been proven effective in image processing and pattern recognition, including
Braille recognition. This research uses SVM with Grid Search optimization
techniques to find the best parameters, producing a model that is able to
recognize Braille letters with sufficient accuracy. The dataset used consists
of Braille images that are processed and resized into a consistent format
before being classified. In this study, the SVM model and Histogram of Oriented
Gradients (HOG) feature extraction achieved 79% accuracy in recognizing Braille
letters in the test dataset. The identification process involves several
stages, starting from data pre-processing, dataset sharing, feature extraction,
image classification, to Braille letter prediction.
The results of feature extraction and model
performance show that the approach used is effective for Braille recognition.
HOG feature extraction successfully captures relevant spatial information from
images, and data augmentation helps the model to generalize better. Feature
standardization ensures consistency in feature scale, which is important for
optimal performance of classification algorithms. These results demonstrate the
potential of SVM and Histogram of Oriented Gradients (HOG) feature extraction in
developing an efficient Braille recognition system to help the general public.
This research opens up further opportunities for technological innovation that
can improve the general public's accessibility and interaction with the blind.
BIBLIOGRAPHY
Agustina,
W., Furqon, M. T., & Rahayudi, B. (2018). Implementasi Metode Support
Vector Machine (SVM) Untuk Klasifikasi Rumah Layak Huni (Studi Kasus: Desa
Kidal Kecamatan Tumpang Kabupaten Malang). Jurnal Pengembangan Teknologi
Informasi Dan Ilmu Komputer, 2(10), 3366–3372.
Aulia,
S., & Satriani, S. N. N. (2020). Recognition of Image Pattern To
Identification Of Braille Characters To Be Audio Signals For Blind
Communication Tools. IOP Conference Series: Materials Science and
Engineering, 846(1), 12008.
De
Luna, R. G. (2020). A Tesseract-based optical character recognition for a
text-to-braille code conversion. International Journal on Advanced Science,
Engineering and Information Technology, 10(1), 128–136.
Florestiyanto,
M. Y., & Prapcoyo, H. (2021). Braille Detection Application Using Gabor
Wavelet and Support Vector Machine. RSF Conference Series: Engineering and
Technology, 1(1), 160–169.
Herlambang,
M. F., Hermana, A. N., & Putra, K. R. (2021). Pengenalan Karakter Huruf
Braille dengan Metode Convolutional Neural Network. Systemic: Information
System and Informatics Journal, 6(2), 20–26.
https://doi.org/10.29080/systemic.v6i2.969
Kausar,
T., Manzoor, S., Kausar, A., Lu, Y., Wasif, M., & Ashraf, M. A. (2021).
Deep learning strategy for braille character recognition. IEEE Access, 9,
169357–169371.
Matalangi,
M., & Jalil, A. (2020). Deteksi Gerak Objek Berbasis Pengolahan Citra
Menggunakan Metode Binary-Image Comparison. Electro Luceat, 6(1),
109–116.
Meneses-Claudio,
B., Alvarado-Diaz, W., & Roman-Gonzalez, A. (2020). Classification
System for the Interpretation of the Braille Alphabet through Image Processing.
Nini,
K., & Harum, R. (2023). Metode Struktural Analitik Sintetik (SAS) untuk
Meningkatkan Kemampuan Membaca Huruf Braille pada Anak Tunanetra di Bhakti
Luhur Malang. Jurnal Pelayanan Pastoral, 44–54.
Ramiati,
R., Aulia, S., & Lifwarda, L. (2020). Aplikasi Identifikasi Huruf Braille
Menggunakan Computer Vision Berbasis Raspberry Pi. Jurnal Nasional Teknik
Elektro, 9(1), 12. https://doi.org/10.25077/jnte.v9n1.707.2020
Safitri,
K. A., & Wulanningrum, R. (2020). Aplikasi Pengenalan Pola Tulisan Tangan
Menggunakan Metode Support Vector Machine. Prosiding SEMNAS INOTEK (Seminar
Nasional Inovasi Teknologi), 4(1), 201–206.
Sari,
A. (2023). Pengenalan Huruf Braille Menggunakan Radially Average Power Spectrum
Dan Geometri. Jurnal Inovtek Polbeng-Seri Informatika, 8(1),
25–36.
Shokat,
S., Riaz, R., Rizvi, S. S., Abbasi, A. M., Abbasi, A. A., & Kwon, S. J.
(2020). Deep learning scheme for character prediction with position-free touch
screen-based Braille input method. Human-Centric Computing and Information
Sciences, 10, 1–24.
Shokat,
S., Riaz, R., Rizvi, S. S., Khan, I., & Paul, A. (2022). Characterization
of English braille patterns using automated tools and RICA based feature
extraction methods. Sensors, 22(5), 1836.
Sumarni,
S. (2019). Implementasi Braille Berbasis Media Card Huruf Hijaiyyah Dalam
Meningkatkan Kemampuan Mengenal Huruf Pada Tunanetra Siswa Sekolah Luar Biasa
Negeri 1 Makassar. Al-Maraji’: Jurnal Pendidikan Bahasa Arab, 3(2),
17–34.
Wulandari,
P. L., Kurniawan, D., Ihsan, R., Safaruddin, S., & Damri, D. (2022).
Printer Penerjemahan Teks ke Audio-Braille Menggunakan Sistem Arduino Uno Untuk
Tunanetra. Jurnal Aplikasi IPTEK Indonesia, 5(3), 99–106.
Copyright
holder: Rizka Noviyanti, Estu Sinduningrum (2024) |
First
publication right: Syntax
Literate: Jurnal Ilmiah Indonesia |
This
article is licensed under: |