Syntax
Literate: Jurnal Ilmiah Indonesia p–ISSN: 2541-0849 e-ISSN: 2548-1398
Vol. 9, No. 9, September
2024
DEVELOPMENT OF RECOMMENDATION BUNDLING SYSTEM FOR FOOD RETAILER
BASED ON DATA MINING
Universitas Islam
Indonesia, Yogyakarta, Indonesia1.2
Email: [email protected]1, [email protected]2*
Abstract
In retail, a major challenge is quickly selling
perishable food items like cakes or pastries within a limited time. If unsold,
these products lead to losses, even with discount promotions. Retailers must
forecast stock accurately and develop more effective strategies to meet
consumer demand within the required timeframe. One suggested strategy is
bundling, where high-demand products are combined with less popular ones.
However, when done manually, this approach often fails to align with consumer
preferences. This study aims to
develop an automated bundling system using data mining techniques. Market
Basket Analysis is used to understand consumer purchasing patterns, while
Association Rules with the Apriori Algorithm help identify relationships
between different products. These methods reveal which items are frequently
bought together, making bundling strategies more effective. The system will be designed with usability
and ergonomic principles, ensuring it is user-friendly. The implications of
this system include improved stock management, more accurate bundling, and
better alignment with customer preferences, ultimately increasing satisfaction.
Additionally, automation reduces errors and inconsistencies that occur with
manual bundling. The expected outcome is a more efficient, effective, and
comfortable system for food retailers, leading to higher sales, reduced losses,
and greater overall customer satisfaction.
Keywords: Retailer, usable, Customer behavior,
Market Basket Analysis, Association rules, Ergonomics
Introduction
Retail
business is a sector that involves selling goods and services directly to
consumers for personal use
In line
with understanding consumer habits, there are other issues that affect retail
performance. Retailers face problems that affect many other retail sectors, the
most food retailer’s common problem is how to sell various food products in a
given period of time because of product’s shelf life
Applying
digital technology in the retail business is importnace, particularly to
address challenges in conventional processes such as data inaccuracy,
inefficient inventory management, and customer dissatisfaction. In this
context, consumer behavior becomes a key element that must be understood as
input in implementing data mining methods like Association Rules and Market
Basket Analysis (AR-MBA), which have proven effective in analyzing product
package recommendations in the food retail sector. However, one of the main
problems faced by food retailers is how to sell food products with a limited
shelf life. Failure to sell these products within a certain period will result
in losses. Although promotional strategies like discounts have been
implemented, losses still frequently occur for various reasons. Therefore,
alternative sales strategies such as the bundling system, where multiple
products are combined, are proposed as a solution.
While
manual bundling is still often used, this method is considered ineffective.
Thus, it is crucial to develop an automated bundling system that is more
efficient and aligns with consumer preferences.
Previous
research provides a valuable foundation for developing an automated bundling
system in retail. Several researchers, such as
Research Methods
CRISP-DM based on
IBM which stands for Cross-Industry Standard Process for Data Mining, is a
widely used methodology for guiding data mining projects. It consists of six
main phases:
a) Business Understanding: This phase involves
defining project goals, objectives, and success criteria from a business
perspective.
b) Data Understanding: This phase focuses on
data collection and exploration, identifying data sources, collecting relevant
data, and understanding its structure, quality, and relationships.
c) Data Preparation: In this phase, data is
cleaned, missing values are handled, variables are transformed, and data from
multiple sources is integrated for analysis.
d) Modeling: This phase involves applying
modeling techniques to the prepared data to build predictive or descriptive
models, selecting algorithms, building and evaluating models, and tuning
parameters.
e) Evaluation: Models are evaluated to assess
their performance and determine how well they meet the project objectives,
involving validation with testing data and performance comparison against
requirements.
Usability Analysis
Usability is the level of usefulness of a system and indicates the level
of interaction between the user and the system/computer device being accessed.
The level of usability as per ISO 9241-11:1998 known as the capacity of goods
for use by specific users to meet the objectives with the impact of assignment
fulfillment by users, the efficiency of assignment fulfillment in time, and
satisfaction or user response in terms of knowledge in a system being used. To
get the relationship between usability and the system/software there was an
Analysis called Human-Computer Interaction, the usability 5 criteria of
Variables and Indicators to cover the scope of Human-Computer Interaction can be
described as follows: Learnability, Efficiency, Memorability, Error and
Satisfaction.
Survey
Data is
collected through direct questioning methods, including written questionnaires,
telephone interviews, face-to-face interviews and paper-based questionnaires or
interviews. To enhance the efficiency of data collection, record the
observations on a PC or online platform, computer tablet, or smartphone
1) Market Basket Analisys: secondary data
consist of transaction data of food retailer.
2) Usability: in this study, a practical
application survey was applied using a checklist on an online platform to
record data on the Usability Level of the Ergo-Bundling System using System
Usability Scale (SUS) by John Brooke (1986) and Performance Testing by Jacob
Neilson.
Design
of Survey and Experiment
a) Subject: 5 Partisipant based on expert of
their Knowledge of Business, Manufacture, Developer, and Designer.
b) Apparatus: Experiment of Performance
Testing using ErgoBundling Application illustrated with Figure 1 and SUS Scale
Questionnaire.
Figure 1. ErgoBundling Application
c) Procedures: Given to Participants with
Presentation and Focus Group Discussion before Performance Testing and SUS Scale.
d) Scenario Performance Testing: Performance
testing applied to measured Time-Based Efficiency. The scenario explained with
Table 1 consist of 4 Task to be done by participants.
Table 1. Scenario Performance Testing
Task |
Description |
Task 1. Performing Login |
Performing login into the system as the
ErgoBundling user, then logging out and logging in again |
Task 2. Selecting Bundling Menu and Uploading
Bundling File and Reading Bundling Results |
In the first step, the user finds the Bundling
menu and uploads the bundling file |
Task 3. Selecting Bundling Menu and Uploading
Forecast File and Reading Forecast Results |
The user finds the Bundling menu and uploads the
forecast file then processes the file by selecting the "process"
button. |
Task 4. Reading History |
The user can search for the date of the process
that has been performed by finding the corresponding date, for example,
January 23, 2024. |
a. System Usability Scale Questionnaire
System
Usability Scale (SUS) by Brooke for this study listed as 10 questions about
Learnability, Memorability, Error and Satisfaction:
(SUS1). I
think that I would like to use this system frequently
(SUS2). I
found the system unnecessarily complex.
(SUS3). I
thought the system was easy to use.
(SUS4). I
think that I would need the support of a technical person to be able to use
this system
(SUS5). I
found the various functions in this system were well integrated.
(SUS6). I
thought there was too much inconsistency in this system.
(SUS7). I
would imagine that most people would learn to use this system very quickly.
(SUS8). I
found the system very cumbersome to use.
(SUS9). I
felt very confident using the system.
(SUS10). I
needed to learn a lot of things before I could get going with this system.
Result and Discussion
Result of Design System
In this study, a Pareto diagram reveals that around 20% of the problems have a significant influence on the overall business processes. These issues include a lack of insight among human resources, reliance on manual labor for planning, inaccuracies in planning data for each branch, discrepancies between actual outcomes and forecasts, and limitations in data tracking and analysis. These challenges highlight the need for a more efficient system to address productivity and profitability issues in food retail, supporting the conclusion that automation and data-driven approaches are essential.
Previous research has similarly identified
the importance of automating retail processes to minimize human error and
improve accuracy. Studies by
In the Data Understanding phase,
the first step is data selection, followed by data cleansing, where
non-relevant items like cardboard are removed from the transaction data. This
step is crucial to ensure the data used for analysis is accurate and
meaningful, as supported by studies like
Result
of Market Basket Analysis
From the
association rules support count, it was found that rules occurring with a
support level of 50% have a sufficient number of rules to be used as
recommendations because they have a confidence level above 40%, which is higher
than other support levels. The support value in the calculation results
indicates that Item1 is purchased together with Item2 at a certain percentage
rate of all transactions, while the confidence value signifies that the level
of consumer confidence in buying Item1 together with Item2 is a certain
percentage. By setting 40-50% minimum support and confidence values to ensure
that the association rules generated only encompass significant and relevant
purchase patterns. This helps avoid generating too many rules that may be useless
or meaningless. Data Processing of association Rules is shown in Table 2 below:
Table 2. Association rules result of
Ergo-Bundling System
No. |
Premis |
Conclusion |
Confidence |
1 |
Donat, Maffin |
Arem_arem |
0.4 |
2 |
Arem_arem, Maffin |
Apem (conditional rules) |
0.45 |
3 |
Arem_arem, KunirAsem_botol |
Donat |
0.45 |
4 |
Donat, KunirAsem_botol |
Lemper (Conditional rules) |
0.45 |
5 |
Misoa, Donat |
Arem_arem |
0.50 |
From those
2 itemset, a second data mining process was conducted to identify combination
association rules that could be recommended to the company. To avoid rules containing
the same items, conditional rules will be added to the system, namely for the
wet bread product, which falls under the category of fast-expiring products
that need to be sold immediately.
Compared to Rapid Miner Result of Association rules,
the lift value shows that all of the rules have a relationship between Premise
and Conclusion so the rules are good to be a bundling package food retailer.
Table 3.
Association rules result of Rapid Miner
No. |
Premis |
Conclusion |
Confidence |
Lift |
1 |
Donat, Maffin |
Arem_arem |
0.7799331103678929 |
2,4302E+15 |
2 |
Arem_arem, Maffin |
Donat |
0.855465884079237 |
2,2755E+16 |
3 |
Arem_arem, KunirAsem_botol |
Donat |
0.8868479059515062 |
2,3590E+16 |
4 |
Donat, Misoa |
Arem_arem |
0.7335109926715523 |
2,2856E+15 |
5 |
Arem_arem, Misoa |
Donat |
0.816160118606375 |
2,1710E+16 |
Table 3
explained about Association rules from Rapid Miner, Software of Data Analysis.
Compared to rules that generated from Rapid Miner show that there are not
significant differences in the results of the rules that occurred. The Minimum
Confidence set at Level 0,8 and all of Lift value of the rules more than 1. A
lift value more than 1 indicates that purchasing these items together has an
independent relationship. In this way, lifts can provide insight into how
strong the relationship between two items is and how relevant the association
rules are in the context of Market Basket Analysis. So, the rules happened
strong enough to represent the customer behavior of the retail.
This is
supported by several previous studies that have stated the implementation of
the Apriori Algorithm for data mining in enhancing business strategies provides
faster analysis compared to other algorithms (Harshali, 2023). Additionally,
another study
Modeling
Phase
1) Design Process
The
Auto-Bundling System consists of 6 processes, including login, Menu Selection,
Bundling Data Mining Process, Forecasting Data Mining Process, Storage
Management, and Pricing Recommendation. There are two storage locations for the
database, namely Localhost and Server. In the Auto-bundling system, the
database is stored on Localhost and operated by the server. The Database server
consists of both Database Frontend and Database Backend. All incoming data is
automatically stored by the system into a specific path. For example, in
process 5, Storage Management, when a user wants to review the bundling results
from January 15, 2023, the user selects that date, and the server retrieves the
bundling history data from the database backend for January 15, 2023, to be
represented as the bundling results.
2) Result of Requirements
analysis
Requirements analysis is a part of system analysis,
which involves identifying the components needed for system design. There are
four main steps: analyzing inputs, processes, outputs, and interfaces. The
system requires Input data such as Registration Email, Username, Password,
Sales ID, Time and Date, Item-set, Category Item-set, and Frequency/Quantity.
Second, The Processes requires the system must Register new users with their
name, email, username, and password. Allow users to log in with their username
or email and password. Enable users to retrieve forgotten passwords via email.
Let users select actions (prediction or bundling). Allow users to upload and
re-upload sales data (CSV files) and Visualize predictions and bundling results
through graphs or tables. Third, the system provides output of Sales Prediction
Information for production decisions to optimize restocking and reduce losses,
Bundling Package Information for promotional strategies to increase profit by
selling products that need to be sold quickly. Last requires user interface by
design the system includes interfaces for the Homepage, Registration, Login,
Forgot Password, Upload Data Sales, Auto-Bundling Recommendations, and
Forecasting Results.
Usability
Analysis
1) Result of Efficiency
Effectiveness
is measured based on the average success percentage of all respondents in the
testing phase, with a total of 5 respondents from all specified user types.
Data on the success rate of respondents, categorized by expertise such as
Business Owners, Employees, Developers, and Designers, were collected using
performance measurement techniques and calculated using the completion rate
equation.
The Time to
finish task called Time-Based Efficiency. In this study, Participants finished
the task with 73% efficiency. P3 and P5 have 80% as the highest score to
finished the task and the result show that Task 4 Search date history has the
longest time to finished with 9,6 second per task. This is because there is no
filter function applied on the page Search. Based on the result of Time-Based
efficiency, The Ergo-Bundling System categorized as Quite effective.
Application of the Data Mining Method is supported by several previous studies
which state that system design using Data Mining can increase the productivity
of a business
2) Result of System Usability Scale
Questionnaire
The
selection of Usability test using SUS is based on statements from Sauro (2011)
about SUS, explained that:
a) SUS is reliable. Users respond consistently
to the scale items, and SUS has proven capable of detecting differences in
smaller samples compared to other questionnaires.
b) SUS is valid. This means the tool measures
what it is intended to measure.
c) SUS is not a diagnostic tool. It does not
tell you what makes a system usable or not. It cannot diagnose specific
elements that make the system usable.
d) SUS measures both the learnability and
usability (5 elements) of a system.
e) SUS scores have a moderate correlation with
performance testing of efficiency analysis.
Usability
test started with respondents rate each statement on a scale from 1 (Strongly
Disagree) to 5 (Strongly Agree). The SUS score is calculated based on the
responses to these questions to evaluate the overall usability of the system.
The score is calculated using the following formula:
SUS SCORE = ((R1 - 1) + (5 - R2) + (R3 - 1) + (5 - R4) + (R5 - 1) + (5 -
R6) + (R7 - 1) + (5 - R8) + (R9 - 1) + (5 - R10)) * 2.5 …...(Equation 1)
Based on the percentile score results of
the SUS with score 74, the Ergo-Bundling system categorized as "Good"
rating with an "Acceptable" Usability Level, which means that the
usability test results are acceptable in representing the system as "Good"
in Learnability, Memorability, Error, Efficiency, and Satisfaction.
Conclusion
This research paper discusses the challenges faced by retail
entrepreneurs in selling various food products within a specific period of time
and proposes an automated bundling system to overcome the problem. The system
uses data mining techniques, specifically market basket analysis, to understand
consumer behavior and identify patterns in purchasing products. The result of
Market Basket Analysis shows the recommendation product consists of package
with min Confidence level 40%, Package 1 Maffin, Donat, Arem_arem with price IDR9,000. Package 2 Arem_arem,
Tamarin Juice, Lemper with price IDR 10,000. Package 3 Arem_arem, Maffin and
Apemwith price IDR 8,500. Package 4 Misoa, Donut, Apem with price IDR 8,000.
And Package 5 Donut, Tamarin Juice, Arem-arem with price IDR 10,500. The study
also incorporates usability and ergonomics concepts to design a user-friendly
and efficient bundling system. The results show that the automated system is
“Quite effective” in predicting stock levels, generating bundling
recommendations, and improving Efficiency, Therefore, results of SUS explain that
Ergo-Bundling system is “Good” and Acceptable to represent the usability of
elements Learnability, Memorability, Error and Satisfaction. This research
provides valuable insights for retail businesses in optimizing product sales
strategies and management.
BIBLIOGRAPHY
Braha, D. (2013). Data mining for
design and manufacturing: methods and applications. 3. Springer Science
& Business Media.
Fijriani, M., Hayati, U., Dwilestari, G.,
Rizki Rinaldi, A., & Faturrohman, F. (2023). Implementasi Market Basket
Analysis Pada Toko Retail Menggunakan Algoritma Apriori. Kopertip : Jurnal
Ilmiah Manajemen Informatika Dan Komputer, 7(1).
https://doi.org/10.32485/kopertip.v7i1.252
Gridach, M. (2020). Hybrid deep neural
networks for recommender systems. Neurocomputing, 413.
https://doi.org/10.1016/j.neucom.2020.06.025
Griva, A., Bardaki, C., Pramatari, K.,
& Papakiriakopoulos, D. (2018). Retail business analytics: Customer visit
segmentation using market basket data. Expert Systems with Applications,
100, 1–16.
Guo, X., Zheng, S., Yu, Y., & Zhang,
F. (2021). Optimal bundling strategy for a retail platform under agency
selling. Production and Operations Management, 30(7), 2273–2284.
Guo, Y., Wang, N., Xu, Z.-Y., & Wu, K.
(2020). The internet of things-based decision support system for information
processing in intelligent manufacturing using data mining technology. Mechanical
Systems and Signal Processing, 142, 106630.
Ha, J., Kambe, M., & Pe, J. (2011).
Data Mining: Concepts and Techniques. In Data Mining: Concepts and
Techniques. https://doi.org/10.1016/C2009-0-61819-5
Halim, S., Octavia, T., & Alianto, C.
(2019). Designing facility layout of an amusement arcade using market basket
analysis. Procedia Computer Science, 161.
https://doi.org/10.1016/j.procs.2019.11.165
Hameli, MSc. K. (2018). A Literature
Review of Retailing Sector and Business Retailing Types. ILIRIA
International Review, 8(1). https://doi.org/10.21113/iir.v8i1.386
Harshali, P. (2023). Enhancing Retail Strategies
through Apriori, ECLAT& FP Growth Algorithms in Market Basket
Analysis. International Journal on Recent and Innovation Trends in
Computing and Communication, 11(9), 3831–3838. https://doi.org/10.17762/ijritcc.v11i9.9637
Korfiatis, N., Stamolampros, P.,
Kourouthanassis, P., & Sagiadinos, V. (2019). Measuring service quality
from unstructured data: A topic modeling application on airline passengers’
online reviews. Expert Systems with Applications, 116.
https://doi.org/10.1016/j.eswa.2018.09.037
Kurniawan, A., & Suwaryo, N. (2023).
Analysis of the Apriori Algorithm for Enhancing Retail Product Staple Sales
Recommendations. International Journal Software Engineering and Computer
Science (IJSECS), 3(3). https://doi.org/10.35870/ijsecs.v3i3.1877
Kutuzova, T., & Melnik, M. (2018).
Market basket analysis of heterogeneous data sources for recommendation system
improvement. Procedia Computer Science, 136.
https://doi.org/10.1016/j.procs.2018.08.263
Lagorio, A., & Pinto, R. (2021). Food
and grocery retail logistics issues: A systematic literature review. Research
in Transportation Economics, 87.
https://doi.org/10.1016/j.retrec.2020.100841
Leedy, P. D., & Ormrod, J. E. (2018).
Practical research. Planning and design. Planning and design (11th ed.). Journal
of Applied Learning & Teaching, 1(2).
Lessmann, S., Haupt, J., Coussement, K.,
& De Bock, K. W. (2021). Targeting customers for profit: An ensemble
learning framework to support marketing decision-making. Information
Sciences, 557. https://doi.org/10.1016/j.ins.2019.05.027
Ngai, E. W. T., Xiu, L., & Chau, D. C.
K. (2009). Application of data mining techniques in customer relationship
management: A literature review and classification. Expert Systems with
Applications, 36(2), 2592–2602.
Nurmayanti, W. P., Sastriana, H. M.,
Rahim, A., Gazali, M., Hirzi, R. H., Ramdani, Z., & Malthuf, M. (2021).
Market Basket Analysis with Apriori Algorithm and Frequent Pattern Growth
(Fp-Growth) on Outdoor Product Sales Data. International Journal of
Educational Research & Social Sciences, 2(1).
https://doi.org/10.51601/ijersc.v2i1.45
Pearlmutter, D., Theochari, D., Nehls, T.,
Pinho, P., Piro, P., Korolova, A., Papaefthimiou, S., Mateo, M. C. G.,
Calheiros, C., Zluwa, I., Pitha, U., Schosseler, P., Florentin, Y., Ouannou,
S., Gal, E., Aicher, A., Arnold, K., Igondová, E., & Pucher, B. (2020).
Enhancing the circular economy with nature-based solutions in the built urban
environment: Green building materials, systems and sites. Blue-Green
Systems, 2(1). https://doi.org/10.2166/bgs.2019.928
Qisman, M., Rosadi, R., & Abdullah, A.
S. (2021). Market basket analysis using apriori algorithm to find consumer
patterns in buying goods through transaction data (case study of Mizan
computer retail stores). Journal of Physics: Conference Series, 1722(1).
https://doi.org/10.1088/1742-6596/1722/1/012020
Roggeveen, A. L., Nordfält, J., &
Grewal, D. (2016). Do Digital Displays Enhance Sales? Role of Retail Format
and Message Content. Journal of Retailing, 92(1).
https://doi.org/10.1016/j.jretai.2015.08.001
Roggeveen, A. L., & Sethuraman, R.
(2020). Customer-Interfacing Retail Technologies in 2020 & Beyond: An
Integrative Framework and Research Directions. In Journal of Retailing
(Vol. 96, Issue 3). https://doi.org/10.1016/j.jretai.2020.08.001
Sarker, I. H., Kayes, A. S. M., &
Watters, P. (2019). Effectiveness analysis of machine learning classification
models for predicting personalized context-aware smartphone usage. Journal
of Big Data, 6(1), 1–28.
Schneider, F., & Eriksson, M. (2020).
Food waste (and loss) at the retail level. In Routledge Handbook of Food
Waste. https://doi.org/10.4324/9780429462795-10
Sulianta, F., Madsu, Y. M., Syukriyah, Y.,
& Fahrezi, M. M. (2023). Konsumen Sebagai Co-Creation untuk Menentukan
Strategi Bisnis Menggunakan Algoritma Apriori pada Industri Retail Skala
Internasional. Jurnal Sistem Dan Teknologi Informasi (JustIN), 11(3).
https://doi.org/10.26418/justin.v11i3.67377
Vidhya, R. V., Tushar, S., & Swapnil,
W. (2019). A Review on Online Super-market Models And Customer
Interpretations. Journal of Emerging Technologies and Innovative Research.
Copyright
holder: |
First
publication right: Syntax Literate: Jurnal Ilmiah Indonesia |
This
article is licensed under: |