1. Introduction
Soft set theory provides a robust and flexible framework for addressing problems characterized by uncertainty, ambiguity, or incomplete information. Unlike conventional set theories, which often struggle with imprecise data, soft set theory offers a more nuanced approach by associating elements of a universal set with a parameter set. This association creates a collection of sets that effectively captures the variability and imprecision inherent in real-world problems. First proposed by Molodtsov in 1999 [
1] as a generalization and alternative to fuzzy set theory and fuzzy logic, soft set theory has since been extensively refined and expanded. Its capacity to model and manage complex decision-making scenarios where information is partially known or inherently vague has established it as a powerful tool across a wide range of disciplines, including artificial intelligence, economics, engineering, and data science.
Molodtsov developed soft set theory to fill the gaps left by classical and fuzzy set theories, which often struggled to handle complex uncertainty and ambiguity in practical situations. Soft set theory offered a pioneering approach to problems involving imprecise or incomplete information by introducing a more flexible and generalized framework. Following its introduction, the theory garnered widespread attention, leading to an acceleration of mathematical research and the exploration of its applications across numerous disciplines. In computer science, soft sets proved helpful for handling data uncertainty, while in decision theory, they provided enhanced models for complex, multi-criteria decision-making processes. Similarly, fields like data mining and artificial intelligence benefited from their capacity to process ambiguous information more effectively than traditional methods. Economists and biologists also found value in the theory, using it to model uncertainty in economic systems and biological data. Consequently, soft set theory has evolved into a versatile tool for addressing the intricate challenges of uncertain environments.
In the 2000s, soft set theory saw significant growth and refinement as researchers expanded its foundational concepts and explored new applications. Early on, Maji et al. [
2] showcased practical applications for soft set theory, and the following year [
3], they introduced essential concepts such as equality and subset relations within this framework. Chen et al. [
4] worked on simplifying soft sets by reducing the number of parameters needed, aiming to streamline the computational aspects of soft sets. Ali et al. [
5] added new operations between soft sets, further expanding the theory’s scope. Kong et al. [
6] introduced the concept of normal parameter reduction, along with algorithms that made this approach more applicable in decision-making processes using soft sets. In 2010, Cagman and Enginoglu [
7] reexamined key operations between soft sets, working on the product of soft sets and introducing the uni-int decision-making method, which enhanced decision processes under uncertain conditions. The theory continued to evolve in the following years: Peng and Yang, in [
8], developed algorithms based on interval-valued fuzzy soft sets for multi-criteria decision-making in stochastic environments. By 2019, their work extended into the use of inverse fuzzy soft sets in decision-making [
9], further solidifying soft set theory as a valuable tool for tackling complex, real-world problems.
Soft sets provide an effective mathematical model for decision-making processes focused on selecting the best option due to their ability to express objects that provide parameters within a soft set. This structure has been successfully utilized in numerous studies aimed at eliminating uncertainties. Recent works [
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25] have shown the importance of soft sets as a tool in managing uncertainty and decision-making processes.
Since 2020, studies on soft set theory have made significant progress, particularly in uncertainty management and multi-criteria decision-making (MCDM) problems. During this period, soft set theory has been expanded, and hybrid models have been developed. Structures like fuzzy soft sets and interval-valued fuzzy soft sets have been effective in optimizing decision-making processes.
Additionally, research on filling missing data and parameter reduction has allowed soft sets to be used more efficiently with large datasets. These studies have developed various algorithms for predicting missing data, enabling decision-making processes to proceed despite this missing information.
These developments demonstrate that soft set theory is a flexible and powerful tool in decision sciences and uncertainty management.
While foundational methods, such as the algorithm proposed by Maji and Roy [
3], provide a viable approach to decision-making with soft sets, they can suffer from a loss of sensitivity due to the summation process, potentially masking nuanced differences between alternatives. The inherent limitations of binary membership values in classical soft set decision-making have been noted in the literature. Some approaches, for instance, address this issue by redefining the membership concept to be relational [
14]. Our work also addresses this fundamental challenge but introduces a distinct mathematical formulation through the
function. This function not only captures relational aspects but is embedded within a framework that allows for its quantitative validation. Unlike conceptual models, we empirically demonstrate the effectiveness of our approach in enhancing decision granularity using the Gini Index and provide a comprehensive computational and sensitivity analysis, offering a robust and testable solution to the problem. We develop and present the Relational Membership Value Calculation (RMVC), a novel algorithm that offers a more granular and sensitive scoring mechanism. The primary contribution of this work lies not only in the enhanced theoretical framework but also in its practical implementation as a fully automated, open-source computational tool, thereby increasing the transparency, reproducibility, and applicability of soft set theory in real-world decision-making scenarios. The enhanced granularity and robustness of the RMVC algorithm make it a promising component for integration into larger intelligent systems. Specifically, its ability to provide nuanced rankings could significantly improve the reliability of AI-driven applications such as multi-criteria expert systems, medical diagnostic tools, and personalized recommender systems, where ambiguity in decision-making can lead to suboptimal outcomes. The remainder of this paper is organized as follows:
Section 2 reviews the fundamental concepts of soft set theory.
Section 3 introduces the RMVC algorithm in detail, presents its core relational membership function,
, and provides a rigorous analysis of its theoretical properties, including computational complexity and robustness to data sparsity.
Section 4 is dedicated to the validation of our algorithmic framework, where we first conduct a sensitivity analysis to test the model’s stability under perturbation.
Following this,
Section 5 grounds our theoretical claims by applying the RMVC framework to an illustrative decision-making scenario. Within the context of this real-world problem, we use the Gini Index to provide quantitative evidence that our algorithm achieves higher granularity compared to established methods.
Finally,
Section 6 concludes this paper with a summary of our findings and suggests directions for future research.
2. Preliminaries
This section presents the basic definitions and results of fuzzy sets, soft sets, and inverse soft sets required in the following sections. Detailed explanations of soft sets and inverse soft sets can be found in [
5,
7,
26]. To improve clarity and ensure consistency in mathematical notation, we have made some adjustments to the definitions and representations found in the literature.
Definition 1.
Let be a universal set. A fuzzy set F on is characterized by a membership function , where for each , the value indicates the degree to which x belongs to F. This fuzzy set is denoted as follows: Here, is called the membership function of , and the value is called the grade of membership of .
Definition 2.
Assume that is a set of elements and E is a set of parameters. An ordered pair is called a soft set over , where and Φ is a mapping given bywhere is the power set of . In this study, we specifically focus on the case where , and the decision process involves only a single decision-maker.
A soft set
over
can be represented as the set of ordered pairs
Definition 3.
Assume that is a set of elements and E is a set of parameters. An ordered pair (,ψ) is called an inverse soft set (ISS) over E, where ψ is a mapping given by
An inverse soft set (
,
) over
E can be represented by the set of ordered pairs
As established by Cetkin [
26], each soft set can be uniquely represented as an inverse soft set, and furthermore, each fuzzy set generates an inverse soft set.
3. A Different Approach to Decision-Making Under Uncertainty via Python Code
In this section, we redefine the relational and inverse relational membership functions to enhance sensitivity and accuracy in decision-making. Furthermore, to obtain more exact and reliable results in soft set theory, we developed a program called the “Relational Membership Value Calculator” (RMVC) using Python 3.13.3.
Definition 4.
Let be a universal set and let be a set of parameters. Let be a soft set over . For each , and for any , the relational membership value of k with respect to is defined via the mapping This mapping, called the relational membership function, assigns a membership value to each element k not belonging to and given bywhere , , and . Here, for , the Kronecker delta function is defined as Thus, the relational membership function measures how strongly an element k (not originally in ) is connected to the elements of through other parameters.
Theorem 1.
The proposed relational membership function, , is consistent and monotonic.
Proof. The proof is structured in two parts, demonstrating the properties of consistency and monotonicity based on the function’s definition.
1. Consistency: This property requires the function to produce values within a predictable, well-defined range. We will prove that the function is always bounded within the interval .
The function is defined as
Lower bound: The Kronecker delta function,
, can only take values of 0 or 1. Therefore, the double summation in the numerator must be greater than or equal to zero. The denominator, which consists of the cardinality of a set (
) and the number of other parameters (
), is always a positive integer (since
). Thus, the entire expression is non-negative:
Upper bound: To find the maximum possible value, consider the numerator. The outer sum runs over
parameters in
, and the inner sum runs over
elements. In the most extreme case,
would be 1 for every single term in the sum. The maximum value of the double summation is therefore the total number of terms, which is
. Substituting this maximum value into the function gives
Since the numerator can never exceed this value, we have
Combining these bounds confirms that , proving that the function is consistent.
2. Monotonicity: In the context of this function, monotonicity relates specifically to the co-occurrence relationship. The property implies that as the relational connection of an element k to the set increases, its membership value must also increase. The relational connection is measured by the sum of values.
Let the total co-occurrence count be
. The function can be written as
The denominator is a positive constant for a given . Therefore, is directly proportional to the co-occurrence count . If any event occurs that increases the relational connection (e.g., for a previously non-matching pair under a parameter , a change causes them to co-occur, so changes from 0 to 1), the value of increases. Consequently, the value of also increases.
This direct proportionality demonstrates that the function is monotonic with respect to the strength of the relational connections, which is the core concept of the proposed function. □
Example 1.
Let be a set of elements and be a set of parameters. If , then the corresponding soft set is given by We now calculate all the relational membership values for this soft set.
First we consider the parameter (for ). The only element in but not in is . So the value is calculated as follows: The remaining relational values were calculated in a similar manner. The complete set of results is as follows: These results can be compiled into a matrix for a more comprehensive representation as shown in Table 1. The discriminative power of the RMVC algorithm is inherently linked to the relational richness of the input data. The core of the method, the relational membership value , quantifies the strength of an alternative’s relationship with other alternatives for each parameter set. This calculation relies on the co-occurrence of alternatives as measured by the function.
In highly sparse data environments—where parameter sets () contain few members and have minimal overlap—the opportunities for such co-occurrences diminish significantly. Consequently, the numerator of the formula, representing the sum of relational interactions, tends toward smaller values. For any non-member of a given parameter set, its calculated relational score will therefore converge towards zero.
This leads to a critical boundary condition. As data becomes sparser, the resulting matrix becomes increasingly polarized and dominated by binary-like values: “1” for members and values approaching “0” for non-members. In such a state, the nuanced, relational membership values that give the RMVC method its depth are reduced, and the algorithm’s behavior begins to approximate a simple inclusion–exclusion check rather than a rich relational analysis. While the algorithm remains structurally sound, its ability to differentiate finely between non-members is consequently reduced. This highlights an important dependency: the full potential of the RMVC method is realized when the input soft set is sufficiently dense to generate a strong relational signal.
To facilitate the practical application of our method (algorithmic framework), we provide a reference implementation in Python. The following code snippet illustrates the core of the RMVC algorithm, especially demonstrating how user input for the universal set
is processed. The complete open-source program is publicly available for researchers and practitioners around the globe who are interested in further details or direct application. The repository can be accessed at
https://github.com/DrDayioglu/RMVC.git (20 August 2025).
The initial step involves defining the universal set . The following Python function is used to capture this input from the user:
Next the parameter set E and the corresponding subsets of must be defined, since the soft set mapping, , assigns a subset of (an element of the power set, ) to each parameter in E; the program must first be capable of generating the power set of to construct these relationships.
Next we define the sum part of the delta function, which is crucial for this newly presented method.
To ensure data consistency and improve readability, the input data is first converted into a list of sets using the following code snippet:
Next, each parameter set is mapped to a unique identifier to facilitate programming access. This is achieved with the following code.
The core calculation is then executed. The following loop iterates through each parameter set to complete the final relational membership values.
The complete output generated by the
“Relational Membership Value Calculator” (
RMVC) for this example is given in
Figure 1.
3.1. Computational Complexity Analysis
To evaluate the scalability of the RMVC algorithm for large-scale soft sets, we analyze its theoretical time complexity. Let be the number of elements (candidates), be the number of parameters (criteria), and be the average cardinality of a parameter set .
The algorithm’s complexity is determined by three primary computational steps:
Calculation of the membership matrix (): This is the most computationally intensive step. To calculate a single value, the algorithm iterates through all other parameters and, for each, iterates through the elements of . This results in a complexity of approximately for a single entry. Since there are entries to compute, the total complexity for this step is .
Calculation of final scores: This step involves summing the n columns of the matrix, which has a complexity of .
Selection of the optimal candidate: Finding the maximum value among n scores takes time.
The overall time complexity is determined by the most demanding step, which is the calculation of the membership matrix. Therefore, the computational complexity of the RMVC algorithm is approximately .
This analysis indicates that the algorithm’s performance scales linearly with the number of candidates (n) but quadratically with the number of parameters (m). This suggests that the RMVC is well-suited for problems with a very large number of candidates but may become computationally expensive for applications involving an extremely high number of evaluation criteria.
3.2. Sensitivity Analysis
A crucial attribute of any clustering algorithm is its robustness against small perturbations in the input data. For the RMVC, the input consists of the parameter sets . Sensitivity analysis for our model, therefore, involves examining how the final clustering output reacts to minor changes within these sets, such as the addition or removal of a single element.
3.2.1. Relational Similarity Matrix
The relational membership function, , produces a relational membership value that indicates the association of an element u with a specific parameter set . To obtain a complete relational profile of an element, we collect these values across all parameter sets. We define this profile as the Relational Membership Vector.
For any element
, its membership vector
is an
-dimensional vector where each component is the element’s relational membership value with respect to one parameter set:
Clustering requires us to measure the pairwise similarity between any two elements,
u and
v. A natural way to achieve this is by comparing their respective membership vectors,
and
. We define a similarity matrix
S, where the entry
is computed by measuring the average L1-distance between these vectors. A smaller distance implies higher similarity. The formula is defined as
Here, indicates that the elements have identical membership vectors (perfect similarity), while a value approaching 0 indicates high dissimilarity. This matrix S provides the foundational input for the final graph-partitioning step and for the sensitivity analysis that follows.
Let us consider a small perturbation in a single parameter set , resulting in a new set . This change propagates through our model in a clear sequence:
- 1.
Change in membership values: A perturbation in a single parameter set, say from to , propagates its effect to all relational membership values across the system, not just one. This happens in two ways:
Direct effect (on ): The membership function associated directly with the changed parameter, , is the most impacted. Both its denominator, which contains the term , and its numerator, which sums over , are altered. This requires a full recalculation of for all elements k.
Indirect effect (on for all ): The relational membership value for any other parameter, , is also affected. Its calculation involves a summation over all parameters . Since , this collection of parameters necessarily includes . Therefore, the term within the sum will now use the perturbed set . This change in a single delta term will cause a small but non-zero change in the value of , rippling the effect of the initial perturbation throughout the entire system.
- 2.
Change in relational similarity matrix: Since the relational similarity matrix S is constructed from all values, the changes described above will lead to a modified similarity matrix .
- 2.
Potential change in final partition: The final clustering partition is derived from S. A change from S to may or may not be significant enough to alter the final partition .
The relational nature of the function provides a degree of inherent robustness. The membership value is an average of co-occurrence information calculated over all other parameters () and all elements within a set. Therefore, a single change (one value flipping from 0 to 1, or the addition/removal of one element) has a diluted effect on the final value rather than causing a catastrophic shift.
To quantitatively measure this sensitivity, one can perform the following analysis:
Let S be the original similarity matrix computed with the initial parameter sets .
Introduce a single perturbation, for instance, by adding one element to a single set, to get .
Compute the new similarity matrix .
The magnitude of the change can be measured by the Frobenius norm of the difference between the matrices: . A lower sensitivity score indicates higher robustness.
Furthermore, one can compare the final clustering partitions (from S) and (from ) using an external validation index like the Adjusted Rand Index (ARI). An ARI score of 1 would signify that the small perturbation had no effect on the final clustering result, demonstrating perfect stability for that specific change.
This analytical framework allows for a systematic evaluation of the algorithm’s stability, confirming that the RMVC is not overly sensitive to minor variations in its input parameterization, a desirable trait for real-world applications.
3.2.2. A Practical Case Study of Sensitivity
To provide a concrete illustration of the theoretical sensitivity framework, we apply it step-by-step to the setup from Example 1. This practical analysis demonstrates how to compute the similarity matrix, introduce a change to the system, and quantitatively measure the impact on the final clustering outcome.
Step 1: Establishing the Baseline with the Original Data
Our first task is to establish a baseline result using the original, unperturbed data from Example 1. This involves computing the relational similarity matrix, S, using the formula defined in the previous section.
To ensure complete transparency and reproducibility, this section provides a detailed, step-by-step calculation of each unique entry in the similarity matrix S for the setup in Example 1. This detailed demonstration is intended to showcase the mechanics of the methodology once. For the sake of abbreviation, in subsequent sections, similar intermediate calculation steps will be omitted and only the final results will be presented.
Step 1: Constructing Relational Membership Vectors.
First, we construct the relational membership vectors, , for every element . All subsequent calculations are based on these vectors:
Step 2: Pairwise similarity calculations.
We now calculate the similarity
for each pair of elements using the following formula:
To illustrate the process, we give a detailed calculation for a few representative entries:
The remaining entries of the similarity matrix S were calculated using the same procedure. This process is repeated for all pairs. The completed results form the final similarity matrix, which is presented in the next step.
Step 3: Assembling the final similarity matrix.
Compiling all the calculated similarity values yields the final similarity matrix,
:
S |
| 1 | 2 | 3 | 4 | 5 |
1 | 1 | 0.750 | 0.917 | 0.555 | 0.750 |
2 | 0.750 | 1 | 0.667 | 0.528 | 1 |
3 | 0.917 | 0.667 | 1 | 0.639 | 0.667 |
4 | 0.555 | 0.528 | 0.639 | 1 | 0.528 |
5 | 0.750 | 1 | 0.667 | 0.528 | 1 |
Applying a graph-partitioning method to this matrix, we can derive the initial clustering partition, . For this analysis, we define a “strong tie” as any connection with a similarity score greater than . An analysis of the strong ties in matrix S reveals the following:
The similarity is 0.917 (>0.85), indicating a strong tie. This forms the core cluster: .
The similarity is 1 (>0.85), also indicating a strong tie. This forms a second distinct cluster: .
No other similarity values exceed the threshold. Element “4” does not form a strong tie with any other element and is therefore considered an outlier.
This analysis yields the following initial partition
:
This partition consists of two distinct, cohesive clusters and a single outlier element.
Step 2: Defining and Applying the Perturbation
Next, a minimal, localized perturbation is introduced to the system to test its stability. We modify the parameter set by removing the element “4”. The new, perturbed set is denoted as . All other parameter sets remain unchanged.
Step 3: Calculating the Post-Perturbation Outcome
Following this change, the relational membership vectors (
) and, consequently, the entire similarity matrix must be recalculated. After performing all calculations with the perturbed data, we compute the new similarity matrix,
.
|
| 1 | 2 | 3 | 4 | 5 |
1 | 1 | 0.750 | 1 | 0.472 | 0.750 |
2 | 0.750 | 1 | 0.750 | 0.528 | 1 |
3 | 1 | 0.750 | 1 | 0.472 | 0.750 |
4 | 0.472 | 0.528 | 0.472 | 1 | 0.528 |
5 | 0.750 | 1 | 0.750 | 0.528 | 1 |
Step 4: Quantitative Analysis and Interpretation
The final step is to quantitatively compare the pre- and post-perturbation results to measure the algorithm’s stability.
Frobenius norm: To measure the overall change in the similarity structure, we calculate the Frobenius norm of the difference between the original matrix
S and the perturbed matrix
:
The low value indicates that the perturbation did not cause a drastic shift; the overall relational structure between elements remained highly stable.
Adjusted Rand Index (ARI): To measure the effect on the final partition, we compare the original partition,
, with the new partition,
, derived from
. Applying the same partitioning logic (a similarity threshold
) to
yields an identical partition:
The new partition is identical to the original. The ARI score, which measures the agreement between two data clusterings, confirms this perfect stability:
An ARI score of 1.0 signifies that the final output was completely unaffected by the perturbation. This practical demonstration validates the robustness of the RMVC algorithm, showcasing its resilience against minor variations in the input data.
4. A New Decision-Making Approach
In this section, building on the new technical formulation for soft sets presented in the third section, we introduce a new algorithmic method for decision-making in uncertain environments, basically the outlines of our Python code for the
RMVC. We develop the Algorithm 1 that applies our new formulation to this context.
Algorithm 1: Determine the best choice in a given SS. |
Input: Let be the set of elements, let be the set of parameters. Given a soft set , where . Step 1: Computing all relational membership values using the function as defined in the new formula.
Step 2: Constructing the matrix which contains the membership values derived from where , :and is defined asHere, represents the membership function.
Step 3: Calculating the score, of elements using the following formula:Step 4: Selecting the element r with the highest score, namely, the one for which is maximized.
|
A fundamental aspect of the proposed framework is the choice of using a raw sum for the final score aggregation (). This decision was made deliberately, prioritizing interpretability and parameter-free objectivity in this foundational presentation of the RMVC model while acknowledging that more complex aggregation methods exist.
4.1. Score Resolution and Tie-Breaking Mechanism
A critical attribute of any decision-making algorithm is its ability to produce a clear and decisive ranking and and to minimize the occurrence of ties. The proposed RMVC scoring mechanism holds a significant theoretical advantage in this regard compared to traditional soft set approaches.
Many conventional methods in soft set theory, such as the one proposed by Maji and Roy, rely on scoring systems based on integer counts. In such systems, a candidate’s score is typically the total number of criteria they satisfy. This restricts the set of possible scores to a small, discrete set of integers, . When the range of possible outcomes is limited, the probability of different candidates achieving the exact same score is inherently higher.
In contrast, the RMVC methodology is based on the relational membership function, , which produces rational numbers as values. The final score for a candidate, , is the sum of these granular values. This process maps the candidates to a much larger and denser set of possible scores. Because the image set of the scoring function is significantly larger and denser, the probability of a coincidental tie between two distinct candidates is substantially reduced. While an absolute guarantee against ties is not possible, the structural nature of the RMVC renders them a rare occurrence.
To ensure the algorithm is robust and deterministic even in the rare case of a tie, the final selection is formally defined as a two-tiered process:
Primary criterion: The primary selection criterion is the relational score
. The optimal candidate,
, is the one that maximizes the score:
Secondary (tie-breaking) criterion: If two or more candidates share an identical maximal score, a secondary tie-breaking criterion is applied. For this, we use a simpler choice value, , representing the total count of parameter sets to which candidate belongs. The candidate with the higher value among the tied candidates is chosen. If the tie persists even after this secondary criterion, the candidates are considered equally optimal and can be presented as a set of co-optimal choices.
Example 2.
Let be a set of candidates and be a set of evaluation criteria for a specific job position. For each , the element in E represents a required attribute, such as “higher education”, “experience”, “foreign language proficiency”, etc.
The decision-making problem is to select the optimal candidate for the position.
The soft set is defined by the following: This can be expressed as the set of ordered pairs: We proceed by calculating all relational membership values for , which will then be compared with the output of the RMVC software implementation.
Since candidates do not satisfy the first criterion , their corresponding relational membership values are calculated as Similarly, since the candidates do not satisfy the second criterion , their corresponding relational membership values are given by The complete set of remaining relational memberships values is as follows: This completes Step 1.
Following Step 2, we prepare the matrix as In Step 3, the final scores are calculated for each candidate: In Step 4, the optimal choice is determined by selecting the candidate with the maximum score. In this case, is the maximum value, indicating that candidate 3 is the optimal choice.
These results were cross-validated using our software implementation. Feeding the same input data into the RMVC tool yielded identical scores, as shown in Figure 2. Although the general sensitivity analysis framework discussed in
Section 3 already validated the robustness of the RMVC algorithm, we will now perform a similar perturbation on the decision-making problem of Example 3 to reiterate and reinforce this principle. The objective is to validate whether the optimal choice, candidate 3, maintains its rank under a minor perturbation. To demonstrate a case of stability, we introduce a specific perturbation and analyze its consequences.
As a test case, we assume that candidate 5 now satisfies the criterion
(“higher education”), modifying the parameter set to
. This change requires a full recalculation of relational scores.
Table 2 compares the original scores, all with the new scores generated by the RMVC software following this perturbation.
As the results demonstrate, the perturbation had a significant and appropriate impact on the score of the directly targeted candidate (candidate 5), increasing it from 2.55 to 3.87. Furthermore, due to the model’s relational nature, the scores of most other candidates were also slightly adjusted.
Despite these adjustments, the ranking of the top candidates remained stable. Candidate 3 retained the highest score, and thus the optimal decision was unchanged. This analysis confirms that while the model is appropriately responsive to minor changes, selecting the optimal candidate exhibits strong robustness against this type of data variation.
4.1.1. ComparisonAnalysis
To contextualize the performance of our method, we compare its results with those obtained from the traditional algorithm proposed by Maji and Roy [
3], using the decision-making problem from Example 2. The algorithm from [
3] produces the following candidate ranking, which includes several ties:
In contrast, our RMVC algorithm yields a fully resolved ranking with no ties:
This immediate result demonstrates that the RMVC method provides a more granular and decisive ordering of candidates.
In this study, we define granularity as the ability of a scoring method to discriminate between alternatives by reducing ties and producing a more finely distributed score set. To further deepen the analysis of score granularity, we now compare the structural characteristics of the score distributions generated by the RMVC method and the traditional Maji–Roy approach. For this purpose, we treat the set of scores from each method as a discrete probability distribution by normalizing them (i.e., dividing each score by the total sum). We then compute the Gini Index for each resulting distribution to provide a standardized measure of concentration. Unlike the conventional interpretation where higher Gini values are associated with greater inequality, our framework interprets a lower Gini Index as an indicator of higher granularity, since it reflects a less concentrated and more evenly spread distribution of scores.
This analysis is conducted on the decision-making problem of Example 2, as it provides a set of scores for each candidate.
4.1.2. Analysis of the Maji–Roy (Choice Value) Distribution
The traditional method produces scores based on the count of criteria satisfied by each candidate:
Raw scores (): The set of scores for the seven candidates is .
Normalization: The sum of these scores is
. We normalize each score by dividing by 19 to get a probability distribution:
Gini Index calculation: We first sort the normalized probabilities:
Applying the Gini Index formula to this distribution yields
4.1.3. Analysis of the RMVC Distribution
Next, we perform the same analysis on the scores generated by our proposed RMVC method:
Raw scores (): As presented in Example 2, the scores are
Normalization: The sum of these scores is
. We normalize to get the probability distribution:
Gini Index calculation: We first sort the normalized probabilities:
Applying the Gini Index formula to this distribution yields
4.1.4. Conclusion of Comparative Analysis
The comparative results are summarized below in
Table 3.
The results provide consistent quantitative evidence supporting our central claim. The Gini Index for the RMVC scores (0.192) is lower than that of the traditional count-based method (0.210). Even in this small-scale example, this represents a relative reduction in score concentration of approximately 9%. This suggests that the relational membership function maps candidates to a more distributed score space, structurally reducing the probability of ties and enabling a more discriminative basis for decision-making. While the effect size is modest in absolute terms, its presence even in a minimal test case indicates the method’s potential for enhanced granularity in broader applications.
Example 3.
To demonstrate the flexibility of the RMVC method, we now analyze the inverse problem from Example 2. In this inverse soft set (ISS) formulation, the roles of the sets are reversed. now represents the set of candidates, and represents the set of criteria (e.g., “higher education”, “experience”, “foreign language”, etc.). The decision-making problem is now to determine which candidate is most strongly supported by the available pool of criteria.
If , then the ISS is written by Utilizing the RMVC allows us to derive the complete set of relational membership values. When we organize these values into a matrix format as outlined in our algorithm, we subsequently obtain the matrix presented below. As expected from the model’s asymmetric nature, the resulting unified membership matrix is not the transpose of the matrix from the forward problem. The final scores for each candidate (now parameters) are calculated as The maximum score is , indicating that candidate is the optimal choice in the inverse formulation. This result suggests that from the perspective of the available criteria, candidate has the strongest overall profile.
4.2. A Small Note on the Asymmetry of the Relational Model
In classical soft set theory, which uses a binary incidence matrix (where entries are 1 or 0), the matrix of the inverse soft set is indeed the transpose of the forward soft set’s matrix. This is because the underlying relationship is a simple, symmetric “belongs to” check.
However, the RMVC method does not operate on this simple incidence matrix. Instead, it generates a relational membership matrix, , where each entry is a calculated, real-valued score. The calculation of this score is fundamentally asymmetric. The formula computes the membership of a candidate with respect to a parameter by analyzing the co-occurrence patterns of across the entire universe of other parameters ().
In an “inverse” relational problem—calculating the membership of a parameter with respect to a candidate —the roles of and E would be interchanged in the formula. The calculation would then be based on the co-occurrence patterns of across the universe of other candidates (). This would lead to a completely different set of calculations and a resulting matrix that is not a simple transpose of the former relational matrix.
This asymmetry does not impact model consistency; on the contrary, it is the source of the model’s novelty and power. It ensures that the “view” from the perspective of parameters (how candidates relate to a criterion) and the “view” from the perspective of candidates (how criteria relate to a candidate) are distinct and rich with contextual information. This is a deliberate and core feature of the relational model, not an inconsistency.
We now compare the ranking produced by our inverse RMVC analysis with the ranking from a traditional count-based approach [
3] for the same inverse problem. The count-based method, which relies on integer scores, results in a ranking with extensive ties:
In sharp contrast, the RMVC algorithm produces a fully resolved and granular ranking:
This comparison again highlights the discriminative power of the RMVC method. By moving beyond simple integer counts to a relational, real-valued scoring system, the algorithm effectively breaks ties and provides a more detailed and useful ordering. This enhanced granularity is a direct result of the model’s design and is critical for nuanced decision-making in complex scenarios.
5. Application to a Real-World Benchmark Case
To validate the practical efficiency of the proposed RMVC method, we apply it to a well-established benchmark problem from the MCDM literature. This approach not only demonstrates the algorithm’s applicability to a real-world scenario but also allows for a direct comparison of its performance against established methods.
5.1. Benchmark Dataset: The Automotive Selection Problem
The dataset used is the classic automotive selection problem presented by Yoon [
27] in his seminal paper “A Reconciliation among Discrete Compromise Solutions.” He sourced it from Fleischer’s book [
28]. This problem has since become a standard benchmark, used by numerous researchers to test and compare new decision-making methods. The problem involves ranking eight car models based on 16 different criteria.
The original decision matrix, as presented in Yoon [
27] (p. 284,
Table 2), is shown in
Table 4, below. This table includes the raw performance values for each car across all criteria, as well as the criterion weights assigned in the original study.
5.2. Data Preprocessing: From Raw Data to a Comprehensive Soft Set
The original dataset is extensive, comprising 16 distinct criteria. To provide a robust and challenging test for our RMVC method, we select a comprehensive subset of 12 criteria. This expanded set moves beyond a minimal selection and allows for a more nuanced analysis, testing our method against a rich and diverse set of attributes.
The goal of this selection is to create a balanced model covering the primary aspects of vehicle evaluation: long- and short-term economics, performance and efficiency, safety, comfort, and practicality. As our RMVC method does not use the criterion weights, demonstrating its effectiveness on a larger, more complex set of unweighted criteria is a key validation step.
The selected criteria, now grouped by their evaluation category, are as follows:
Economic factors (three criteria): We include the primary “List price” (C1) and the long-term “Operating cost” (C2) to cover different cost aspects. To complete the economic picture, we also include “Resale value” (C3), a low-weight benefit criterion.
Performance and efficiency (four criteria): We select both “City mileage” (C5) and “Highway mileage” (C6) for a full efficiency profile. For performance, we include the opposing metrics of “Acceleration” (C7) and “Braking” (C8), both of which are cost-type criteria requiring transformation.
Driving dynamics and comfort (three criteria): We include “Ride” (C9) for comfort and “Handling” (C10) for vehicle dynamics. To represent interior comfort, we add “Front-seat room” (C13).
Practicality and maintenance (two criteria): We include “Maintainability” (C4) and “Trunk space” (C12) as key indicators of a vehicle’s day-to-day usability.
This curated subset of 12 parameters provides a rigorous test case for the RMVC method. We now denote these parameters as
for consistency with our soft set notation. The raw data for this comprehensive subset is shown below in
Table 5.
To apply our RMVC method, all parameters must be of the benefit type (i.e., higher values are better). The four cost-type parameters () are converted into benefit values using a scaled reciprocal transformation ().
To improve the readability of the transformed data and ensure the values are of a comparable magnitude to the other criteria, we use scaling factors of 1000 or 100 in the transformation (i.e.,
or
). This linear scaling does not change the internal ranking of the alternatives within the criterion but makes the resulting table more intuitive. The benefit-type parameters remain unchanged. The resulting “benefit table,” which serves as the basis for our analysis, is presented in
Table 6.
5.3. Constructing the Soft Set
With all 12 parameters converted to a uniform benefit format as shown in
Table 6, we can now define the soft set parameter sets,
. To ensure maximum objectivity and methodological consistency, we employ a single, data-driven thresholding rule across parameters.
The “Above Average” rule: For each of the 12 parameters, we calculate the arithmetic mean of the benefit values for all eight alternatives. An alternative is considered to “possess” the parameter (i.e., belong to the set ) if its benefit value is strictly greater than the mean for that parameter. This unified approach eliminates subjective threshold-setting and grounds the entire preprocessing phase in the internal structure of the data.
For example, for the first parameter
(“Economical Price”), the arithmetic mean of the benefit values in
Table 6 is calculated as
. The alternatives with a score strictly greater than this mean are
. Therefore, the resulting parameter set is
Applying this same data-driven rule to the remaining 11 parameters yields the complete soft set. The results are summarized in
Table 7 below.
This transparent and objective preprocessing yields a well-defined soft set, which is represented in a binary matrix format in
Table 8. This matrix now serves as the final input for the RMVC algorithm.
5.4. Application of the RMVC Algorithm and Results
With the soft set representation of the problem established in
Table 8, we now apply the RMVC algorithm. The core of this method is the construction of the relational membership matrix,
.
Unlike a simple binary value, each entry in this matrix, , is a normalized score between 0 and 1. It quantifies how strongly an alternative is associated with a parameter , considering its relationship with all other alternatives. For alternatives that are members of the parameter set , this value is 1.0. For non-members, the value is the calculated relational score.
The resulting relational membership matrix,
, based on RMVC results, is presented in
Table 9.
The final score,
, for each alternative is calculated by summing the values in its corresponding column of the matrix. A higher score indicates a more preferable alternative:
These calculations yield the final scores and the resulting RMVC ranking. To provide a comprehensive benchmark, we compare these results with two different TOPSIS rankings:
(1) A “focused TOPSIS” analysis using the exact same 12 criteria that we already calculated.
(2) The “original TOPSIS” ranking from Yoon (1987), which used all 16 criteria.
This multi-faceted comparison is presented in
Table 10.
5.5. Discussion of Final Results
The comprehensive comparison presented in
Table 10 provides a powerful validation of the RMVC method’s unique approach, revealing three different optimal choices depending on the methodology and the scope of the criteria.
TOPSIS’s sensitivity to scope: The TOPSIS method proves to be highly sensitive to the set of criteria. With all 16 criteria, the original study identifies A8 (Peugeot 504) as the best choice, likely due to its strengths in several high-weight criteria that were excluded from our 12-item set. However, when focused on our 12-criteria set, the optimal choice dramatically shifts to A6 (Cadillac Seville). This demonstrates how criterion selection and weighting can fundamentally alter the outcome of utility-based methods.
RMVC’s robustness and consistency: On the other hand, the most compelling finding is the stability of the result produced by the RMVC method. A key strength of the proposed methodology is its consistency across varying levels of analytical scope. Preliminary analyses using a more focused 6-criteria subset, the 16-criteria set itself, and the comprehensive 12-criteria analysis presented here, all identify A3 (Plymouth Duster) as the clear and unwavering optimal choice. This remarkable stability stems from the RMVC’s weight-agnostic and relational nature. The method does not seek a “utility champion” based on explicit weights but rather identifies a “consensus champion” based on overall profile balance. A3 consistently emerges as the most “well-connected” alternative that satisfies the broadest range of criteria, proving its robustness regardless of the analytical scope.
In conclusion, this final analysis powerfully illustrates the value of the RMVC method. While TOPSIS provides valuable insights based on pre-defined utility and weights, its results can be volatile. RMVC offers a more stable and robust alternative, identifying the most balanced and consensual choice by leveraging the intrinsic relationships within the data. This demonstrates its strength as a new and reliable decision-making tool.
6. Conclusions and Future Work
This paper addressed a fundamental challenge in decision-making under uncertainty: the information loss and ranking ambiguity inherent in classical soft set theory. We have put forth the Relational Membership Value Calculator (RMVC), an algorithmic framework designed to overcome these limitations by introducing a more nuanced, fine-grained approach to evaluating candidates. Our primary contributions and findings are as follows:
We introduced a unique function, , that evaluates the implicit connections between a candidate’s existing attributes and those it lacks, thereby preserving critical information that is otherwise discarded.
We provided a rigorous analysis of the algorithm’s theoretical properties, establishing its polynomial-time complexity and its robustness against data sparsity.
Through a comprehensive sensitivity analysis, we demonstrated the stability of the RMVC’s output rankings under data perturbations.
Using the Gini Index, we provided quantitative evidence that the RMVC framework achieves significantly higher granularity, leading to highly differentiated rankings with a drastically reduced likelihood of ties.
The implications of these findings are significant. The RMVC provides a more trustworthy foundation for automated and semi-automated decision-making by producing more reliable rankings with minimal ambiguity. This makes it a valuable tool for integration into next-generation AI-driven decision support and expert systems, where decision ambiguity can lead to critical failures. The provision of the algorithm as an open-source Python program further enhances its value to the scientific community by ensuring transparency and facilitating adoption.
Despite these promising results, we acknowledge several limitations that open avenues for future research. Our quantitative validation was based on illustrative case studies; future work should involve applying the RMVC to diverse, large-scale, real-world datasets to further assess its generalizability.
This work opens up several promising avenues for future investigation, which can be categorized into theoretical extensions, advanced aggregation methods, and enhancements for practical application.
A primary direction for future work involves extending the RMVC model from its current crisp framework to operate in more complex environments. Future work could focus on the following:
Fuzzy framework: The crisp function could be replaced by a fuzzy counterpart (e.g., using a t-norm). The development of a Fuzzy Relational Membership Vision Clustering (F-RMVC) algorithm is a compelling next step.
Advanced uncertainty models: Furthermore, the model can be adapted for even more sophisticated frameworks such as interval-valued fuzzy sets, intuitionistic fuzzy sets, or neutrosophic sets. Each extension would allow the algorithm to handle not just uncertainty but also concepts like indeterminacy and contradictory information, significantly broadening its applicability.
The current model uses a raw sum for score aggregation, a deliberate choice to establish a parameter-free baseline. Future work could investigate more advanced aggregation techniques to create a more flexible and adaptive model, including the following:
Weighted aggregation: Incorporating criterion weights would allow the model to reflect the varying importance of different parameters in real-world scenarios.
Entropy-based weighting: Information theory could be used to automatically assign higher weights to more “discriminative” criteria, providing a data-driven approach to weighting.
Future research can also focus on strengthening the model’s practical and theoretical reliability through the following:
Handling parameter redundancy: The current model assumes parameters are distinct. An advanced extension could involve an automated, in-built mechanism to account for highly correlated or redundant criteria, for example, by down-weighting the contribution of similar parameters.
Formalizing stability guarantees: While traditional error bounds are not directly applicable, the concept of robustness can be formalized. Future work could involve defining a distance metric between rankings (e.g., Kendall’s Tau) and then deriving probabilistic bounds on the stability of the output, further solidifying the model’s theoretical foundations.
An important objective for future work is to perform the formal comparative analysis theorized in this paper. This would involve using metrics to rigorously validate the enhanced granularity of the RMVC method, including the following:
Gini Index: A lower Gini Index for RMVC scores compared to traditional methods would provide quantitative proof of a more distributed score space and a reduced probability of ties.
Kullback–Leibler (KL) divergence: A high KL divergence value when comparing distributions would further support the argument that the RMVC provides a richer and more informative ranking.
Such an investigation would provide a standardized framework for comparing the resolution power of various decision-making algorithms. In conclusion, this work contributes a validated, transparent, and robust algorithmic tool to the field, paving the way for more precise and reliable data-driven decision-making in complex environments.