Solving the Max-Diversity Orthogonal Regrouping Problem by an Integer Linear Programming Model and a GRASP/VND with Path-Relinking Approach

Canale, Eduardo; Robledo, Franco; Sartor, Pablo; Stábile, Luis

doi:10.3390/sym14010018

Open AccessArticle

Solving the Max-Diversity Orthogonal Regrouping Problem by an Integer Linear Programming Model and a GRASP/VND with Path-Relinking Approach

¹

Facultad de Ingeniería, Universidad de la República, Montevideo 11300, Uruguay

²

IEEM Business School, Universidad de Montevideo, Montevideo 16000, Uruguay

^*

Author to whom correspondence should be addressed.

Symmetry 2022, 14(1), 18; https://doi.org/10.3390/sym14010018

Submission received: 15 October 2021 / Revised: 2 December 2021 / Accepted: 14 December 2021 / Published: 23 December 2021

(This article belongs to the Special Issue Variable Neighborhood Search for Symmetrical and Asymmetrical Optimization Problems)

Download Versions Notes

Abstract

:

Students from Master of Business Administration (MBA) programs are usually split into teams. In light of the generalistic nature of MBA programs, diversity within every team is desirable in terms of gender, major, age and other criteria. Many schools rotate the teams at the beginning of every term so that each student works with a different set of peers during every term, thus training his or her adaptation skills and expanding the peer network. Achieving diverse teams while avoiding–or minimizing—the repetition of student pairs is a complex and time-consuming task for MBA Directors. We introduce the Max-Diversity Orthogonal Regrouping (MDOR) problem to manage the challenge of splitting a group of people into teams several times, pursuing the goals of high diversity and few repetitions. We propose a hybrid Greedy Randomized Adaptive Search Procedure/Variable Neighborhood Descent (GRASP/VND) heuristic combined with tabu search and path relinking for its resolution, as well as an Integer Linear Programming (ILP) formulation. We compare both approaches through a set of real MBA cohorts, and the results show that, in all cases, the heuristic approach significantly outperforms the ILP and manually formed teams in terms of both diversity and repetition levels.

Keywords:

MBA teams; orthogonal regrouping; diversity; GRASP; VND; path relinking

1. Motivation

There is empirical evidence showing that diversity among team members plays an important role in the success of organizations; see [1,2,3] for studies on the impacts of gender diversity on revenue and profits, female executive ratios on profit, and educational and work ratios on innovation. Collaborative, multidisciplinary team-formation and staffing/scheduling problems in workforce management are of paramount importance in project deployment and large-scale corporations. Diversity within teams is also increasingly important in contexts such as squadrons in “agile organizations” and study groups in executive education programs. In fact, the case that motivated the work presented in this article on MBA team formation and rotation is a good example. Experience shows that student skills and the learning process benefit significantly from highly diverse teams with respect to prior experience, age, gender, major, years of work experience and other attributes. The main goal of MBA programs is to develop generalistic professionals able to cope with the complexity of managing organizations. They are generalistic by nature; students benefit much more from addressing their weaknesses regarding skills, experience and discipline knowledge than from further developing those that they have already mastered. An important part of learning achieved by every student comes from his or her peers through teamwork. In addition, developing abilities to work in diverse teams and quickly adapt oneself to them is also a goal in itself of most current MBA programs. The problem of splitting a group of people into a given number of teams in a way that maximizes the intra-team diversity is challenging, as discussed in Section 2. Moreover, in many contexts, it can be desirable to rotate the teams several times so that every team member meets as many peers as possible, trains his or her adaptation skills, etc. Continuing with the case of MBA programs, they are usually split into 4–6 terms. Many business schools rotate the groups every term so that students develop their ability to adapt to different groups, benefit from new points of view and expand their peer network. Creating highly diverse teams while minimizing the repetition of peer pairs between terms is a very challenging problem faced by program directors when launching a new cohort. For instance, at IEEM Business School (Universidad de Montevideo), MBA directors typically spend 6–8 hours coping with this problem, as many times as cohorts are launched every year, yet achieving team partitions has been barely satisfactory in meeting the goals of high diversity and few repetitions.

Given the intrinsic difficulty of the underlying partitioning and clustering problems involved, it is highly convenient to develop algorithms to automate this task as much as possible. In this work, we focused on maximum-diversity regrouping assignments of MBA students; nevertheless, the reader can find potential applications to similar partitioning problems that involve rotating the partitions several times.

The contributions of this article can be summarized as follows:

1.: We introduce the Max-Diversity Orthogonal Regrouping (MDOR) problem.
2.: A feasibility condition and an upper bound for the optimum are derived in Section 3.3 for some ranges of the parameters.
3.: An exact Integer Linear Programming (ILP) formulation for the MDOR problem
is proposed (Section 3.2).
4.: A GRASP/VND methodology combined with tabu search and path relinking
is proposed.
5.: We compare the performance of both approaches against each other and against a real case of manual team splitting using data from five MBA cohorts of IEEM Business School, Universidad de Montevideo, Uruguay, who graduated between 2017 and 2019.

This document is organized as follows. The related work is presented in Section 2. A mathematical programming formulation for the MDOR is introduced in Section 3. A full GRASP/VND heuristic combined with tabu search and path relinking as a post-optimization technique is presented in Section 4. Computational results based on real-life students are presented in Section 5. Section 6 contains concluding remarks and directions for future work.

2. Related Work

Based on our scientific literature review, the works that are closest to ours are [4,5,6]. A simplified model with a high similarity in team formation was presented in [4], which considers the dining philosophers problem for the assignment of students into groups. In [5], the problem was modeled using integer linear programming. This work considered a centroid for each cluster. Two approaches were studied: the min-sum approach tries to minimize the distances with respect to the centroid; the second is a min-max approach whose goal is to minimize the maximum (i.e., the worst) distance. The case study in [6] consisted of the assignment of 8 advisors to 235 students. This work applied integer linear programming, and it is equivalent to the min-sum approach given by [5]. The problem belongs to the

NP

-hard computational complexity class, and heuristics are available to tackle it [7]. A hybrid genetic algorithm was proposed in [8]. There, the authors suggested Tabu Search combined with strategic oscillations. Independently, [9] proposed an artificial bee-workers approach. In [10], a competitive General Variable Neighborhood Search (GVNS) was also proposed. An extension of this GVNS was developed in [11], with a skewed VNS combined with a shaking process to better explore the search space. The goal in the Orthogonal Regrouping Problem is to repeatedly partition a given set in such a way that every pair is included only once in a cluster. Well-known instances have been extensively treated, e.g., the Kirkman’s Schoolgirl Problem and the Social Golfer Problem.

This article is an extended version of [12], where we introduced the MDOR problem. It is worth remarking that, although this work was motivated by the assignment of MBA students to teams that are reconstructed every term, it has potential applications to other scenarios, such as staffing and scheduling in workforce management [13], team formation models for collaboration [14] and team-formation algorithms for faultline minimization [15], among others.

3. Problem

In this section, we describe the main features of the problem, and then we present a mathematical programming formulation. A brief discussion covers particular cases, which are considered to address the problem heuristically.

3.1. Problem Description

Our problem formulation requires a definition of distance between any two items. In the context of grouping MBA students, the distance between two students represents how different they are in terms of a set of criteria (age, type of major, gender, work experience, admission test score, etc.) that the MBA Director chooses. In the case of the real-life sets used in our test, the criteria are:

Career (subdivided into percentages of Social, Natural and Exact Sciences content);
Admission test score;
Residence (urban or countryside);
Gender;
Age.

Career is split into three attributes in

[0, 1]

that account for the relative levels of Social, Natural and Exact Sciences. The score on the admission test and age are natural numbers, while the remaining attributes assume a binary domain. Once the attributes are selected, a distance function between the different individuals

d_{i j}

must be specified. In what follows, the normalized Euclidean distance is considered:

d_{i j} = d (x^{i}, x^{j}) = \frac{{∥ x^{i} - x^{j} ∥}_{2}}{{max}_{u \neq v} {∥ u - v ∥}_{2}},

(1)

where the distance between each pair of students is found by a numerical assignment to the different attributes (i.e., different coordinates). Note that this normalization implies that

0 \leq d_{i j} \leq 1

for all pairs of students i and j with corresponding attributes

x^{i}

and

x^{j}

.

3.2. Problem Formulation

Consider the following variables:

N: the number of students;
G: the number of teams (clusters);
K: the number of attributes;
M: the number of students per team: $M = \frac{N}{G}$ (if integer);
S: the number of terms (clusterings);
$d_{i j}$ : the distance between students i and j;
R: the maximum number of terms that any pair of students can share (R = 1 for an SGP instance).

Consider the set of binary decision variables

x_{i g s}

, such that

x_{i g s} = 1

if and only if student i is assigned to group g in term s, and

x_{i g s} = 0

otherwise. Additionally, we consider the set of binary decision variables

y_{i j}^{g s}

such that

y_{i j}^{g s} = 1

if and only if

x_{i g s} = 1

and

x_{j g s} = 1

, where i and j are students, g is a group and s is a term. We introduce the MDOR problem as the following integer linear programming model:

\begin{matrix} max_{x_{i g s}} & \sum_{s = 1}^{S} \sum_{g = 1}^{G} \sum_{i = 1}^{N - 1} \sum_{j = i + 1}^{N} d_{i j} y_{i j}^{g s}, \end{matrix}

(2)

\begin{matrix} s . t . \sum_{g = 1}^{G} x_{i g s} & = 1, \forall (i, s) \in {1, \dots, N} \times {1, \dots, S} \end{matrix}

(3)

\begin{matrix} \sum_{i = 1}^{N} x_{i g s} & = M, \forall (g, s) \in {1, \dots, G} \times {1, \dots, S} \end{matrix}

(4)

\begin{matrix} 2 y_{i j}^{g s} & \leq x_{i g s} + x_{j g s}, \forall (i, j, g, s) \in \times {1, \dots, N}^{2} \times {1, \dots, G} \times {1, \dots, S} \end{matrix}

(5)

\begin{matrix} x_{i g s} + x_{j g s} & \leq y_{i j}^{g s} + 1, \forall (i, j, g, s) \in {1, \dots, N}^{2} \times {1, \dots, G} \times {1, \dots, S} \end{matrix}

(6)

\begin{matrix} \sum_{g = 1}^{G} \sum_{s = 1}^{S} y_{i j}^{g s} & \leq R, \forall (i, j) \in {1, \dots, N} \times {1, \dots, N} \end{matrix}

(7)

\begin{matrix} x_{i g s} \in {0, 1}, & \forall (i, g, s) \in {1, \dots, N} \times {1, \dots, G} \times {1, \dots, S} \end{matrix}

(8)

\begin{matrix} y_{i j}^{g s} \in {0, 1}, & \forall (i, j, g, s) \in {1, \dots, N}^{2} \times {1, \dots, G} \times {1, \dots, S} \end{matrix}

(9)

The goal (objective function (2)) is to maximize the diversity-sum of all student peers

i, j

among all clusters (second summation) and terms (first summation), where the intracluster diversity is precisely the distance-sum among all pairs of that cluster. Constraints (3) state that each student is included in one and just one team for all clusters and terms. Constraints (4) state that the teams have exactly M students for all teams and terms. Constraints (5) and (6) enforce the coherence between the x and y sets of variables as follows. Constraints (5) state that if two students i and j share group g in a certain term s, i.e.,

y_{i j^{g s}} = 1

, then both flag variables

x_{i g s}

and

x_{j g s}

must be equal to one. Conversely, Constraints (6) state that if

x_{i g s}

and

x_{j g s}

are both one, meaning that they share team g in term s, then

y_{i j^{g s}}

must be equal to one. Constraints (7) limit the number of times that any pair of students

i, j

can meet throughout the different terms (outer summation). Finally, Constraints (8) and (9) define the binary domain for the decision variable sets x and y.

3.3. Discussion

Note that the previous MDOR model is adequate when

M = \frac{N}{G}

is an integer. Next, we discuss how to overcome this limitation and to minimize the number of repetitions as well.

3.3.1. Number of Students per Group

If

M = \frac{N}{G}

is not an integer, we can introduce a slight change in Constraints (4). In fact, consider the Euclidean division:

N = G \times M + r

for some remainder

r : 0 \leq r < G

. We can arrange

M + 1

students in r groups and M students in the remaining

G - r

groups.

As a more general setting, pick two vectors

\vec{a}

and

\vec{b}

representing lower and upper bounds on the number of students per group. Replace Constraints (4) with:

\begin{matrix} \sum_{g = 1}^{G} x_{i g s} & \geq a_{g}, \forall (g, s) \in {1, \dots, G} \times {1, \dots, S} \\ \sum_{g = 1}^{G} x_{i g s} & \leq b_{g}, \forall (g, s) \in {1, \dots, G} \times {1, \dots, S} . \end{matrix}

3.3.2. Avoiding Repetitions

Completely avoiding repetitions is not always possible, depending on the parameters

G, M

and S of an MDOR instance (see Table 1). Even when it is possible, no polynomial-complexity algorithm is known for the general case; variations such as the SGP-completion problem are known to be NP-complete [16,17], so no feasible solution is guaranteed to be computed in polynomial time unless

S = 1

. In the last case, any distribution is a feasible solution, but the optimization problem is equivalent to the MIN CUT INTO BOUNDED SET problem [18], where sets have bounds equal to M and edge weights equal to

d_{i, j}

, which is NP-complete on N and G, since maximizing the edge weights inside the groups is the same as minimizing the edge weights between these groups. It is then plausible to assume that MDOR is NP-hard, but we were not able to prove it. If so, it would be NP-complete since checking if a solution is feasible and if its cost is greater than a given constant takes polynomial time.

Let us consider a certain student. If

R = 1

in each term s, he/she will meet

M - 1

different students, so, among the S terms, he/she will meet

S (M - 1)

students. In order to fulfill this restriction, the total number of students should be greater than or equal to

1 + S (M - 1)

, i.e.,

N \geq 1 + S (M - 1)

.

Let us also upper bound the value of the objective function given in Equation (2), assuming that

N / G

is an integer and

R = 1

, by sorting the distances

d_{i j}

. More concretely, let

j_{i, h}

be a matrix, where every row i is a vector that contains the indexes of

N - 1

students (excluding i) sorted by descending distance to i, i.e,

d_{i, j_{i, h}} \geq d_{i, j_{i, h + 1}} h = 1, \dots, N - 1 .

Then,

\begin{matrix} F & = \sum_{s = 1}^{S} \sum_{g = 1}^{G} \sum_{i = 1}^{N - 1} \sum_{j = i + 1}^{N} d_{i j} y_{i j}^{g s} = \frac{1}{2} \sum_{s = 1}^{S} \sum_{g = 1}^{G} \sum_{i = 1}^{N} \sum_{j = 1}^{N} d_{i j} y_{i j}^{g s} = \frac{1}{2} \sum_{i = 1}^{N} \sum_{j : y_{i j}^{g s} = 1} d_{i j} \\ \leq \frac{1}{2} \sum_{i = 1}^{N} \sum_{h = 1}^{S (M - 1)} d_{i j_{i, h}}, \end{matrix}

which is an upper bound that can be easily precomputed, which can help to estimate the gap to the optimum. If

R > 1

and

S (M - 1) = q R + R^{'}

with

0 \leq R^{'} < R

, then

F \leq \frac{1}{2} \sum_{i = 1}^{N} (R^{'} d_{i j_{i, q + 1}} + R \sum_{h = 1}^{q} d_{i j_{i, h}}) .

Two possible heuristic approaches arise to cope with the repetition problem. One might build high-diversity solutions while controlling the repetition level. Alternatively, one might generate repetition-free solutions and then choose and/or modify them to seek for improved diversity. In this paper, we introduce an algorithm that follows the first approach. A parameter

G L O B A L_R E P

is set; if the algorithm generates solutions that include a repetition for a certain pair more than

G L O B A L_R E P

times, then it accepts the algorithm.

4. Solution

GRASP and VND are well-known metaheuristics that have been successfully used to solve many hard combinatorial optimization problems. GRASP is a powerful multistart metaheuristic that operates in two phases. A feasible solution is generated in the first phase, whose neighborhood is then explored in the local search phase. The second phase is usually enriched by means of different variable neighborhood structures. For instance, VND explores several neighborhood structures in a deterministic order. Its success is based on the simple fact that different neighborhood structures do not usually have the same local minimum. Thus, the resulting solution is simultaneously a locally optimum solution under all neighborhood structures. Additionally, the metaheuristic is enriched with path relinking, which is an enhancement of the GRASP procedure, leading to significant improvements in solution quality. Path relinking was first introduced in the context of tabu sarch [19]. It was suggested as an approach to integrate intensification and diversification into the search for solutions. In the context of GRASP, path relinking was first introduced by [20]. This approach generates new solutions by exploring trajectories between high-quality solutions; it starts from one of these solutions, called the initiating solution, and generates a path in the neighborhood space that leads towards other solutions, the guiding solutions. The reader is invited to consult the comprehensive Handbook of Heuristics [21] and GRASP with Path-Relinking [22] for further information. Here, we develop a GRASP/VND with path-relinking methodology.

4.1. GRASP/VND with Path-Relinking Methodology for the MDOR

We followed a traditional VND flow diagram, which consists of three local searches:

Insert: moves a student to another group;
Swap: swaps two students from different groups;
3-Chain: exchanges three students from three different groups.

The most simple local searches appear at the beginning. Therefore, the order is respectively Insert, Swap and 3-Chain. A greedy randomized Construction phase is run first.

To speed up the evaluation of the objective function, the internal structures in the main algorithm consider two vectors:

$x^{c} [i]$ : the current group for student i, and
$s d^{c} [i] [g]$ : current sum-diversity between student i and his/her peers in group g.

Note that

s d^{c} [i] [g] = \sum_{j : x^{[} j] = g} d_{i, j}

, and if we link the students in a graph with link weights

d_{i, j}

, by virtue of the Handshaking Lemma, we obtain the following objective:

f (x^{c}) = \frac{1}{2} \sum_{i = 1}^{N} s d^{c} [i] [x^{c} [i]] .

(10)

Next, the details of the construction and local searches are presented.

4.2. Construction Phase

The search space is the set of all student assignments to the groups, where each student belongs to exactly one group. A feasible solution also satisfies the respective lower and upper bounds

a_{g}

and

b_{g}

. In our Construction phase, an iterative student insertion into groups takes effect, meeting the lower bounds

a_{g}

. Finally, in order to realize feasibility, all students are assigned to some group, satisfying the upper bound

b_{g}

. Two factors are considered for these group insertions: diversity and repetitions. In this construction phase, the priority is given to repetitions. Therefore, a memory of the previous terms is used, and if two assignment have an identical number of repetitions, the assignment with maximum diversity is chosen. During the process, the diversity per group g for some student x is computed using the following expression:

d^{'} (x, g) = \sum_{y \in g} \frac{d (x, y)}{| g |} .

Note the division by the cardinality

| g |

; this is meant to avoid preferring groups with larger numbers of students. Algorithm 1 illustrates the pseudocode of the construction phase.

Algorithm 1

C o n s t r u c t i o n (s t u d e n t G r o u p, a, b, a t r s S t a n d a r d, r e p M a t r i x)

1:: $studentVector \leftarrow {1, 2, \dots, N}$
2:: $groupVector \leftarrow {1, 2, \dots, N}$
3:: $a s s i g n O n e R a n d o m S t u d e n t T o E a c h G r o u p (studentGroup, repMatrix)$
4:: while $g r o u p V e c t o r \neq {}$ do
5:: $selGroup \leftarrow a s s i g n G r o u p T o S t u d F o r M i n R e p e t i t i o n s ($
6:: $studentGroup, repMatrix)$
7:: if $groupCount [selGroup] = a [s e l G r o u p]$ then
8:: $groupVector \leftarrow groupVector - selGroup$
9:: end if
10:: end while
11:: for $g \leftarrow$ 1 $t o$ G do
12:: if $groupCount [g] = b [g]$ then
13:: $groupVector \leftarrow groupVector - g$
14:: end if
15:: end for
16:: while $g r o u p V e c t o r \neq {}$ do
17:: $selGroup \leftarrow a s s i g n G r o u p T o S t u d e n t F o r M i n R e p e t i t i o n s ($
18:: $studentGroup, repMatrix)$
19:: if $groupCount [selGroup] = b [s e l G r o u p]$ then
20:: $groupVector \leftarrow groupVector - selGroup$
21:: end if
22:: end while

The following variables are considered during the Construction phase:

$s t u d e n t G r o u p [s]$ : the group assigned to student $s \in {1, \dots, N}$ ;
$a t r s S t a n d a r d [i, j]$ : the value of attribute $j \in {1, \dots, K}$ for student i;
$g r o u p C o u n t [g]$ : the number of students in group $g \in {1, \dots, G}$ .

The following functions are also considered:

$a s s i g n O n e R a n d o m S t u d e n t T o E a c h G r o u p ()$ : it assigns one random student uniformly picked at random to each group.
$a s s i g n G r o u p T o S t u d F o r M i n R e p e t i t i o n s ()$ : it picks a random student and assigns him/her to the group that leads to the least number of repetitions. Ties are resolved based on maximum diversity.

4.3. Insertion

In this local search, student i is moved from one group to a different one. We remark that a local search takes place whenever the resulting solution is both better and feasible. To test feasibility, we simply check the lower and upper bounds for the old and new groups, respectively. The difference in the objective is the change in the diversity:

f (x^{n}) - f (x^{c}) = s d^{c} [i] [g_{2}] - s d^{c} [i] [g_{1}],

where

x^{n}

is the new solution, and

x^{c}

is the current solution.

4.4. Swap

In this local search, two students i and j, originally belonging to different groups

g_{i} \neq g_{j}

, are exchanged, and the difference in the objective function is:

f (x^{n}) - f (x^{c}) = (s d^{c} [i] [g_{j}] - s d^{c} [i] [g_{i}]) + (s d^{c} [j] [g_{j}] - s d^{c} [j] [g_{i}]) - 2 d_{i j}

4.5. Three-Chain

Consider three different students i, j and k belonging to three different groups

g_{i}

,

g_{j}

and

g_{k}

. Student i is moved to

g_{j}

, j is moved to

g_{k}

, and k is moved to

g_{i}

:

\begin{matrix} f (x^{n}) - f (x^{c}) & = (s d^{c} [i] [g_{j}] - s d^{c} [i] [g_{i}]) + (s d^{c} [j] [g_{k}] - s d^{c} [j] [g_{j}]) + (s d^{c} [k] [g_{i}] - s d^{c} [k] [g_{k}]) \\ - (d_{i j} + d_{j k} + d_{k i}) \end{matrix}

4.6. Shake

In order to increase the diversity in the search space, a shake process takes place. Consider a k-neighborhood of a Swap operation, i.e., an arbitrary application of k swaps.

S h a k e

picks a k-neighbor, and the VND phase is restarted with the obtained solution, provided that the tabu list allows for the shake to be carried out (i.e., controlling the repetition threshold). In the general algorithm, k starts as equal to a parameter

K_M I N

and is increased by a second parameter

K_S T E P

until the solution is improved or until it equals a third parameter

K_M A X

.

4.7. Path Relinking

To perform the post-optimization path-relinking process, a neighborhood structure is defined. Two solutions

x_{1}

and

x_{2}

are neighbors if there exists a sequence of swaps between students that keeps the number of global repetitions under the threshold R. Which students to take in each swap is determined based on the symmetric difference between the two solutions. The paths are developed using this neighborhood structure. Starting from one or more elite solutions, paths in the solution space leading toward other elite solutions are generated and explored in the search for better solutions.

Algorithm 2 illustrates the pseudocode of the path-relinking procedure applied to a pair of solutions

x_{s}

(starting solution) and

x_{t}

(target solution).

Algorithm 2

P a t h_R e l i n k i n g (x_{s}, x_{t})

1:: $Compute symmetric difference Δ (x_{s}; x_{t});$
2:: $f^{*} \leftarrow \max {f (x_{s}), f (x_{t})};$
3:: $x^{*} \leftarrow argmax {f (x_{s}), f (x_{t})};$
4:: $x \leftarrow x_{s};$
5:: while $Δ (x; x_{t}) \neq \emptyset$ do
6:: $m^{*} \leftarrow a r g m a x {f (x \oplus m) : m \in Δ (x, x_{t})};$
7:: while $m a x N u m b e r O f R e p e t i t i o n s (x \oplus m^{*}) > R$ do
8:: $m \leftarrow g e t R a n d o m (Δ (x, x_{t}))$
9:: $Δ (x \oplus m, x_{t}) \leftarrow Δ (x, x_{t}) ∖ {m}$
10:: $x \leftarrow x \oplus m$
11:: $m^{*} \leftarrow a r g m a x {f (x \oplus m) : m \in Δ (x, x_{t})};$
12:: end while
13:: $Δ (x \oplus m^{*}, x_{t}) \leftarrow Δ (x, x_{t}) ∖ {m^{*}}$
14:: $x \leftarrow x \oplus m^{*}$
15:: if $f (x) > f^{*}$ then
16:: $f^{*} \leftarrow f (x)$
17:: $x^{*} \leftarrow x$
18:: end if
19:: end while
20:: return $x^{*}$

The procedure starts by computing the symmetric difference

Δ (x_{s}, x_{t})

between the two solutions; in this case, this represents the set of moves (swaps) needed to reach

x_{t}

(target solution) from

x_{s}

(initial solution). A path of solutions is generated linking

x_{s}

and

x_{t}

. The best solution

x^{*}

in this path is returned by the algorithm. At each step, the procedure examines all moves

m \in Δ (x; x_{t})

from the current solution x and selects the one that results in the maximum diversity under the threshold R. If there is no swap that keeps the number of repetitions within the threshold, it applies one swap from

Δ (x, x_{t})

at random and repeats the procedure. If necessary, the best solution

x^{*}

is updated. The procedure terminates when

x_{t}

is reached, which is when

Δ (x, x_{t}) = \emptyset

.

4.8. Main Algorithm

The main algorithm iterates over all terms. For each one, it starts by invoking Construction for a number of times equal to

M A X_T R I E S

, which acts as a parameter. The most diverse solution is passed to the following step, where the following cycle is repeated for a number of times equal to

T_M A X

(another parameter):

S h a k e

, Insertion, Swap and 3-Chain. Then, the algorithm selects a set

Y

of instances, chosen randomly from the pool of elite solutions, for the post-optimization path-relinking process [19]. The best solution found is chosen for the term, and the process moves to the next one.

Algorithm 3 illustrates the pseudocode of the main algorithm.

Algorithm 3

H e u r i s t i c (M A X_T R I E S, T_M A X, K_M I N, K_S T E P, K_M A X)

1:: $f_{h}^{*} \leftarrow 0$
2:: for $i t e r \leftarrow 1 to M A X_T R I E S$ do
3:: $P \leftarrow \emptyset$ {Elite solution set}
4:: $T \leftarrow \emptyset$ {Tabu List}
5:: for $i \leftarrow 1 to T_M A X$ do
6:: $x \leftarrow G r e e d y R a n d o m i z e d C o n s t r u c t i o n P h a s e ()$
7:: $x \leftarrow L o c a l S e a r c h P h a s e (x, T, K_M I N, K_S T E P, K_M A X)$
8:: $T \leftarrow U p d a t e T a b u L i s t (x, T)$
9:: if $i \geq 2$ then
10:: Choose, at random, pool solutions $Y \subseteq P$ to relink with x
11:: for $y \in Y$ do
12:: Determine which (x or y) is the initial $x_{s}$ and which is the target $x_{t}$
13:: $x_{p} \leftarrow P a t h R e l i n k i n g (x_{s}, x_{t})$
14:: Update the elite set P with $x_{p}$
15:: if $f (x_{p}) > f_{h}^{*}$ then
16:: $f_{h}^{*} \leftarrow f (x_{p})$
17:: $x^{*} \leftarrow x_{p}$
18:: end if
19:: end for
20:: end if
21:: end for
22:: end for
23:: return $x^{*}$

5. Computational Results

We carried out a comparison between the algorithm (coded in Python 3.9) introduced herein and the ILP model implemented using the optimization engine IBM CPLEX version 12.8. Both were executed on a home PC (Intel-core i7 2.2 GHz, 8 GB RAM). One hundred independent iterations were run (since GRASP is a multistart metaheuristic), and the best solution was finally returned. As a preliminary stage, an adjustment of all parameters was performed by running several experiments. Some instances were included even though there were no feasible solutions in order to show that the existence depends on the parameters, as stated in Section 3.3.

Table 1 reports the performance of the GRASP/VND with path relinking and ILP for each instance. The heuristic was tested with the following parameters:

M A X_T R I E S = 100

,

T_M A X = 10^{6}

,

K_M I N = K_S T E P = 1

and

K_M A X = 3

.

Following the terminology,

f_{h} (x)

and

f_{i} (x)

represent the values of the best solutions found using the heuristic and the integer linear programming model, respectively. The column

T i m e

indicates the CPU time in seconds that the algorithms runs. The column

T i m e

under

I L P

gives the time required to reach either the optimum value or the best lower bound when the optimum is not attained. The elapsed time is 24 h. Column

{GAP}_{i}

indicates the gap calculated by CPLEX. In essence, it is the best bound, as the best objective value, that an integer solution could potentially have based on information that the solver has discovered so far. The best bound is the best relaxed-but-region-constrained solution for any region that has not yet been eliminated from the search space. The column under GAP shows the result of computing the gap with respect to the best solution found with ILP (

GAP = \frac{f_{h} (x) - f_{i} (x)}{f_{i} (x)} \times 100

).

In the GAP column, the reader can observe that our heuristic finds better solutions than the ILP in almost all cases. Even though a global optimum is not formally proved for some instances, the null gap between ILP and our solution reinforces the evidence of optimality. The average GAP obtained is

4.28 %

.

To understand the global effectiveness of our VND scheme with path relinking, a midpoint test was performed. Table 2 shows the performance of the algorithm for the MDOR. The columns Insertion, Swap, 3-Chain and Path Relinking show the percentage of each kind of movement applied over 100 executions of the VND local search phase and path relinking, respectively. The column #Moves states the average number of moves applied during these iterations. The column entitled

m p

shows the relative improvement in the objective function between the best solution found in each local search phase with the post-optimization path-relinking process compared to the feasible solution obtained from the construction phase over 100 iterations. The reader can see that the VND with the path-relinking effect is noticeable, since the diversity is roughly half the optimum in most cases using only the construction phase. Note that each movement is applied only if an improvement is achieved. It is clear that Insertion, Swap and Path Relinking are the most effective movements, while 3-Chain seldom has an effect. The effectiveness of path relinking is evident and reaches over 25% of the total number of applied movements in the post-optimization process.

The six test cohorts were manually split into teams by the corresponding program directors at IEEM Business School when each one was launched (between 2017 and 2020). They claim to have spent between 6 and 10 h working on a spreadsheet for every cohort. Table 3 compares what they achieved to the output of the heuristic algorithm. Columns f and R stand for the objective function and the maximum repetition parameter for both the heuristic and manually developed solutions. The comparison shows that, in all cases, the heuristic generated superior solutions in terms of objective function (the global diversity score) and equal or better solutions in terms of repetition levels; furthermore, by using the algorithm, directors can save almost all of the mentioned time that they spent on this task.

6. Conclusions and Directions for Future Work

A combinatorial optimization problem named Max-Diversity Orthogonal Regrouping (MDOR) is introduced. It was conceived to cope with the problem of partitioning MBA cohorts into high-diversity teams, rotating the teams every term and keeping repetitions under a given (low) threshold. Nevertheless, the MDOR has potential applications in workforce management or team formation models for collaboration.

An exact integer linear programming method and a GRASP/VND methodology enriched with path relinking are proposed in order to address the MDOR. The tests presented, based on six real MBA cohorts, show that the heuristic algorithm produces partitionings faster and with fewer repetitions and higher diversities than the best solution found for the ILP exact method in almost all cases. The heuristic algorithm produced solutions that were significantly better than those manually generated by the program directors in terms of maximum repetitions and global diversity score. In fact, the partitionings obtained via the GRASP algorithm were presented to the MBA Directors, who agreed that they would have been far superior to those that they had manually generated, which led them to start applying the GRASP/VND algorithm from 2021 onward.

Future work includes:

Weighing the diversity differently in different terms. In the MBA example, diversity is more important in earlier terms than in later ones, since as long as the program advances, students tend to increase their field-specific knowledge and experience; towards the end of the program, it should be much more difficult to guess what their major or previous experience is based only on the attendance of a class;
Considering the fact that a student can join the MBA in a term other than the first one (e.g., deferrals from previous cohorts) or that a student can leave the program before completing it;
Allowing the algorithm to run starting at a term other than the first one, taking into account the previous teams for further repetitions and allowing a different number of teams to be set;
Allowing rules such as “student A and B cannot (or must) be together in terms X, Y, Z”;
Testing a different approach to define the problem from the one presented herein (maximizing diversity while controlling for repetitions); this could be (a) minimizing repetitions while controlling for diversity or (b) maximizing an objective function that simultaneously considers diversity and repetitions through a well-tuned relative weighting for both factors;
Exploring the potential benefits of other heuristic approaches, e.g., adaptive VND [23].

Author Contributions

Conceptualization, P.S.; methodology, F.R. and E.C.; software, L.S.; validation, P.S.; formal analysis, F.R. and E.C.; data curation, P.S.; writing, P.S. and L.S.; visualization, L.S.; project administration, F.R.; funding acquisition, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The MBA cohort datasets used in our tests are available at: http://www2.um.edu.uy/ieem-papers/mba-groups.zip (accessed on 17 December 2021).

Acknowledgments

This work is partially supported by Project ANII FCE_1_2019_1_156693 Teoría y Construcción de Redes de Máxima Confiabilidad, MATHAMSUD 19-MATH-03 Rare events analysis in multi-component systems with dependent components, STIC-AMSUD ACCON Algorithms for the capacity crunch problem in optical networks and the Grant-Thornton Research Center of IEEM Business School.

Conflicts of Interest

The authors declare no conflict of interest.

References

Herring, C. Does Diversity Pay?: Race, Gender, and the Business Case for Diversity. Am. Sociol. Rev. 2009, 74, 208–224. [Google Scholar] [CrossRef] [Green Version]
Ellemers, N.; Rink, F. Diversity in work groups. Curr. Opin. Psychol. 2016, 11, 49–53. [Google Scholar] [CrossRef]
Talke, K.; Salomo, S.; Kock, A. Top Management Team Diversity and Strategic Innovation Orientation: The Relationship and Consequences for Innovativeness and Performance. J. Prod. Innov. Manag. 2011, 28, 819–832. [Google Scholar] [CrossRef]
Bhadurya, J.; Mightyb, E.J.; Damar, H. Maximizing workforce diversity in project teams: A network flow approach. Omega 2000, 28, 143–153. [Google Scholar] [CrossRef] [Green Version]
Desrosiers, J.; Mladenović, N.; Villeneuve, D. Design of balanced MBA student teams. J. Oper. Res. Soc. 2005, 56, 60–66. [Google Scholar] [CrossRef]
Baker, B.M.; Benn, C. Assigning pupils to tutor groups in a comprehensive school. J. Oper. Res. Soc. 2001, 62, 623–629. [Google Scholar] [CrossRef]
Feo, T.A.; Khellaf, M. A class of bounded approximation algorithms for graph partitioning. Networks 1990, 20, 181–195. [Google Scholar] [CrossRef]
Fan, Z.P.; Chen, Y.; Ma, J.; Zeng, S. A hybrid genetic algorithmic approach to the maximally diverse grouping problem. J. Oper. Res. Soc. 2011, 62, 1423–1430. [Google Scholar] [CrossRef]
Rodriguez, F.J.; Lozano, M.; García-Martínez, C.; González, J.D. An artificial bee colony algorithm for the maximally diverse grouping problem. Inf. Sci. 2013, 230, 183–196. [Google Scholar] [CrossRef]
Dragan, U. Variable neighborhood search for maximum diverse grouping problem. Yugosl. J. Oper. Res. 2014, 24, 21–23. [Google Scholar]
Brimberg, J.; Mladenovic, N.; Uroševic, D. Solving the maximally diverse grouping problem by skewed general variable neighborhood search. Inf. Sci. 2015, 295, 650–675. [Google Scholar] [CrossRef]
Banchero, M.; Robledo, F.; Romero, P.; Sartor, P.; Servetti, C. Max-Diversity Orthogonal Regrouping of MBA Students Using a GRASP/VND Heuristic. In Proceedings of the ICVNS 2021, Abu Dhabi, United Arab Emirates, 21–25 March 2021; pp. 58–70. [Google Scholar]
Bruecker, P.D.; den Bergh, J.V.; Beliën, J.; Demeulemeester, E. Workforce planning incorporating skills: State of the art. Eur. J. Oper. Res. 2015, 243, 1–16. [Google Scholar] [CrossRef] [Green Version]
Wi, H.; Oh, S.; Mun, J.; Jung, M. A team formation model based on knowledge and collaboration. Expert Syst. Appl. 2009, 36, 9121–9134. [Google Scholar] [CrossRef]
Bahargam, S.; Golshan, B.; Lappas, T.; Terzi, E. A team-formation algorithm for faultline minimization. Expert Syst. Appl. 2019, 119, 441–455. [Google Scholar] [CrossRef] [Green Version]
Triska, M. Solution Methods for the Social Golfer Problem. Master’s Thesis, Technische Universität Wien, Vienna, Austria, 2008. [Google Scholar]
Colbourn, C.J. The complexity of completing partial Latin squares. Discret. Appl. Math. 1984, 8, 25–30. [Google Scholar] [CrossRef] [Green Version]
Garey, M.R.; Johnson, D.S. Computers and Intractability: A Guide to the Theory of NP-Completeness (Series of Books in the Mathematical Sciences), 1st ed.; W. H. Freeman: New York, NY, USA, 1979. [Google Scholar]
Glover, F. Tabu search and adaptive memory programming advances, applications and challenges. In Interfaces in Computer Science and Operations Research; Springer: Berlin/Heidelberg, Germany, 1997; pp. 1–75. [Google Scholar]
Laguna, M.; Marti, R. GRASP and path relinking for 2-layer straight line crossing minimization. INFORMS J. Comput. 1999, 11, 44–52. [Google Scholar] [CrossRef] [Green Version]
Mart, R.; Pardalos, P.M.; Resende, M.G.C. Handbook of Heuristics, 1st ed.; Springer Publishing Company: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Resendel, M.G.; Ribeiro, C.C. GRASP with path-relinking: Recent advances and applications. In Metaheuristics: Progress as Real Problem Solvers; Ibaraki, T., Nonobe, K., Yagiura, M., Eds.; Operations Research/Computer Science Interfaces Series; Springer: Boston, MA, USA, 2005; Volume 32, pp. 29–63. [Google Scholar]
Karakostas, P.; Sifaleras, A.; Georgiadis, M.C. Variable neighborhood search-based solution methods for the pollution location-inventory-routing problem. Optim. Lett. 2020. [Google Scholar] [CrossRef]

Table 1. GRASP/VND with path relinking versus ILP for MDOR.

Instances					Heuristic		ILP			GAP
Name	N	S	G	R	$f_{h} (x)$	Time (h)	GAP $_{i}$ (%)	$f_{i} (x)$	Time (h)	%
2017-1	31	1	5	1	43.5	11.7	202	42.5	24	2.43
2017-1	31	6	5	3	256.2	24	171	245.2	24	4.45
2017-1	31	6	6	2	198.4	24	123	198.4	24	0
2017-1	31	6	6	3	209.3	24	249	190.0	24	10.17
2017-1	31	6	7	2	170.4	24	176	162.1	24	5.12
2017-1	31	6	7	3	179.7	24	315	160.2	24	12.21
2018-1	47	1	5	1	107.3	24	293	104.6	24	2.58
2018-1	47	6	6	3	514.6	24	234	479.8	24	7.24
2018-1	47	6	7	2	408.1	24	162	408.0	24	0.02
2018-1	47	6	7	3	437.6	24	288	413.1	24	5.92
2018-1	47	6	8	2	351.1	24	209	346.2	24	1.39
2018-1	47	6	8	3	377.5	24	356	352.2	24	7.16
2018-2	35	1	5	1	52.6	21.8	223	52.1	24	0.99
2018-2	35	6	6	2	242.7	24	128	239.4	24	1.40
2018-2	35	6	6	3	260.8	24	230	247.9	24	5.23
2019-1	40	1	5	1	77.8	19.1	267	77.0	24	0.99
2019-1	40	6	5	3	452.2	24	167	447.8	24	0.99
2019-1	40	6	6	2	357.3	24	123	357.2	24	0.02
2019-1	40	6	6	3	381.8	24	237	356.0	24	7.24
2019-1	40	6	7	2	301.7	24	167	298.7	24	0.98
2019-1	40	6	7	3	324.0	24	304	296.8	24	9.16
2019-2	30	1	5	1	41.1	11.3	226	40.9	24	0.50
2019-2	30	6	5	3	241.2	24	188	223.9	24	7.70
2019-2	30	6	6	2	No feasible solution
2019-2	30	6	6	3	195.9	24	258	180.1	24	8.79
2020-1	42	1	7	1	58.2	24	474	57.3	24	1.57
2020-1	42	6	7	2	Time limit exceeded
2020-1	42	6	7	3	337.5	24	651	323.62	24	4.29
Average										4.17

Table 2. Performance of the local search phase and path relinking. Abbreviations: I = Insertion, S = Swap, 3C = 3-Chain, PR = Path Relinking, M = #Moves.

Instances					Heuristic, all in %
Name	N	S	G	R	I	S	3C	PR	M	$mp$
2017-1	31	1	5	1	53	15	0	32	21	73
2017-1	31	6	5	3	25	29	2	44	137	43
2017-1	31	6	6	2	61	23	0	16	119	52
2017-1	31	6	6	3	29	13	0	68	91	36
2017-1	31	6	7	2	13	46	1	40	97	58
2017-1	31	6	7	3	45	32	0	23	114	42
2018-1	47	1	5	1	76	12	0	12	51	23
2018-1	47	6	6	3	61	22	0	17	276	74
2018-1	47	6	7	2	22	65	1	12	204	32
2018-1	47	6	7	3	21	14	4	61	197	88
2018-1	47	6	8	2	12	52	1	35	186	21
2018-1	47	6	8	3	11	57	0	32	202	62
2018-2	35	1	5	1	61	22	0	27	31	52
2018-2	35	6	6	2	19	57	1	23	131	78
2018-2	35	6	6	3	29	58	2	11	169	85
2019-1	40	1	5	1	63	32	0	5	32	17
2019-1	40	6	5	3	68	9	3	20	178	46
2019-1	40	6	6	2	61	35	0	4	177	59
2019-1	40	6	6	3	45	27	0	28	132	87
2019-1	40	6	7	2	12	79	0	9	253	86
2019-1	40	6	7	3	31	45	0	24	101	49
2019-2	30	1	5	1	78	19	0	3	24	62
2019-2	30	6	5	3	23	61	1	15	117	59
2019-2	30	6	6	3	46	12	0	52	99	64
2020-1	42	1	7	1	23	38	2	37	44	64
2020-1	42	6	7	3	57	16	0	27	208	46
Average					40.1	34.2	0.7	26	130.4	56.1

Table 3. Performance comparison: heuristic vs. manually generated solutions.

Instances			Manually Generated		Heuristic
Name	G	N	$f (x)$	$R$	$f (x)$	$R$
2017-1	5	31	229.8	4	256.2	3
2018-1	8	47	361.1	3	377.5	3
2018-2	6	35	243.7	3	260.8	3
2019-1	6	40	354.9	4	381.8	3
2019-2	5	30	227.0	3	241.2	3
2020-1	7	42	301.2	3	337.5	3

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Canale, E.; Robledo, F.; Sartor, P.; Stábile, L. Solving the Max-Diversity Orthogonal Regrouping Problem by an Integer Linear Programming Model and a GRASP/VND with Path-Relinking Approach. Symmetry 2022, 14, 18. https://doi.org/10.3390/sym14010018

AMA Style

Canale E, Robledo F, Sartor P, Stábile L. Solving the Max-Diversity Orthogonal Regrouping Problem by an Integer Linear Programming Model and a GRASP/VND with Path-Relinking Approach. Symmetry. 2022; 14(1):18. https://doi.org/10.3390/sym14010018

Chicago/Turabian Style

Canale, Eduardo, Franco Robledo, Pablo Sartor, and Luis Stábile. 2022. "Solving the Max-Diversity Orthogonal Regrouping Problem by an Integer Linear Programming Model and a GRASP/VND with Path-Relinking Approach" Symmetry 14, no. 1: 18. https://doi.org/10.3390/sym14010018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Solving the Max-Diversity Orthogonal Regrouping Problem by an Integer Linear Programming Model and a GRASP/VND with Path-Relinking Approach

Abstract

1. Motivation

2. Related Work

3. Problem

3.1. Problem Description

3.2. Problem Formulation

3.3. Discussion

3.3.1. Number of Students per Group

3.3.2. Avoiding Repetitions

4. Solution

4.1. GRASP/VND with Path-Relinking Methodology for the MDOR

4.2. Construction Phase

4.3. Insertion

4.4. Swap

4.5. Three-Chain

4.6. Shake

4.7. Path Relinking

4.8. Main Algorithm

5. Computational Results

6. Conclusions and Directions for Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI