Next Article in Journal
PCa-Clf: A Classifier of Prostate Cancer Patients into Patients with Indolent and Aggressive Tumors Using Machine Learning
Previous Article in Journal
Unraveling COVID-19 Dynamics via Machine Learning and XAI: Investigating Variant Influence and Prognostic Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Brainstorming Will Never Be the Same Again—A Human Group Supported by Artificial Intelligence

Cybernetics & Decision Support Systems Laboratory, Faculty of Organizational Sciences, University of Maribor, Kidričeva cesta 55a, 4000 Kranj, Slovenia
*
Author to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2023, 5(4), 1282-1301; https://doi.org/10.3390/make5040065
Submission received: 27 July 2023 / Revised: 6 September 2023 / Accepted: 19 September 2023 / Published: 25 September 2023
(This article belongs to the Section Learning)

Abstract

:
A modification of the brainstorming process by the application of artificial intelligence (AI) was proposed. Here, we describe the design of the software system “kresilnik”, which enables hybrid work between a human group and AI. The proposed system integrates the Open AI-GPT-3.5–turbo model with the server side providing the results to clients. The proposed architecture provides the possibility to not only generate ideas but also categorize them and set priorities. With the developed prototype, 760 ideas were generated on the topic of the design of the Gorenjska region’s development plan with eight different temperatures with the OpenAI-GPT-3.5-turbo algorithm. For the set of generated ideas, the entropy was determined, as well as the time needed for their generation. The distributions of the entropy of the ideas generated by the human-generated and the AI-generated sets of ideas of the OpenAI-GPT-3.5–turbo algorithm at different temperatures are provided in the form of histograms. Ideas are presented as word clouds and histograms for the human group and the AI-generated sets. A comparison of the process of generating ideas between the human group and AI was conducted. The statistical Mann-Whitney U-test was performed, which confirmed the significant differences in the average entropy of the generated ideas. Correlations between the length of the generated ideas and the time needed were determined for the human group and AI. The distributions for the time needed and the length of the ideas were determined, which are possible indicators to distinguish between human and artificial processes of generating ideas.

1. Introduction

In the design of the regional development plan of the Gorenjska region, the brainstorming process plays an important role in the initial steps, with the goal of providing innovative ideas that would benefit the community [1,2,3,4]. New tools such as ChatGPT [2] present the opportunity to enhance group brainstorming sessions with the aid of AI. In the present study, we did not use the ChatGPT interface directly but considered the integration of the existing software tools for brainstorming (the “kresilnik” tool in our case) with the OpenAI application programming interface (API), which enables advanced use of the generative pretrained transformer (GPT) models. In the previous cycle of designing the regional development plan of the Gorenjska region [3], the tool “kresilnik” was used to gather, categorize, and prioritize the innovative ideas in the initial stage of designing the regional development plan. In the present research, we propose the concept of hybrid brainstorming sessions as well as the concept of realizing hybrid brainstorming tools.
Classical brainstorming sessions will not be the same since the invention of ChatGPT [5,6,7,8,9,10]. This invention has had a profound impact on the methodology of brainstorming [11,12]; therefore, it was our intention to provide a novel hybrid framework for the process of generating ideas.
Involving a group of experts in the design of the regional development plan is important in order to apply collective intelligence [13] and innovative ideas [14,15,16]. There are cases where collective intelligence outperforms a single expert, for example, in the field of radiology [17]. Novel collaborative information systems should, therefore, leverage the potential of collective intelligence [18,19]. However, in the application of AI, one should consider the ethical aspects [20] that might occur, as well as the technical issues; for example, if unethical ideas generated by AI would be presented for consideration by the expert group. The proposed methodology, which will be described in this article, provides a framework for incorporating the Large Language Models (LLM) of AI into the process of group brainstorming. We expect that the expert group will evaluate the generated ideas and that the decision-making process will still be under the control of the expert group. Nonetheless, we anticipate that the proposed methodology will harness the capabilities of AI to generate innovative ideas and enhance the overall ideation process during the brainstorming phase.
To test the feasibility of augmenting the “kresilnik” brainstorming tool, a software system design was defined that enabled the integration of the tool with the OpenAI–GPT-3.5–turbo model via a web socket over a secure API-key encrypted connection.
In our previous research [2], we estimated the usefulness of ChatGPT without direct access to API as well as the Ayoa [21] tool, which enables AI-supported brainstorming. Preliminary research showed that AI tools can generate useful ideas in the complex topic of regional developmental planning. However, in order to have better control, integration within the custom-made “kresilnik” tool was needed in order to fine-tune the output of the model with the variation in the temperature parameter, in our case, as well as to provide an appropriate prompt within the API call.
An important developmental aspect was the integration of several different LLM such as CLAUDE, Bard, and ChatGPT [22]. As shown in our previous research [23], the integration of multiple cloud-based systems, which conform to the Koložvari–Škraba condition [23], could provide better results than if one would apply only a single cloud-based LLM AI system. Regarding the general approach to the hybrid brainstorming process, the proposed methodology could be applied with the integration of other LLM and AI systems in order to boost the classical brainstorming process.
The emphasis of the study was on the process of generating innovative ideas rather than on decision-making. Nevertheless, in our previous research [2], we have also considered the application of AI in the field of decision-making with promising results, which might be an interesting topic for further research.
In each brainstorming session, we started with the initial question or call for ideas. When using OpenAI-GPT-3.5-turbo API, an appropriate prompt should be formed.
The initial question that was posed to the OpenAI-GPT-3.5-turbo model was the following: “In the period up to 2034, with what activities will we take advantage of the strengths and eliminate the weaknesses of the Gorenjska region in the field of human resources development? Please give one idea according to the principles of brainstorming”. This initial question was similar to the one posed to the group of human participants in 2018, except the year was changed from 2027 to 2034, and the statement “Please give one idea according to the principles of brainstorming” was added, which was also in the instructions given to the participants of the real committee in 2018. Therefore, one could consider that the same question was posed here.
The question above, in English, was analyzed by the OpenAI tokenizer [24] of the original Slovenian language, yielding 84 tokens and 203 characters. The tokens were marked with different colors, as shown in Figure 1.
“The GPT family of models process text using tokens, which are common sequences of characters found in text. The models understand the statistical relationships between these tokens and excel at producing the next token in a sequence of tokens” [24].
We should also mention that this initial question was stated in the Slovenian language, which is exceptional. The real (i.e., human) group included eight members, who generated 95 ideas in 24 min. In our experiments, sets of 95 ideas were generated with different GPT temperature parameters.
Ideas generated by the OpenAI-GPT-3.5-turbo model were examined by the generation of the word cloud at different model temperatures. The temperature governs the randomness and, thus, the creativity of the responses. LLM predicts the next best word when the initial prompt is provided, one word at a time. The model assigns a probability to each word in the model vocabulary and picks among these words. With a temperature of 0, the variation in the selection of words is small; the algorithm tries to pick the word with the highest probability. A higher temperature would result in the selection of words with a slightly lower probability, which would lead to more variation, randomness, and creativity [25]. If one wants to experiment and create many variations quickly, a high temperature is better.
With the “kresilnik” system, 8 × 95 = 760 ideas were generated at eight different temperatures of the GPT-3.5-turbo model (0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, and 1.75) in order to examine the functioning of the GPT-3.5-turbo model and the appropriateness of the results generated at different temperatures. We examined how different temperatures influenced the innovativeness of the ideas and whether the ideas might be applicable to the design of Gorenjska’s regional development plan.
In order to distinguish between the process of generating ideas by the human group and by AI, the entropy of the generated ideas can be used. In our previous research, the time needed by participants to generate ideas was recorded, enabling us to compare the generation process of humans versusAI. We also observed the frequency distributions of the entropy H as well as the correlation between the length of the generated ideas and the time needed, and the corresponding distributions.
The present research made a unique comparison between the human group and the AI system in the process of generating innovative ideas. The results will enable the development of novel information systems, enhance the methodology of brainstorming, and contribute to a means of detecting human-generated and artificially generated ideas.
The main original contributions of the study are the following: (a) a definition of the modified hybrid brainstorming process, where the human expert group is supported by AI; (b) the technical specifications of the novel design of the hybrid human-OpenAI system for generating ideas; (c) an analysis of the effects of variations in the temperature parameter of the OpenAI-GPT-3.5-turbo algorithm on the generated ideas in terms of entropy, the number of characters, their latency, and distribution; (d) a comparison of the ideas generated by the group of human experts and the ideas generated by AI; (e) a confirmation of the significant differences between the group of human experts and AI in terms of the entropy.

2. Methodology and Tools

Figure 2 shows part of the flowchart of the brainstorming process that has been modified by the addition of the possibility of generating new ideas via the OpenAI API, exploiting the GPT-3.5-turbo model. The initial steps of brainstorming stay the same, i.e., inviting the participants, stating the rules, etc. However, after the call for ideas, one could generate ideas with the aid of the OpenAI GPT model. These ideas can, later on, be checked, augmented, and corrected by the participants. When an idea is edited, one should check whether it is unique before passing it on to the next steps of classic brainstorming.
An important addition to the classic brainstorming process is the possibility of acquiring innovative ideas with the help of the OpenAI ChatGPT API [26], providing a methodological framework for the hybrid brainstorming process. One should note that in the real world, the participants in professional committees might generate only one idea in the time of 30 min [1,4]. The hybrid application of AI in the brainstorming phase provides the possibility of enhancing the process of regional developmental planning, which is based on the work of the Regional Development Committee’s expert group. The expert group is expected to evaluate each idea after the brainstorming phase, leveraging the principles of collective intelligence.
The uniqueness of a specific idea is determined by the expert group members participating in the brainstorming session. Once a new idea has been generated and approved by a member of the expert group, it is shared with all participants and displayed on the projector canvas. Typically, regional development committees conduct their work offline in a meeting room. Each participant closely monitors the ideas contributed to the group. Members often build upon the ideas provided by their peers, striving to offer their own unique insights. After gathering a collection of ideas, the categorization phase ensues, grouping similar ideas together. During the review of ideas within a particular category, redundant ideas may be excluded, although this decision falls within the purview of the expert group. In order to enhance the uniqueness of the gathered ideas, the measures of cosine or Jaccard similarity might be used [27,28], but this would be a topic for possible further implementation of the “kresilnik” tool.
Figure 3 shows the design of the hybrid [29,30,31,32,33] human–OpenAI tool for generating ideas, which is named “kresilnik”. The system consists of the server-side infrastructure, the OpenAI API, and the user interface. The server runs on Ubuntu Linux with node.js. The user can generate ideas with the help of OpenAI by pressing the corresponding button (1). The request is transmitted via a web socket to the server. The server then issues an asynchronous call to the OpenAI API function (2). The OpenAI API provides a response via the secure connection encrypted by the secret OpenAI API key (3). After the results have been received, i.e., an idea has been generated, the idea is transmitted to the client that has requested it via the web socket (4). The newly generated idea is displayed in the user interface (5), where the user can edit the generated idea and add or remove a particular part of it. After the idea has been checked by the user, the user can send the idea to the server (6). The server then distributes the generated and approved idea to all participants in the brainstorming group. Users can also provide their own ideas or work in hybrid mode, merging the artificially generated ideas with their own.
In order to compare the set of ideas generated by the members of the real regional development committee and the ideas generated with the aid of the GPT model at different values of the temperature parameter, the entropy of a particular idea was determined by the following equation:
H = i f i l l o g 2 f i l ,
where f i represents the frequency of the ith character’s occurrence within the idea’s text and l is the length of the idea.
The connection to the OpenAI GPT system was realized by the asynchronous API call to the OpenAI-GPT-3.5-turbo model. The code of the function createChatCompletion with the defined model and messages is shown in Appendix A.
More details about the conceptual architecture of the applied GPT model can be found in [34]. In the case of the API call, we specified the model in the json structure in key-value form as model: ’gpt-3.5-turbo’ and a message with the proper prompt, where the condition of brainstorming was also added. The temperature parameter was also one of the key value pairs, such as temperature: 1.0. The key-value pair of the messages was defined as messages: [{ role: ’user’, content: promptString }].
Figure 4 shows a mockup of the user interface of “kresilnik” for an iPhone with a resolution of 375 × 667 px. This is a prototype of a fully functional version of the system with added OpenAI GPT functionality. One can observe the number “2029” in front of the generated idea on the top of the list, which is the number of milliseconds that passed from pressing the button “Generate idea” until the idea was displayed in the text box; it was similar for other ideas in the list.
The user interface was developed with html and JavaScript for mobile and stationary devices, which is convenient for the users.

3. Results

The left part of Figure 5 shows the word cloud for 95 ideas generated by Gorenjska’s regional development committee in 2018. On the right, the corresponding histogram of the top 20 words is shown with frequency on the x-axis. One can observe that the real committee proposed considering the “elderly”, while the ideas generated by the GPT API did not emphasize this topic. One should also consider that the topic of “human resources” could be understood specifically to be for the Gorenjska region in a particular timespan, i.e., 2018. The word cloud and corresponding histogram for the top 20 keywords presented in Figure 5 can be applied as a reference point to compare the ideas generated by the OpenAI GPT API.
Figure 6 and Figure 7 show the word clouds for 95 ideas generated by the OpenAI GPT API at temperatures of 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, and 1.75. Word clouds are shown in the first column, with corresponding word histograms showing the frequencies of particular words in the right column. The image in the top left in Figure 6 is for a temperature of 0; the figure on the right shows the corresponding word histogram for the top 20 words at a temperature of 0 etc., and the subfigures layout is similar in Figure 7. At the start, when the temperature was low, several keywords were emphasized. The larger the text, the greater the frequency of those specific keywords. With increasing temperatures, the importance of the keywords became more evenly distributed, which might be better observed in corresponding word histograms. Interestingly, the word cloud and corresponding word histogram extracted meaningful keywords even at a temperature of 1.75, where a significant proportion of the generated text was somewhat random. Here, a word cloud with corresponding word histogram might be considered as an appropriate filtering method to extract helpful suggestions from the OpenAI GPT API.
One could observe that at a temperature of 1.5, the GPT API sometimes returns “hallucinational” results [34], providing some English text phrases such as “Our instrumentalization however, may call for introducing mechanisms that bolostequally sitintparenthood and We”, some Chinese simplified text such as “此公海冬请OA原图帝劣因 不只有斢果”, fragments of computer code such as “if(this.href.indexOf(’#ghost’)>-1){bref=location.href.split(’#’)[0];anchor=csv”, and apparently random characters such as “ingmethodamaupzerðrrazmislekitoliko”.
Nevertheless, some ideas generated at the temperature of 1.5 were still impressively innovative and original, such as “Creating events and study programs at Gorenjska public schools that focus on the 21st century with the aim of creating a new generation force that can stay in the region and promote the sustainability and development of the wider community.”
At a temperature of 1.75, some of the generated ideas were barely useful. Here, the level of hallucination was very high, mixing different languages and apparently random strings of characters. An example of such ideas is: “Opening a local training center or support system for small and medium-sized enterprises. A joint program would be developed as IQSEJA” (“Odpiranje lokalnega izobraževalnega centra ali podpornega sistema za mala in srednja podjetja.Razviti bi bil skupni program kot IQSEJA-lta karton kurslarī bu verstandrevalktu Karivgenlassalirdizilani na ovo prodručb iz voulksi teridas tyukeućih. Toringg treneniiisisk…”).
The following case was similar: “We should take advantage of Gorenjska mountains because of winte123” (“Izkoristiti bi morali vrline gorenjsKIH gor zaradi zims123_k_o nivojsanju griDen motena na lv2cjvh rjugin vsxt-g mq523 kvula.zros bo232 @hm5276 zag917ki sod zb GAIM327-LKO.(Naše verige besedic ”---ccc---, “---rjuga(r/m/i.P)###imetnik bokeškrb --- z (#mrzlice)).EN: COVID KILLS". Lemma=GIBREL 326Q|#Y;;ISO-Sr(FSK/T31089/O2384ISR, osalz4M332%)”).
One could observe that the starting text of the idea, printed in italic, could be useful; however, the following text is somewhat random. The system could also generate emojis such as Make 05 00065 i001,Make 05 00065 i002, and Make 05 00065 i003 at a temperature of 1.75, e.g., “(TES(DSV29tkzaapmaEnMake 05 00065 i001zypoBenBAcrahslAdSh | Chelah!Make 05 00065 i002 | Neveljavana IdMake 05 00065 i003)”.
However, the randomness of the proposed ideas might trigger positive associations for the interacting subjects, providing new, innovative ideas.
At a temperature of 1.75, also some useful ideas might be generated, such as “Establishment of ‘mondene’ Gorenjska market place”, “Digital knowledge for Gorenjska competitiveness. To enable the introduction of informatics as a more comprehensive subject already in primary schools and the planned orientation of educated computer scientists…” followed once more by random text.
One could apply a filter after the generation of ideas that would extract useful text. Here, the real human subjects in the brainstorming session might filter out the good parts of the text. Certainly, the algorithm could be useful here, but it would be somewhat challenging to extract the good parts. Here, the possibility of the application of AI with the aid of human actors could be usefully exploited.
At a temperature of 0, the system returned ideas that were partly or even completely the same (printed in italic in the example below), although some parts of the idea could still be different: “The establishment of a regional centre for career development, which will provide free counseling and education for various target groups, from school and university students to the unemployed and employed, and connect employers with job seekers.
The establishment of a regional centre for career development and education, which will connect educational institutions, companies and other organizations and provide training, mentoring and opportunities to gain experience for young and experienced workers”.
However, at a temperature of 0, all 95 ideas started with the “The establishment of a regional center…” This might have been suggested by the previous regional development plans, which may have included such ideas.
One could observe that the ideas are somewhat repetitive, and their innovativeness might be questionable. All members of the real committees might cover all needed expert areas. However, if one compared the ideas produced by the smaller expert group to the set of ideas generated by GPT, it is possible that a particular important expert area would lack proper coverage. Therefore, GPT could be perceived as a useful tool to help participants consider the wider scope of the initial brainstorming question.
The usefulness of the generated ideas is always subjective, since the final decision regarding the acceptability of a particular idea is determined by the regional development committee’s expert group. However, in order to estimate whether certain ideas generated by “kresilnik” would be worth considering, we analyzed the 95 ideas generated by the regional development committee’s expert group and the 95 ideas generated by “kresilnik” with the temperature set to T = 1. The following keywords from the newly generated ideas were not present within the set of 95 ideas generated by the regional development committee’s expert group (here, we present only the keywords so as not to burden the text with the complete ideas):
-
Digital literacy;
-
E-learning;
-
Sustainable tourism;
-
Green industry;
-
Education on artificial intelligence, blockchain, and autonomous vehicles;
-
Establish connections with foreign experts and institutions;
-
Digital skills;
-
Soft skills;
-
Educational camps for youth.
Whether these topics should be included in the regional development plan could be considered by each individual citizen living in the region. Some of them, such as “green industry”, are within the EU’s strategic “Green Deal” development plan. This might be something that each citizen of the EU might be interested in including in the regional development plan of his/her region. The quality of the ideas is, therefore, hard to determine and it is in the domain of the regional development committee’s expert group. However, if we look at the suggestions, even the proposed keywords provide some meaningful input into the brainstorming session.
In order to illustrate the potential usefulness of the generated ideas, the following idea might be considered, which was generated at a temperature of T = 1.5 and was not included, even partially, by the real regional development committee’s expert group: “Inclusion of the population with different ethnic and cultural backgrounds in decision-making about projects and their implementation with the aim of increasing community, cooperation, tolerance, understanding and greater equality in opportunities for career development”.
The judgment as to whether such an idea might be considered within a regional development plan is up to each reader.
For the set of generated ideas, the entropy of each idea was determined by Equation (1). The image on the left in Figure 8 shows the increase in the average entropy H computed by Equation (1). One can observe an increase in entropy at temperatures of 1.5 and 1.75. This was mostly due to the addition of languages other than Slovenian in the results as well as the lengthier text generated. The right-hand side of Figure 8 shows the corresponding average time (in seconds) needed to generate a particular idea. At the temperature of 0.75, one could observe a steady and somewhat exponential increase in the time needed to generate a particular idea.
The average entropy of the ideas generated by the GPT API ranged from H = 4.247 bits up to H = 4.579 bits, while the entropy of a human committee was H = 4.567 bits. In general, one could conclude that the entropy H increased with the temperature, which was more apparent at temperatures higher than 1. With a correlation coefficient of r2 = 0.97, one could conclude that the ideas with higher entropy (i.e., those that are more innovative) take up more time to generate.
Figure 9 shows histograms of the entropy H of the human group and the OpenAI-generated sets of ideas from temperatures of 0 (temp0) to 1.75 (temp1.75). On the y-axis, the absolute frequency of the H bins is shown. One can see the different shapes of the distributions.
Table 1 shows the results of the Shapiro–Wilk test for the human group and the results of OpenAI for temperatures from 0 to 1.75 with a step of 0.25. The W statistics are a measure of how well the ordered and standardized sample quantiles fit the standard normal quantiles. One can observe that distribution of entropy at temperatures of 0.5, 1, and 1.25 fulfils the criterion of normality.
While the distribution of the entropy of ideas generated by the human group was not Gaussian (normal), this might be an indicator that the set of ideas was artificially generated.
The nonparametric Mann–Whitney U-test was conducted to assess the similarity between the entropy of the ideas (H) of the human group and the ideas generated by OpenAI. At the level of p = 0.001, the entropy of none of the AI-generated sets matched the entropy of the human group’s ideas (U-stat: temp0 = 2353.0, temp0.25 = 2341.0, temp0.5 = 2114.0, temp0.75 = 2228.0, temp1 = 1851.0, temp1.25 = 1809.0, temp1.5 = 1253.0, temp1.75 = 708.0).
Figure 10 shows the correlation between the length of the generated ideas as measured by the number of characters and the time needed to generate the ideas in seconds. The linear trendline is also shown, along with the value of r2 and p-value. The first correlation subplot is for the human group (r2 = 0.11, p = 0.001). The r2 for T0 was 0.01 with p = 0.330, which might be attributed to the fact that the results of the OpenAI API were the most deterministic and that the time needed to generate the idea was partly dependent on the network’s latency. The time taken to generate the ideas at T0 was also the shortest. A similar situation can be seen at T0.25; however, here, a clear trend with r2 = 0.47 could be observed where, at p = 0.000, approximately 47% of the variation in times needed to generate ideas can be explained by the length of ideas. For other temperatures from T0.5 to T1.75, higher correlation coefficients could be observed with the value of p = 0.000.
If we observe the correlation plots, a distinction can be made between the process where the human group was involved and that of OpenAI. If we look further into the differences in the process of generating ideas, we can inspect the distribution of the time needed for generating ideas, which is shown in Figure 11. The x-axis of the graphs shows the time needed to generate ideas, while the y-axis presents the absolute frequency. One can observe that the distribution of the time needed was not symmetrical for the human group. One could expect an exponential distribution of the times needed to generate ideas.
We have performed the Lilliefors test [35,36,37] for the distribution of times needed to generate a particular idea of the human group with results: h = 0, p = 0.0636, k = 0.1068, c = 0.1101. A value of h = 0 indicates that the exponential distribution is a reasonable model for the data at the significance level p = 0.05 since the obtained p-value of the null hypothesis test is 0.0636. Here k and c represent aspects of the Kolmogorov–Smirnov (KS) statistics, which are commonly used in goodness-of-fit tests.
Only the distribution of times needed to generate ideas by the human group are exponentially distributed. All other distributions could not be considered as exponential according to performed Lilliefors tests.
Table 2 shows the mean, standard deviation, skewness, and kurtosis for nine distributions shown in Figure 11 for the human group and GPT algorithm at temperatures between 0 and 1.75 with a step of 0.25. In the GPT algorithm, a temperature value of approximately 1.2 might be considered the threshold at which the GPT algorithm goes from “talking sense to talking nonsense” [38]. Skewness represents the measure of the asymmetry of distribution, while kurtosis represents the measure of the distribution’s tail thickness. Up to the temperature 1.25, the absolute value of skewness of the distribution of the times needed to generate a particular idea by the GPT algorithm is smaller than the one of the human group, which is S = 2.862. At temperatures 1.5 and 1.75, the distributions become more asymmetrical again with the skewness value of 3.186 and 1.806, respectively. Kurtosis is the highest for the human group, indicating that a significant number of times needed to generate ideas are in the tail of exponential distribution. This means that in a few cases longer times are needed to generate ideas. This might also be confirmed from practice, where participants might generate one idea in approximately 10 min.
With a combination of the correlations of the length of the ideas and the time taken with the distribution, the distinction between human and artificial processes might become more precise.
A similar situation might be observed if we consider the distributions of the number of characters in the generated ideas, which are shown in Figure 12. Again, the distribution of the human group was distinctively asymmetrical; however, according to the Lilliefors test, it is not exponential.
Table 3 shows the mean, standard deviation, skewness, and kurtosis for nine distributions shown in Figure 12 for the human group and the GPT algorithm at different temperatures.
Here, the number of characters is considered. Again, up to the temperature 1.25, the absolute value of skewness of the distribution of the number of characters in a particular idea generated by the GPT algorithm is smaller than the one of the human group, which is S = 1.751. At temperatures 1.5 and 1.75, the distributions become more asymmetrical again, with the skewness value of 2.745 and 1.842, respectively.
Considering the distributions in the normal range of temperature for the GPT algorithm (0–1.25), one could expect more symmetry in times and character distributions in GPT-generated ideas than in the human group.
Through a combined analysis of the correlations of time needed to generate ideas and the length of the ideas in addition to an analysis of the distribution of the time, length, and entropy, we may be able to distinguish whether the underlying process is human or artificial.
The distribution of entropy is one of the proposed metrics that could enable us to compare human-generated sets of ideas and the ideas generated by the OpenAI-GPT-3.5-turbo model. This is important for the identification of ideas generated by the human groups versus the ideas generated by AI algorithms. Entropy by itself is a proposed metric that can be used to estimate the variability of the ideas generated by the OpenAI-GPT-3.5-turbo model. It has been shown that entropy is positively correlated with temperature values greater than 1.
The length of the generated ideas and the time needed for generating the ideas are technical factors that determine the performance of the OpenAI-GPT-3.5-turbo model. Nevertheless, these two metrics could be used to identify the ideas or responses generated by the human group versus the ideas generated by the AI algorithms.
These metrics are not meant to evaluate the ideas themselves. Ideas are, in the first phase, evaluated by the human actors, who in this case were members of the regional development committee. The ideas were realized in the real world; for example, the “Kovačnica Business Incubator” (https://kovacnica.si/en/, accessed on 21 September 2023) was built in 2023. However, the quality of the proposed idea will ultimately be estimated within the next few years by the citizens of the Gorenjska region as taxpayers. We should mention that the proposed hybrid framework, the development of which has been described here, was prepared for the next round of the regional development plan of the Gorenjska region, which will be supported by the “kresilnik” hybrid brainstorming tool.
With the rise of AI, identifying the process of generating ideas is important not only for determining the ideas’ origin [39,40,41,42,43,44,45] but also for better understanding the process of human innovation.

4. Research Limitations and Bias

Since the OpenAI-GPT-3.5-turbo model is based on inherently random algorithms, one should be aware that there might be potential confounding factors present with a certain hidden causality. In the present statistical analysis, we have only indicated the primary variables of interest; however, an analysis of the confounding factors might be a topic for further research, where the causal relationships between variables might be identified.
The human group, consisting of eight participants, collectively generated a total of 95 ideas. Although the sample size is relatively modest, it exhibits a notable characteristic—the exponential distribution of the time required to generate ideas. To enhance the robustness of our findings, it is important to expand the sample by incorporating new groups and conducting further experiments. In accordance with queuing theory, we can reasonably anticipate that the exponential distribution of the time required to generate ideas will persist in other groups, provided that the flow of ideas remains unobstructed by potential interference, such as that arising from the facilitator or leader during brainstorming sessions.
Due to the novelty of the OpenAI-GPT-3.5-turbo algorithm, there are only a few studies available that might be comparable. Even at the range of the temperature parameter variation, there is no definite specification; for example, according to [46], the variation could only be possible in the range of 0–1; however, recent implementations consider the range of 0–2 [47]. Because of the lack of “explainability” of the LLM and artificial neural networks in general, the true impact of the temperature is somewhat speculative; nevertheless, it is important for the innovation aspect [38], which is desired in the brainstorming process.
The development of the “kresilnik” tool and the novel methodology have been described in the case of a regional development plan. In our experience, the complexity of the brainstorming process of such a problem is similar to that of other areas, particularly brainstorming sessions for the Gorenjska region on topics other than human resources, such as industry, infrastructure, tourism, and the environment.
The applicability of the proposed hybrid approach in generating ideas in brainstorming sessions in other areas besides regional development is the subject of further research. However, according to the promising results obtained for the rather complex topic of regional development in the Gorenjska region, one could expect that the methodology might provide acceptable results and provide an important boost to the classical approach to the brainstorming process in all areas of human endeavor.

5. Discussion

The proposed system design was successfully tested through the generation of 760 ideas on the topic of the regional development plan. An examination of the ideas by content indicated that these ideas might be useful for the development of the real regional plan. In further research, the hybrid process of generating ideas should be tested by real committees, where the OpenAI GPT API would be applied in combination with human agents. This might contribute to a better set of innovative ideas [48,49] for the creation of the regional development plan. Some ideas appear to be quite useful and, even more surprisingly, focused on the Gorenjska region.
Generating ideas at different temperatures might be useful for focusing the committee at lower temperatures. At lower temperatures, the major keywords with higher relative frequencies could be extracted. On the other hand, higher temperatures provide a greater diversity of keywords and might be used as the seed for the generation of further innovative ideas by the human participants.
The explainability of the ideas generated by the GPT models is a challenging task even for OpenAI as the creator of the algorithm [22], which may encourage future research in the area of: “Interpretability, explainability, and calibration, to address the current nature of ‘black-box’ AI models” [22]. Explainability should, therefore, be included as part of the functionality in the next versions of the “kresilnik” brainstorming tool.
One should be aware that AI tools might generate ethically questionable ideas, such as “automated attendance with facial recognition” [2]. This might be an important challenge for the application of the proposed methodology; however, it is expected that all the generated ideas will be evaluated by the expert group of the regional development committee. One could expect that ethically questionable ideas would be ranked last by the committee members, who are expected to follow high ethical standards. In the hybrid form of brainstorming, each generated idea is reviewed by a human expert and pushed to the group only if it is acceptable by the human expert; certainly, it could also be edited and augmented by the expert and later pushed to the group in its edited form. Hybrid work should, therefore, also address the ethical aspects of the proposed ideas as expected by the members of regional development committees.
Word clouds with the corresponding word histograms might be used as filters for extracting the major topics addressed by the initial brainstorming question.
With an increase in the temperature of the GPT model, one could observe a distinct exponential increase of entropy as well as the time needed to generate ideas. This is important for the further development of computational capabilities; creativity comes at a computational cost.
The distribution of the entropy of the ideas generated by the human group was not Gaussian (normal), which might be a useful indicator showing that a set of ideas was artificially generated.
At a temperature of T = 0, the ideas were very narrow; in most cases, parts of the ideas were repeated. On the other hand, at T = 1.75, the ideas might generate random text. However, even at T = 1.75, useful ideas might be generated.
A comparison of the characteristics of the idea generation processes of the real committee and AI provides the possibility of distinguishing between human and AI, which might be useful in many areas.
Since the results of the study are based on statistical methods, further analysis and a repetition of the experiments should be conducted. The final test of the proposed hybrid human–AI system’s architecture would be the evaluation of the ideas that are proposed, at least partially, by AI; to implement them in the real world and for them to be recognized by the public as positive. This would be a topic for further research. Nevertheless, the proposed methodology should be applied and tested in similar situations. The development of the field of LLM of AI is extremely fast [22]. We were able to apply and analyze the OpenAI-GPT-3.5-turbo algorithm. By the time of writing, new versions of the GPT algorithms have been provided, which should also be explored and compared. The proposed architecture could nonetheless be applied to newly developed models. It is important that the architecture is flexible and universal. In our case, the JavaScript API calls provide possible future compatibility with the new cloud-based LLM AI algorithms.
With the proposed technical framework, novel decision support systems [50,51] could be developed, providing the possibility of integrating human and AI in the process of generating innovative ideas by leveraging the rapid progress in the field of AI [21].

Author Contributions

Conceptualization, F.L. and A.Š.; funding acquisition, A.Š.; investigation, F.L.; methodology, F.L.; resources, A.Š.; software, A.Š.; supervision, A.Š.; validation, F.L. and A.Š.; visualization, F.L. and A.Š.; writing—original draft, F.L. and A.Š.; writing—review and editing, A.Š. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the financial support from the Slovenian Research and Innovation Agency (research core funding No. P5-0018), the Ministry of Higher Education, Science, and Innovation of the Republic of Slovenia as part of the Next Generation EU National Recovery and Resilience Plan (Grants No. 3330-22-3515; NOO No: C3330-22-953012), and the Erasmus+ program of the European Union (Grant No. 2021-1-MK01-KA220-HED-000027646.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data available on request due to privacy and ethical restrictions, subject to institutional approval.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of the data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A

Figure A1 shows the OpenAI-GPT-3.5-turbo API call code where the message with the prompt is shown as the part of the json structure.
Figure A1. OpenAI-GPT-3.5-turbo API call code.
Figure A1. OpenAI-GPT-3.5-turbo API call code.
Make 05 00065 g0a1

References

  1. Lavrič, F.; Semenkin, E.; Stanovov, V.; Škraba, A. Review of tools and methodology for efficient group idea generation and decision making. In Proceedings of the 41st International Conference on Organizational Science Development, Portorož, Slovenia, 23–25 March 2022; University of Maribor Press: Maribor, Slovenia, 2022; pp. 489–501. [Google Scholar] [CrossRef]
  2. Lavrič, F.; Škraba, A. Group Brainstorming Support by ChatGPT & Ayoa at the Design of Regional Development Plan. In Proceedings of the 42th International Conference on Organizational Science Development, Portorož, Slovenia, 22–24 March 2023; University of Maribor Press: Maribor, Slovenia, 2023; pp. 555–567. [Google Scholar] [CrossRef]
  3. RRA Gorenjske. Regional Gorenjska Development Plan 2021–2027|Regionalni Razvojni Program Gorenjske 2021–2027. Regionalna Razvojna Agencija Gorenjske BSC, Kranj, 22 June 2022. Available online: https://www.bsc-kranj.si/library/files/upload/RRP%20GORENJSKE%2020212027_sprejet.pdf (accessed on 18 July 2023).
  4. Škraba, A.; Filipič, B. Z informacijsko tehnologijo podprta izvedba sestankov regionalnih razvojnih odborov v fazi zbiranja idej. In Razvojni Izzivi Slovenije; Janez, N., Perko, D., Eds.; Geografski inštitut Antona Melika ZRC SAZU: Ljubljana, Slovenia, 2009; pp. 241–250. [Google Scholar]
  5. Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. 2019. Available online: https://github.com/codelucas/newspaper (accessed on 6 September 2023).
  6. Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
  7. Kung, T.H.; Cheatham, M.; Medenilla, A.; Sillos, C.; De Leon, L.; Elepaño, C.; Madriaga, M.; Aggabao, R.; Diaz-Candido, G.; Maningo, J.; et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLoS Digit. Health 2023, 2, e0000198. [Google Scholar] [CrossRef] [PubMed]
  8. Haque, M.U.; Dharmadasa, I.; Sworna, Z.T.; Rajapakse, R.N.; Ahmad, H. ‘I think this is the most disruptive technology’: Exploring Sentiments of ChatGPT Early Adopters using Twitter Data. Dec. arXiv 2022, arXiv:2212.05856. [Google Scholar]
  9. Li, Z.; Zhou, J.; An, Z.; Cheng, W.; Hu, B. Deep Hierarchical Ensemble Model for Suicide Detection on Imbalanced Social Media Data. Entropy 2022, 24, 442. [Google Scholar] [CrossRef]
  10. Burger, B.; Kanbach, D.K.; Kraus, S.; Breier, M.; Corvello, V. On the use of AI-based tools like ChatGPT to support management research. Eur. J. Innov. Manag. 2023, 26, 233–241. [Google Scholar] [CrossRef]
  11. Maaravi, Y.; Heller, B.; Shoham, Y.; Mohar, S.; Deutsch, B. Ideation in the digital age: Literature review and integrative model for electronic brainstorming. Rev. Manag. Sci. 2021, 15, 1431–1464. [Google Scholar] [CrossRef]
  12. Memmert, L.; Tavanapour, N. Towards Human-AI-Collaboration in Brainstorming: Empirical Insights into the Perception of Working with a Generative AI. ECIS 2023 Proceedings. Kristiansand, Norway, 2023. Available online: https://aisel.aisnet.org/ecis2023_rp (accessed on 6 September 2023).
  13. UNDP. Collective Intelligence in Action Using Machine Data and Insights to Improve UNDP Sensemaking Report on the Portfolio Analytics for Strategic Insights (PASI) Project; UNDP. 2022. Available online: https://www.undp.org/sites/g/files/zskgke326/files/2022-10/Collective%20intelligence%20in%20action%20-%20Using%20machine%20data%20and%20insights%20to%20improve%20UNDP%20Sensemaking_Report_1.pdf (accessed on 27 July 2023).
  14. Morooka, F.E.; Junior, A.M.; Sigahi, T.F.A.C.; Pinto, J.d.S.; Rampasso, I.S.; Anholon, R. Deep Learning and Autonomous Vehicles: Strategic Themes, Applications, and Research Agenda Using SciMAT and Content-Centric Analysis, a Systematic Review. Mach. Learn. Knowl. Extr. 2023, 5, 763–781. [Google Scholar] [CrossRef]
  15. Demertzis, K.; Demertzis, S.; Iliadis, L. A Selective Survey Review of Computational Intelligence Applications in the Primary Subdomains of Civil Engineering Specializations. Appl. Sci. 2023, 13, 3380. [Google Scholar] [CrossRef]
  16. Suran, S.; Pattanaik, V.; Draheim, D. The Generic Collective Intelligence Framework A Qualitative View. 2023. Available online: https://ssrn.com/abstract=4468723 (accessed on 6 September 2023).
  17. Wolf, M.; Krause, J.; Carney, P.A.; Bogart, A.; Kurvers, R.H.J.M. Collective intelligence meets medical decision-making: The collective outperforms the best radiologist. PLoS ONE 2015, 10, e0134269. [Google Scholar] [CrossRef]
  18. Lykourentzou, I.; Vergados, D.J.; Kapetanios, E.; Loumos, V. Collective intelligence systems: Classification and modeling. J. Emerg. Technol. Web Intell. 2011, 3, 217–226. [Google Scholar] [CrossRef]
  19. Duan, Y.; Edwards, J.S.; Dwivedi, Y.K. Artificial intelligence for decision making in the era of Big Data—Evolution, challenges and research agenda. Int. J. Inf. Manag. 2019, 48, 63–71. [Google Scholar] [CrossRef]
  20. Amini, M.M.; Jesus, M.; Sheikholeslami, D.F.; Alves, P.; Benam, A.H.; Hariri, F. Artificial Intelligence Ethics and Challenges in Healthcare Applications: A Comprehensive Review in the Context of the European GDPR Mandate. Mach. Learn. Knowl. Extr. 2023, 5, 1023–1035. [Google Scholar] [CrossRef]
  21. Ayoa. Ayoa Mind Mapping, Whiteboards & Tasks. 2023. Available online: https://www.ayoa.com/ (accessed on 18 August 2023).
  22. OpenAI. GPT-4 Technical Report; OpenAI: 2023. arXiv 2023, arXiv:2303.08774. [Google Scholar]
  23. Koložvari, A.; Stojanović, R.; Zupan, A.; Semenkin, E.; Stanovov, V.; Kofjač, D.; Škraba, A. Speech-recognition cloud harvesting for improving the navigation of cyber-physical wheelchairs for disabled persons. Microprocess Microsyst 2019, 69, 179–187. [Google Scholar] [CrossRef]
  24. OpenAI. Tokenizer—OpenAI Platform. Available online: https://platform.openai.com/tokenizer (accessed on 18 July 2023).
  25. Marion, S. How to Use OpenAI Model Temperature? Available online: https://gptforwork.com/guides/openai-gpt3-temperature (accessed on 15 March 2023).
  26. Cahan, P.; Treutlein, B. A conversation with ChatGPT on the role of computational systems biology in stem cell research. Stem Cell Rep. 2023, 18, 1–2. [Google Scholar] [CrossRef]
  27. Lahitani, A.R.; Permanasari, A.E.; Setiawan, N.A. Cosine Similarity to Determine Similarity Measure: Study Case in Online Essay Assessment. In Proceedings of the 4th International Conference on Cyber and IT Service Management, Bandung, Indonesia, 26–27 April 2016; pp. 1–6. [Google Scholar] [CrossRef]
  28. Wahyuningsih, T. Text Mining an Automatic Short Answer Grading (ASAG), Comparison of Three Methods of Cosine Similarity, Jaccard Similarity and Dice’s Coefficient. J. Appl. Data Sci. 2021, 2, 45–54. [Google Scholar] [CrossRef]
  29. Karakose, T.; Demirkol, M.; Yirci, R.; Polat, H.; Ozdemir, T.Y.; Tülübaş, T. A Conversation with ChatGPT about Digital Leadership and Technology Integration: Comparative Analysis Based on Human–AI Collaboration. Adm. Sci. 2023, 13, 157. [Google Scholar] [CrossRef]
  30. Alshami, A.; Elsayed, M.; Ali, E.; Eltoukhy, A.E.E.; Zayed, T. Harnessing the Power of ChatGPT for Automating Systematic Review Process: Methodology, Case Study, Limitations, and Future Directions. Systems 2023, 11, 351. [Google Scholar] [CrossRef]
  31. Pavlik, J.V. Collaborating with ChatGPT: Considering the Implications of Generative Artificial Intelligence for Journalism and Media Education. J. Mass Commun. Educ. 2023, 78, 84–93. [Google Scholar] [CrossRef]
  32. Dwivedi, Y.K.; Kshetri, N.; Hughes, L.; Slade, E.L.; Jeyaraj, A.; Kar, A.K.; Baabdullah, A.M.; Koohang, A.; Raghavan, V.; Ahuja, M.; et al. Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int. J. Inf. Manag. 2023, 71, 102642. [Google Scholar] [CrossRef]
  33. Temsah, O.; Khan, S.A.; Chaiah, Y.; Senjab, A.; Alhasan, K.; Jamal, A.; Aljamaan, F.; Malki, K.H.; Halwani, R.; Al-Tawfiq, J.A.; et al. Overview of Early ChatGPT’s Presence in Medical Literature: Insights from a Hybrid Literature Review by ChatGPT and Human Experts. Cureus 2023, 15, e37281. [Google Scholar] [CrossRef]
  34. Lee, M. A Mathematical Investigation of Hallucination and Creativity in GPT Models. Mathematics 2023, 11, 2320. [Google Scholar] [CrossRef]
  35. Lilliefors, H.W. On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown. J. Am. Stat. Assoc. 1967, 62, 399–402. [Google Scholar] [CrossRef]
  36. Henze, N.; Meintanis, S.G. Recent and classical tests for exponentiality: A partial review with comparisons. Metrika 2005, 61, 29–45. [Google Scholar] [CrossRef]
  37. MATLAB. MATLAB Version 9.10.0.1602886 (R2021a); MATLAB: Natick, MA, USA, 2022. [Google Scholar]
  38. Wolfram, S. Mystery of Entropy FINALLY Solved after 50 Years? 15 August 2023. Available online: https://youtu.be/dkpDjd2nHgo?t=241 (accessed on 1 September 2023).
  39. Kumar, Y.; Morreale, P.; Sorial, P.; Delgado, J.; Li, J.J.; Martins, P. A Testing Framework for AI Linguistic Systems (testFAILS). Electronics 2023, 12, 3095. [Google Scholar] [CrossRef]
  40. Uddin, S.M.J.; Albert, A.; Ovid, A.; Alsharef, A. Leveraging ChatGPT to Aid Construction Hazard Recognition and Support Safety Education and Training. Sustainability 2023, 15, 7121. [Google Scholar] [CrossRef]
  41. Liao, W.; Liu, Z.; Dai, H.; Xu, S.; Wu, Z.; Zhang, Y.; Huang, X.; Zhu, D.; Cai, H.; Liu, T.; et al. Differentiate ChatGPT-generated and Human-Written Medical Texts. arXiv 2023, arXiv:2304.11567. [Google Scholar]
  42. Mitrović, S.; Andreoletti, D.; Ayoub, O. ChatGPT or Human? Detect and Explain. Explaining Decisions of Machine Learning Model for Detecting Short ChatGPT-generated Text. arXiv 2023, arXiv:2301.13852. [Google Scholar]
  43. Islam, N.; Sutradhar, D.; Noor, H.; Raya, J.T.; Maisha, M.T.; Farid, D.M. Distinguishing Human Generated Text From ChatGPT Generated Text Using Machine Learning. arXiv 2023, arXiv:2306.01761. [Google Scholar]
  44. Gao, C.A.; Howard, F.M.; Markov, N.S.; Dyer, E.C.; Ramesh, S.; Luo, Y.; Pearson, A.T. Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. NPJ Digit. Med. 2023, 6, 75. [Google Scholar] [CrossRef] [PubMed]
  45. Joy, R.; Ventayen, M. OpenAI ChatGPT Generated Results: Similarity Index of Artificial Intelligence (AI) Based Model. 2023. Available online: https://ssrn.com/abstract=4332664 (accessed on 6 September 2023).
  46. OpenAI. API Reference—OpenAI API. 2023. Available online: https://platform.openai.com/docs/api-reference/audio (accessed on 6 September 2023).
  47. OpenAI. Does Temperature Go to 1 or 2? 2023. Available online: https://community.openai.com/t/does-temperature-go-to-1-or-2/174095 (accessed on 6 September 2023).
  48. Hung, J.; Chen, J. The Benefits, Risks and Regulation of Using ChatGPT in Chinese Academia: A Content Analysis. Soc. Sci. 2023, 12, 380. [Google Scholar] [CrossRef]
  49. Taecharungroj, V. ‘What Can ChatGPT Do?’ Analyzing Early Reactions to the Innovative AI Chatbot on Twitter. Big Data Cogn. Comput. 2023, 7, 35. [Google Scholar] [CrossRef]
  50. Almazyad, M.; Aljofan, F.; A Abouammoh, N.; Muaygil, R.; Malki, K.H.; Aljamaan, F.; Alturki, A.; Alayed, T.; Alshehri, S.S.; Alsatrawi, M.; et al. Enhancing Expert Panel Discussions in Pediatric Palliative Care: Innovative Scenario Development and Summarization With ChatGPT-4. Cureus 2023, 15, e38249. [Google Scholar] [CrossRef] [PubMed]
  51. Liu, S.; Wright, A.P.; Patterson, B.L.; Wanderer, J.P.; Turer, R.W.; Nelson, S.D.; McCoy, A.B.; Sittig, D.F.; Wright, A. Using AI-generated suggestions from ChatGPT to optimize clinical decision support. J. Am. Med. Inform. Assoc. 2023, 30, 1237–1245. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Tokenization of the initial brainstorming question.
Figure 1. Tokenization of the initial brainstorming question.
Make 05 00065 g001
Figure 2. A flowchart for conducting a brainstorming session with the aid of OpenAI. Only a fragment of an entire brainstorming session is shown, which includes the application of OpenAI.
Figure 2. A flowchart for conducting a brainstorming session with the aid of OpenAI. Only a fragment of an entire brainstorming session is shown, which includes the application of OpenAI.
Make 05 00065 g002
Figure 3. Design of the hybrid human–OpenAI tool for generating ideas.
Figure 3. Design of the hybrid human–OpenAI tool for generating ideas.
Make 05 00065 g003
Figure 4. Mockup of the user interface on an iPhone (375 × 667 px).
Figure 4. Mockup of the user interface on an iPhone (375 × 667 px).
Make 05 00065 g004
Figure 5. On the left: word cloud for 95 ideas generated by Gorenjska’s regional development committee in the 2018, on the right: corresponding histogram of the top 20 words with frequency on the x-axis.
Figure 5. On the left: word cloud for 95 ideas generated by Gorenjska’s regional development committee in the 2018, on the right: corresponding histogram of the top 20 words with frequency on the x-axis.
Make 05 00065 g005
Figure 6. In the left column: word clouds for 95 ideas generated by the GPT API at temperatures of 0, 0.25, 0.5, and 0.75. In the right column: corresponding word histogram with frequency on the x-axis for the top 20 most frequent words.
Figure 6. In the left column: word clouds for 95 ideas generated by the GPT API at temperatures of 0, 0.25, 0.5, and 0.75. In the right column: corresponding word histogram with frequency on the x-axis for the top 20 most frequent words.
Make 05 00065 g006
Figure 7. In the left column: word cloud for 95 ideas generated by the GPT API at temperatures of 1, 1.25, 1.5, and 1.75. In the right column: corresponding word histogram with frequency on the x-axis for the top 20 most frequent words.
Figure 7. In the left column: word cloud for 95 ideas generated by the GPT API at temperatures of 1, 1.25, 1.5, and 1.75. In the right column: corresponding word histogram with frequency on the x-axis for the top 20 most frequent words.
Make 05 00065 g007
Figure 8. (Left) Entropy H at different temperatures of the GPT-3 model. (Right) Average time taken to generate ideas, in seconds [s] (N = 95).
Figure 8. (Left) Entropy H at different temperatures of the GPT-3 model. (Right) Average time taken to generate ideas, in seconds [s] (N = 95).
Make 05 00065 g008
Figure 9. Histograms of the entropy H for the human group (humanGroup) and the OpenAI-generated sets of ideas at temperatures from 0 (temp0) to 1.75 (temp1.75).
Figure 9. Histograms of the entropy H for the human group (humanGroup) and the OpenAI-generated sets of ideas at temperatures from 0 (temp0) to 1.75 (temp1.75).
Make 05 00065 g009
Figure 10. Correlation between the length of the generated ideas (number of characters) and the time taken to generate ideas in seconds. r2 represents the squared value of correlation coefficient r and p statistical significance.
Figure 10. Correlation between the length of the generated ideas (number of characters) and the time taken to generate ideas in seconds. r2 represents the squared value of correlation coefficient r and p statistical significance.
Make 05 00065 g010
Figure 11. Distribution of the time needed to generate ideas in seconds.
Figure 11. Distribution of the time needed to generate ideas in seconds.
Make 05 00065 g011
Figure 12. Distribution of the number of characters in the generated ideas.
Figure 12. Distribution of the number of characters in the generated ideas.
Make 05 00065 g012
Table 1. Results of the Shapiro–Wilk test for the human group and the results of OpenAI at temperatures from 0 to 1.75 with a step of 0.25.
Table 1. Results of the Shapiro–Wilk test for the human group and the results of OpenAI at temperatures from 0 to 1.75 with a step of 0.25.
Group|Temp.Wp-Value
Human group0.9580.004
temp00.7040.000
temp0.250.9730.048
temp0.50.9960.995
temp0.750.9340.000
temp10.9850.364
temp1.250.9920.824
temp1.50.8100.000
temp1.750.9020.000
Table 2. Mean, standard deviation, skewness, and kurtosis for times in milliseconds needed to generate a particular idea.
Table 2. Mean, standard deviation, skewness, and kurtosis for times in milliseconds needed to generate a particular idea.
hGroupT0T0.25T0.5T0.75T1T1.25T1.5T1.75
Mean1009547524884482917887835996221402420627
Stdev1005572047150023352100235536191346524511
Skew2.862−1.3191.8991.1250.3221.0690.973.1861.806
Kurt14.8935.0388.7967.0223.6575.2074.24714.4215.842
Table 3. Mean, standard deviation, skewness, and kurtosis for the number of characters in a particular idea.
Table 3. Mean, standard deviation, skewness, and kurtosis for the number of characters in a particular idea.
hGroupT0T0.25T0.5T0.75T1T1.25T1.5T1.75
Mean78246233223219228244414762
Stdev53112964617191391877
Skew1.751−0.2330.40.9690.5420.9890.9472.7451.842
Kurt5.9112.7193.6346.0724.3974.6364.04811.126.203
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lavrič, F.; Škraba, A. Brainstorming Will Never Be the Same Again—A Human Group Supported by Artificial Intelligence. Mach. Learn. Knowl. Extr. 2023, 5, 1282-1301. https://doi.org/10.3390/make5040065

AMA Style

Lavrič F, Škraba A. Brainstorming Will Never Be the Same Again—A Human Group Supported by Artificial Intelligence. Machine Learning and Knowledge Extraction. 2023; 5(4):1282-1301. https://doi.org/10.3390/make5040065

Chicago/Turabian Style

Lavrič, Franc, and Andrej Škraba. 2023. "Brainstorming Will Never Be the Same Again—A Human Group Supported by Artificial Intelligence" Machine Learning and Knowledge Extraction 5, no. 4: 1282-1301. https://doi.org/10.3390/make5040065

Article Metrics

Back to TopTop