Future Internet

15 pages, 6409 KiB

Open AccessArticle

A Hybrid Deep Learning Model with Self-Improved Optimization Algorithm for Detection of Security Attacks in IoT Environment

by Amit Sagu, Nasib Singh Gill, Preeti Gulia, Jyotir Moy Chatterjee and Ishaani Priyadarshini

Future Internet 2022, 14(10), 301; https://doi.org/10.3390/fi14100301 - 19 Oct 2022

Cited by 12 | Viewed by 2158

Abstract

With the growth of the Internet of Things (IoT), security attacks are also rising gradually. Numerous centralized mechanisms have been introduced in the recent past for the detection of attacks in IoT, in which an attack recognition scheme is employed at the network’s [...] Read more.

With the growth of the Internet of Things (IoT), security attacks are also rising gradually. Numerous centralized mechanisms have been introduced in the recent past for the detection of attacks in IoT, in which an attack recognition scheme is employed at the network’s vital point, which gathers data from the network and categorizes it as “Attack” or “Normal”. Nevertheless, these schemes were unsuccessful in achieving noteworthy results due to the diverse necessities of IoT devices such as distribution, scalability, lower latency, and resource limits. The present paper proposes a hybrid model for the detection of attacks in an IoT environment that involves three stages. Initially, the higher-order statistical features (kurtosis, variance, moments), mutual information (MI), symmetric uncertainty, information gain ratio (IGR), and relief-based features are extracted. Then, detection takes place using Gated Recurrent Unit (GRU) and Bidirectional Long Short-Term Memory (Bi-LSTM) to recognize the existence of network attacks. For improving the classification accuracy, the weights of Bi-LSTM are optimally tuned via a self-upgraded Cat and Mouse Optimizer (SU-CMO). The improvement of the employed scheme is established concerning a variety of metrics using two distinct datasets which comprise classification accuracy, and index, f-measure and MCC. In terms of all performance measures, the proposed model outperforms both traditional and state-of-the-art techniques. Full article

(This article belongs to the Special Issue Privacy and Cybersecurity in the Artificial Intelligence Age)

► Show Figures

Figure 1

23 pages, 1292 KiB

Open AccessArticle

Towards Reliable Baselines for Document-Level Sentiment Analysis in the Czech and Slovak Languages

by Ján Mojžiš, Peter Krammer, Marcel Kvassay, Lenka Skovajsová and Ladislav Hluchý

Future Internet 2022, 14(10), 300; https://doi.org/10.3390/fi14100300 - 19 Oct 2022

Cited by 3 | Viewed by 1759

Abstract

This article helps establish reliable baselines for document-level sentiment analysis in highly inflected languages like Czech and Slovak. We revisit an earlier study representing the first comprehensive formulation of such baselines in Czech and show that some of its reported results need to [...] Read more.

This article helps establish reliable baselines for document-level sentiment analysis in highly inflected languages like Czech and Slovak. We revisit an earlier study representing the first comprehensive formulation of such baselines in Czech and show that some of its reported results need to be significantly revised. More specifically, we show that its online product review dataset contained more than 18% of non-trivial duplicates, which incorrectly inflated its macro F1-measure results by more than 19 percentage points. We also establish that part-of-speech-related features have no damaging effect on machine learning algorithms (contrary to the claim made in the study) and rehabilitate the Chi-squared metric for feature selection as being on par with the best performing metrics such as Information Gain. We demonstrate that in feature selection experiments with Information Gain and Chi-squared metrics, the top 10% of ranked unigram and bigram features suffice for the best results regarding online product and movie reviews, while the top 5% of ranked unigram and bigram features are optimal for the Facebook dataset. Finally, we reiterate an important but often ignored warning by George Forman and Martin Scholz that different possible ways of averaging the F1-measure in cross-validation studies of highly unbalanced datasets can lead to results differing by more than 10 percentage points. This can invalidate the comparisons of F1-measure results across different studies if incompatible ways of averaging F1 are used. Full article

(This article belongs to the Special Issue Trends of Data Science and Knowledge Discovery)

► Show Figures

Graphical abstract

16 pages, 457 KiB

Open AccessArticle

Modeling and Analyzing Preemption-Based Service Prioritization in 5G Networks Slicing Framework

by Yves Adou, Ekaterina Markova and Yuliya Gaidamaka

Future Internet 2022, 14(10), 299; https://doi.org/10.3390/fi14100299 - 18 Oct 2022

Cited by 5 | Viewed by 3494

Abstract

The Network Slicing (NS) technology, recognized as one of the key enabling features of Fifth Generation (5G) wireless systems, provides very flexible ways to efficiently accommodate common physical infrastructures, e.g., Base Station (BS), multiple logical networks referred to as Network Slice Instances (NSIs). [...] Read more.

The Network Slicing (NS) technology, recognized as one of the key enabling features of Fifth Generation (5G) wireless systems, provides very flexible ways to efficiently accommodate common physical infrastructures, e.g., Base Station (BS), multiple logical networks referred to as Network Slice Instances (NSIs). To ensure the required Quality of Service (QoS) levels, the NS-technology relies on classical Resource Reservation (RR) or Service Prioritization schemes. Thus, the current paper aims to propose a Preemption-based Prioritization (PP) scheme “merging” the classical RR and Service Prioritization schemes. The proposed PP-scheme efficiency is evaluated or estimated given a Queueing system (QS) model analyzing the operation of multiple NSIs with various requirements at common 5G BSs. As a key result, the proposed PP-scheme can provide up to 100% gain in terms of blocking probabilities of arriving requests with respect to some baseline. Full article

(This article belongs to the Special Issue AI and Security in 5G Cooperative Cognitive Radio Networks)

► Show Figures

Figure 1

12 pages, 962 KiB

Open AccessArticle

A Unified PUF and Crypto Core Exploiting the Metastability in Latches

by Ronaldo Serrano, Ckristian Duran, Marco Sarmiento, Tuan-Kiet Dang, Trong-Thuc Hoang and Cong-Kha Pham

Future Internet 2022, 14(10), 298; https://doi.org/10.3390/fi14100298 - 17 Oct 2022

Cited by 3 | Viewed by 1659

Abstract

Hardware acceleration of cryptography algorithms represents an emerging approach to obtain benefits in terms of speed and side-channel resistance compared to software implementations. In addition, a hardware implementation can provide the possibility of unifying the functionality with some secure primitive, for example, a [...] Read more.

Hardware acceleration of cryptography algorithms represents an emerging approach to obtain benefits in terms of speed and side-channel resistance compared to software implementations. In addition, a hardware implementation can provide the possibility of unifying the functionality with some secure primitive, for example, a true random number generator (TRNG) or a physical unclonable function (PUF). This paper presents a unified PUF-ChaCha20 in a field-programmable gate-array (FPGA) implementation. The problems and solutions of the PUF implementation are described, exploiting the metastability in latches. The Xilinx Artix-7 XC7A100TCSG324-1 FPGA implementation occupies 2416 look-up tables (LUTs) and 1026 flips-flops (FFs), reporting a 3.11% area overhead. The PUF exhibits values of 49.15%, 47.52%, and 99.25% for the average uniformity, uniqueness, and reliability, respectively. Finally, ChaCha20 reports a speed of 0.343 cycles per bit with the unified implementation. Full article

(This article belongs to the Special Issue Cyber Security Challenges in the New Smart Worlds)

► Show Figures

Figure 1

16 pages, 3431 KiB

Open AccessArticle

Improved Dragonfly Optimization Algorithm for Detecting IoT Outlier Sensors

by Maytham N. Meqdad, Seifedine Kadry and Hafiz Tayyab Rauf

Future Internet 2022, 14(10), 297; https://doi.org/10.3390/fi14100297 - 17 Oct 2022

Cited by 2 | Viewed by 1878

Abstract

Things receive digital intelligence by being connected to the Internet and by adding sensors. With the use of real-time data and this intelligence, things may communicate with one another autonomously. The environment surrounding us will become more intelligent and reactive, merging the digital [...] Read more.

Things receive digital intelligence by being connected to the Internet and by adding sensors. With the use of real-time data and this intelligence, things may communicate with one another autonomously. The environment surrounding us will become more intelligent and reactive, merging the digital and physical worlds thanks to the Internet of things (IoT). In this paper, an optimal methodology has been proposed for distinguishing outlier sensors of the Internet of things based on a developed design of a dragonfly optimization technique. Here, a modified structure of the dragonfly optimization algorithm is utilized for optimal area coverage and energy consumption reduction. This paper uses four parameters to evaluate its efficiency: the minimum number of nodes in the coverage area, the lifetime of the network, including the time interval from the start of the first node to the shutdown time of the first node, and the network power. The results of the suggested method are compared with those of some other published methods. The results show that by increasing the number of steps, the energy of the live nodes will eventually run out and turn off. In the LEACH method, after 350 steps, the RED-LEACH method, after 750 steps, and the GSA-based method, after 915 steps, the nodes start shutting down, which occurs after 1227 steps for the proposed method. This means that the nodes are turned off later. Simulations indicate that the suggested method achieves better results than the other examined techniques according to the provided performance parameters. Full article

(This article belongs to the Special Issue Internet of Things (IoT) for Industry 4.0)

► Show Figures

Figure 1

14 pages, 950 KiB

Open AccessArticle

Smart Preliminary Channel Access to Support Real-Time Traffic in Wi-Fi Networks

by Kirill Chemrov, Dmitry Bankov, Evgeny Khorov and Andrey Lyakhov

Future Internet 2022, 14(10), 296; https://doi.org/10.3390/fi14100296 - 16 Oct 2022

Cited by 4 | Viewed by 1885

Abstract

Real-time applications (RTA) are an important use case for IEEE 802.11be, a new amendment to the Wi-Fi standard. This amendment introduces new complicated mechanisms to provide low delay and high reliability for RTA, but many of them are not supported by legacy devices [...] Read more.

Real-time applications (RTA) are an important use case for IEEE 802.11be, a new amendment to the Wi-Fi standard. This amendment introduces new complicated mechanisms to provide low delay and high reliability for RTA, but many of them are not supported by legacy devices that may be present in future Wi-Fi networks. In contrast, the preliminary channel access (PCA) method is designed to satisfy strict RTA requirements even in the presence of legacy devices and does not require significant changes to the Wi-Fi protocol. However, it significantly reduces the capacity for non-RTA traffic. This paper introduces a Smart PCA method, which improves the performance of all the stations in scenarios with multiple RTA stations. Extensive simulation shows that the Smart PCA method guarantees low delays for intensive RTA traffic in these scenarios. Moreover, it doubles the network capacity for the stations with non-RTA traffic. Full article

(This article belongs to the Collection 5G/6G Networks for the Internet of Things: Communication Technologies and Challenges)

► Show Figures

Graphical abstract

20 pages, 1837 KiB

Open AccessArticle

Study of the Organization and Implementation of E-Learning in Wartime Inside Ukraine

by Liudmyla Matviichuk, Stefano Ferilli and Nataliia Hnedko

Future Internet 2022, 14(10), 295; https://doi.org/10.3390/fi14100295 - 15 Oct 2022

Cited by 8 | Viewed by 2343

Abstract

The article provides a factual foundation for the possibility of organizing and implementing e-learning in Ukrainian higher educational institutions during the war. The current research topicality is supported by the urgent need for training experience, organization and implementation during wartime because of the [...] Read more.

The article provides a factual foundation for the possibility of organizing and implementing e-learning in Ukrainian higher educational institutions during the war. The current research topicality is supported by the urgent need for training experience, organization and implementation during wartime because of the fact that both the educational process and the opportunity to obtain an education should not be halted. The study’s goal is to assess the current state of the e-learning organization and implementation, as well as to examine students’ attitude towards the educational process during wartime. Methods such as scientific source analysis, generalization and systematization of the e-learning experience and its practical application were used to achieve the goal. Furthermore, empirical methods such as interviewing and observation were used. Questionnaires have been proposed as important research tools for this purpose. Four structured groups for the use of e-learning have been formed and identified based on the findings. We created an e-learning organization and support model based on them. Furthermore, we identified ten poignant factors as the sources of difficulties for teachers when implementing innovations, with limited resources and a lack of time being among the most significant. Full article

(This article belongs to the Special Issue E-Learning and Technology Enhanced Learning II)

► Show Figures

Figure 1

21 pages, 516 KiB

Open AccessArticle

A Comparative Study on Traffic Modeling Techniques for Predicting and Simulating Traffic Behavior

by Taghreed Alghamdi, Sifatul Mostafi, Ghadeer Abdelkader and Khalid Elgazzar

Future Internet 2022, 14(10), 294; https://doi.org/10.3390/fi14100294 - 15 Oct 2022

Cited by 7 | Viewed by 3954

Abstract

The significant advancements in intelligent transportation systems (ITS) have contributed to the increased development in traffic modeling. These advancements include prediction and simulation models that are used to simulate and predict traffic behaviors on highway roads and urban networks. These models are capable [...] Read more.

The significant advancements in intelligent transportation systems (ITS) have contributed to the increased development in traffic modeling. These advancements include prediction and simulation models that are used to simulate and predict traffic behaviors on highway roads and urban networks. These models are capable of precise modeling of the current traffic status and accurate predictions of the future status based on varying traffic conditions. However, selecting the appropriate traffic model for a specific environmental setting is challenging and expensive due to the different requirements that need to be considered, such as accuracy, performance, and efficiency. In this research, we present a comprehensive literature review of the research related to traffic prediction and simulation models. We start by highlighting the challenges in the long-term and short-term prediction of traffic modeling. Then, we review the most common nonparametric prediction models. Lastly, we look into the existing literature on traffic simulation tools and traffic simulation algorithms. We summarize the available traffic models, define the required parameters, and discuss the limitations of each model. We hope that this survey serves as a useful resource for traffic management engineers, researchers, and practitioners in this domain. Full article

(This article belongs to the Special Issue IoT in Intelligent Transportation Systems)

► Show Figures

Figure 1

52 pages, 1149 KiB

Open AccessArticle

A Survey of Wi-Fi 6: Technologies, Advances, and Challenges

by Erfan Mozaffariahrar, Fabrice Theoleyre and Michael Menth

Future Internet 2022, 14(10), 293; https://doi.org/10.3390/fi14100293 - 14 Oct 2022

Cited by 16 | Viewed by 10557

Abstract

Wi-Fi is a popular wireless technology and is continuously extended to keep pace with requirements such as high throughput, real-time communication, dense networks, or resource and energy efficiency. The IEEE 802.11ax standard, also known as Wi-Fi 6, promises to provide data rates of [...] Read more.

Wi-Fi is a popular wireless technology and is continuously extended to keep pace with requirements such as high throughput, real-time communication, dense networks, or resource and energy efficiency. The IEEE 802.11ax standard, also known as Wi-Fi 6, promises to provide data rates of up to almost 10 Gb/s, lower energy consumption, and higher reliability. Its capabilities go far beyond Wi-Fi 5 (802.11ac) and novel technical concepts have been introduced for this purpose. As such, the Wi-Fi 6 standard includes Multi-User Orthogonal Frequency Division Multiple Access (MU OFDMA), Multi-User Multiple-Input Multiple-Output (MU MIMO), new mechanisms for Spatial Reuse (SR), new mechanisms for power saving, higher-order modulation, and additional minor improvements. In this paper, we provide a survey of Wi-Fi 6. Initially, we provide a compact technological summary of Wi-Fi 5 and its predecessors. Then, we discuss the potential application domains of Wi-Fi 6, which are enabled through its novel features. Subsequently, we explain these features and review the related works in these areas. Finally, performance evaluation tools for Wi-Fi 6 and future roadmaps are discussed. Full article

(This article belongs to the Special Issue 5G Wireless Communication Networks)

► Show Figures

Figure 1

28 pages, 796 KiB

Open AccessArticle

Experimenting with Routing Protocols in the Data Center: An ns-3 Simulation Approach

by Leonardo Alberro, Felipe Velázquez, Sara Azpiroz, Eduardo Grampin and Matías Richart

Future Internet 2022, 14(10), 292; https://doi.org/10.3390/fi14100292 - 14 Oct 2022

Cited by 3 | Viewed by 1920

Abstract

Massive scale data centers (MSDC) have become a key component of current content-centric Internet architecture. With scales of up to hundreds of thousands servers, conveying traffic inside these infrastructures requires much greater connectivity resources than traditional broadband Internet transit networks. MSDCs use Fat-Tree [...] Read more.

Massive scale data centers (MSDC) have become a key component of current content-centric Internet architecture. With scales of up to hundreds of thousands servers, conveying traffic inside these infrastructures requires much greater connectivity resources than traditional broadband Internet transit networks. MSDCs use Fat-Tree type topologies, which ensure multipath connectivity and constant bisection bandwidth between servers. To properly use the potential advantages of these topologies, specific routing protocols are needed, with multipath support and low control messaging load. These infrastructures are enormously expensive, and therefore it is not possible to use them to experiment with new protocols; that is why scalable and realistic emulation/simulation environments are needed. Based on previous experiences, in this paper we present extensions to the ns-3 network simulator that allow executing the Free Range Routing (FRR) protocol suite, which support some of the specific MSDC routing protocols. Focused on the Border Gateway Protocol (BGP), we run a comprehensive set of control plane experiments over Fat-Tree topologies, achieving competitive scalability running on a single-host environment, which demonstrates that the modified ns-3 simulator can be effectively used for experimenting in the MSDC. Moreover, the validation was complemented with a theoretical analysis of BGP behavior over selected scenarios. The whole project is available to the community and fully reproducible. Full article

(This article belongs to the Section Smart System Infrastructure and Applications)

► Show Figures

Figure 1

18 pages, 1346 KiB

Open AccessArticle

Natural Language Processing and Cognitive Networks Identify UK Insurers’ Trends in Investor Day Transcripts

by Stefan Claus and Massimo Stella

Future Internet 2022, 14(10), 291; https://doi.org/10.3390/fi14100291 - 12 Oct 2022

Cited by 5 | Viewed by 2403

Abstract

The ability to spot key ideas, trends, and relationships between them in documents is key to financial services, such as banks and insurers. Identifying patterns across vast amounts of domain-specific reports is crucial for devising efficient and targeted supervisory plans, subsequently allocating limited [...] Read more.

The ability to spot key ideas, trends, and relationships between them in documents is key to financial services, such as banks and insurers. Identifying patterns across vast amounts of domain-specific reports is crucial for devising efficient and targeted supervisory plans, subsequently allocating limited resources where most needed. Today, insurance supervisory planning primarily relies on quantitative metrics based on numerical data (e.g., solvency financial returns). The purpose of this work is to assess whether Natural Language Processing (NLP) and cognitive networks can highlight events and relationships of relevance for regulators that supervise the insurance market, replacing human coding of information with automatic text analysis. To this aim, this work introduces a dataset of

N_{I D T} = 829

investor transcripts from Bloomberg and explores/tunes 3 NLP techniques: (1) keyword extraction enhanced by cognitive network analysis; (2) valence/sentiment analysis; and (3) topic modelling. Results highlight that keyword analysis, enriched by term frequency-inverse document frequency scores and semantic framing through cognitive networks, could detect events of relevance for the insurance system like cyber-attacks or the COVID-19 pandemic. Cognitive networks were found to highlight events that related to specific financial transitions: The semantic frame of “climate” grew in size by +538% between 2018 and 2020 and outlined an increased awareness that agents and insurers expressed towards climate change. A lexicon-based sentiment analysis achieved a Pearson’s correlation of

ρ = 0.16

(

p < 0.001, N = 829

) between sentiment levels and daily share prices. Although relatively weak, this finding indicates that insurance jargon is insightful to support risk supervision. Topic modelling is considered less amenable to support supervision, because of a lack of results’ stability and an intrinsic difficulty to interpret risk patterns. We discuss how these automatic methods could complement existing supervisory tools in supporting effective oversight of the insurance market. Full article

(This article belongs to the Special Issue Information Networks with Human-Centric AI)

► Show Figures

Figure 1

34 pages, 1275 KiB

Open AccessArticle

Unreachable Peers Communication Scheme in Decentralized Networks Based on Peer-to-Peer Overlay Approaches

by Gengxian Li, Chundong Wang and Huaibin Wang

Future Internet 2022, 14(10), 290; https://doi.org/10.3390/fi14100290 - 12 Oct 2022

Cited by 1 | Viewed by 2281

Abstract

Decentralized networks bring us many benefits, but as networks evolve, many nodes either actively or passively become unreachable behind an NAT or a firewall. This has become a hindrance to the development of decentralized networks, where peer-to-peer communication data transfer between unreachable nodes [...] Read more.

Decentralized networks bring us many benefits, but as networks evolve, many nodes either actively or passively become unreachable behind an NAT or a firewall. This has become a hindrance to the development of decentralized networks, where peer-to-peer communication data transfer between unreachable nodes cannot be accomplished, whether in decentralized file systems, decentralized social, or decentralized IoT. The existing scheme requires a series of centralized servers or requires network-wide flooding for consensus data, which can lead to the loss of decentralized nature of the network and cause flooding bottlenecks, contrary to the design concept of decentralization. In this paper, our proposed scheme uses a structured P2P overlay network to store the indexes of unreachable nodes in the whole network, so that the characteristics of a decentralized network are still maintained while ensuring the efficiency of lookup. When nodes communicate, the transmission channel is established so that both nodes continuously transmit data streams peer-to-peer without relying on the central server. Moreover, the scheme guarantees the security and privacy of nodes’ data transmission and the P2P overlay network without relying on centralized trusted institutions. Finally, we deploy a real cluster environment to verify the effectiveness of each module at different network sizes and prove the overall feasibility of the scheme. The scheme has certain advantages over existing solutions in terms of security, privacy, communication efficiency, device democracy, etc. Full article

(This article belongs to the Special Issue Future Communication Networks for the Internet of Things (IoT))

► Show Figures

Figure 1

16 pages, 1058 KiB

Open AccessArticle

A Self-Supervised Learning Model for Unknown Internet Traffic Identification Based on Surge Period

by Dawei Wei, Feifei Shi and Sahraoui Dhelim

Future Internet 2022, 14(10), 289; https://doi.org/10.3390/fi14100289 - 10 Oct 2022

Cited by 5 | Viewed by 2019

Abstract

The identification of Internet protocols provides a significant basis for keeping Internet security and improving Internet Quality of Service (QoS). However, the overwhelming developments and updating of Internet technologies and protocols have led to large volumes of unknown Internet traffic, which threaten the [...] Read more.

The identification of Internet protocols provides a significant basis for keeping Internet security and improving Internet Quality of Service (QoS). However, the overwhelming developments and updating of Internet technologies and protocols have led to large volumes of unknown Internet traffic, which threaten the safety of the network environment a lot. Since most of the unknown Internet traffic does not have any labels, it is difficult to adopt deep learning directly. Additionally, the feature accuracy and identification model also impact the identification accuracy a lot. In this paper, we propose a surge period-based feature extraction method that helps remove the negative influence of background traffic in network sessions and acquire as many traffic flow features as possible. In addition, we also establish an identification model of unknown Internet traffic based on JigClu, the self-supervised learning approach to training unlabeled datasets. It finally combines with the clustering method and realizes the further identification of unknown Internet traffic. The model has been demonstrated with an accuracy of no less than 74% in identifying unknown Internet traffic with the public dataset ISCXVPN2016 under different scenarios. The work provides a novel solution for unknown Internet traffic identification, which is the most difficult task in identifying Internet traffic. We believe it is a great leap in Internet traffic identification and is of great significance to maintaining the security of the network environment. Full article

(This article belongs to the Special Issue Anomaly Detection in Modern Networks)

► Show Figures

Figure 1

13 pages, 2365 KiB

Open AccessArticle

Playful Meaning-Making as Prosocial Fun

by John M. Carroll, Fanlu Gui, Srishti Gupta and Tiffany Knearem

Future Internet 2022, 14(10), 288; https://doi.org/10.3390/fi14100288 - 30 Sep 2022

Viewed by 1851

Abstract

Smart city infrastructures enable the routine interleaving and integration of diverse activities, including new ways to play, to be playful, and to participate. We discuss three examples: (1) citizen-based water quality monitoring, which combines outdoor exercise and social interaction with safeguarding public water [...] Read more.

Smart city infrastructures enable the routine interleaving and integration of diverse activities, including new ways to play, to be playful, and to participate. We discuss three examples: (1) citizen-based water quality monitoring, which combines outdoor exercise and social interaction with safeguarding public water supplies, (2) a digital scavenger hunt, which combines the experiences of a community arts festival with shared reflections about significant community places and events, and (3) public thanking, which encourages people to acknowledge neighbors and local groups that serve and strengthen the community. Each of these interaction possibilities in itself alters lived experience modestly. We argue that lightweight and playful meaning making activities can be prosocial fun, that is to say, they can simultaneously be playful and fun, but also substantive contributions to the coherence and richness of a community. Full article

(This article belongs to the Special Issue Information Networks with Human-Centric AI)

► Show Figures

Figure 1

20 pages, 2244 KiB

Open AccessArticle

Complex Cases of Source Code Authorship Identification Using a Hybrid Deep Neural Network

by Anna Kurtukova, Aleksandr Romanov, Alexander Shelupanov and Anastasia Fedotova

Future Internet 2022, 14(10), 287; https://doi.org/10.3390/fi14100287 - 30 Sep 2022

Cited by 1 | Viewed by 2260

Abstract

This paper is a continuation of our previous work on solving source code authorship identification problems. The analysis of heterogeneous source code is a relevant issue for copyright protection in commercial software development. This is related to the specificity of development processes and [...] Read more.

This paper is a continuation of our previous work on solving source code authorship identification problems. The analysis of heterogeneous source code is a relevant issue for copyright protection in commercial software development. This is related to the specificity of development processes and the usage of collaborative development tools (version control systems). As a result, there are source codes written according to different programming standards by a team of programmers with different skill levels. Another application field is information security—in particular, identifying the author of computer viruses. We apply our technique based on a hybrid of Inception-v1 and Bidirectional Gated Recurrent Units architectures on heterogeneous source codes and consider the most common commercial development complex cases that negatively affect the authorship identification process. The paper is devoted to the possibilities and limitations of the author’s technique in various complex cases. For situations where a programmer was proficient in two programming languages, the average accuracy was 87%; for proficiency in three or more—76%. For the artificially generated source code case, the average accuracy was 81.5%. Finally, the average accuracy for source codes generated from commits was 84%. The comparison with state-of-the-art approaches showed that the proposed method has no full-functionality analogs covering actual practical cases. Full article

(This article belongs to the Special Issue Trends of Data Science and Knowledge Discovery)

► Show Figures

Figure 1

11 pages, 337 KiB

Open AccessArticle

Minimization of nth Order Rate Matching in Satellite Networks with One to Many Pairings

by Anargyros J. Roumeliotis, Christos N. Efrem and Athanasios D. Panagopoulos

Future Internet 2022, 14(10), 286; https://doi.org/10.3390/fi14100286 - 30 Sep 2022

Cited by 1 | Viewed by 1196

Abstract

This paper studies the minimization of nth (positive integer) order rate matching in high-throughput multi-beam satellite systems, based on one-to-many capacity allocation pairings, for the first time in the literature. The offered and requested capacities of gateways and users’ beams are exploited, [...] Read more.

This paper studies the minimization of nth (positive integer) order rate matching in high-throughput multi-beam satellite systems, based on one-to-many capacity allocation pairings, for the first time in the literature. The offered and requested capacities of gateways and users’ beams are exploited, respectively. Due to the high complexity of the binary optimization problem, its solution is approached with a two-step heuristic scheme. Firstly, the corresponding continuous, in [0, 1], pairing problem is solved applying the difference of convex optimization theory, and then, a transformation from continuous to binary feasible allocation is provided to extract the pairings among gateways and users’ beams. Comparing with the exponential-time optimal exhaustive mechanism that investigates all possible pairs to extract the best matching for minimizing the rate matching, extended simulations show that the presented approximation for the solution of the non-convex optimization problem has fast convergence and achieves a generally low relative error for lower value of n. Finally, the simulation results show the importance of n in the examined problem. Specifically, pairings originated by the minimization of rate matching with larger n result in more fair rate matching among users’ beams, which is a valuable result for satellite and generally wireless systems operators. Full article

(This article belongs to the Special Issue Featured Papers in the Section Internet of Things)

► Show Figures

Figure 1

12 pages, 1612 KiB

Open AccessArticle

An Efficient Location-Based Forwarding Strategy for Named Data Networking and LEO Satellite Communications

by Pablo Iglesias-Sanuy, José Carlos López-Ardao, Miguel Rodríguez-Pérez, Sergio Herrería-Alonso, Andrés Suárez-González and Raúl F. Rodríguez-Rubio

Future Internet 2022, 14(10), 285; https://doi.org/10.3390/fi14100285 - 29 Sep 2022

Cited by 5 | Viewed by 2381

Abstract

Low Earth orbit (LEO) satellite constellations are increasingly gaining attention as future global Internet providers. At the same time, named data networking (NDN) is a new data-centric architecture that has been recently proposed to replace the classic TCP/IP architecture since it is particularly [...] Read more.

Low Earth orbit (LEO) satellite constellations are increasingly gaining attention as future global Internet providers. At the same time, named data networking (NDN) is a new data-centric architecture that has been recently proposed to replace the classic TCP/IP architecture since it is particularly well suited to the most common usage of the Internet nowadays as a content delivery network. Certainly, the use of NDN is especially convenient in highly dynamic network environments, such as those of next LEO constellations incorporating inter-satellite links (ISL). Among other native facilities, such as inbuilt security, NDN readily supports the mobility of clients, thus helping to overcome one of the main problems raised in LEO satellite networks. Moreover, thanks to a stateful forwarding plane with support for multicast transmission and inbuilt data caches, NDN is also able to provide a more efficient usage of the installed transmission capacity. In this paper, we propose a new location-based forwarding strategy for LEO satellite networks that takes advantage of the knowledge of the relative position of the satellites and the grid structure formed by the ISLs to perform the forwarding of NDN packets. So, forwarding at each node is done using only local information (node and destination locations), without the need of interchanging information between nodes, as is the case with conventional routing protocols. Using simulation, we show that the proposed forwarding strategy is a good candidate to promote the efficient and effective future use of the NDN architecture in LEO satellite networks. Full article

(This article belongs to the Special Issue Recent Advances in Information-Centric Networks (ICNs))

► Show Figures

Graphical abstract

21 pages, 2834 KiB

Open AccessArticle

Modeling and Validating a News Recommender Algorithm in a Mainstream Medium-Sized News Organization: An Experimental Approach

by Paschalia (Lia) Spyridou, Constantinos Djouvas and Dimitra Milioni

Future Internet 2022, 14(10), 284; https://doi.org/10.3390/fi14100284 - 29 Sep 2022

Cited by 2 | Viewed by 1903

Abstract

News recommending systems (NRSs) are algorithmic tools that filter incoming streams of information according to the users’ preferences or point them to additional items of interest. In today’s high-choice media environment, attention shifts easily between platforms and news sites and is greatly affected [...] Read more.

News recommending systems (NRSs) are algorithmic tools that filter incoming streams of information according to the users’ preferences or point them to additional items of interest. In today’s high-choice media environment, attention shifts easily between platforms and news sites and is greatly affected by algorithmic technologies; news personalization is increasingly used by news media to woo and retain users’ attention and loyalty. The present study examines the implementation of a news recommender algorithm in a leading news media organization on the basis of observation of the recommender system’s outputs. Drawing on an experimental design employing the ‘algorithmic audit’ method, and more specifically the ‘collaborative audit’ which entails utilizing users as testers of algorithmic systems, we analyze the composition of the personalized MyNews area in terms of accuracy and user engagement. Premised on the idea of algorithms being black boxes, the study has a two-fold aim: first, to identify the implicated design parameters enlightening the underlying functionality of the algorithm, and second, to evaluate in practice the NRS through the deployed experimentation. Results suggest that although the recommender algorithm manages to discriminate between different users on the basis of their past behavior, overall, it underperforms. We find that this is related to flawed design decisions rather than technical deficiencies. The study offers insights to guide the improvement of NRSs’ design that both considers the production capabilities of the news organization and supports business goals, user demands and journalism’s civic values. Full article

(This article belongs to the Special Issue Theory and Applications of Web 3.0 in the Media Sector)

► Show Figures

Figure 1

24 pages, 2162 KiB

Open AccessArticle

FakeNewsLab: Experimental Study on Biases and Pitfalls Preventing Us from Distinguishing True from False News

by Giancarlo Ruffo and Alfonso Semeraro

Future Internet 2022, 14(10), 283; https://doi.org/10.3390/fi14100283 - 29 Sep 2022

Cited by 3 | Viewed by 2326

Abstract

Misinformation posting and spreading in social media is ignited by personal decisions on the truthfulness of news that may cause wide and deep cascades at a large scale in a fraction of minutes. When individuals are exposed to information, they usually take a [...] Read more.

Misinformation posting and spreading in social media is ignited by personal decisions on the truthfulness of news that may cause wide and deep cascades at a large scale in a fraction of minutes. When individuals are exposed to information, they usually take a few seconds to decide if the content (or the source) is reliable and whether to share it. Although the opportunity to verify the rumour is often just one click away, many users fail to make a correct evaluation. We studied this phenomenon with a web-based questionnaire that was compiled by 7298 different volunteers, where the participants were asked to mark 20 news items as true or false. Interestingly, false news is correctly identified more frequently than true news, but showing the full article instead of just the title, surprisingly, does not increase general accuracy. Additionally, displaying the original source of the news may contribute to misleading the user in some cases, while the genuine wisdom of the crowd can positively assist individuals’ ability to classify news correctly. Finally, participants whose browsing activity suggests a parallel fact-checking activity show better performance and declare themselves as young adults. This work highlights a series of pitfalls that can influence human annotators when building false news datasets, which in turn can fuel the research on the automated fake news detection; furthermore, these findings challenge the common rationale of AI that suggest users read the full article before re-sharing. Full article

(This article belongs to the Special Issue Information Networks with Human-Centric AI)

► Show Figures

Figure 1

13 pages, 1058 KiB

Open AccessArticle

Latency Analysis of Blockchain-Based SSI Applications

by Tamas Pflanzner, Hamza Baniata and Attila Kertesz

Future Internet 2022, 14(10), 282; https://doi.org/10.3390/fi14100282 - 29 Sep 2022

Cited by 5 | Viewed by 2270

Abstract

Several revolutionary applications have been built on the distributed ledgers of blockchain (BC) technology. Besides cryptocurrencies, many other application fields can be found in smart systems exploiting smart contracts and Self Sovereign Identity (SSI) management. The Hyperledger Indy platform is a suitable open-source [...] Read more.

Several revolutionary applications have been built on the distributed ledgers of blockchain (BC) technology. Besides cryptocurrencies, many other application fields can be found in smart systems exploiting smart contracts and Self Sovereign Identity (SSI) management. The Hyperledger Indy platform is a suitable open-source solution for realizing permissioned BC systems for SSI projects. SSI applications usually require short response times from the underlying BC network, which may vary highly depending on the application type, the used BC software, and the actual BC deployment parameters. To support the developers and users of SSI applications, we present a detailed latency analysis of a permissioned BC system built with Indy and Aries. To streamline our experiments, we developed a Python application using containerized Indy and Aries components from official Hyperledger repositories. We deployed our experimental application on multiple virtual machines in the public Google Cloud Platform and on our local, private cloud using a Docker platform with Kubernetes. We evaluated and compared their performance benchmarked by Read and Write latencies. We found that the local Indy ledger reads and writes 30–50%, and 65–85% faster than the Indy ledger running on the Google Cloud Platform, respectively. Full article

(This article belongs to the Special Issue Distributed Systems for Emerging Computing: Platform and Application)

► Show Figures

Figure 1

13 pages, 1945 KiB

Open AccessArticle

Improving Quality Indicators of the Cloud-Based IoT Networks Using an Improved Form of Seagull Optimization Algorithm

by Hamza Mohammed Ridha Al-Khafaji

Future Internet 2022, 14(10), 281; https://doi.org/10.3390/fi14100281 - 29 Sep 2022

Cited by 6 | Viewed by 1817

Abstract

The Internet of things (IoT) points to billions of devices located worldwide which are connected and share their data based on the Internet. Due to the new technologies that provide cheap computer chips and universal wireless networks, it is feasible that everything from [...] Read more.

The Internet of things (IoT) points to billions of devices located worldwide which are connected and share their data based on the Internet. Due to the new technologies that provide cheap computer chips and universal wireless networks, it is feasible that everything from a small tablet to a very large airplane will be connected to the Internet and will be a part of the IoT. In most applications, IoT network nodes face limitations in terms of energy source and cost. Therefore, the need for innovative methods to improve quality indicators that increase the lifespan of networks is evident. Here, a novel technique is presented to increase the quality of service (QoS) in IoT using an improved meta-heuristic algorithm, called the improved seagull optimization algorithm (ISOA), along with traffic management in these networks. Based on this subject, the traffic-aware algorithm can manage the sending of packets and increase the QoS provision in terms of time to a great extent. The performance evaluation of the proposed method and comparison with the previous methods demonstrated the accuracy and efficiency of this method and its superiority over the previous works. Full article

(This article belongs to the Special Issue Emerging Technologies, Research Opportunities and Experimentation for Network Virtualization and Cloud Computing)

► Show Figures

Figure 1

17 pages, 7457 KiB

Open AccessArticle

Information Technologies for Real-Time Mapping of Human Well-Being Indicators in an Urban Historical Garden

by Francesco Pirotti, Marco Piragnolo, Marika D’Agostini and Raffaele Cavalli

Future Internet 2022, 14(10), 280; https://doi.org/10.3390/fi14100280 - 29 Sep 2022

Cited by 4 | Viewed by 1720

Abstract

The post-pandemic era has raised awareness on the importance of physical and psychological well-being for decreasing the vulnerability of both individuals and populations. Citizens in urban areas are subject to numerous stress factors which can be mitigated by green spaces such as parks [...] Read more.

The post-pandemic era has raised awareness on the importance of physical and psychological well-being for decreasing the vulnerability of both individuals and populations. Citizens in urban areas are subject to numerous stress factors which can be mitigated by green spaces such as parks and gardens. Sensor and internet technologies support nature-based solutions in various ways. In this paper, we show the results of ongoing research on the use of spatially distributed IoT sensors that collect climate data in an ~8 ha urban garden. The novelty resides in the method for merging the IoT data with a detailed 3D model created by a laser scan survey from a drone flight. The end products are 1 m resolution thermal comfort maps of user-defined scenarios, e.g., at specific times or aggregated in daily/monthly/yearly statistics that represent a thermal comfort distribution. For full replicability, the code is open source and available as an R package on Github. Full article

(This article belongs to the Section Techno-Social Smart Systems)

► Show Figures

Figure 1

16 pages, 3659 KiB

Open AccessArticle

Modelling Analysis of a Novel Frameless Slotted-ALOHA Protocol Based on the Number of Detectable Conflicting Users

by Sa Yang, Suoping Li, Nana Yang and Ying Lin

Future Internet 2022, 14(10), 279; https://doi.org/10.3390/fi14100279 - 28 Sep 2022

Cited by 1 | Viewed by 1435

Abstract

To solve the conflict when multi-user packets are transmitted in a shared wireless link, a novel frameless slotted-ALOHA protocol is proposed. Signature codes are used to help the receiver identify the set of transmitting users, and successive interference cancellation technology is employed to [...] Read more.

To solve the conflict when multi-user packets are transmitted in a shared wireless link, a novel frameless slotted-ALOHA protocol is proposed. Signature codes are used to help the receiver identify the set of transmitting users, and successive interference cancellation technology is employed to recover conflicting packets. Thus, the information in the conflicting slot can be reused to reduce the number of retransmissions. Taking the number of backlogged users in each slot as a system state, a Markov chain model is established to analyze the protocol, in which the state transition probabilities are obtained based on the binomial distribution of packets sent in a slot. Under the maximum number of detectable conflicting users, the best value is taken, traffic balance equations are obtained, and the expressions of throughput, average number of backlogged users, average successful transmission probability and average memory size are derived. Finally, a numerical simulation is carried out to accurately analyze the influence of the first transmission probability of the packets on various performance indexes and the effectiveness of the theoretical analysis is further verified by the simulation results. Full article

(This article belongs to the Section Internet of Things)

► Show Figures

Figure 1

20 pages, 42672 KiB

Open AccessArticle

ReFuse: Generating Imperviousness Maps from Multi-Spectral Sentinel-2 Satellite Imagery

by Giovanni Giacco, Stefano Marrone, Giuliano Langella and Carlo Sansone

Future Internet 2022, 14(10), 278; https://doi.org/10.3390/fi14100278 - 28 Sep 2022

Cited by 2 | Viewed by 1847

Abstract

Continual mapping and monitoring of impervious surfaces are crucial activities to support sustainable urban management strategies and to plan effective actions for environmental changes. In this context, impervious surface coverage is increasingly becoming an essential indicator for assessing urbanization and environmental quality, with [...] Read more.

Continual mapping and monitoring of impervious surfaces are crucial activities to support sustainable urban management strategies and to plan effective actions for environmental changes. In this context, impervious surface coverage is increasingly becoming an essential indicator for assessing urbanization and environmental quality, with several works relying on satellite imagery to determine it. However, although satellite imagery is typically available with a frequency of 3–10 days worldwide, imperviousness maps are released at most annually as they require a huge human effort to be produced and validated. Attempts have been made to extract imperviousness maps from satellite images using machine learning, but (i) the scarcity of reliable and detailed ground truth (ii) together with the need to manage different spectral bands (iii) while making the resulting system easily accessible to the end users is limiting their diffusion. To tackle these problems, in this work we introduce a deep-learning-based approach to extract imperviousness maps from multi-spectral Sentinel-2 images leveraging a very detailed imperviousness map realised by the Italian department for environment protection as ground truth. We also propose a scalable and portable inference pipeline designed to easily scale the approach, integrating it into a web-based Geographic Information System (GIS) application. As a result, even non-expert GIS users can quickly and easily calculate impervious surfaces for any place on Earth (accuracy

> 95 %

), with a frequency limited only by the availability of new satellite images. Full article

(This article belongs to the Special Issue Novel Sources of Geographical Data and Old Planning Problems: New Challenges and Novel Approaches)

► Show Figures

Figure 1

14 pages, 3771 KiB

Open AccessArticle

Deep Learning Based Semantic Image Segmentation Methods for Classification of Web Page Imagery

by Ramya Krishna Manugunta, Rytis Maskeliūnas and Robertas Damaševičius

Future Internet 2022, 14(10), 277; https://doi.org/10.3390/fi14100277 - 27 Sep 2022

Cited by 3 | Viewed by 2252

Abstract

Semantic segmentation is the task of clustering together parts of an image that belong to the same object class. Semantic segmentation of webpages is important for inferring contextual information from the webpage. This study examines and compares deep learning methods for classifying webpages [...] Read more.

Semantic segmentation is the task of clustering together parts of an image that belong to the same object class. Semantic segmentation of webpages is important for inferring contextual information from the webpage. This study examines and compares deep learning methods for classifying webpages based on imagery that is obscured by semantic segmentation. Fully convolutional neural network architectures (UNet and FCN-8) with defined hyperparameters and loss functions are used to demonstrate how they can support an efficient method of this type of classification scenario in custom-prepared webpage imagery data that are labeled multi-class and semantically segmented masks using HTML elements such as paragraph text, images, logos, and menus. Using the proposed Seg-UNet model achieved the best accuracy of 95%. A comparison with various optimizer functions demonstrates the overall efficacy of the proposed semantic segmentation approach. Full article

(This article belongs to the Special Issue Trends of Data Science and Knowledge Discovery)

► Show Figures

Figure 1

18 pages, 2718 KiB

Open AccessArticle

Automated Penetration Testing Framework for Smart-Home-Based IoT Devices

by Rohit Akhilesh, Oliver Bills, Naveen Chilamkurti and Mohammad Jabed Morshed Chowdhury

Future Internet 2022, 14(10), 276; https://doi.org/10.3390/fi14100276 - 27 Sep 2022

Cited by 13 | Viewed by 5581

Abstract

Security testing is fundamental to identifying security vulnerabilities on smart home-based IoT devices. For this, penetration testing is the most prominent and effective solution. However, testing the IoT manually is cumbersome and time-consuming. In addition, penetration testing requires a deep knowledge of the [...] Read more.

Security testing is fundamental to identifying security vulnerabilities on smart home-based IoT devices. For this, penetration testing is the most prominent and effective solution. However, testing the IoT manually is cumbersome and time-consuming. In addition, penetration testing requires a deep knowledge of the possible attacks and the available hacking tools. Therefore, this study emphasises building an automated penetration testing framework to discover the most common vulnerabilities in smart home-based IoT devices. This research involves exploring (studying) different IoT devices to select five devices for testing. Then, the common vulnerabilities for the five selected smart home-based IoT devices are examined, and the corresponding penetration testing tools required for the detection of these vulnerabilities are identified. The top five vulnerabilities are identified from the most common vulnerabilities, and accordingly, the corresponding tools for these vulnerabilities are discovered. These tools are combined using a script which is then implemented into a framework written in Python 3.6. The selected IoT devices are tested individually for known vulnerabilities using the proposed framework. For each vulnerability discovered in the device, the Common Vulnerability Scoring System (CVSS) Base score is calculated and the summation of these scores is taken to calculate the total score (for each device). In our experiment, we found that the Tp-Link Smart Bulb and the Tp-Link Smart Camera had the highest score and were the most vulnerable and the Google Home Mini had the least score and was the most secure device of all the devices. Finally, we conclude that our framework does not require technical expertise and thus can be used by common people. This will help improve the field of IoT security and ensure the security of smart homes to build a safe and secure future. Full article

(This article belongs to the Special Issue Privacy and Cybersecurity in the Artificial Intelligence Age)

► Show Figures

Figure 1

20 pages, 4426 KiB

Open AccessArticle

The Combined Use of UAV-Based RGB and DEM Images for the Detection and Delineation of Orange Tree Crowns with Mask R-CNN: An Approach of Labeling and Unified Framework

by Felipe Lucena, Fabio Marcelo Breunig and Hermann Kux

Future Internet 2022, 14(10), 275; https://doi.org/10.3390/fi14100275 - 27 Sep 2022

Cited by 6 | Viewed by 2150

Abstract

In this study, we used images obtained by Unmanned Aerial Vehicles (UAV) and an instance segmentation model based on deep learning (Mask R-CNN) to evaluate the ability to detect and delineate canopies in high density orange plantations. The main objective of the work [...] Read more.

In this study, we used images obtained by Unmanned Aerial Vehicles (UAV) and an instance segmentation model based on deep learning (Mask R-CNN) to evaluate the ability to detect and delineate canopies in high density orange plantations. The main objective of the work was to evaluate the improvement acquired by the segmentation model when integrating the Canopy Height Model (CHM) as a fourth band to the images. Two models were evaluated, one with RGB images and the other with RGB + CHM images, and the results indicated that the model with combined images presents better results (overall accuracy from 90.42% to 97.01%). In addition to the comparison, this work suggests a more efficient ground truth mapping method and proposes a methodology for mosaicking the results by Mask R-CNN on remotely sensed images. Full article

(This article belongs to the Special Issue Advances in Agriculture 4.0)

► Show Figures

Figure 1

23 pages, 1368 KiB

Open AccessArticle

Cloud-Native Observability: The Many-Faceted Benefits of Structured and Unified Logging—A Multi-Case Study

by Nane Kratzke

Future Internet 2022, 14(10), 274; https://doi.org/10.3390/fi14100274 - 26 Sep 2022

Cited by 4 | Viewed by 2960

Abstract

Background: Cloud-native software systems often have a much more decentralized structure and many independently deployable and (horizontally) scalable components, making it more complicated to create a shared and consolidated picture of the overall decentralized system state. Today, observability is often understood as a [...] Read more.

Background: Cloud-native software systems often have a much more decentralized structure and many independently deployable and (horizontally) scalable components, making it more complicated to create a shared and consolidated picture of the overall decentralized system state. Today, observability is often understood as a triad of collecting and processing metrics, distributed tracing data, and logging. The result is often a complex observability system composed of three stovepipes whose data are difficult to correlate. Objective: This study analyzes whether these three historically emerged observability stovepipes of logs, metrics and distributed traces could be handled in a more integrated way and with a more straightforward instrumentation approach. Method: This study applied an action research methodology used mainly in industry–academia collaboration and common in software engineering. The research design utilized iterative action research cycles, including one long-term use case. Results: This study presents a unified logging library for Python and a unified logging architecture that uses the structured logging approach. The evaluation shows that several thousand events per minute are easily processable. Conclusions: The results indicate that a unification of the current observability triad is possible without the necessity to develop utterly new toolchains. Full article

(This article belongs to the Special Issue Cloud-Native Observability)

► Show Figures

Figure 1

13 pages, 858 KiB

Open AccessArticle

Distributed Big Data Storage Infrastructure for Biomedical Research Featuring High-Performance and Rich-Features

by Xingjian Xu, Lijun Sun and Fanjun Meng

Future Internet 2022, 14(10), 273; https://doi.org/10.3390/fi14100273 - 24 Sep 2022

Cited by 1 | Viewed by 1575

Abstract

The biomedical field entered the era of “big data” years ago, and a lot of software is being developed to tackle the analysis problems brought on by big data. However, very few programs focus on providing a solid foundation for file systems of [...] Read more.

The biomedical field entered the era of “big data” years ago, and a lot of software is being developed to tackle the analysis problems brought on by big data. However, very few programs focus on providing a solid foundation for file systems of biomedical big data. Since file systems are a key prerequisite for efficient big data utilization, the absence of specialized biomedical big data file systems makes it difficult to optimize storage, accelerate analysis, and enrich functionality, resulting in inefficiency. Here we present F3BFS, a functional, fundamental, and future-oriented distributed file system, specially designed for various kinds of biomedical data. F3BFS makes it possible to boost existing software’s performance without modifying its main algorithms by transmitting raw datasets from generic file systems. Further, F3BFS has various built-in features to help researchers manage biology datasets more efficiently and productively, including metadata management, fuzzy search, automatic backup, transparent compression, etc. Full article

(This article belongs to the Special Issue Software Engineering and Data Science II)

► Show Figures

Figure 1

20 pages, 1618 KiB

Open AccessArticle

Author Identification from Literary Articles with Visual Features: A Case Study with Bangla Documents

by Ankita Dhar, Himadri Mukherjee, Shibaprasad Sen, Md Obaidullah Sk, Amitabha Biswas, Teresa Gonçalves and Kaushik Roy

Future Internet 2022, 14(10), 272; https://doi.org/10.3390/fi14100272 - 23 Sep 2022

Cited by 3 | Viewed by 1840

Abstract

Author identification is an important aspect of literary analysis, studied in natural language processing (NLP). It aids identify the most probable author of articles, news texts or social media comments and tweets, for example. It can be applied to other domains such as [...] Read more.

Author identification is an important aspect of literary analysis, studied in natural language processing (NLP). It aids identify the most probable author of articles, news texts or social media comments and tweets, for example. It can be applied to other domains such as criminal and civil cases, cybersecurity, forensics, identification of plagiarizer, and many more. An automated system in this context can thus be very beneficial for society. In this paper, we propose a convolutional neural network (CNN)-based author identification system from literary articles. This system uses visual features along with a five-layer convolutional neural network for the identification of authors. The prime motivation behind this approach was the feasibility to identify distinct writing styles through a visualization of the writing patterns. Experiments were performed on 1200 articles from 50 authors achieving a maximum accuracy of 93.58%. Furthermore, to see how the system performed on different volumes of data, the experiments were performed on partitions of the dataset. The system outperformed standard handcrafted feature-based techniques as well as established works on publicly available datasets. Full article

(This article belongs to the Special Issue Deep Learning and Natural Language Processing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Future Internet, Volume 14, Issue 10 (October 2022) – 32 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI