Objective Quality of Experience (QoE) for Dynamic Adaptive Streaming over HTTP (DASH) video streaming has received considerable attention in recent years. While there are a number of objective QoE models, a limitation of the current models is that the QoE is provided after the entire video is delivered; also, the models are on a per client basis. For content service providers, QoE observed is important to monitor to understand ensemble performance during streaming such as for live events or concurrent streaming when multiple clients are streaming. For this purpose, we propose Moving QoE (MQoE, in short) models to measure QoE during periodically during video streaming for multiple simultaneous clients. Our first model MQoE_RF is a nonlinear model considering the bitrate gain and sensitivity from bitrate switching frequency. Our second model MQoE_SD is a linear model that focuses on capturing the standard deviation in the bitrate switching magnitude among segments along with the bitrate gain. We then study the effectiveness of both models in a multi-user mobile client environment, with the mobility patterns being based on traces from a train, a car, or a ferry. We implemented the study on the GENI testbed. Our study shows that our MQoE models are more accurate in capturing the QoE behavior during transmission than static QoE models. Furthermore, our MQoE_RF model captures the sensitivity due to bitrate switching frequency more effectively while MQoE_SD captures the sensitivity due to the magnitude of the bitrate switching. Either models are suitable for content service providers for monitoring video streaming based on their preference.
Internet eXchange Points (IXPs) are Internet infrastructures composed of high-performance networks that allow multiple autonomous systems to exchange traffic. Given the challenges of managing the flows that cross an IXP, identifying elephant flows may help improve the quality of services provided to its participants. In this context, we leverage the new flexibility and resources of programmable data planes to identify elephant flows in IXP networks adaptively via the dynamic adjustment of thresholds. Our mechanism uses the information reported by the data plane to monitor network utilization in the control plane, calculating new thresholds based on previous flow sizes and durations percentiles and configuring them back into switches to support the local classification of flows. Thus, the thresholds are updated to make the identification process better aligned with the network behavior. The experimental results show that it is possible to identify and react to elephant flows quickly, less than 0.4ms, and efficiently, with only 98.4KB of data inserted into the network by the mechanism. In addition, the threshold updating mechanism achieved accuracy of up to 90% in our evaluation scenarios.
The federated identity model provides a solution for user authentication across multiple administrative domains. The academic federations, such as the Brazilian federation, are examples of this model in practice. The majority of institutions that participate in academic federations employ password-based authentication for their users, with an attacker only needing to find out one password in order to personify the user in all federated service providers. Multi-factor authentication emerges as a solution to increase the robustness of the authentication process. This article aims to introduce a comprehensive and open source solution to offer multi-factor authentication for Shibboleth Identity Providers. Based on the Multi-factor Authentication Profile standard, our solution provides three extra second factors (One-Time Password, FIDO2 and Phone Prompt). The solution has been deployed in the Brazilian academic federation, where it was evaluated using functional and integration testing, as well as security and case study analysis.
Priority-based scheduling policies are commonly used to guarantee that requests submitted to the different service classes offered by cloud providers achieve the desired Quality of Service (QoS). However, the QoS delivered during resource contention periods may be unfair on certain requests. In particular, lower priority requests may have their resources preempted to accommodate resources associated with higher priority ones, even if the actual QoS delivered to the latter is above the desired level, while the former is underserved. Also, competing requests with the same priority may experience quite different QoS, since some of them may have their resources preempted, while others do not. In this paper we present a new scheduling policy that is driven by the QoS promised to individual requests. Benefits of using the QoS-driven policy are twofold: it maintains the QoS of each request as high as possible, considering their QoS targets and available resources; and it minimizes the variance of the QoS delivered to requests of the same class, promoting fairness. We used simulation experiments fed with traces from a production system to compare the QoS-driven policy with a state-of-the-practice priority-based one. In general, the QoS-driven policy delivers a better service than the priority-based one. Moreover, the equity of the QoS delivered to requests of the same class is much higher when the QoS-driven policy is used, particularly when not all requests get the promised QoS, which is the most important scenario. Finally, based on the current practice of large public cloud providers, our results show that penalties incurred by the priority-based scheduler in the scenarios studied can be, on average, as much as 193% higher than those incurred by the QoS-driven one.
The ubiquitous connectivity of Location-Based Systems (LBS) allows people to share individual location-related data anytime. In this sense, Location-Based Social Networks (LBSN) provides valuable information to be available in large-scale and low-cost fashion via traditional data collection methods. Moreover, this data contains spatial, temporal, and social features of user activity, enabling a system to predict user mobility. In this sense, mobility prediction plays crucial roles in urban planning, traffic forecasting, advertising, and recommendations, and has thus attracted lots of attention in the past decade. In this article, we introduce the Ensemble Random Forest-Markov (ERFM) mobility prediction model, a two-layer ensemble learner approach, in which the base learners are also ensemble learning models. In the inner layer, ERFM considers the Markovian property (memoryless) to build trajectories of different lengths, and the Random Forest algorithm to predict the user’s next location for each trajectory set. In the outer layer, the outputs from the first layer are aggregated based on the classification performance of each weak learner. The experimental results on the real user trajectory dataset highlight a higher accuracy and f1-score of ERFM compared to five state-of-the-art predictors.
In this paper we focus on knowledge extraction from large-scale wireless networks through stream processing. We present the primary methods for sampling, data collection, and monitoring of wireless networks and we characterize knowledge extraction as a machine learning problem on big data stream processing. We show the main trends in big data stream processing frameworks. Additionally, we explore the data preprocessing, feature engineering, and the machine learning algorithms applied to the scenario of wireless network analytics. We address challenges and present research projects in wireless network monitoring and stream processing. Finally, future perspectives, such as deep learning and reinforcement learning in stream processing, are anticipated.
Elastic optical networks are a network infrastructure capable of withstanding the high demand for data traffic from high-speed networks. One of the problems that must be solved to ensure the smooth functioning of the network is called Routing, Modulation Level and Spectrum Assignment (RMLSA). This work aims to propose a new approach to this problem with an algorithm to select the guard band in an adaptive way. Two algorithms for the adaptive selection of the guard band, called Guard Band according to Use of the Network (GBUN) and Guard Band by OSNR Margin (GBOM), are presented. The GBUN algorithm performs the guard band selection based on the usage level of network. On the other hand the GBOM algorithm uses an Optical Signal to Noise Ratio (OSNR) margin for the selection of the guard band. The performances of the proposed algorithms are compared with algorithms that use fixed guard band values and the adaptive proposal AGBA. The results showed that the GBOM algorithm presented a better performance in terms of bandwidth blocking probability for the studied scenarios. In general, GBOM also presents a better energy efficiency when compared to the other algorithms.
Wireless sensor networks (WSNs) are an important means of collecting data in a variety of situations, such as the monitoring of large or hazardous areas. The retrieval of WSN data can yield better results through the use of unmanned aerial vehicles (UAVs), for example, concerning the increase in the amount of data collected and the decrease in the time between the collection and use of the data. In particular, disaster areas may be left without communication resources and with high residual risk to humans, at which point a WSN can be quickly launched by air to collect relevant data until other measures can be established. The set of rules of each problem’s component (e.g., number of UAVs, UAVs dislocation control, sensors, communication) is considered the approaches to solve the problem. In this meaning, some studies present approaches for the use of UAVs for the collection of WSN data, focusing primarily on optimizing the path to be covered by a single UAV and relying on long-range communication that is always available; these studies do not explore the possibility of using several UAVs or the limitations on the range of communication. This work describes DADCA, a distributed scalable approach capable of coordinating groups of UAVs in WSN data collection with restricted communication range and without the use of optimization techniques. The results reveal that the amount of data collected by DADCA is similar or superior to path optimization approaches by up to 1%. In our proposed approach, the delay in receiving sensor messages is up to 46% shorter than in other approaches, and the required processing onboard UAVs can reach less than 75% of those using optimization-based algorithms. The results indicate that the DADCA can match and even surpass other presented approaches, since the path optimization is not a focus, while also incorporating the advantages of a distributed approach.
Adaptive middleware is essential for developing distributed systems in several applications domains. The design and implementation of this kind of middleware, however, it is still a challenge due to general adaptation issues, such as When to adapt? Where to include the adaptation code? What to adapt?, and How to guarantee safe adaptations?. Current solutions commonly face these challenges at the implementation level and do not focus on the safety aspects of the adaptation. This paper proposes a holistic solution implemented in Go programming language for developing adaptive middleware centred on the adoption of software architecture principles combined with lightweight use of formalisms. Software architecture concepts work as an enabling approach for structuring and adapting the middleware. Meanwhile, the formalisation helps in providing some guarantees before and during the middleware execution. The proposed solution is evaluated by implementing an adaptive middleware and comparing its performance against existing middleware systems. As shown in the experimental evaluation, the proposed solution enables us to design and implement safe adaptive middleware systems without compromising their performance.
A key challenge posed by the Next Generation Internet landscape is that modern service-based applications need to cope with open and continuously evolving environments and to operate under dynamic circumstances (e.g., changes in the users requirements, changes in the availability of resources). Indeed, dynamically discover, select and compose the appropriate services in such environment is a challenging task. Self-adaptation approaches represent effective instruments to tackle this issue, because they allow applications to adapt their behaviours based on their execution environment. Unfortunately, although existing approaches support run-time adaptation, they tend to foresee the adaptation requirements and related solutions at design-time, while working under a "closed-world" assumption. In this article our objective is that of providing a new way of approaching the design, operation and run-time adaptation of service-based applications, by considering the adaptivity as an intrinsic characteristic of applications and from the earliest stages of their development. We propose a novel design for adaptation approach implementing a complete lifecycle for the continuous development and deployment of service-based applications, by facilitating (i) the continuous integration of new services that can easily join the application, and (ii) the operation of applications under dynamic circumstances, to face the openness and dynamicity of the environment. The proposed approach has been implemented and evaluated in a real-world case study in the mobility domain. Experimental results demonstrate the effectiveness of our approach and its practical applicability.
Cloud computing is a general term that involves delivering hosted services over the Internet. With the accelerated growth of the volume of data used by applications, many organizations have moved their data into cloud servers to provide scalable, reliable and highly available services. A particularly challenging issue that arises in the context of cloud storage systems with geographically-distributed data replication is how to reach a consistent state for all replicas. This survey reviews major aspects related to consistency issues in cloud data storage systems, categorizing recently proposed methods into three categories: (1) fixed consistency methods, (2) configurable consistency methods and (3) consistency monitoring methods.
Online discussion forums are asynchronous communication tools that are widely used in Learning Management Systems. However, instructors and students face various difficulties, and instructors lack a guide on what strategies they can use to achieve a more participatory forum environment. This work aims to identify benefits and difficulties of using online discussion forums from the instructors’ point of view, and to provide a list of strategies and improvements that can mitigate the challenges and lead to a more participatory forum. We used coding procedures to analyze data collected through semi-structured interviews. The results of our exploratory analysis are relevant to the distance learning community and can inform instructors, developers, and researchers to help them improve the quality of mediation and use of forums.
The competitive dynamics of the globalized market demand information on the internal and external reality of corporations. Information is a precious asset and is responsible for establishing key advantages to enable companies to maintain their leadership. However, reliable, rich information is no longer the only goal. The time frame to extract information from data determines its usefulness. This work proposes DOD-ETL, a tool that addresses, in an innovative manner, the main bottleneck in Business Intelligence solutions, the Extract Transform Load process (ETL), providing it in near real-time. DOD-ETL achieves this by combining an on-demand data stream pipeline with a distributed, parallel and technology-independent architecture with in-memory caching and efficient data partitioning. We compared DOD-ETL with other Stream Processing frameworks used to perform near real-time ETL and found DOD-ETL executes workloads up to 10 times faster. We have deployed it in a large steelworks as a replacement for its previous ETL solution, enabling near real-time reports previously unavailable.
The huge amount of content names available in Named-Data Networking (NDN) challenges both the required routing table size and the techniques for locating and forwarding information. Content copies and content mobility exacerbate the scalability challenge to reach content in the new locations. We present and analyze the performance of a proposed Controller-based Routing Scheme, named CRoS-NDN, which preserves NDN features using the same interest and data packets. CRoS-NDN supports content mobility and provides fast content recovery from copies that do not belong to the consumer-producer path because it splits identity from location without incurring FIB size explosion or supposing prefix aggregation. It provides features similar to Content Distribution Networks (CDN) in NDN, and improves the routing efficiency. We compare our proposal with similar routing protocols and derive analytical expressions for lower-bound efficiency and upper-bound latency. We also conduct extensive simulations to evaluate results in data delivery efficiency and delay. The results show the robust behavior of the proposed scheme achieving the best efficiency and delay performance for a wide range of scenarios. Furthermore, CRoS-NDN results in low use of processing time and memory for a growing number of prefixes.
High-performance computing (HPC) and massive data processing (Big Data) are two trends that are beginning to converge. In that process, aspects of hardware architectures, systems support and programming paradigms are being revisited from both perspectives. This paper presents our experience on this path of convergence with the proposal of a framework that addresses some of the programming issues derived from such integration. Our contribution is the development of an integrated environment that integretes (i) COMPSs, a programming framework for the development and execution of parallel applications for distributed infrastructures; (ii) Lemonade, a data mining and analysis tool; and (iii) HDFS, the most widely used distributed file system for Big Data systems. To validate our framework, we used Lemonade to create COMPSs applications that access data through HDFS, and compared them with equivalent applications built with Spark, a popular Big Data framework. The results show that the HDFS integration benefits COMPSs by simplifying data access and by rearranging data transfer, reducing execution time. The integration with Lemonade facilitates COMPSs’s use and may help its popularization in the Data Science community, by providing efficient algorithm implementations for experts from the data domain that want to develop applications with a higher level abstraction.
Vehicular traffic re-routing is the key to provide better traffic mobility. However, taking into account just traffic-related information to recommend better routes for each vehicle is far from achieving the desired requirements of proper transportation management. In this way, context-aware and multi-objective re-routing approaches will play an important role in traffic management. Yet, most procedures are deterministic and cannot support the strict requirements of traffic management applications, since many vehicles potentially will take the same route, consequently degrading overall traffic efficiency. So, we propose an efficient algorithm named as Better Safe Than Sorry (BSTS), based on Pareto-efficiency. Simulation results have shown that our proposal provides a better trade-off between mobility and safety than state-of-the-art approaches and also avoids the problem of potentially creating different congestion spots.
As an integral component of the 5G communications, the massive Internet of Things (IoT) are vulnerable to various routing attacks due to their dynamic infrastructure, distinct computing resources, and heterogeneity of mobile objects. The sinkhole and selective forwarding attacks stand out among the most destructive ones for infrastructureless networks. Despite the countermeasures introduced by legacy intrusion detection systems (IDS), the massive IoT seeks novel solutions to address their unique requirements. This paper introduces DeTection of SinkHole And SelecTive ForwArding for Supporting SeCure routing for Internet of THIngs (THATACHI), a new IDS against sinkhole and selective forwarding attacks that target routing mechanism in massive and mobile IoT networks. To cope with the density and mobility challenges in the detection of attackers and ensuring reliability, THATACHI exploits watchdog, reputation and trust strategies. Our performance evaluation under an urban scenario shows that THATACHI can perform with a 99% detection rate, 6% of false negative and false positive rates. Moreover, when compared to its closest predecessor against sinkhole attacks for IoT, THATACHI runs with at least 50% less energy consumption.
Web applications are popular targets for cyber-attacks because they are network-accessible and often contain vulnerabilities. An intrusion detection system monitors web applications and issues alerts when an attack attempt is detected. Existing implementations of intrusion detection systems usually extract features from network packets or string characteristics of input that are manually selected as relevant to attack analysis. Manually selecting features, however, is time-consuming and requires in-depth security domain knowledge. Moreover, large amounts of labeled legitimate and attack request data are needed by supervised learning algorithms to classify normal and abnormal behaviors, which is often expensive and impractical to obtain for production web applications.
This paper provides three contributions to the study of autonomic intrusion detection systems. First, we evaluate the feasibility of an unsupervised/semi-supervised approach for web attack detection based on the Robust Software Modeling Tool (RSMT), which autonomically monitors and characterizes the runtime behavior of web applications. Second, we describe how RSMT trains a stacked denoising autoencoder to encode and reconstruct the call graph for end-to-end deep learning, where a low-dimensional representation of the raw features with unlabeled request data is used to recognize anomalies by computing the reconstruction error of the request data. Third, we analyze the results of empirically testing RSMT on both synthetic datasets and production applications with intentional vulnerabilities. Our results show that the proposed approach can efficiently and accurately detect attacks, including SQL injection, cross-site scripting, and deserialization, with minimal domain knowledge and little labeled training data.
Services that aim to make the current transportation system more secure, sustainable, and efficient constitute the Traffic Management Systems (TMS). Vehicular Ad hoc Networks (VANETs) exert a strong influence for TMS applications, due to TMS services require data, communication, and processing for operation. Besides, VANET allows direct communication between vehicles, and data are exchanged and processed between them. Several TMS services require disseminated information among decision-making vehicles. However, such dissemination is a challenging task, due to the specific characteristics of VANETs, such as short-range communication and high node mobility, resulting in several variations in their topology. In this article, we introduce an extensive analysis of our proposed data dissemination protocol based on complex networks’ metrics for urban VANET scenarios, called DDRX. Each vehicle must build a subgraph to identify the relay node to continue the dissemination process. Based on the local graph, it is possible to select the relay nodes based on complex networks’ metrics. Simulation results show that DDRX offers high efficiency in terms of coverage, number of transmitted packets, delay, and packet collisions compared to well-known data dissemination protocols. Also, DDRX provides significant improvements to a TMS that needs efficient data dissemination.
Social networks were first investigated in social, educational and business areas. Academic interest in this field though has been growing since the mid twentieth century, given the increasing interaction among people, data dissemination and exchange of information. As such, the development and evaluation of new techniques for social network analysis and mining (SNAM) is a current key research area for Internet services and applications. Key topics include contextualized analysis of social and information networks, crowdsourcing and crowdfunding, economics in networks, extraction and treatment of social data, mining techniques, modeling of user behavior and social networks, and software ecosystems. These topics have important areas of application in a wide range of fields, such as academia, politics, security, business, marketing, and science.
Participatory sensing networks rely on gathering personal data from mobile devices to infer global knowledge. Participatory sensing has been used for real-time traffic monitoring, where the global traffic conditions are based on information provided by individual devices. However, fewer initiatives address asphalt quality conditions, which is an essential aspect of the route decision process. This article proposes Streetcheck, a framework to classify road surface quality through participatory sensing. Streetcheck gathers mobile devices’ sensors such as Global Positioning System (GPS) and accelerometer, as well as users’ ratings on road surface quality. A classification system aggregates the data, filters them, and extracts a set of features as input for supervised learning algorithms. Twenty volunteers carried out tests using Streetcheck on 1,200 km of urban roads of Minas Gerais (Brazil). Streetcheck reached up to 90.64% of accuracy on classifying road surface quality.
In recent years, as a result of the proliferation of non-elastic services and the adoption of novel paradigms, monitoring networks with high level of detail is becoming crucial to correctly identify and characterize situations related to faults, performance, and security. In-band Network Telemetry (INT) emerges in this context as a promising approach to meet this demand, enabling production packets to directly report their experience inside a network. This type of telemetry enables unprecedented monitoring accuracy and precision, but leads to performance degradation if applied indiscriminately using all network traffic. One alternative to avoid this situation is to orchestrate telemetry tasks and use only a portion of traffic to monitor the network via INT. The general problem, in this context, consists in assigning subsets of traffic to carry out INT and provide full monitoring coverage while minimizing the overhead. In this paper, we introduce and formalize two variations of the In-band Network Telemetry Orchestration (INTO) problem, prove that both are NP-Complete, and propose polynomial computing time heuristics to solve them. In our evaluation using real WAN topologies, we observe that the heuristics produce solutions close to optimal to any network in under one second, networks can be covered assigning a linear number of flows in relation to the number of interfaces in them, and that it is possible to minimize telemetry load to one interface per flow in most networks.
The identification of strategic business partnerships can potentially provide competitive advantages for businesses; however, due to the dynamics and uncertainty present in business environments, this task could be challenging. To help businesses in this task, this study presents a similarity model between businesses that consider the opinions of users on content shared by businesses on social media. Thus, this model captures significant virtual relationships among businesses that are generated by users in the virtual world. Besides, we propose an algorithm for detecting business communities in the considered model. We also propose an algorithm to identify possible business outliers in the detected communities, which could represent an automatic way to identify non-obvious relations that might deserve particular attention of business owners. By exploring approximately 280 million user reactions on Facebook, we show that our results could favor the development of, for example, a new strategic business partnership recommendation service.
With the Internet of Things (IoT), applications should interact with a huge number of devices and retrieve context data produced by those objects, which have to be discovered and selected a priori. Due to the number, heterogeneity, and dynamicity of resources, discovery services are required to consider many selection criteria, e.g., device capabilities, location, context data type, contextual situations, and quality. In this paper, we describe QoDisco, a semantic-based discovery service that addresses this requirement in IoT. QoDisco is composed of a set of repositories storing resource descriptions according to an ontology-based information model and it provides multi-attribute and range querying capabilities. We have evaluated different approaches to reduce the inherent cost of semantic search, namely parallel interactions with multiple repositories and publish-subscribe interactions. This paper also reports the results of some performance experiments on QoDisco with respect to these approaches to handle resource discovery requests in IoT.
The futuristic wireless networks expects to provide adequate support for distinct kind of applications, their diverse requirements, and scenarios for future Internet systems, such as Internet of Things based on multimedia and sensor data, while figuring out low cost solutions to offload the mobile communication core. In this context, Low-cost Wireless Backhauls (LWBs) can be useful, since they are based on cheap WLAN technologies, such as Wireless Mesh Networks that provide capacity for future IoT applications based on mixed traffic. The routing is a fundamental process to provide communication in these multi-hop networks and multi-objective routing optimization algorithms based on Integer Linear Programming (ILP) models have been studied in the literature to address this problem, but there is a lack of solutions for mixed traffic. For this reason, we propose a novel ILP multi-objective approach, called Multi-objective routing Aware of miXed traffIc (MAXI), which employs three weighted objectives to guide the routing in WMNs with different applications and requirements. In addition, we provide a comparative analysis with other relevant approaches of routing using NS-3 to evaluation based on simulation, that takes into account different types and levels of interference (e.g. co-channel interference and external interference) focused on mixed IoT traffic for elderly healthcare scenario. Finally, we demonstrate the effectiveness of the proposed approach to support the requirements of each application through the appropriate combination of objective functions, mainly in dense scenarios with high level of interference.