Recent public cooperation between the Federal University of Technology – Parana (UTFPR) and the Toledo Municipality plans to implement the concept of smart cities in this city. In this context, one of the applications under development intends to track the recyclable garbage collector trucks in real time over the Internet. Actually, fleet vehicle tracking is one of the main applications for smart cities. LoRaWAN stands out among network technologies for smart cities due to operating in an open frequency range, covering long distances with low power consumption and low equipment cost. However, the coverage and performance of LoRaWAN is directly affected by both the environment and configuration parameters. In addition, tracking devices must be able to send its coordinates to the Internet even when the vehicle goes through zones where there are obstacles for electromagnetic waves such as elevated buildings or valleys. In this paper we perform experimental investigations to evaluate four LoRaWAN tracking devices, two available out of the box and two assembled and programmed. The behavior of each tracking device is analyzed when moving at a constant speed through three representative urban areas totaling 10.71 km2. The two most efficient tracking devices are analyzed in a stretch of 3.5 km with speeds ranging from 0 to 30 km/h, 0 to 50 km/h and 0 to 100 km/h. Results include a quantitative and qualitative aspects, including the received signal strength indication (RSSI), signal-to-noise ratio (SNR), packet delivery ratio (PDR), and spreading factor (SF) for the received geographic coordinates. As the devices depend on the quality of the signal offered by the network, we also present the results of the development and evaluation of the LoRaWAN network, by planning its coverage throughout the city.
Blockchain technology has enabled a new kind of distributed systems. Beyond its early applications in Finance, it has also allowed the emergence of novel new ways of governance and coordination. The most relevant of these are the so-called Decentralized Autonomous Organizations (DAOs). DAOs typically implement decision-making systems to make it possible for their online community to reach agreements. As a result of these agreements, the DAO operates automatically by executing the appropriate portion of code on the blockchain network (e.g., hire people, delivers payments, invests in financial products, etc). In the last few years, several platforms such as Aragon, DAOstack and DAOhaus, have emerged to facilitate the creation of DAOs. As a result, hundreds of these new organizations have appeared, with their communities interacting mediated by blockchain. However, the literature has yet to appropriately explore empirically this phenomena. In this paper, we aim to shed light on the current state of the DAO ecosystem. We review the three main platforms nowadays (Aragon, DAOstack, DAOhaus) which facilitate the creation and management of DAOs. Thus, we introduce their main differences, and compare them using quantitative metrics. For such comparison, we retrieve data from both the main Ethereum network (mainnet) and a parallel Ethereum network (xDai). We analyze data from 72,320 users and 2,353 DAO communities in order to study the three ecosystems across four dimensions: growth, activity, voting system and funds. Our results show that there are notable differences among the DAO platforms in terms of growth and activity, and also in terms of voting results. Still, we consider that our work is only a first step and that further research is needed to better understand these communities, and evaluate their level of accomplishment in reaching decentralized governance.
Mentoring is a well-known way to help newcomers to Open Source Software (OSS) projects overcome initial contribution barriers. Through mentoring, newcomers learn to acquire essential technical, social, and organizational skills. Despite the importance of OSS mentors, they are understudied in the literature. Understanding who OSS project mentors are, the challenges they face, and the strategies they use can help OSS projects better support mentors’ work. In this paper, we employ a two-stage study to comprehensively investigate mentors in OSS. First, we identify the characteristics of mentors in the Apache Software Foundation, a large OSS community, using an online survey. We found that less experienced volunteer contributors are less likely to take on the mentorship role. Second, through interviews with OSS mentors (n=18), we identify the challenges that mentors face and how they mitigate them. In total, we identified 25 general mentorship challenges and 7 sub-categories of challenges regarding task recommendation. We also identified 13 strategies to overcome the challenges related to task recommendation. Our results provide insights for OSS communities, formal mentorship programs, and tool builders who design automated support for task assignment and internship.
Internet-based technologies such as IoT, GPS-based systems, and cellular networks enable the collection of geolocated mobility data of millions of people in large metropolitan areas. In addition, large, public datasets are made available on the Internet by open government programs, providing ways for citizens, NGOs, scientists, and public managers to perform a multitude of data analysis with the goal of better understanding the city dynamics to provide means for evidence-based public policymaking. However, it is challenging to visualize huge amounts of data from mobility datasets. Plotting raw trajectories on a map often causes data occlusion, impairing the visual analysis. Displaying the multiple attributes that these trajectories come with is an even larger challenge. One approach to solve this problem is trail bundling, which groups motion trails that are spatially close in a simplified representation. In this paper, we augment a recent bundling technique to support multi-attribute trail datasets for the visual analysis of urban mobility. Our case study is based on the travel survey from the São Paulo Metropolitan Area, which is one of the most intense traffic areas in the world. The results show that bundling helps the identification and analysis of various mobility patterns for different data attributes, such as peak hours, social strata, and transportation modes.
Qualitative science methods have largely been omitted from discussions of open science. Platforms focused on qualitative science that support open science data and method sharing are rare. Sharing and exchanging coding schemas has great potential for supporting traceability in qualitative research as well as for facilitating the reuse of coding schemas. In this study, we present and evaluate QualiCO, an ontology to describe qualitative coding schemas. Twenty qualitative researchers used QualiCO to complete two coding tasks. In our findings, we present task performance and interview data that focus participants’ attention on the ontology. Participants used QualiCO to complete the coding tasks, decreasing time on task, while improving accuracy, signifying that QualiCO enabled the reuse of qualitative coding schemas. Our discussion elaborates some issues that participants had and highlights how conceptual and prior practice frames their interpretation of how QualiCO can be used.
Carsharing is ana lternative to urban mobility that has been widely adopted recently. This service presents three main business models: two of these models base their services on stations while the remainder, the free-floating service, is free of fixed stations. Despite the notable advantages of carsharing, this service is prone to several problems, such as fleet imbalance due to the variance of the daily demand in large urban centers. Forecasting the demand for the service is a key task to deal with this issue. In this sense, in this work, we analyze the use of well-known techniques to forecast a carsharing service demand. More in deep, we evaluate the use of the Long Short-Term Memory (LSTM) and Prophet techniques to predict the demand of three real carsharing services. Moreover, we also evaluate seven state-of-the-art forecasting models on a given free-floating carsharing service, highlighting the potentials of each technique. In addition to historical carsharing service data, we have also used climatic series to enhance the forecasting. Indeed, the results of our analysis have shown that the addition of meteorological data improved the models’ performance. In this case, the mean absolute error of LSTM may fall by half, when using the climate data. When considering the free-floating carsharing service, and prediction for the short-term (i.e., 12 hours), the boosting algorithms (e.g. XGBoost, Catboost, and LightGBM) present superior performance, with less than 20% of mean absolute error when compared to the next best-ranked model (Prophet). On the other hand, Prophet performed better for predictions conducted on long-term periods.
Mobile platforms are rapidly and continuously changing, with support for new sensors, APIs, and programming abstractions. Static analysis is gaining a growing interest, allowing developers to predict properties about the run-time behavior of mobile apps without executing them. Over the years, literally hundreds of static analysis techniques have been proposed, ranging from structural and control-flow analysis to state-based analysis.In this paper, we present a systematic mapping study aimed at identifying, evaluating and classifying characteristics, trends and potential for industrial adoption of existing research in static analysis of mobile apps. Starting from over 12,000 potentially relevant studies, we applied a rigorous selection procedure resulting in 261 primary studies along a time span of 9 years. We analyzed each primary study according to a rigorously-defined classification framework. The results of this study give a solid foundation for assessing existing and future approaches for static analysis of mobile apps, especially in terms of their industrial adoptability.Researchers and practitioners can use the results of this study to (i) identify existing research/technical gaps to target, (ii) understand how approaches developed in academia can be successfully transferred to industry, and (iii) better position their (past and future) approaches for static analysis of mobile apps.
Software systems are complicated, and the scientific and engineering methodologies for software development are relatively young. Cyber-physical systems are now in every corner of our lives, and we need robust methods for handling the ever-increasing complexity of their software systems. Model-Driven Development is a promising approach to tackle the complexity of systems through the concept of abstraction, enabling analysis at earlier phases of development. In this paper, we propose a model-driven approach with a focus on guaranteeing safety using formal verification. Cyber-physical systems are distributed, concurrent, asynchronous and event-based reactive systems with timing constraints. The actor-based textual modeling language, Rebeca, with model checking support is used for formal verification. Starting from structured requirements and system architecture design the behavioral models, including Rebeca models, are built. Properties of interest are also derived from the structured requirements, and then model checking is used to formally verify the properties. This process can be performed in iterations until satisfaction of desired properties are ensured, and possible ambiguities and inconsistencies in requirements are resolved. The formally verified models can then be used to develop the executable code. The Rebeca models include the details of the signals and messages that are passed at the network level including the timing, and this facilitates the generation of executable code. The natural mappings among the models for requirements, the formal models, and the executable code improve the effectiveness and efficiency of the approach.
Objective Quality of Experience (QoE) for Dynamic Adaptive Streaming over HTTP (DASH) video streaming has received considerable attention in recent years. While there are a number of objective QoE models, a limitation of the current models is that the QoE is provided after the entire video is delivered; also, the models are on a per client basis. For content service providers, QoE observed is important to monitor to understand ensemble performance during streaming such as for live events or concurrent streaming when multiple clients are streaming. For this purpose, we propose Moving QoE (MQoE, in short) models to measure QoE during periodically during video streaming for multiple simultaneous clients. Our first model MQoE_RF is a nonlinear model considering the bitrate gain and sensitivity from bitrate switching frequency. Our second model MQoE_SD is a linear model that focuses on capturing the standard deviation in the bitrate switching magnitude among segments along with the bitrate gain. We then study the effectiveness of both models in a multi-user mobile client environment, with the mobility patterns being based on traces from a train, a car, or a ferry. We implemented the study on the GENI testbed. Our study shows that our MQoE models are more accurate in capturing the QoE behavior during transmission than static QoE models. Furthermore, our MQoE_RF model captures the sensitivity due to bitrate switching frequency more effectively while MQoE_SD captures the sensitivity due to the magnitude of the bitrate switching. Either models are suitable for content service providers for monitoring video streaming based on their preference.
Internet eXchange Points (IXPs) are Internet infrastructures composed of high-performance networks that allow multiple autonomous systems to exchange traffic. Given the challenges of managing the flows that cross an IXP, identifying elephant flows may help improve the quality of services provided to its participants. In this context, we leverage the new flexibility and resources of programmable data planes to identify elephant flows in IXP networks adaptively via the dynamic adjustment of thresholds. Our mechanism uses the information reported by the data plane to monitor network utilization in the control plane, calculating new thresholds based on previous flow sizes and durations percentiles and configuring them back into switches to support the local classification of flows. Thus, the thresholds are updated to make the identification process better aligned with the network behavior. The experimental results show that it is possible to identify and react to elephant flows quickly, less than 0.4ms, and efficiently, with only 98.4KB of data inserted into the network by the mechanism. In addition, the threshold updating mechanism achieved accuracy of up to 90% in our evaluation scenarios.
The federated identity model provides a solution for user authentication across multiple administrative domains. The academic federations, such as the Brazilian federation, are examples of this model in practice. The majority of institutions that participate in academic federations employ password-based authentication for their users, with an attacker only needing to find out one password in order to personify the user in all federated service providers. Multi-factor authentication emerges as a solution to increase the robustness of the authentication process. This article aims to introduce a comprehensive and open source solution to offer multi-factor authentication for Shibboleth Identity Providers. Based on the Multi-factor Authentication Profile standard, our solution provides three extra second factors (One-Time Password, FIDO2 and Phone Prompt). The solution has been deployed in the Brazilian academic federation, where it was evaluated using functional and integration testing, as well as security and case study analysis.
Priority-based scheduling policies are commonly used to guarantee that requests submitted to the different service classes offered by cloud providers achieve the desired Quality of Service (QoS). However, the QoS delivered during resource contention periods may be unfair on certain requests. In particular, lower priority requests may have their resources preempted to accommodate resources associated with higher priority ones, even if the actual QoS delivered to the latter is above the desired level, while the former is underserved. Also, competing requests with the same priority may experience quite different QoS, since some of them may have their resources preempted, while others do not. In this paper we present a new scheduling policy that is driven by the QoS promised to individual requests. Benefits of using the QoS-driven policy are twofold: it maintains the QoS of each request as high as possible, considering their QoS targets and available resources; and it minimizes the variance of the QoS delivered to requests of the same class, promoting fairness. We used simulation experiments fed with traces from a production system to compare the QoS-driven policy with a state-of-the-practice priority-based one. In general, the QoS-driven policy delivers a better service than the priority-based one. Moreover, the equity of the QoS delivered to requests of the same class is much higher when the QoS-driven policy is used, particularly when not all requests get the promised QoS, which is the most important scenario. Finally, based on the current practice of large public cloud providers, our results show that penalties incurred by the priority-based scheduler in the scenarios studied can be, on average, as much as 193% higher than those incurred by the QoS-driven one.
The ubiquitous connectivity of Location-Based Systems (LBS) allows people to share individual location-related data anytime. In this sense, Location-Based Social Networks (LBSN) provides valuable information to be available in large-scale and low-cost fashion via traditional data collection methods. Moreover, this data contains spatial, temporal, and social features of user activity, enabling a system to predict user mobility. In this sense, mobility prediction plays crucial roles in urban planning, traffic forecasting, advertising, and recommendations, and has thus attracted lots of attention in the past decade. In this article, we introduce the Ensemble Random Forest-Markov (ERFM) mobility prediction model, a two-layer ensemble learner approach, in which the base learners are also ensemble learning models. In the inner layer, ERFM considers the Markovian property (memoryless) to build trajectories of different lengths, and the Random Forest algorithm to predict the user’s next location for each trajectory set. In the outer layer, the outputs from the first layer are aggregated based on the classification performance of each weak learner. The experimental results on the real user trajectory dataset highlight a higher accuracy and f1-score of ERFM compared to five state-of-the-art predictors.
In this paper we focus on knowledge extraction from large-scale wireless networks through stream processing. We present the primary methods for sampling, data collection, and monitoring of wireless networks and we characterize knowledge extraction as a machine learning problem on big data stream processing. We show the main trends in big data stream processing frameworks. Additionally, we explore the data preprocessing, feature engineering, and the machine learning algorithms applied to the scenario of wireless network analytics. We address challenges and present research projects in wireless network monitoring and stream processing. Finally, future perspectives, such as deep learning and reinforcement learning in stream processing, are anticipated.
Elastic optical networks are a network infrastructure capable of withstanding the high demand for data traffic from high-speed networks. One of the problems that must be solved to ensure the smooth functioning of the network is called Routing, Modulation Level and Spectrum Assignment (RMLSA). This work aims to propose a new approach to this problem with an algorithm to select the guard band in an adaptive way. Two algorithms for the adaptive selection of the guard band, called Guard Band according to Use of the Network (GBUN) and Guard Band by OSNR Margin (GBOM), are presented. The GBUN algorithm performs the guard band selection based on the usage level of network. On the other hand the GBOM algorithm uses an Optical Signal to Noise Ratio (OSNR) margin for the selection of the guard band. The performances of the proposed algorithms are compared with algorithms that use fixed guard band values and the adaptive proposal AGBA. The results showed that the GBOM algorithm presented a better performance in terms of bandwidth blocking probability for the studied scenarios. In general, GBOM also presents a better energy efficiency when compared to the other algorithms.
Wireless sensor networks (WSNs) are an important means of collecting data in a variety of situations, such as the monitoring of large or hazardous areas. The retrieval of WSN data can yield better results through the use of unmanned aerial vehicles (UAVs), for example, concerning the increase in the amount of data collected and the decrease in the time between the collection and use of the data. In particular, disaster areas may be left without communication resources and with high residual risk to humans, at which point a WSN can be quickly launched by air to collect relevant data until other measures can be established. The set of rules of each problem’s component (e.g., number of UAVs, UAVs dislocation control, sensors, communication) is considered the approaches to solve the problem. In this meaning, some studies present approaches for the use of UAVs for the collection of WSN data, focusing primarily on optimizing the path to be covered by a single UAV and relying on long-range communication that is always available; these studies do not explore the possibility of using several UAVs or the limitations on the range of communication. This work describes DADCA, a distributed scalable approach capable of coordinating groups of UAVs in WSN data collection with restricted communication range and without the use of optimization techniques. The results reveal that the amount of data collected by DADCA is similar or superior to path optimization approaches by up to 1%. In our proposed approach, the delay in receiving sensor messages is up to 46% shorter than in other approaches, and the required processing onboard UAVs can reach less than 75% of those using optimization-based algorithms. The results indicate that the DADCA can match and even surpass other presented approaches, since the path optimization is not a focus, while also incorporating the advantages of a distributed approach.
Adaptive middleware is essential for developing distributed systems in several applications domains. The design and implementation of this kind of middleware, however, it is still a challenge due to general adaptation issues, such as When to adapt? Where to include the adaptation code? What to adapt?, and How to guarantee safe adaptations?. Current solutions commonly face these challenges at the implementation level and do not focus on the safety aspects of the adaptation. This paper proposes a holistic solution implemented in Go programming language for developing adaptive middleware centred on the adoption of software architecture principles combined with lightweight use of formalisms. Software architecture concepts work as an enabling approach for structuring and adapting the middleware. Meanwhile, the formalisation helps in providing some guarantees before and during the middleware execution. The proposed solution is evaluated by implementing an adaptive middleware and comparing its performance against existing middleware systems. As shown in the experimental evaluation, the proposed solution enables us to design and implement safe adaptive middleware systems without compromising their performance.
A key challenge posed by the Next Generation Internet landscape is that modern service-based applications need to cope with open and continuously evolving environments and to operate under dynamic circumstances (e.g., changes in the users requirements, changes in the availability of resources). Indeed, dynamically discover, select and compose the appropriate services in such environment is a challenging task. Self-adaptation approaches represent effective instruments to tackle this issue, because they allow applications to adapt their behaviours based on their execution environment. Unfortunately, although existing approaches support run-time adaptation, they tend to foresee the adaptation requirements and related solutions at design-time, while working under a "closed-world" assumption. In this article our objective is that of providing a new way of approaching the design, operation and run-time adaptation of service-based applications, by considering the adaptivity as an intrinsic characteristic of applications and from the earliest stages of their development. We propose a novel design for adaptation approach implementing a complete lifecycle for the continuous development and deployment of service-based applications, by facilitating (i) the continuous integration of new services that can easily join the application, and (ii) the operation of applications under dynamic circumstances, to face the openness and dynamicity of the environment. The proposed approach has been implemented and evaluated in a real-world case study in the mobility domain. Experimental results demonstrate the effectiveness of our approach and its practical applicability.
Cloud computing is a general term that involves delivering hosted services over the Internet. With the accelerated growth of the volume of data used by applications, many organizations have moved their data into cloud servers to provide scalable, reliable and highly available services. A particularly challenging issue that arises in the context of cloud storage systems with geographically-distributed data replication is how to reach a consistent state for all replicas. This survey reviews major aspects related to consistency issues in cloud data storage systems, categorizing recently proposed methods into three categories: (1) fixed consistency methods, (2) configurable consistency methods and (3) consistency monitoring methods.
Online discussion forums are asynchronous communication tools that are widely used in Learning Management Systems. However, instructors and students face various difficulties, and instructors lack a guide on what strategies they can use to achieve a more participatory forum environment. This work aims to identify benefits and difficulties of using online discussion forums from the instructors’ point of view, and to provide a list of strategies and improvements that can mitigate the challenges and lead to a more participatory forum. We used coding procedures to analyze data collected through semi-structured interviews. The results of our exploratory analysis are relevant to the distance learning community and can inform instructors, developers, and researchers to help them improve the quality of mediation and use of forums.
The competitive dynamics of the globalized market demand information on the internal and external reality of corporations. Information is a precious asset and is responsible for establishing key advantages to enable companies to maintain their leadership. However, reliable, rich information is no longer the only goal. The time frame to extract information from data determines its usefulness. This work proposes DOD-ETL, a tool that addresses, in an innovative manner, the main bottleneck in Business Intelligence solutions, the Extract Transform Load process (ETL), providing it in near real-time. DOD-ETL achieves this by combining an on-demand data stream pipeline with a distributed, parallel and technology-independent architecture with in-memory caching and efficient data partitioning. We compared DOD-ETL with other Stream Processing frameworks used to perform near real-time ETL and found DOD-ETL executes workloads up to 10 times faster. We have deployed it in a large steelworks as a replacement for its previous ETL solution, enabling near real-time reports previously unavailable.
The huge amount of content names available in Named-Data Networking (NDN) challenges both the required routing table size and the techniques for locating and forwarding information. Content copies and content mobility exacerbate the scalability challenge to reach content in the new locations. We present and analyze the performance of a proposed Controller-based Routing Scheme, named CRoS-NDN, which preserves NDN features using the same interest and data packets. CRoS-NDN supports content mobility and provides fast content recovery from copies that do not belong to the consumer-producer path because it splits identity from location without incurring FIB size explosion or supposing prefix aggregation. It provides features similar to Content Distribution Networks (CDN) in NDN, and improves the routing efficiency. We compare our proposal with similar routing protocols and derive analytical expressions for lower-bound efficiency and upper-bound latency. We also conduct extensive simulations to evaluate results in data delivery efficiency and delay. The results show the robust behavior of the proposed scheme achieving the best efficiency and delay performance for a wide range of scenarios. Furthermore, CRoS-NDN results in low use of processing time and memory for a growing number of prefixes.
High-performance computing (HPC) and massive data processing (Big Data) are two trends that are beginning to converge. In that process, aspects of hardware architectures, systems support and programming paradigms are being revisited from both perspectives. This paper presents our experience on this path of convergence with the proposal of a framework that addresses some of the programming issues derived from such integration. Our contribution is the development of an integrated environment that integretes (i) COMPSs, a programming framework for the development and execution of parallel applications for distributed infrastructures; (ii) Lemonade, a data mining and analysis tool; and (iii) HDFS, the most widely used distributed file system for Big Data systems. To validate our framework, we used Lemonade to create COMPSs applications that access data through HDFS, and compared them with equivalent applications built with Spark, a popular Big Data framework. The results show that the HDFS integration benefits COMPSs by simplifying data access and by rearranging data transfer, reducing execution time. The integration with Lemonade facilitates COMPSs’s use and may help its popularization in the Data Science community, by providing efficient algorithm implementations for experts from the data domain that want to develop applications with a higher level abstraction.
Vehicular traffic re-routing is the key to provide better traffic mobility. However, taking into account just traffic-related information to recommend better routes for each vehicle is far from achieving the desired requirements of proper transportation management. In this way, context-aware and multi-objective re-routing approaches will play an important role in traffic management. Yet, most procedures are deterministic and cannot support the strict requirements of traffic management applications, since many vehicles potentially will take the same route, consequently degrading overall traffic efficiency. So, we propose an efficient algorithm named as Better Safe Than Sorry (BSTS), based on Pareto-efficiency. Simulation results have shown that our proposal provides a better trade-off between mobility and safety than state-of-the-art approaches and also avoids the problem of potentially creating different congestion spots.
As an integral component of the 5G communications, the massive Internet of Things (IoT) are vulnerable to various routing attacks due to their dynamic infrastructure, distinct computing resources, and heterogeneity of mobile objects. The sinkhole and selective forwarding attacks stand out among the most destructive ones for infrastructureless networks. Despite the countermeasures introduced by legacy intrusion detection systems (IDS), the massive IoT seeks novel solutions to address their unique requirements. This paper introduces DeTection of SinkHole And SelecTive ForwArding for Supporting SeCure routing for Internet of THIngs (THATACHI), a new IDS against sinkhole and selective forwarding attacks that target routing mechanism in massive and mobile IoT networks. To cope with the density and mobility challenges in the detection of attackers and ensuring reliability, THATACHI exploits watchdog, reputation and trust strategies. Our performance evaluation under an urban scenario shows that THATACHI can perform with a 99% detection rate, 6% of false negative and false positive rates. Moreover, when compared to its closest predecessor against sinkhole attacks for IoT, THATACHI runs with at least 50% less energy consumption.