Sunday, August 5, 2012

2012, Java IEEE Project Abstracts - Part 4

JAVA IEEE 2012 PROJECT ABSTRACTS

DOMAIN - SOFTWARE ENGINEERING
QOS ASSURANCE FOR DYNAMIC RECONFIGURATION OF COMPONENT-BASED SOFTWARE SYSTEMS
A major challenge of dynamic reconfiguration is Quality of Service (QoS) assurance, which is meant to reduce application disruption to the minimum for the system’s transformation. However, this problem has not been well studied.
This paper investigates the problem for component-based software systems from three points of view.
First, the whole spectrum of QoS characteristics is defined. Second, the logical and physical requirements for QoS characteristics are analyzed and solutions to achieve them are proposed. Third, prior work is classified by QoS characteristics and then realized by abstract reconfiguration strategies.
On this basis, quantitative evaluation of the QoS assurance abilities of existing work and our own approach is conducted through three steps. First, a proof-of-concept prototype called the reconfigurable component model is implemented to support the representation and testing of the reconfiguration strategies.
Second, a reconfiguration benchmark is proposed to expose the whole spectrum of QoS problems. Third, each reconfiguration strategy is tested against the benchmark and the testing results are evaluated. The most important conclusion from our investigation is that the classified QoS characteristics can be fully achieved under some acceptable constraints


*------------*------------*------------*------------*------------*------------*

DOMAIN - SOFTWARE ENGINEERING
EXPLOITING DYNAMIC INFORMATION IN IDES IMPROVES SPEED AND CORRECTNESS OF SOFTWARE MAINTENANCE TASKS
Modern IDEs such as Eclipse offer static views of the source code, but such views ignore information about the runtime behavior of software systems. Since typical object-oriented systems make heavy use of polymorphism and dynamic binding, static views will miss key information about the runtime architecture.
In this paper, we present an approach to gather and integrate dynamic information in the Eclipse IDE with the goal of better supporting typical software maintenance activities. By means of a controlled experiment with 30 professional developers, we show that for typical software maintenance tasks, integrating dynamic information into the Eclipse IDE yields a significant 17.5 percent decrease of time spent while significantly increasing the correctness of the solutions by 33.5 percent. We also provide a comprehensive performance evaluation of our approach


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - SOFTWARE ENGINEERING
COMPARING THE DEFECT REDUCTION BENEFITS OF CODE INSPECTION AND TEST-DRIVEN DEVELOPMENT
This study is a quasi experiment comparing the software defect rates and implementation costs of two methods of software defect reduction: code inspection and test-driven development.
We divided participants, consisting of junior and senior computer science students at a large Southwestern university, into four groups using a two-by-two, between-subjects, factorial design and asked them to complete the same programming assignment using either test-driven development, code inspection, both, or neither.
We compared resulting defect counts and implementation costs across groups. We found that code inspection is more effective than test-driven development at reducing defects, but that code inspection is also more expensive. We also found that test-driven development was no more effective at reducing defects than traditional programming methods.


*------------*------------*------------*------------*------------*------------*

DOMAIN - SOFTWARE ENGINEERING
AN AUTONOMOUS ENGINE FOR SERVICES CONFIGURATION AND DEPLOYMENT
The runtime management of the infrastructure providing service-based systems is a complex task, up to the point where manual operation struggles to be cost effective. As the functionality is provided by a set of dynamically composed distributed services, in order to achieve a management objective multiple operations have to be applied over the distributed elements of the managed infrastructure.
Moreover, the manager must cope with the highly heterogeneous characteristics and management interfaces of the runtime resources. With this in mind, this paper proposes to support the configuration and deployment of services with an automated closed control loop.
The automation is enabled by the definition of a generic information model, which captures all the information relevant to the management of the services with the same abstractions, describing the runtime elements, service dependencies, and business objectives.
On top of that, a technique based on satisfiability is described which automatically diagnoses the state of the managed environment and obtains the required changes for correcting it (e.g., installation, service binding, update, or configuration). The results from a set of case studies extracted from the banking domain are provided to validate the feasibility of this proposal.


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - SOFTWARE ENGINEERING
STAKERARE: USING SOCIAL NETWORKS AND COLLABORATIVE FILTERING FOR LARGE-SCALE REQUIREMENTS ELICITATION
Requirements elicitation is the software engineering activity in which stakeholder needs are understood. It involves identifying and prioritizing requirements—a process difficult to scale to large software projects with many stakeholders.
This paper proposes StakeRare, a novel method that uses social networks and collaborative filtering to identify and prioritize requirements in large software projects. StakeRare identifies stakeholders and asks them to recommend other stakeholders and stakeholder roles, builds a social network with stakeholders as nodes and their recommendations as links, and prioritizes stakeholders using a variety of social network measures to determine their project influence.
It then asks the stakeholders to rate an initial list of requirements, recommends other relevant requirements to them using collaborative filtering, and prioritizes their requirements using their ratings weighted by their project influence. StakeRare was evaluated by applying it to a software project for a 30,000-user system, and a substantial empirical study of requirements elicitation was conducted.
Using the data collected from surveying and interviewing 87 stakeholders, the study demonstrated that StakeRare predicts stakeholder needs accurately and arrives at a more complete and accurately prioritized list of requirements compared to the existing method used in the project, taking only a fraction of the time.


*------------*------------*------------*------------*------------*------------*

DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
ON THE HOP COUNT STATISTICS IN WIRELESS MULTIHOP NETWORKS SUBJECT TO FADING
Consider a wireless multihop network where nodes are randomly distributed in a given area following a homogeneous Poisson process. The hop count statistics, viz. the probabilities related to the number of hops between two nodes, are important for performance analysis of the multihop networks.
In this paper, we provide analytical results on the probability that two nodes separated by a known euclidean distance are k hops apart in networks subject to both shadowing and small-scale fading. Some interesting results are derived which have generic significance. For example, it is shown that the locations of nodes three or more hops away provide little information in determining the relationship of a node with other nodes in the network.
This observation is useful for the design of distributed routing, localization, and network security algorithms. As an illustration of the application of our results, we derive the effective energy consumption per successfully transmitted packet in end-to-end packet transmissions.
We show that there exists an optimum transmission range which minimizes the effective energy consumption. The results provide useful guidelines on the design of a randomly deployed network in a more realistic radio environment.


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
ON MAXIMIZING THE LIFETIME OF WIRELESS SENSOR NETWORKS USING VIRTUAL BACKBONE SCHEDULING
Wireless Sensor Networks (WSNs) are key for various applications that involve long-term and low-cost monitoring and actuating. In these applications, sensor nodes use batteries as the sole energy source.
Therefore, energy efficiency becomes critical. We observe that many WSN applications require redundant sensor nodes to achieve fault tolerance and Quality of Service (QoS) of the sensing.
However, the same redundancy may not be necessary for multihop communication because of the light traffic load and the stable wireless links. In this paper, we present a novel sleep-scheduling technique called Virtual Backbone Scheduling (VBS). VBS is designed for WSNs has redundant sensor nodes.
VBS forms multiple overlapped backbones which work alternatively to prolong the network lifetime. In VBS, traffic is only forwarded by backbone sensor nodes, and the rest of the sensor nodes turn off their radios to save energy.
The rotation of multiple backbones makes sure that the energy consumption of all sensor nodes is balanced, which fully utilizes the energy and achieves a longer network lifetime compared to the existing techniques.
The scheduling problem of VBS is formulated as the Maximum Lifetime Backbone Scheduling(MLBS) problem. Since the MLBS problem is NP-hard, we propose approximation algorithms based on the Schedule Transition Graph (STG) and Virtual Scheduling Graph(VSG).
We also present an Iterative Local Replacement (ILR) scheme as a distributed implementation. Theoretical analyses and simulation studies verify that VBS is superior to the existing techniques


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
FLASH CROWD IN P2P LIVE STREAMING SYSTEMS: FUNDAMENTAL CHARACTERISTICS AND DESIGN IMPLICATIONS
Peer-to-peer (P2P) live video streaming systems have recently received substantial attention, with commercial deployment gaining increased popularity in the internet.
It is evident from our practical experiences with real-world systems that, it is not uncommon for hundreds of thousands of users to choose to join a program in the first few minutes of a live broadcast.
Such a severe flash crowd phenomenon in live streaming poses significant challenges in the system design.
In this paper, for the first time, we develop a mathematical model to: 1) capture the fundamental relationship between time and scale in P2P live streaming systems under a flash crowd, and 2) explore the design principle of population control to alleviate the impact of the flash crowd.
We carry out rigorous analysis that brings forth an in-depth understanding on effects of the gossip protocol and peer dynamics. In particular, we demonstrate that there exists an upper bound on the system scale with respect to a time constraint.
By trading peer startup delays in the initial stage of a flash crowd for system scale, we design a simple and flexible population control framework that can alleviate the flash crowd without the requirement of otherwise costly server deployment


*------------*------------*------------*------------*------------*------------*

DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
EXPLORING THE OPTIMAL REPLICATION STRATEGY IN P2P-VOD SYSTEMS: CHARACTERIZATION AND EVALUATION
P2P-Video-on-Demand (P2P-VoD) is a popular Internet service which aims to provide a scalable and high-quality service to users. At the same time, content providers of P2P-VoD services also need to make sure that the service is operated with a manageable operating cost.
Given the volume-based charging model by ISPs, P2P-VoD content providers would like to reduce peers’ access to the content server so as to reduce the operating cost.
In this paper, we address an important open problem: what is the “ optimal replication ratio” in a P2P-VoD system such that peers will receive service from each other and at the same time, reduce the access to the content server? We address two fundamental issues: 1) what is the optimal replication ratio of a movie if w e know its popularity, and 2) how to achieve these optimal ratios in a distributed and dynamic fashion.
We first formally show how movie popularities can impact server’s workload, and formulate the video replication as an optimization problem. We show that the conventional wisdom of using the proportional replication strategy is “ suboptimal,” and expand the design space to both “ passive replacement policy” and “ active push policy ” to achieve the optimal replication ratios.
We consider practical implementation issues, evaluate the performance of P2P-VoD systems and show how to greatly reduce server’s workload and improve streaming quality via our distributed algorithms


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
ENERGY-EFFICIENT TOPOLOGY CONTROL IN COOPERATIVE AD HOC NETWORKS
Cooperative communication (CC) exploits space diversity through allowing multiple nodes cooperatively relay signals to the receiver so that the combined signal at the receiver can be correctly decoded. Since CC can reduce the transmission power and extend the transmission coverage, it has been considered in topology control protocols
However, prior research on topology control with CC only focuses on maintaining the network connectivity, minimizing the transmission power of each node, whereas ignores the energy efficiency of paths in constructed topologies.
This may cause inefficient routes and hurt the overall network performance in cooperative ad hoc networks. In this paper, to address this problem, we introduce a new topology control problem: energy-efficient topology control problem with cooperative communication, and propose two topology control algorithms to build cooperative energy spanners in which the energy efficiency of individual paths are guaranteed.
Both proposed algorithms can be performed in distributed and localized fashion while maintaining the globally efficient paths. Simulation results confirm the nice performance of all proposed algorithms


*------------*------------*------------*------------*------------*------------*

DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
EMBEDDED TRANSITIVE CLOSURE NETWORK FOR RUNTIME DEADLOCK DETECTION IN NETWORKS-ON-CHIP
Interconnection networks with adaptive routing are susceptible to deadlock, which could lead to performance degradation or system failure. Detecting deadlocks at runtime is challenging because of their highly distributed characteristics.
In this paper, we present a deadlock detection method that utilizes runtime transitive closure (TC) computation to discover the existence of deadlock-equivalence sets, which imply loops of requests in networks-on-chip (NoCs). This detection scheme guarantees the discovery of all true deadlocks without false alarms in contrast with state-of-the-art approximation and heuristic approaches.
A distributed TC-network architecture, which couples with the NoC infrastructure, is also presented to realize the detection mechanism efficiently. Detailed hardware realization architectures and schematics are also discussed.
Our results based on a cycle-accurate simulator demonstrate the effectiveness of the proposed method. It drastically outperforms timing-based deadlock detection mechanisms by eliminating false detections and, thus, reducing energy wastage in retransmission for various traffic scenarios including real-world application.
We found that timing-based methods may produce two orders of magnitude more deadlock alarms than the TC-network method. Moreover, the implementations presented in this paper demonstrate that the hardware overhead of TC-networks is insignificant.


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
EFFICIENT HARDWARE BARRIER SYNCHRONIZATION IN MANY-CORE CMPS
Traditional software-based barrier implementations for shared memory parallel machines tend to produce hotspots in terms of memory and network contention as the number of processors increases. This could limit their applicability to future many-core CMPs in which possibly several dozens of cores would need to be synchronized efficiently.
In this work, we develop GBarrier, a hardware-based barrier mechanism especially aimed at providing efficient barriers in future many-core CMPs.
Our proposal deploys a dedicated G-line-based network to allow for fast and efficient signaling of barrier arrival and departure. Since GBarrier does not have any influence on the memory system, we avoid all coherence activity and barrier-related network traffic that traditional approaches introduce and that restrict scalability.
Through detailed simulations of a 32-core CMP, we compare GBarrier against one of the most efficient software-based barrier implementations for a set of kernels and scientific applications. Evaluation results show average reductions of 54 and 21 percent in execution time, 53 and 18 percent in network traffic, and also 76 and 31 percent in the energy-delay 2 product metric for the full CMP when the kernels and scientific applications, respectively, are considered


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
DYNAMIC BEACON MOBILITY SCHEDULING FOR SENSOR LOCALIZATION
In mobile-beacon assisted sensor localization, beacon mobility scheduling aims to determine the best beacon trajectory so that each sensor receives sufficient beacon signals and becomes localized with minimum delay.
We propose a novel DeteRministic dynamic bEAcon Mobility Scheduling (DREAMS) algorithm, without requiring any prior knowledge of the sensory field. In this algorithm, the beacon trajectory is defined as the track of Depth-First Traversal (DFT) of the network graph, thus deterministic.
The mobile beacon performs DFT dynamically, under the instruction of nearby sensors on the fly. It moves from sensor to sensor in an intelligent heuristic manner according to Received Signal Strength (RSS)-based distance measurements. We prove that DREAMS guarantees full localization (every sensor is localized) when the measurements are noise-free, and derive the upper bound of beacon total moving distance in this case.
Then, we suggest to apply node elimination and Local Minimum Spanning Tree (LMST) to shorten beacon tour and reduce delay. Further, we extend DREAMS to multibeacon scenarios. Beacons with different coordinate systems compete for localizing sensors. Loser beacons agree on winner beacons’ coordinate system, and become cooperative in subsequent localization.
All sensors are finally localized in a commonly agreed coordinate systems. Through simulation we show that DREAMS guarantees full localization even with noisy distance measurements. We evaluate its performance on localization delay and communication overhead in comparison with a previously proposed static path-based scheduling method


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
DRAGON: DETECTION AND TRACKING OF DYNAMIC AMORPHOUS EVENTS IN WIRELESS SENSOR NETWORKS
Wireless sensor networks may be deployed in many applications to detect and track events of interest. Events can be either point events with an exact location and constant shape, or region events which cover a large area and have dynamic shapes.
While both types of events have received attention, no event detection and tracking protocol in existing wireless sensor network research is able to identify and track region events with dynamic identities, which arise when events are created or destroyed through splitting and merging. In this paper, we propose DRAGON, an event detection and tracking protocol which is able to handle all types of events including region events with dynamic identities.
DRAGON employs two physics metaphors: event center of mass, to give an approximate location to the event; andnode momentum, to guide the detection of event merges and splits.
Both detailed theoretical analysis and extensive performance studies of DRAGON’s properties demonstrate that DRAGON’s execution is distributed among the sensor nodes, has low latency, is energy efficient, is able to run on a wide array of physical deployments, and has performance which scales well with event size, speed, and count.


*------------*------------*------------*------------*------------*------------*

DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
DISTRIBUTED DIAGNOSIS OF DYNAMIC EVENTS IN PARTITIONABLE ARBITRARY TOPOLOGY NETWORKS
This work introduces the Distributed Network Reachability(DNR) algorithm, a distributed system-level diagnosis algorithm that allows every node of a partitionable arbitrary topology network to determine which portions of the network are reachable and unreachable.
DNR is the first distributed diagnosis algorithm that works in the presence of network partitions and healings caused by dynamic fault and repair events. Both crash and timing faults are assumed, and a faulty node is indistinguishable of a network partition. Every link is alternately tested by one of its adjacent nodes at subsequent testing intervals. Upon the detection of a new event, the new diagnostic information is disseminated to reachable nodes.
New events can occur before the dissemination completes. Any time a new event is detected or informed, a working node may compute the network reachability using local diagnostic information. The bounded correctness of DNR is proved, including the bounded diagnostic latency, bounded startup and accuracy.
Simulation results are presented for several random and regular topologies, showing the performance of the algorithm under highly dynamic fault situations


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
COST-DRIVEN SCHEDULING OF GRID WORKFLOWS USING PARTIAL CRITICAL PATHS
Recently, utility Grids have emerged as a new model of service provisioning in heterogeneous distributed systems. In this model, users negotiate with service providers on their required Quality of Service and on the corresponding price to reach a Service Level Agreement.
One of the most challenging problems in utility Grids is workflow scheduling, i.e., the problem of satisfying the QoS of the users as well as minimizing the cost of workflow execution. In this paper, we propose a new QoS-based workflow scheduling algorithm based on a novel concept called Partial Critical Paths (PCP), that tries to minimize the cost of workflow execution while meeting a user-defined deadline.
The PCP algorithm has two phases: in the deadline distribution phase it recursively assigns subdeadlines to the tasks on the partial critical paths ending at previously assigned tasks, and in the planning phase it assigns the cheapest service to each task while meeting its subdeadline. The simulation results show that the performance of the PCP algorithm is very promising


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
COMPARISON-BASED SYSTEM-LEVEL FAULT DIAGNOSIS: A NEURAL NETWORK APPROACH
We consider the fault identification problem, also known as the system-level self-diagnosis, in multiprocessor and multicomputer systems using the comparison approach. In this diagnosis model, a set of tasks is assigned to pairs of nodes and their outcomes are compared by neighboring nodes.
Given that comparisons are performed by the nodes themselves, faulty nodes can incorrectly claim that fault-free nodes are faulty or that faulty ones are fault-free. The collections of all agreements and disagreements, i.e., the comparison outcomes, among the nodes are used to identify the set of permanently faulty nodes.
Since the introduction of the comparison model, significant progress has been made in both theory and practice associated with the original model and its offshoots. Nevertheless, the problem of efficiently identifying the set of faulty nodes when not all the comparison outcomes are available to the diagnosis algorithm at the beginning of the diagnosis phase, i.e., partial syndromes, remains an outstanding research issue.
In this paper, we introduce a novel diagnosis approach using neural networks to solve this fault identification problem using partial syndromes. Results from a thorough simulation study demonstrate the effectiveness of the neural-network-based self-diagnosis algorithm for randomly generated diagnosable systems of different sizes and under various fault scenarios.
We have then conducted extensive simulations using partial syndromes and nondiagnosable systems. Simulations showed that the neural-network-based diagnosis approach provided good results making it a viable addition or alternative to existing diagnosis algorithms


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
CATCHING PACKET DROPPERS AND MODIFIERS IN WIRELESS SENSOR NETWORKS
Packet dropping and modification are common attacks that can be launched by an adversary to disrupt communication in wireless multihop sensor networks. Many schemes have been proposed to mitigate or tolerate such attacks, but very few can effectively and efficiently identify the intruders.
To address this problem, we propose a simple yet effective scheme, which can identify misbehaving forwarders that drop or modify packets. Extensive analysis and simulations have been conducted to verify the effectiveness and efficiency of the scheme


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
CASHING IN ON THE CACHE IN THE CLOUD
Over the past decades, caching has become the key technology used for bridging the performance gap across memory hierarchies via temporal or spatial localities; in particular, the effect is prominent in disk storage systems. Applications that involve heavy I/O activities, which are common in the cloud, probably benefit the most from caching.
The use of local volatile memory as cache might be a natural alternative, but many well-known restrictions, such as capacity and the utilization of host machines, hinder its effective use. In addition to technical challenges, providing cache services in clouds encounters a major practical issue (quality of service or service level agreement issue) of pricing. Currently, (public) cloud users are limited to a small set of uniform and coarse-grained service offerings, such as High-Memory and High-CPU in Amazon EC2.
In this paper, we present the cache as a service (CaaS) model as an optional service to typical infrastructure service offerings. Specifically, the cloud provider sets aside a large pool of memory that can be dynamically partitioned and allocated to standard infrastructure services as disk cache.
We first investigate the feasibility of providing CaaS with the proof-of-concept elastic cache system (using dedicated remote memory servers) built and validated on the actual system, and practical benefits of CaaS for both users and providers (i.e., performance and profit, respectively) are thoroughly studied with a novel pricing scheme.
Our CaaS model helps to leverage the cloud economy greatly in that 1) the extra user cost for I/O performance gain is minimal if ever exists, and 2) the provider’s profit increases due to improvements in server consolidation resulting from that performance gain. Through extensive experiments with eight resource allocation strategies, we demonstrate that our CaaS model can be a promising cost-efficient solution for both users and providers


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
AN ONLINE DATA ACCESS PREDICTION AND OPTIMIZATION APPROACH FOR DISTRIBUTED SYSTEMS
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids.
In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism.
However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties.
By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community.
Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data


*------------*------------*------------*------------*------------*------------*

DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
A SURVEY OF PARALLEL PROGRAMMING MODELS AND TOOLS IN THE MULTI AND MANY-CORE ERA
In this work, we present a survey of the different parallel programming models and tools available today with special consideration to their suitability for high-performance computing. Thus, we review the shared and distributed memory approaches, as well as the current heterogeneous parallel programming model.
In addition, we analyze how the partitioned global address space (PGAS) and hybrid parallel programming models are used to combine the advantages of shared and distributed memory systems. The work is completed by considering languages with specific parallel support and the distributed programming paradigm. In all cases, we present characteristics, strengths, and weaknesses.
The study shows that the availability of multi-core CPUs has given new impulse to the shared memory parallel programming approach. In addition, we find that hybrid parallel programming is the current way of harnessing the capabilities of computer clusters with multi-core nodes.
On the other hand, heterogeneous programming is found to be an increasingly popular paradigm, as a consequence of the availability of multi-core CPUs+GPUs systems. The use of open industry standards like OpenMP, MPI, or OpenCL, as opposed to proprietary solutions, seems to be the way to uniformize and extend the use of parallel programming models


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
A SEQUENTIALLY CONSISTENT MULTIPROCESSOR ARCHITECTURE FOR OUT-OF-ORDER RETIREMENT OF INSTRUCTIONS
Out-of-order retirement of instructions has been shown to be an effective technique to increase the number of in-flight instructions. This form of runtime scheduling can reduce pipeline stalls caused by head-of-line blocking effects in the reorder buffer (ROB).
Expanding the width of the instruction window can be highly beneficial to multiprocessors that implement a strict memory model, especially when both loads and stores encounter long latencies due to cache misses, and whose stalls must be overlapped with instruction execution to overcome the memory latencies.
Based on the Validation Buffer (VB) architecture (a previously proposed out-of-order retirement, checkpoint-free architecture for single processors), this paper proposes a cost-effective, scalable, out-of-order retirement multiprocessor, capable of enforcing sequential consistency without impacting the design of the memory hierarchy or interconnect.
Our simulation results indicate that utilizing a VB can speed up both relaxed and sequentially consistent in-order retirement in future multiprocessor systems by between 3 and 20 percent, depending on the ROB size


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
A SECURE ERASURE CODE-BASED CLOUD STORAGE SYSTEM WITH SECURE DATA FORWARDING
A cloud storage system, consisting of a collection of storage servers, provides long-term storage services over the Internet. Storing data in a third party’s cloud system causes serious concern over data confidentiality.
General encryption schemes protect data confidentiality, but also limit the functionality of the storage system because a few operations are supported over encrypted data. Constructing a secure storage system that supports multiple functions is challenging when the storage system is distributed and has no central authority.
We propose a threshold proxy re-encryption scheme and integrate it with a decentralized erasure code such that a secure distributed storage system is formulated. The distributed storage system not only supports secure and robust data storage and retrieval, but also lets a user forward his data in the storage servers to another user without retrieving the data back.
The main technical contribution is that the proxy re-encryption scheme supports encoding operations over encrypted messages as well as forwarding operations over encoded and encrypted messages. Our method fully integrates encrypting, encoding, and forwarding.
We analyze and suggest suitable parameters for the number of copies of a message dispatched to storage servers and the number of storage servers queried by a key server. These parameters allow more flexible adjustment between the number of storage servers and robustness


*------------*------------*------------*------------*------------*------------*

DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
TRUSTWORTHY COORDINATION OF WEB SERVICES ATOMIC TRANSACTIONS
The Web Services Atomic Transactions (WS-AT) specification makes it possible for businesses to engage in standard distributed transaction processing over the Internet using Web Services technology. For such business applications, trustworthy coordination of WS-AT is crucial.
In this paper, we explain how to render WS-AT coordination trustworthy by applying Byzantine Fault Tolerance (BFT) techniques. More specifically, we show how to protect the core services described in the WS-AT specification, namely, the Activation service, the Registration service, the Completion service and the Coordinator service, against Byzantine faults.
The main contribution of this work is that it exploits the semantics of the WS-AT services to minimize the use of yzantine Agreement (BA), instead of applying BFT techniques naively, which would be prohibitively expensive.
We have incorporated our BFT protocols and mechanisms into an open-source framework that implements the WS-AT specification. The resulting BFT framework for WS-AT is useful for business applications that are based on WS-AT and that require a high degree of dependability, security, and trust


*------------*------------*------------*------------*------------*------------*
 
DOMAIN - PARALLEL AND DISTRIBUTED SYSTEMS
A NETWORK CODING EQUIVALENT CONTENT DISTRIBUTION SCHEME FOR EFFICIENT PEER-TO-PEER INTERACTIVE VOD STREAMING
Although random access operations are desirable for on-demand video streaming in peer-to-peer systems, they are difficult to efficiently achieve due to the asynchronous interactive behaviors of users and the dynamic nature of peers.
In this paper, we propose a network coding equivalent content distribution (NCECD) scheme to efficiently handle interactive video-on-demand (VoD) operations in peer-to-peer systems. In NCECD, videos are divided into segments that are then further divided into blocks. These blocks are encoded into independent blocks that are distributed to different peers for local storage.
With NCECD, a new client only needs to connect to a sufficient number of parent peers to be able to view the whole video and rarely needs to find new parents when performing random access operations. In most existing methods, a new client must search for parent peers containing specific segments; however, NCECD uses the properties of network coding to cache equivalent content in peers, so that one can pick any parent without additional searches.
Experimental results show that the proposed scheme achieves low startup and jump searching delays and requires fewer server resources. In addition, we present the analysis of system parameters to achieve reasonable block loss rates for the proposed scheme