|
Tutorials are included in the conference fee.
However, in order to arrange proper logistics for the tutorials, after registration
we kindly ask you to send an email to europar05@di.fct.unl.pt, with subject
"Tutorial Attendance" and containing, in the message body, your name and the
tutorial(s) you want to attend to (#1, #2 or #3). All tutorials will take
place on Tuesday, August 30, 2005.
Tutorials:
- Testing Multi-threaded and Distributed Applications
- Kerrighed, a Single System Image Cluster Operating
System
- Creating and Managing Distributed Scientific Workflows
Outline of the Tuesday Schedule:
9:00 - Bus takes participants from Hotel Costa
da Caparica
9:30 - Start of Tutorials 1 and 2
11:00 - Cofffee Break
11:30 - Continuation of the Tutorials
12:30 - Lunch
13:00 - Bus takes participants in Tutorial 3 from
Hotel Costa da Caparica
14:00 - Continuation of Tutorials 1 and 2
/ Start of Tutorial 3
15:30 - Coffee Break
17:30 - End
(Buses takes participants back to Hotel Costa da Caparica)
|
Tutorial
1 - Testing Multithreaded and Distributed Applications
|
Eitan Farchi and Shmuel Ur
IBM, Haifa, Israel
Full-day tutorial with hands-on session
Location: Room 1B
The industry's newest proven techniques and
tools for reviewing and testing multithreaded applications will be presented
in this tutorial, along with hands-on experience in their use. We will begin
with an overview of why multithreaded code is so difficult to develop, and
present, in some detail, the bug patterns commonly found in such code. We
will then introduce the interleaving review technique (IRT), a very effective,
scenario-based review technique for exposing scheduling problems. A review
exercise in which participants will use concurrent bug patterns to find deadlocks
and race conditions will conclude the first part of the tutorial.
The second part of the tutorial will discuss
effective unit testing of multithreaded programs that expose concurrent bugs
such as deadlocks, and race conditions. This is of utmost importance: bugs
in multithreaded code are usually found during system or stress testing,
or at the customer site, often with catastrophic consequences. We will further
explain how function and system testing of multithreaded programs can be improved.
An exercise in test execution techniques and analysis will conclude the
tutorial. In the course of the session, we will also touch on aspects of
debugging and coverage that are unique to multithreaded and distributed applications.
The techniques taught in the tutorial are being
applied in the industry to multithreading distributive and concurrent code.
They are used on all forms of software from embedded micro code all the way
to huge middleware applications.
- Typical problems in designing, coding, inspecting and testing multithreaded code - Design-level test scenario selection and representation - Bug-pattern based review - The Interleaving Review Technique (IRT) - Roles in review: program counter, devil's advocate, and stenographer - Devil's advocate guidelines - Bug patterns - How to represent the system state and the system changes when multi-threading, several CPUs, and asynchronous calls are used - Testing tools - Interleaving generation tools such as ConTest (Contest can be downloaded from http://www.alphaworks.ibm.com/tech/contest) - Race detection tools such as Intel's Thread Checker (available from http://www.intel.com/software/products/threading/tcwin/overview.htm ) - Coverage, debugging, and replay options - Unit testing of concurrent programs - Writing unit level multithreaded tests - Fast creation of multithreaded testing harness - A method for stressing units and evaluating multithreaded tests - Function and system testing - Writing system level multithreaded tests - Making tests more effective - Evaluating test quality
[Tutorials Index]
|
Tutorial
2 - Kerrighed, a Single System Image Cluster Operating System
|
Christine Morin and Renaud Lottiaux
IRISA/INRIA, France
Full-day tutorial with hands-on session
Location: Room 1A
Single System Image (SSI) systems for clusters
have recently gained a lot of interest, in particular in the area of high
performance computing. A single system image system provides the illusion
that a cluster is a single machine. Such a system eases cluster use and
programming for parallel computing. A SSI globally manages all the cluster
resources to hide resource distribution in the cluster nodes. It is made
up of a set of distributed services to manage processes, memory, data streams
and files cluster-wide.
The goal of this tutorial is to provide a detailed
understanding of the SSI technology available today. Kerrighed system, one
of the leading SSI technology for clusters, will be presented. Kerrighed
is a distributed operating system based on Linux giving the illusion of a
virtual SMP machine. Results from a qualitative and quantitative comparative
study with openMosix and OpenSSI, two other Linux based SSI systems, will
be analyzed. Future research directions in the design of SSI systems will
be discussed.
- Introduction to clusters
- Presentation of the SSI concept
- State of the art
- Overview of Kerrighed
- Kerrighed architecture
- Global process scheduling (process placement, dynamic load balancing,
configurability of the scheduling policy)
- Kerrighed key concepts: ghosts, containers, dynamic streams
- Global process management in Kerrighed (process naming, process migration,
process cloning, global ps and top)
- Global data stream management (socket, pipe)
- Global memory management in Kerrighed (distributed shared memory, remote
paging)
- Distributed synchronization (barriers, semaphores, ...)
- KerFS: Kerrighed distributed file system (virtual global disk, cooperative
caches, data replication)
- Kerrighed high performance communication system
- Support for parallel application checkpoint/restart in Kerrighed
- Capabilities to selectively enable/disable SSI mechanisms
- Running applications on Kerrighed
- Deploying Kerrighed with OSCAR using the SSI-OSCAR package
- Experience with Kerrighed
- Comparison with OpenSSI and openMosix (SSI properties
coverage, performance comparison)
- Future research directions
Christine Morin received an engineering degree
from the Institut National des Sciences Appliquées (INSA) of Rennes
(France) in 1987, and master and PhD degrees in computer science from the
University of Rennes I in 1987 and 1990, respectively. In March 1998, she
received the Habilitation à diriger des recherches in computer science
from the Université de Rennes
Since 1991, she has held a researcher position
at INRIA and has carried out her research activities at IRISA/INRIA-Rennes.
From October 2000 to August 2002, she has held a temporary assistant professor
position at IFSIC (University of Rennes I). She now holds a senior researcher
position at INRIA.
Research Interests
Her research interests are in operating system, distributed system, fault
tolerance and cluster computing. She leads Kerrighed research activities aiming
at the design and implementation of a single system image cluster operating
system for high performance computing (http://www.kerrighed.org). Kerrighed
software, based on Linux, is available as an open source software under
the GPL licence.
[Tutorials Index]
|
Tutorial
3 - Creating and Managing Distributed Scientific Workflows
|
Techniques and Tools
Omer F. Rana, Ian Taylor, Matthew Shields and David W. Walker
Cardiff University, UK
Half-day tutorial with demonstrations
Location: Room 2A
Viewing an application as a coordinated execution
of one or more services has become an important undertaking recently. A variety
of approaches have been introduced which enable distributed services to be
combined across different administrative domains. Each service in this context
may be independently managed, and may be made available at different time
instances. Workflow is a concept commonly used to coordinate the execution
of such services -- and adapted from its use in automating business and information
processes within an organisation. The notion of workflow has existed for many
years, and workflow enactment generally refers to the automated execution
of some activities in a pre-defined order.
The need to separate the ``what'' -- which
specifies the ``knowledge to be used in solving problems'', from the ``how''
-- the ``problem solving strategies (process) by which that knowledge is used'',
is an important step to enable the re-use of services within a workflow.
This division between the control and logic is useful to enable components
developed by a variety of vendors to interoperate more efficiently. Treating
workflow as the ``how'' gives us a good handle on why the problem solving
strategies may be usefully shared between different scientific communities.
Additionally, workflow allows application developers to use problem solving
features that would otherwise be too expensive to handcraft. For instance,
the ability to directly manage the execution ordering of a set of services
at runtime allows one to support advanced features like computational steering.
Recent advances in Grid computing, for instance, often aim to provide suitable
infrastructure to enable such services to be deployed and used by a variety
of workflow enactment engines.
The aim of this tutorial is to introduce the
general notion of distributed workflows, and then demonstrate techniques that
may be used to support such workflows. A tool that may be used to enact
distributed workflows will also be introduced. A significant portion of
this tutorial will assume that a user has access to a graphical interface
to compose the workflow. Techniques which are not based on such an interface,
but rely on the use of planning techniques, will also be briefly introduced.
The tutorial will be between 2.5 to 3 hours
long.
-
Introduction to workflow techniques (section constitutes 20% of the tutorial time)
- What is workflow and its relevance to scientific computing - Workflow enactment techniques - Survey of workflow engines for distributed scientific computing - Workflow and Problem Solving Environments - Workflow and Grid Computing - Standardisation efforts - When workflow techniques should NOT be used
-
Constructing and managing workflows (this section constitutes 25% of the the tutorial time)
- Managing a workflow session - Data type checking - Data staging in a workflow - Management of conditions - Support for loops and recursive structures - Workflow enactment infrastructure (event-based vs. dataflow based)
-
Application example: Distributed Data Mining via the
FAEHIM Toolkit (this section constitutes 25% of the tutorial time)
- The FAEHIM toolkit and its use - Support of Triana within FAEHIM - Types of services supported - Example of a service + its description - Combining services from FAEHIM - Combining services from FAEHIM with third party services - Writing your own services
-
Adaptive workflows
(this section constitutes 25% of the tutorial time)
- Static vs. Dynamic workflows - Service discovery support - Types of discovery mechanisms - Managing reference handles - Workflow generation via planning techniques - Workflow embedding techniques - Design patterns for workflow - Adaptive design patterns through workflow operators
-
Workflow-related research themes
(this section constitutes 5% of the tutorial time)
- Workflow and Provenance issues - Workflow interoperability
This tutorial will appeal to the following
individuals:
Omer F. Rana is a Senior Lecturer in the Department
of Computer Science at Cardiff University, and the Deputy Director of the
Welsh eScience Centre. He also acts as a technical advisor to ``Grid Technology
Partners'' (www.gridpartners.com) -- a US based company specialising in
Grid technology transfer to industry. He holds a PhD in Computing from Imperial
College, London, and works in the areas of high performance distributed computing,
multi-agent systems and data mining. Dr Rana has been involved in the programme
committees for various conferences and workshops in the area of Grid Computing,
and also participates on the Editorial boards of the ``Concurrency and Computation:
Practice and Experience", ``Scientific Programming", and the ``ACM Transactions
on Autonomous and Adaptive Systems" journals.
[Tutorials Index]
|