Tutorials

Tutorial registration

Tutorials are included in the conference fee. However, in order to arrange proper logistics for the tutorials, after registration we kindly ask you to send an email to europar05@di.fct.unl.pt, with subject "Tutorial Attendance" and containing, in the message body, your name and the tutorial(s) you want to attend to (#1, #2 or #3). All tutorials will take place on Tuesday, August 30, 2005.

Tutorials:
  1. Testing Multi-threaded and Distributed Applications
  2. Kerrighed, a Single System Image Cluster Operating System
  3. Creating and Managing Distributed Scientific Workflows

Outline of the Tuesday Schedule:

9:00 -  Bus takes participants from Hotel Costa da Caparica
9:30 -  Start of Tutorials 1 and 2
11:00 - Cofffee Break
11:30 - Continuation of the Tutorials
12:30 - Lunch
13:00 - Bus takes participants in Tutorial 3 from Hotel Costa da Caparica
14:00 - Continuation of  Tutorials 1 and 2  / Start of Tutorial 3
15:30 - Coffee Break
17:30 - End
(Buses takes participants back to Hotel Costa da Caparica)


Tutorial 1 - Testing Multithreaded and Distributed Applications

Eitan Farchi and Shmuel Ur
IBM, Haifa, Israel
Full-day tutorial with hands-on session
Location: Room 1B


The industry's newest proven techniques and tools for reviewing and testing multithreaded applications will be presented in this tutorial, along with hands-on experience in their use. We will begin with an overview of why multithreaded code is so difficult to develop, and present, in some detail, the bug patterns commonly found in such code. We will then introduce the interleaving review technique (IRT), a very effective, scenario-based review technique for exposing scheduling problems. A review exercise in which participants will use concurrent bug patterns to find deadlocks and race conditions will conclude the first part of the tutorial.

The second part of the tutorial will discuss effective unit testing of multithreaded programs that expose concurrent bugs such as deadlocks, and race conditions. This is of utmost importance: bugs in multithreaded code are usually found during system or stress testing, or at the customer site, often with catastrophic consequences. We will further explain how function and system testing of multithreaded programs can be improved. An exercise in test execution techniques and analysis will conclude the tutorial. In the course of the session, we will also touch on aspects of debugging and coverage that are unique to multithreaded and distributed applications.

The techniques taught in the tutorial are being applied in the industry to multithreading distributive and concurrent code. They are used on all forms of software from embedded micro code all the way to huge middleware applications.

The following topics are covered:

- Typical problems in designing, coding, inspecting and testing
multithreaded code
- Design-level test scenario selection and representation
- Bug-pattern based review
- The Interleaving Review Technique (IRT)
- Roles in review: program counter, devil's advocate, and
stenographer
- Devil's advocate guidelines
- Bug patterns
- How to represent the system state and the system
changes when multi-threading, several CPUs, and asynchronous
calls are used
- Testing tools
- Interleaving generation tools such as ConTest (Contest
can be downloaded from
http://www.alphaworks.ibm.com/tech/contest)
- Race detection tools such as Intel's Thread Checker (available from
http://www.intel.com/software/products/threading/tcwin/overview.htm )
- Coverage, debugging, and replay options
- Unit testing of concurrent programs
- Writing unit level multithreaded tests
- Fast creation of multithreaded testing harness
- A method for stressing units and evaluating multithreaded
tests
- Function and system testing
- Writing system level multithreaded tests
- Making tests more effective
- Evaluating test quality

[Tutorials Index]

Tutorial 2 - Kerrighed, a Single System Image Cluster Operating System

Christine Morin and Renaud Lottiaux
IRISA/INRIA, France
Full-day tutorial with hands-on session
Location: Room 1A

Single System Image (SSI) systems for clusters have recently gained a lot of interest, in particular in the area of high performance computing. A single system image system provides the illusion that a cluster is a single machine. Such a system eases cluster use and programming for parallel computing. A SSI globally manages all the cluster resources to hide resource distribution in the cluster nodes. It is made up of a set of distributed services to manage processes, memory, data streams and files cluster-wide.

The goal of this tutorial is to provide a detailed understanding of the SSI technology available today. Kerrighed system, one of the leading SSI technology for clusters, will be presented. Kerrighed is a distributed operating system based on Linux giving the illusion of a virtual SMP machine. Results from a qualitative and quantitative comparative study with openMosix and OpenSSI, two other Linux based SSI systems, will be analyzed. Future research directions in the design of SSI systems will be discussed.

The outline of the tutorial is the following (with demos associated with the presentation of Kerrighed features):

Presenters Bio

Christine Morin received an engineering degree from the Institut National des Sciences Appliquées (INSA) of Rennes (France) in 1987, and master and PhD degrees in computer science from the University of Rennes I in 1987 and 1990, respectively. In March 1998, she received the Habilitation à diriger des recherches in computer science from the Université de Rennes

Since 1991, she has held a researcher position at INRIA and has carried out her research activities at IRISA/INRIA-Rennes. From October 2000 to August 2002, she has held a temporary assistant professor position at IFSIC (University of Rennes I). She now holds a senior researcher position at INRIA.

Research Interests
Her research interests are in operating system, distributed system, fault tolerance and cluster computing. She leads Kerrighed research activities aiming at the design and implementation of a single system image cluster operating system for high performance computing (http://www.kerrighed.org). Kerrighed software, based on Linux, is available as an open source software under the GPL licence.

[Tutorials Index]

Tutorial 3 - Creating and Managing Distributed Scientific Workflows

Techniques and Tools
Omer F. Rana, Ian Taylor, Matthew Shields and David W. Walker
Cardiff University, UK
Half-day tutorial with demonstrations
Location: Room 2A

Viewing an application as a coordinated execution of one or more services has become an important undertaking recently. A variety of approaches have been introduced which enable distributed services to be combined across different administrative domains. Each service in this context may be independently managed, and may be made available at different time instances. Workflow is a concept commonly used to coordinate the execution of such services -- and adapted from its use in automating business and information processes within an organisation. The notion of workflow has existed for many years, and workflow enactment generally refers to the automated execution of some activities in a pre-defined order.

The need to separate the ``what'' -- which specifies the ``knowledge to be used in solving problems'', from the ``how'' -- the ``problem solving strategies (process) by which that knowledge is used'', is an important step to enable the re-use of services within a workflow. This division between the control and logic is useful to enable components developed by a variety of vendors to interoperate more efficiently. Treating workflow as the ``how'' gives us a good handle on why the problem solving strategies may be usefully shared between different scientific communities. Additionally, workflow allows application developers to use problem solving features that would otherwise be too expensive to handcraft. For instance, the ability to directly manage the execution ordering of a set of services at runtime allows one to support advanced features like computational steering. Recent advances in Grid computing, for instance, often aim to provide suitable infrastructure to enable such services to be deployed and used by a variety of workflow enactment engines.

The aim of this tutorial is to introduce the general notion of distributed workflows, and then demonstrate techniques that may be used to support such workflows. A tool that may be used to enact distributed workflows will also be introduced. A significant portion of this tutorial will assume that a user has access to a graphical interface to compose the workflow. Techniques which are not based on such an interface, but rely on the use of planning techniques, will also be briefly introduced.

Tutorial Structure and Organisation

The tutorial will be between 2.5 to 3 hours long.

  1. Introduction to workflow techniques 
    (section constitutes 20% of the tutorial time)

    - What is workflow and its relevance to scientific computing
    - Workflow enactment techniques
    - Survey of workflow engines for distributed scientific
    computing
    - Workflow and Problem Solving Environments
    - Workflow and Grid Computing
    - Standardisation efforts
    - When workflow techniques should NOT be used
  2. Constructing and managing workflows
    (this section constitutes 25% of the the tutorial time)

    - Managing a workflow session
    - Data type checking
    - Data staging in a workflow
    - Management of conditions
    - Support for loops and recursive structures
    - Workflow enactment infrastructure (event-based vs.
    dataflow based)
  3. Application example: Distributed Data Mining via the
    FAEHIM Toolkit
    (this section constitutes 25% of the tutorial time)

    - The FAEHIM toolkit and its use
    - Support of Triana within FAEHIM
    - Types of services supported
    - Example of a service + its description
    - Combining services from FAEHIM
    - Combining services from FAEHIM with third party
    services
    - Writing your own services
  4. Adaptive workflows
    (this section constitutes 25% of the tutorial time)

    - Static vs. Dynamic workflows
    - Service discovery support
    - Types of discovery mechanisms
    - Managing reference handles
    - Workflow generation via planning techniques
    - Workflow embedding techniques
    - Design patterns for workflow
    - Adaptive design patterns through workflow operators
  5. Workflow-related research themes
    (this section constitutes 5% of the tutorial time)

    - Workflow and Provenance issues
    - Workflow interoperability

Who should attend?

This tutorial will appeal to the following individuals:

Presenters Bio

Omer F. Rana is a Senior Lecturer in the Department of Computer Science at Cardiff University, and the Deputy Director of the Welsh eScience Centre. He also acts as a technical advisor to ``Grid Technology Partners'' (www.gridpartners.com) -- a US based company specialising in Grid technology transfer to industry. He holds a PhD in Computing from Imperial College, London, and works in the areas of high performance distributed computing, multi-agent systems and data mining. Dr Rana has been involved in the programme committees for various conferences and workshops in the area of Grid Computing, and also participates on the Editorial boards of the ``Concurrency and Computation: Practice and Experience", ``Scientific Programming", and the ``ACM Transactions on Autonomous and Adaptive Systems" journals.

[Tutorials Index]

 
Departamento de Informática - Faculdade de Ciências e Tecnologia - Universidade Nova de Lisboa - 2005 - Jorge F. Custódio