![]() |
Computer Centre 2000 project |
|---|
VIDEOCONFERENCING
FOR HEP RESEARCHERS
INTRODUCTION
In the last year’s packet-mode videoconference has
found a very important place in the academic world and has been adopted by the
major HEP (High Energy Physics) sites and collaborations. Both ATLAS and CMS in
their Computing Model Proposal clearly indicate that videoconferencing is an
essential part of the tools needed to collaborate worldwide, which can be
verified by the intensive usage of CERN videoconference facilities by LHC
experiments and by the success of the CERN VRVS project.
This project complements the project present last year, which has not completely implemented due to the lack of proper funding.
There are presently two major forms of
video-conferencing:
1.
Packet video-conferencing
Since several years
a number of initiatives have been undertaken to overcome the many limitations
of CODEC video-conferencing by using workstations, packet networks and software applications enabling cooperative
distance working. This is what is now called Packet Video-Conferencing. From
the user's point of view, a key feature of Packet video-conferencing is that it
allows the implementation of software application sharing, and shared
whiteboard. Another key feature of Packet VC is that it may be at no direct cost for users who can perform
Packet VC over an existing network.
2.
CODEC video-conferencing
This is the best
known and longer in use form of video-conferencing. Closed systems are
available from industry and can be installed in conference rooms as dedicated
equipment. CODEC systems communicate over ISDN links.
Videoconference
sessions among several sites require an additional expensive piece of equipment
called Multi-point Control Unit (MCU).
Telephone companies are selling MCU services together with a scheduling system.
This scheduling system imposes a major constraint. Indeed a scheduling system
may require booking as much as two weeks in advance, which is a very stringent
limitation.
OBJECTIVES
The objective is to
complement and develop the work initiated in 1999 and further investigate all
the topics of modern video conferencing tools. The main goals are:
·
Consolidate the installed
infrastructure and expand it to other LIP sites.
·
Deploy videoconference systems and
further consolidate them as a service.
·
Validate prototypes with our
partners CERN, FCCN and other remote research and academic institutions.
·
Improve existing tools and develop
new ones where needed.
·
Further improve quality of service.
PACKET-MODE
VIDEOCONFERENCE
Packet-mode video-conferencing makes use of
workstations equipped with audio and video capabilities and connected over
regular IP packet networks (Internet/MBONE). It can be cheap, flexible and well
integrated in the users computing environment. The applications can be
supported on a wide range of UNIX workstations, PCs and in some cases
Macintoshes. Public-domain software is available, and ensures good
interoperability.
Packet-mode videoconference is a technology that can
decrease of traveling needs to attend meetings and conferences hence decreasing
costs and increasing productivity. This kind of technology is also helpful for
geographically scattered organizations since allows cooperative work between
remote researchers.
Another potential use of this technology is remote
intervention to solve problems, for instance in data acquisition systems,
allowing engineers and experts to watch equipment and trouble shoot problems
without being actually present.
Most packet mode videoconference applications can
run in two modes:
·
UNICAST MODE - A UNICAST packet is a packet addressed to a particular, single
system. Only the recipient will "see" this packet since its network
interface knows about its own particular address. All the other stations on the
subnet will not read this packet since the packet destination address differs
from their own addresses.
Unicasts cross bridges
transparently. Since the bridges know about the network topology only the
segments that have to support the
traffic will receive the packet.
·
MULTICAST MODE - A MULTICAST packet is a packet addressed to a group of nodes. The
destination address is particular to the group
of systems it wants to reach. Network interfaces are only listening to the
groups the system should listen to (requested by the applications on the node).
Bridges forward multicasts, and since they cannot know where the potential
destinations are located, the multicast packets are sent to all interfaces (flooding the whole network) unless
special switches are deployed.
PARTNERS
Collaboration already exists with CERN IT division
Internet Applications group that is in charge of the CERN pilot Packet video
and video-conferencing activities. The Internet Applications group is
conducting the Video-conferencing project
for LHC collaborations officially approved by the LHC Computing Board
(LCB). LIP continues particularly interested in this project which covers a
wide range of technologies.
FCCN (Fundação para a Computação Científica
Nacional) the Portuguese foundation which coordinates the Portuguese academic
and research network (RCCN) is officially researching and supporting
packet-mode-video-conference activities within the Portuguese network. LIP is
collaborating with FCCN to develop and promote the RCCN videoconference
infrastructure and their usage.
THE VIRTUAL
ROOM VIDEOCONFERENCING SYSTEM (VRVS)
The Virtual
Room Videoconferencing System (VRVS) is a packet mode video-conferencing
system based on the LBNL and UCL applications that are well adapted to HEP
needs, and is being developed by CERN and Caltech.
The VRVS system is based on a “Virtual Videoconference Room”. A series of IP Servers/Reflectors
connects users within a virtual room via a set of interconnected IP
tunnels, so that they form a private video-group. Each participant sees the
others in the “virtual room” through
a series of windows. A web
user-interface provides worldwide secure access, on demand, to each virtual
room. The “Virtual Rooms” concept
makes conference access and scheduling easier, and makes effective bandwidth
management on critical links.

Figure 1
The video Reflectors run on Unix platforms, and
interconnect the users joining a virtual room by permanent IP tunnels, forming
a set of virtual video sub-networks. Participants at any location can join
videoconferences by contacting their "closest" reflector. In order to
make efficient use of the bandwidth, packets (video, audio and data streams)
are sent through the tunnel between two reflectors only if there are
participants on both sides.
In addition, the network reflector topology is
chosen taking into account both geography and the bandwidth available on each
network link, in order to optimize the network-connectivity paths. The
extension of the virtual video sub-networks has progressed by installing
several reflectors in Universities and HEP laboratories:
|
Switzerland |
CERN |
Italy |
CNAF Bologna |
|
UK |
Rutherford Lab |
Germany |
Heidelberg University |
|
France |
IN2P3 Lyon, CPPM Marseille |
Spain |
IFCA-University Cantabria |
|
Finland |
FUNET Helsinki |
Venuzuela |
CeCalULA |
|
Taiwan |
Academia Sinica |
Portugal |
LIP |
|
Russia |
Moscow State University, Tyer University |
USA |
Caltech, LBNL, SLAC, FNAL, ANL, Jefferson Lab,
BNL, DoE HQ Germantown |
The
use of Web technology allows any authorized user, at any location, to access a
wide range of services for packet-based videoconferencing. The Web-based user
interface supports the centralized conference scheduling, coordination and
access control.
Since the system went into trial, the system has
been deployed and expanded to 1131 registered hosts running the VRVS software.
The
following graph shows the evolution of the number of machines registered in the
VRVS system as from 1st January 1997.

The
following table shows the number of real-work meetings (excluding test sessions) from 1st January 1998 to 20 May
1999. The different meetings involved at least one site from Europe (UK, Italy,
France, Spain, etc…), or one site from the USA.

At LIP several videoconference tests have been conducted in 1999 showing the feasibility of using the system with the current available bandwidth. The situation improved with the bandwidth upgrade (from 8 to 15 Mb/s) between Portugal and the TEN 155 (Trans European Network) backbone. Many CERN and non-CERN broadcasted events have been attended at LIP using VRVS and the Mbone systems.
Events such as the VIII International Conference on Calorimetry in High Energy Physics organized in Lisbon by LIP have been broadcasted worldwide using packet-mode videoconference.
The first VRVS reflector in Portugal was installed in September 1999, simultaneously with the installation of a Videoconference infrastructure in the main meeting room which is now being tested and is expected to became fully operational in October.
Strong interest in this project has been shown by the LIP community and many physicists are interested in using the facilities as soon has they became operational and others are using them as beta testers.
TASKS
The identified main tasks to be performed are:
·
Packet
video-conferencing:
·
Extend
the current existing VC infrastructure to include other LIP sites using PCs
equipped with audio and video capture boards and low cost video cameras.
·
Development
of recording/playback facilities for packet videoconferences.
·
Implementation
of a videoconference recording system.
·
Implementation
of a video on demand server for videoconferencing playback.
·
Codec
and Packet video-conferencing integration:
·
In
a mixed environment, there is a clear need for achieving inter-working between
CODEC systems and packet-based systems. Via a gateway, CODEC and Packet
participants will be able to share the same teleconference.
·
Improve
the installation at the LIP main conference room:
·
Add
a second data projector to display local video such as presentations in
simultaneous with the display of remote video.
·
Add
a pair of loudspeakers and corresponding amplifier to cover the whole room.
COMMODITY
COMPUTING COMPONENTS
INTRODUCTION
In the past years the High Energy Physics community
have satisfied his need for computing by deploying RISC/UNIX machines. The new
generation experiments, based on LHC (Large Hadron Collider), will demand
computing capacities that are at least three orders of magnitudes higher making
the usage of the traditional RISC farms unaffordable.
At CERN the IT division through the IP section of the PDP (Physics Data Processing) group has established a pilot project to construct and evaluate PC farms. Several PC farms are in study and production:
·
LXPLUS - before the end of the
current year CERN will introduce a new central Linux-based public service named
LXPLUS.
·
PCSF - The PCSF farm has
thirty-five client systems and one server machine interconnected with fast
ethernet. The PCSF service is targeted
as an event simulation facility for the LHC experiments, based on Intel Pentium II and Pentium Pro
processors running Windows NT.
·
PCRD - The PC Performance
Research & Development Project is a pilot farm for many research projects
such as NA45, NA48 and COMPASS. It aims at demonstrating that PC farms can be
used in a cost-effective manner to process Physics jobs. The PCRD farm is currently made of a set of
7 Dual Pentium PCs plus 2 low-end RISC machines. The current PCRD PCs runs
Windows NT 4.0 and Red Hat Linux to understand the power and limitations of
both Operating Systems.
·
NA49PC - The NA49PC farm is
dedicated to NA49's PROOF visualization tool.
The NA49PC farm is currently made of a set of 5 Dual Pentium II PCs @
300MHz. The Operating System is Windows NT 4.0.
·
PC-NOMAD - A PC based production
environment for NOMAD's reconstruction
LIP has developed activities in this area since
1997. These activities have resulted in the decision of using low cost PCs as a
replacement for desktop X terminals. These PCs run Linux in a cluster configuration
sharing the operating system and configurations. Tools have been developed to
ease the cluster management and integration of new desktop systems. Work is
also being done in the evaluation of PCs for high performance computing with
both Linux, WindowsNT and FreeBSD.
This work is being done in the belief that only by
using commodity computing components will the HEP community be sure to align
itself with the best price/performance possible and reach the LHC computing
requirements at an affordable cost.
OBJECTIVES
The objective is to build a small production farm of
PCs running Linux.
THE FARM
The farm systems should be connected through a Fast
Ethernet network. The architecture will be composed of one main server with
scsi disks dedicated mostly to data storage and retrieval and several (at least
two) stripped systems dedicated to CPU intensive tasks. The know-how obtained
in the Linux desktop cluster will be applied and enhanced in this project.
The stripped systems will be equipped with the essential
minimum, motherboard, cpu, memory, network card, graphics card and a small disk
for swapping. The keyboard, mouse and monitor ports will be connected to one data switch which will allow to perform
the management tasks for all systems from one console. With this configuration
increasing the computing power of the cluster should be fairly inexpensive.
The choice of Linux is based on the know how
obtained previously which indicate that Linux is better suited to run physics
jobs than WindowsNT. Linux is also easily integrated in the LIP computer Center
infrastructure that is now based mostly on UNIX systems. Users are familiar
with UNIX systems and some are already running jobs in their Linux desktops,
this choice will also facilitate the porting of applications to the farm.
Finally Linux is widely available, well supported by the academic community, is
open source and free.
STORAGE
SYSTEMS
INTRODUCTION
Disk failures are common in computer centers the
mechanical nature of disks makes them extremely prune to malfunctions. Hard
disks are the only continuously moving devices in a computer and environmental
factors such as dust temperature and spikes can reduce considerably the mean
time between failures (MTBF). Failures in disks storing critical information
usually means down time with impact to all users using either the broken disk
or the system hosting it. In fact the system to which the disk is attached must
be brought offline in order to diagnose the problem and if it can’t be
corrected the disk must be replaced by a new one, finally the data must be
restored from backups. Generally this means that something will be lost. Even
when backups are made frequently any change performed after the last backup
will be lost. In computer centers where jobs run for long periods (days or even
weeks) opened files can’t be completely backed up while the corresponding job
is running which can lead to more lost data.
RAID systems due to their redundancy overcome these
problems making the data always available. RAID technology is made of different
configurations called levels. The level defines how data is written to the
drives and the minimum number of drives required.
In RAID level 5 systems the data and parity
information is striped among all drives. When one of the disks is not available
the redundant information stored in the other disks is sufficient to continue
working, and since disks are hot swappable they can be replaced and rebuilt
without the need of stopping the system. In practice this means that no data is
lost from the broken disk while users can continue to work without
interruption.
Since the data is spread across all the disks of the
RAID system throughput is also better than in traditional disks, this behavior
is also improved by the usage of large memory caches inside the RAID
controllers.
OBJECTIVES
Implement a high performance fault tolerant disk storage system for critical information such as user accounts and software shared by several systems, in order to improve the reliability and up time of all central services.
RAID SYSTEM
The system should be built around a RAID (Redundant Array of Inexpensive Disks) level 5 storage system connected to two UNIX servers through a dual path SCSI controller (RAID side) and SCSI RAID controllers (server side). Load balancing and failover between the two servers should be supported. Network access to the RAID volumes will be performed through NFS and SAMBA. For better network access performance the servers should be equipped with two fast ethernet controllers each. Due to the large storage capacity of the RAID system a DLT 4000 tape drive should be dedicated to RAID backup operations.
In the same framework distributed file systems
should be studied in order to give a better network access to the raid systems.
Two options will be considered:
·
DFS
(Distributed File System) a standard from Open Software Foundation will be
evaluated as an option to AFS. DFS is a standard included in the DCE
(Distributed Computing Environment) which is already supported by many major vendors
and presents several advantages including lower cost than AFS.
·
CODA
is a distributed file system developed at CMU and is descended from AFS2, the
last non-commercial release of AFS. CODA supports persistent client caching of
whole files so that client activity can continue even when the server becomes
disconnected.