Jack Dongarra's Netsolve Aims to Create Virtual SW Library
Source: HPCwire, April 4, 1997
By Alan Beck
Editor in Chief
Knoxville, Tenn. -- HPCwire was recently contacted by Jack Dongarra about
a newly developed application called NetSolve, designed to enable
significantly more efficient utilization of a wide range of network
resources. To learn more, HPCwire interviewed Dongarra, who is professor of
computer science at the University of Tennessee and researcher at Oak Ridge
National Laboratory. Following are selected excerpts from that discussion.
---
HPCwire: What is NetSolve?
DONGARRA: "NetSolve is a network-enabled solver that allows users access to
computational resources -- both hardware and software -- distributed across
a network. Its development was motivated by a need for easy-to-use, efficient
mechanisms for remotely accessing computational resources. Ease-of-use is
obtained via four different interfaces -- Fortran, C, MatLab and Java.
"A strong point is that it enables users to get access to hardware
platforms through their own programs by making calls through NetSolve to
various software components. Thus, there's locational transparency. A
software library that a person can access remotely, a virtual library, is
thereby created although it does not actually exist on their machine.
Therefore, there can be central management of library resources, where the
most up-to-date version is always available and systems administrators no
longer have to maintain software packages on a variety of different machines.
"The NetSolve system has three components: the client, which can be either
a user program or a user interacting with one of the NetSolve interfaces;
the NetSolve agent; and the pool of NetSolve resources. The entry point into
the NetSolve system is the client sending a problem request to the agent.
The agent analyzes this request and chooses a computational resource. The
problem and its input data are then sent to the chosen NetSolve resource.
The problem is solved by the appropriate scientific package on some hardware
platform and the result is sent back to the client. This system can be
deployed on the Internet or on a local intranet. For example, here at the
University we have many workstations and can set up a server with a specific
type of software. Through NetSolve, users send their problem to that machine,
although the user wouldn't have to have any knowledge of the details of the
servers or agent involved."
HPCwire: Please detail NetSolve's operation and appropriate platform.
DONGARRA: "Currently, NetSolve can be enabled on any Unix-based machine.
Its mechanism will manage and exploit full heterogeneity throughout without
the user being aware of the complexities and hassles of network programming.
Traditionally, if a user wanted to gain access to a given subroutine or
function they would write a call to it, passing the input and output
arguments. With NetSolve, you still call a routine and pass the parameters,
but the executable software can be anywhere on the net. Thus, you simply call
NetSolve, pass the arguments to it, and it figures out the most suitable
computational device. It then sends your problem to that device for
computation, and if necessary, using retry for fault-tolerance, solves a
problem and returns the answers to the user's program. A load-balancing
policy is used by the NetSolve system to ensure good performance by enabling
the system to use the available computational resources more efficiently."
HPCwire: How large is NetSolve?
DONGARRA: "The part that the client sees is quite small, on the order of
tens of kilobytes."
HPCwire: What degree of programming expertise is required to utilize it
effectively?
DONGARRA: "Minimal -- that's one of its principal virtues. The user does
not have to be experienced in network computing. In fact the user does not
even have to know that the computation will be networked. He or she just
calls a provided subroutine. That subroutine contains all the mechanisms
necessary to contact the agents, transfer inputs over the network to an
appropriate server, and then have the results return to the user. It's
intended as a very simple interface allowing for rapid prototyping and even
detailed analysis.
"In addition, a number of built-in features provide significant assistance.
Load balancing and fault tolerance are examples of such features. If one
hardware platform goes down, the system itself will restart the problem on
a new server, without the user becoming involved.
"Parallel processing can be achieved on the server side by invoking a
solver that can run on a parallel computer. For example, if the request
to NetSolve is for a the solution to a system of linear equations, NetSolve
may choose to solve the problem on a parallel computer without the user being
aware or involved in parallel programming.
"In the user's program, NetSolve can be invoked through blocking or
nonblocking calls, thus enabling another level of parallel processing.
"The system can be enhanced and grow in a number of ways. Users can add
computational servers and software components dynamically to the system."
HPCwire: Who developed NetSolve?
DONGARRA: "Henri Casanova, a UT graduate student, and I designed and
developed the system at the University of Tennessee and Oak Ridge National
Lab. Funding for the project comes from the NSF's Science and Technology
Center for Research Parallel Computing and the Department of Energy"
HPCwire: At what stage of development is it?
DONGARRA: "Currently, a free beta version is available on the Internet for
experimentation. Although fully functional, it is intended for "friendly"
users at this point. You can download the code and find documentation at
http://www.cs.utk.edu/netsolve/
HPCwire: Do you ultimately intend to market NetSolve?
DONGARRA: "For me, it will always remain strictly a research project. All
our other projects -- PVM, MPI or the linear algebra projects -- were never
intended for the commercial market. They are intended to further our research
interests and, almost as a by product, produce software that computational
scientists may find useful. NetSolve was developed in the same spirit. We
want to gain insight into heterogeneous distributed computing and allow our
colleagues to use the system and comment on the approach and techniques used
to solve problems in this fashion."
HPCwire: Where do you intend to concentrate improvements?
DONGARRA: "In the future, we will enhance various security aspects, the
ability to migrate tasks, load-balancing capabilities, and interfaces --
especially with respect to Java. We will also port the system to NT and
Window based platforms and enable a much greater range of software
components. After this initial beta release of NetSolve, we plan to release
a Version 1.0, where we will take into account the feedback from users of
this version."
--------------------
Alan Beck is editor in chief of HPCwire. Comments are always welcome and
should be directed to editor@hpcwire.tgc.com
|