Remote Procedure Call Programming Guide

This document is intended for programmers who wish to  write
network applications using remote procedure calls (explained
below), thus avoiding low-level system primitives  based  on
sockets.  The reader must be familiar with the C programming
language, and should have a  working  knowledge  of  network
theory.

NOTE: Before attempting to write a network  application,  or
to  convert  an existing non-network application to run over
the network, you should be familiar  with  the  material  in
this  chapter.  However, for most applications, you can cir-
cumvent the need to cope with the kinds of details presented
here  by  using  the  rpcgen  protocol  compiler,  which  is
described in detail in the next chapter, the rpcgen Program-
ming  Guide.   The  Generating  XDR Routines section of that
chapter contains the  complete  source  for  a  working  RPC
service-a remote directory listing service which uses rpcgen
to generate XDR routines as well as client and server stubs.


What are remote procedure calls?  Simply put, they  are  the
high-level  communications  paradigm  used  in the operating
system.  RPC presumes the existence of low-level  networking
mechanisms  (such  as  TCP/IP  and UDP/IP), and upon them it
implements a logical client to server communications  system
designed  specifically  for  the support of network applica-
tions.  With RPC, the client makes a procedure call to  send
a  data  packet to the server.  When the packet arrives, the
server calls a dispatch routine, performs  whatever  service
is  requested,  sends back the reply, and the procedure call
returns to the client.

1.  Layers of RPC

The RPC interface can be seen as being  divided  into  three
layers.[1]

The Highest Layer: The highest layer is totally  transparent
to  the  operating system, machine and network upon which is
is run.  It's probably best to think of this level as a  way
of using RPC, rather than as a part of RPC proper.  Program-
mers who write RPC routines should (almost) always make this
layer  available to others by way of a simple C front end to
that entirely hides the networking.

To illustrate, at this level a program  can  simply  make  a
call  to  rnusers,  a  C routine which returns the number of
_________________________
  [1] For a complete specification of the  routines  in
the  remote  procedure  call  Library,  see the rpc(3N)
manual page.


                           - 1 -


Page 2               Remote Procedure Call Programming Guide


users on a remote machine.  The user is not explicitly aware
of  using  RPC  - they simply call a procedure, just as they
would call malloc

The Middle Layer: The middle layer is really  "RPC  proper."
Here,  the user doesn't need to consider details about sock-
ets, the UNIX  system,  or  other  low-level  implementation
mechanisms.  They simple make remote procedure calls to rou-
tines on other machines.  The selling point here is  simpli-
city.  It's  this  layer  that allows RPC to pass the "hello
world" test - simple things should be simple.   The  middle-
layer routines are used for most applications.

RPC calls are made  with  the  system  routines  registerrpc
callrpc  and  svc_run.   The first two of these are the most
fundamental:  registerrpc  obtains  a   unique   system-wide
procedure-identification  number,  and callrpc actually exe-
cutes a remote procedure call.  At the middle level, a  call
to rnusers is implemented by way of these two routines.

The middle layer is unfortunately  rarely  used  in  serious
programming  due to its inflexibility (simplicity).  It does
not allow timeout specifications or the choice of transport.
It  allows no UNIX process control or flexibility in case of
errors.  It doesn't support multiple kinds of call authenti-
cation.  The programmer rarely needs all these kinds of con-
trol, but one or two of them is often necessary.

The Lowest Layer: The lowest layer does allow these  details
to  be  controlled by the programmer, and for that reason it
is often necessary.  Programs written at this level are also
most  efficient, but this is rarely a real issue - since RPC
clients and servers rarely generate heavy network loads.

Although this document only discusses the  interface  to  C,
remote  procedure calls can be made from any language.  Even
though this document discusses RPC when it is used  to  com-
municate  between  processes on different machines, it works
just as well for communication between  different  processes
on the same machine.

1.1.  The RPC Paradigm


Remote Procedure Call Programming Guide               Page 3


dashed down from L4 to L7 line dashed up 1i from L3 "service
" rjust "daemon " rjust arrow dashed down 1i  from  L8  move
right  1i  from L3 box invis "Machine B" move left 0.7i from
L2; move down box invis "Machine A"

2.  Higher Layers of RPC

2.1.  Highest Layer

Imagine you're writing a program that needs to know how many
users  are logged into a remote machine.  You can do this by
calling the RPC  library  routine  rnusers,  as  illustrated
below:

#include <stdio.h>

main(argc, argv)
        int argc;
        char **argv;
{
        int num;

        if (argc < 2) {
                fprintf(stderr, "usage: rnusers hostname\n");
                exit(1);
        }
        if ((num = rnusers(argv[1])) < 0) {
                fprintf(stderr, "error: rnusers\n");
                exit(-1);
        }
        printf("%d users on %s\n", num, argv[1]);
        exit(0);
}

RPC library routines such as rnusers are in the RPC services
library  librpcsvc.a  Thus, the program above should be com-
piled with

        % cc program.c -lrpcsvc

rnusers, like the other RPC library routines, is  documented
in  section  3R  of  the System Interface Manual for the Sun
Workstation, the same section which documents  the  standard
Sun  RPC  services.  See  the  intro(3R)  manual page for an
explanation of the documentation strategy for these services
and their RPC protocols.

Here are some of the RPC service library routines  available
to the C programmer:


Page 4               Remote Procedure Call Programming Guide


_____________________________________________________________
 Routine                      Description
_____________________________________________________________
 rnusers    Return number of users on remote machine
 rusers     Return information about users on remote machine
 havedisk   Determine if remote machine has disk
 rstats     Get performance data from remote kernel
 rwall      Write to specified remote machines
 yppasswd   Update user password in Yellow Pages
_____________________________________________________________









|


                                                            |


Other RPC services - for  example  ether  mount  rquota  and
spray  -  are  not  available to the C programmer as library
routines.  They do, however, have  RPC  program  numbers  so
they  can be invoked with callrpc which will be discussed in
the  next  section.   Most  of  them  also  have  compilable
rpcgen(1)  protocol description files.  (The rpcgen protocol
compiler radically simplifies the process of developing net-
work applications.  See the rpcgen Programming Guide chapter
for detailed information about rpcgen  and  rpcgen  protocol
description files).


Remote Procedure Call Programming Guide               Page 5


2.2.  Intermediate Layer

The simplest interface, which explicitly  makes  RPC  calls,
uses  the  functions  callrpc  and  registerrpc  Using  this
method, the number of remote users can be gotten as follows:

#include <stdio.h>
#include <rpc/rpc.h>
#include <utmp.h>
#include <rpcsvc/rusers.h>

main(argc, argv)
        int argc;
        char **argv;
{
        unsigned long nusers;
        int stat;

        if (argc < 2) {
                fprintf(stderr, "usage: nusers hostname\n");
                exit(-1);
        }
        if (stat = callrpc(argv[1],
          RUSERSPROG, RUSERSVERS, RUSERSPROC_NUM,
          xdr_void, 0, xdr_u_long, &nusers) != 0) {
                clnt_perrno(stat);
                exit(1);
        }
        printf("%d users on %s\n", nusers, argv[1]);
        exit(0);
}

Each RPC procedure is uniquely defined by a program  number,
version  number,  and  procedure number.  The program number
specifies a group of  related  remote  procedures,  each  of
which  has  a different procedure number.  Each program also
has a version number, so when a minor change is  made  to  a
remote  service (adding a new procedure, for example), a new
program number doesn't have to be assigned.  When  you  want
to  call a procedure to find the number of remote users, you
look up  the  appropriate  program,  version  and  procedure
numbers  in  a  manual,  just  as  you look up the name of a
memory allocator when you want to allocate memory.

The simplest way of making remote procedure  calls  is  with
the the RPC library routine callrpc It has eight parameters.
The first is the name of the  remote  server  machine.   The
next  three  parameters  are  the program, version, and pro-
cedure numbers-together they identify the  procedure  to  be
called.   The  fifth  and sixth parameters are an XDR filter
and an argument to be encoded and passed to the remote  pro-
cedure.  The  final two parameters are a filter for decoding
the results returned by the remote procedure and  a  pointer


Page 6               Remote Procedure Call Programming Guide


to the place where the procedure's results are to be stored.
Multiple arguments and results are handled by embedding them
in   structures.   If  callrpc  completes  successfully,  it
returns zero; else it returns a nonzero value.   The  return
codes  (of  type  enum  cast  into  an integer) are found in
<rpc/clnt.h>.

Since data types may be represented differently on different
machines,  callrpc  needs both the type of the RPC argument,
as well as a pointer to the argument itself  (and  similarly
for the result).  For RUSERSPROC_NUM, the return value is an
unsigned long so callrpc has xdr_u_long as its first  return
parameter,  which  says  that the result is of type unsigned
long and &nusers as its second return parameter, which is  a
pointer  to  where  the  long  result will be placed.  Since
RUSERSPROC_NUM takes no argument, the argument parameter  of
callrpc is xdr_void.

After trying several times to deliver a message, if  callrpc
gets no answer, it returns with an error code.  The delivery
mechanism is UDP, which stands for User  Datagram  Protocol.
Methods  for  adjusting the number of retries or for using a
different protocol require you to use the lower layer of the
RPC  library,  discussed later in this document.  The remote
server procedure corresponding to the above might look  like
this:

char *
nuser(indata)
        char *indata;
{
        static int nusers;

        /*
         * Code here to compute the number of users
         * and place result in variable nusers.
         */
        return((char *)&nusers);
}


It takes one argument, which is a pointer to  the  input  of
the  remote  procedure call (ignored in our example), and it
returns a pointer to the result.  In the current version  of
C,  character pointers are the generic pointers, so both the
input argument and the return value are cast to char *

Normally, a server registers all of the RPC calls  it  plans
to  handle,  and  then goes into an infinite loop waiting to
service requests.  In this example, there is only  a  single
procedure  to register, so the main body of the server would
look like this:


Remote Procedure Call Programming Guide               Page 7


#include <stdio.h>
#include <rpc/rpc.h>
#include <utmp.h>
#include <rpcsvc/rusers.h>

char *nuser();

main()
{
        registerrpc(RUSERSPROG, RUSERSVERS, RUSERSPROC_NUM,
                nuser, xdr_void, xdr_u_long);
        svc_run();              /* Never returns */
        fprintf(stderr, "Error: svc_run returned!\n");
        exit(1);
}


The  registerrpc  routine  registers  a   C   procedure   as
corresponding  to  a  given RPC procedure number.  The first
three parameters, RUSERPROG, RUSERSVERS, and  RUSERSPROC_NUM
are  the  program,  version,  and  procedure  numbers of the
remote procedure to be registered; nuser is the name of  the
local  procedure  that  implements the remote procedure; and
xdr_void and xdr_u_long are the XDR filters for  the  remote
procedure's  arguments and results, respectively.  (Multiple
arguments or multiple results are passed as structures).

Only the UDP transport mechanism can use  registerrpc  thus,
it  is  always  safe  in conjunction with calls generated by
callrpc.

WARNING: The UDP transport  mechanism  can  only  deal  with
arguments and results less than 8K bytes in length.


After registering the local procedure, the server  program's
main  procedure calls svc_run, the RPC library's remote pro-
cedure dispatcher.  It  is  this  function  that  calls  the
remote  procedures  in  response to RPC call messages.  Note
that the dispatcher takes care of decoding remote  procedure
arguments and encoding results, using the XDR filters speci-
fied when the remote procedure was registered.

2.3.  Assigning Program Numbers

Program numbers are assigned in groups of 0x20000000 accord-
ing to the following chart:


Page 8               Remote Procedure Call Programming Guide


               0x0 - 0x1fffffff Defined by Sun
        0x20000000 - 0x3fffffff Defined by user
        0x40000000 - 0x5fffffff Transient
        0x60000000 - 0x7fffffff Reserved
        0x80000000 - 0x9fffffff Reserved
        0xa0000000 - 0xbfffffff Reserved
        0xc0000000 - 0xdfffffff Reserved
        0xe0000000 - 0xffffffff Reserved

Sun Microsystems administers the  first  group  of  numbers,
which  should be identical for all Sun customers.  If a cus-
tomer develops an  application  that  might  be  of  general
interest,  that  application  should  be  given  an assigned
number in the first range.  The second group of  numbers  is
reserved  for specific customer applications.  This range is
intended primarily for debugging new  programs.   The  third
group  is  reserved  for  applications that generate program
numbers dynamically.  The  final  groups  are  reserved  for
future use, and should not be used.

To register a protocol specification, send a request by net-
work mail to rpc@sun or write to:

        RPC Administrator
        Sun Microsystems
        2550 Garcia Ave.
        Mountain View, CA 94043

Please include a compilable rpcgen  ``.x''  file  describing
your protocol.  You will be given a unique program number in
return.

The RPC program numbers and protocol specifications of stan-
dard  Sun  RPC services can be found in the include files in
/usr/include/rpcsvc.  These  services,  however,  constitute
only  a  small  subset  of those which have been registered.
The complete list of registered programs,  as  of  the  time
when this manual was printed, is:

____________________________________________________________
 RPC Number   Program              Description
____________________________________________________________


|
                                                           
                                                           |

 100000       PMAPPROG             portmapper
 100001       RSTATPROG            remote stats
 100002       RUSERSPROG           remote users
 100003       NFSPROG              nfs
 100004       YPPROG               Yellow Pages
 100005       MOUNTPROG            mount demon
 100006       DBXPROG              remote dbx
 100007       YPBINDPROG           yp binder
 100008       WALLPROG             shutdown msg
 100009       YPPASSWDPROG         yppasswd server
 100010       ETHERSTATPROG        ether stats
____________________________________________________________


                                                           |


Remote Procedure Call Programming Guide               Page 9


____________________________________________________________
 RPC Number   Program              Description
____________________________________________________________


|
                                                           
                                                           |

 100011       RQUOTAPROG           disk quotas
 100012       SPRAYPROG            spray packets
 100013       IBM3270PROG          3270 mapper
 100014       IBMRJEPROG           RJE mapper
 100015       SELNSVCPROG          selection service
 100016       RDATABASEPROG        remote database access
 100017       REXECPROG            remote execution
 100018       ALICEPROG            Alice Office Automation
 100019       SCHEDPROG            scheduling service
 100020       LOCKPROG             local lock manager
 100021       NETLOCKPROG          network lock manager
 100022       X25PROG              x.25 inr protocol
 100023       STATMON1PROG         status monitor 1
 100024       STATMON2PROG         status monitor 2
 100025       SELNLIBPROG          selection library
 100026       BOOTPARAMPROG        boot parameters service
 100027       MAZEPROG             mazewars game
 100028       YPUPDATEPROG         yp update
 100029       KEYSERVEPROG         key server
 100030       SECURECMDPROG        secure login
 100031       NETFWDIPROG          nfs net forwarder init
 100032       NETFWDTPROG          nfs net forwarder trans
 100033       SUNLINKMAP_PROG      sunlink MAP
 100034       NETMONPROG           network monitor
 100035       DBASEPROG            lightweight database
 100036       PWDAUTHPROG          password authorization
 100037       TFSPROG              translucent file svc
 100038       NSEPROG              nse server
 100039       NSE_ACTIVATE_PROG    nse activate daemon

 150001       PCNFSDPROG           pc passwd authorization

 200000       PYRAMIDLOCKINGPROG   Pyramid-locking
 200001       PYRAMIDSYS5          Pyramid-sys5
 200002       CADDS_IMAGE          CV cadds_image

 300001       ADT_RFLOCKPROG       ADT file locking
____________________________________________________________






































|


                                                           |


2.4.  Passing Arbitrary Data Types

In the previous  example,  the  RPC  call  passes  a  single
unsigned  long  RPC  can  handle  arbitrary data structures,
regardless of different machines' byte orders  or  structure
layout  conventions,  by always converting them to a network
standard called eXternal Data  Representation  (XDR)  before
sending  them over the wire.  The process of converting from
a particular machine representation to XDR format is  called
serializing,  and  the reverse process is called deserializ-
ing.  The type field parameters of callrpc  and  registerrpc


Page 10              Remote Procedure Call Programming Guide


can  be a built-in procedure like xdr_u_long in the previous
example, or a user supplied one.  DR has these built-in type
routines:

        xdr_int()      xdr_u_int()      xdr_enum()
        xdr_long()     xdr_u_long()     xdr_bool()
        xdr_short()    xdr_u_short()    xdr_wrapstring()
        xdr_char()     xdr_u_char()

Note that the routine xdr_string exists, but cannot be  used
with callrpc and registerrpc, which only pass two parameters
to their XDR routines.  xdr_wrapstring has only two  parame-
ters, and is thus OK.  It calls xdr_string.

As an example of a user-defined type routine, if you  wanted
to send the structure

        struct simple {
                int a;
                short b;
        } simple;

then you would call callrpc as

        callrpc(hostname, PROGNUM, VERSNUM, PROCNUM,
                xdr_simple, &simple ...);

where xdr_simple is written as:

#include <rpc/rpc.h>

xdr_simple(xdrsp, simplep)
        XDR *xdrsp;
        struct simple *simplep;
{
        if (!xdr_int(xdrsp, &simplep->a))
                return (0);
        if (!xdr_short(xdrsp, &simplep->b))
                return (0);
        return (1);
}


An XDR routine returns nonzero (true in the sense of  C)  if
it  completes  successfully, and zero otherwise.  A complete
description of XDR is in the XDR Protocol Specification sec-
tion  of  this  manual, only few implementation examples are
given here.

In addition to the built-in primitives, there are  also  the
prefabricated building blocks:


Remote Procedure Call Programming Guide              Page 11


        xdr_array()       xdr_bytes()     xdr_reference()
        xdr_vector()      xdr_union()     xdr_pointer()
        xdr_string()      xdr_opaque()

To send a variable array of integers, you might package them
up as a structure like this

        struct varintarr {
                int *data;
                int arrlnth;
        } arr;

and make an RPC call such as

        callrpc(hostname, PROGNUM, VERSNUM, PROCNUM,
                xdr_varintarr, &arr...);

with xdr_varintarr defined as:

xdr_varintarr(xdrsp, arrp)
        XDR *xdrsp;
        struct varintarr *arrp;
{
        return (xdr_array(xdrsp, &arrp->data, &arrp->arrlnth,
                MAXLEN, sizeof(int), xdr_int));
}

This routine takes as parameters the XDR handle,  a  pointer
to  the  array, a pointer to the size of the array, the max-
imum allowable array size, the size of each  array  element,
and an XDR routine for handling each array element.

If the size of the array is known in advance,  one  can  use
xdr_vector, which serializes fixed-length arrays.

int intarr[SIZE];

xdr_intarr(xdrsp, intarr)
        XDR *xdrsp;
        int intarr[];
{
        int i;

        return (xdr_vector(xdrsp, intarr, SIZE, sizeof(int),
                xdr_int));
}


DR always  converts  quantities  to  4-byte  multiples  when
deserializing.   Thus,  if  either  of  the  examples  above
involved characters  instead  of  integers,  each  character
would  occupy  32 bits.  That is the reason for the XDR rou-
tine xdr_bytes which is like xdr_array except that it  packs


Page 12              Remote Procedure Call Programming Guide


characters;  xdr_bytes  has  four parameters, similar to the
first four parameters  of  xdr_array.   For  null-terminated
strings,  there is also the xdr_string routine, which is the
same as xdr_bytes without the length parameter.  On  serial-
izing  it gets the string length from strlen, and on deseri-
alizing it creates a null-terminated string.

Here is a final example that calls  the  previously  written
xdr_simple  as well as the built-in functions xdr_string and
xdr_reference, which chases pointers:

struct finalexample {
        char *string;
        struct simple *simplep;
} finalexample;

xdr_finalexample(xdrsp, finalp)
        XDR *xdrsp;
        struct finalexample *finalp;
{

        if (!xdr_string(xdrsp, &finalp->string, MAXSTRLEN))
                return (0);
        if (!xdr_reference(xdrsp, &finalp->simplep,
          sizeof(struct simple), xdr_simple);
                return (0);
        return (1);
}


3.  Lowest Layer of RPC

In the examples given so far, RPC takes care of many details
automatically  for you.  In this section, we'll show you how
you can change the defaults by using lower layers of the RPC
library.   It  is assumed that you are familiar with sockets
and the system calls for dealing with them.  If not, consult
the IPC Primer section of this manual.

There are several occasions when you may need to  use  lower
layers  of  RPC.   First, you may need to use TCP, since the
higher layer uses UDP, which restricts RPC calls to 8K bytes
of  data.   Using  TCP permits calls to send long streams of
data.  For an example, see the TCP section  below.   Second,
you  may  want to allocate and free memory while serializing
or deserializing with XDR routines.  There is no call at the
higher  level  to  let you free memory explicitly.  For more
explanation, see the  Memory  Allocation  with  XDR  section
below.   Third,  you  may  need to perform authentication on
either the client or server side, by  supplying  credentials
or verifying them. See the explanation in the Authentication
section below.


Remote Procedure Call Programming Guide              Page 13


3.1.  More on the Server Side

The server for the nusers program shown below does the  same
thing  as  the  one  using registerrpc above, but is written
using a lower layer of the RPC package:


Page 14              Remote Procedure Call Programming Guide


#include <stdio.h>
#include <rpc/rpc.h>
#include <utmp.h>
#include <rpcsvc/rusers.h>

main()
{
        SVCXPRT *transp;
        int nuser();

        transp = svcudp_create(RPC_ANYSOCK);
        if (transp == NULL){
                fprintf(stderr, "can't create an RPC server\n");
                exit(1);
        }
        pmap_unset(RUSERSPROG, RUSERSVERS);
        if (!svc_register(transp, RUSERSPROG, RUSERSVERS,
                          nuser, IPPROTO_UDP)) {
                fprintf(stderr, "can't register RUSER service\n");
                exit(1);
        }
        svc_run();  /* Never returns */
        fprintf(stderr, "should never reach this point\n");
}

nuser(rqstp, transp)
        struct svc_req *rqstp;
        SVCXPRT *transp;
{
        unsigned long nusers;

        switch (rqstp->rq_proc) {
        case NULLPROC:
                if (!svc_sendreply(transp, xdr_void, 0))
                        fprintf(stderr, "can't reply to RPC call\n");
                return;
        case RUSERSPROC_NUM:
                /*
                 * Code here to compute the number of users
                 * and put in variable nusers
                 */
                if (!svc_sendreply(transp, xdr_u_long, &nusers))
                        fprintf(stderr, "can't reply to RPC call\n");
                return;
        default:
                svcerr_noproc(transp);
                return;
        }
}


First, the server gets a transport handle, which is used for
receiving  and  replying  to RPC messages.  registerrpc uses


Remote Procedure Call Programming Guide              Page 15


svcudp_create to get a UDP handle.  If you  require  a  more
reliable protocol, call svctcp_create instead.  If the argu-
ment to svcudp_create is RPC_ANYSOCK the RPC library creates
a socket on which to receive and reply to RPC calls.  Other-
wise, svcudp_create expects  its  argument  to  be  a  valid
socket  number.   If  you specify your own socket, it can be
bound or unbound.  If it is bound to a port by the user, the
port  numbers  of  svcudp_create and clntcp_create (the low-
level client routine) must match.

If the user specifies  the  RPC_ANYSOCK  argument,  the  RPC
library  routines  will  open  sockets.  Otherwise they will
expect the user to do so.  The  routines  svcudp_create  and
clntudp_create  will  cause the RPC library routines to bind
their socket if it is not bound already.

A service may choose to register its port  number  with  the
local  portmapper service.  This is done is done by specify-
ing a non-zero protocol number in svc_register.  Incidently,
a client can discover the server's port number by consulting
the portmapper on their server's machine.  This can be  done
automatically   by   specifying   a   zero  port  number  in
clntudp_create or clntcp_create.

After  creating  an  SVCXPRT,  the  next  step  is  to  call
pmap_unset so that if the nusers server crashed earlier, any
previous trace of it is erased before restarting.  More pre-
cisely,  pmap_unset erases the entry for RUSERSPROG from the
port mapper's tables.

Finally, we associate the program number for nusers with the
procedure nuser.  The final argument to svc_register is nor-
mally the protocol being  used,  which,  in  this  case,  is
IPPROTO_UDP Notice that unlike registerrpc, there are no XDR
routines involved in the registration process.  Also, regis-
tration  is  done  on  the  program,  rather than procedure,
level.

The user routine nuser must call and dispatch the  appropri-
ate  XDR  routines based on the procedure number.  Note that
two things are handled by  nuser  that  registerrpc  handles
automatically.    The   first  is  that  procedure  NULLPROC
(currently zero) returns with no results.  This can be  used
as  a  simple test for detecting if a remote program is run-
ning.  Second,  there  is  a  check  for  invalid  procedure
numbers.   If  one  is  detected, svcerr_noproc is called to
handle the error.


Page 16              Remote Procedure Call Programming Guide


The user service routine serializes the results and  returns
them to the RPC caller via svc_sendreply Its first parameter
is the SVCXPRT handle, the second is the  XDR  routine,  and
the  third is a pointer to the data to be returned.  Not il-
lustrated above is how a server handles an RPC program  that
receives  data.   As  an  example,  we  can  add a procedure
RUSERSPROC_BOOL which has an argument  nusers,  and  returns
TRUE  or  FALSE depending on whether there are nusers logged
on.  It would look like this:

case RUSERSPROC_BOOL: {
        int bool;
        unsigned nuserquery;

        if (!svc_getargs(transp, xdr_u_int, &nuserquery) {
                svcerr_decode(transp);
                return;
        }
        /*
         * Code to set nusers = number of users
         */
        if (nuserquery == nusers)
                bool = TRUE;
        else
                bool = FALSE;
        if (!svc_sendreply(transp, xdr_bool, &bool)) {
                 fprintf(stderr, "can't reply to RPC call\n");
                 exit(1);
        }
        return;
}


The relevant routine is svc_getargs which takes  an  SVCXPRT
handle, the XDR routine, and a pointer to where the input is
to be placed as arguments.

3.2.  Memory Allocation with XDR

DR routines not only do  input  and  output,  they  also  do
memory  allocation.   This  is  why  the second parameter of
xdr_array is a pointer to an array, rather  than  the  array
itself.   If  it is NULL, then xdr_array allocates space for
the array and returns a pointer to it, putting the  size  of
the  array  in  the third argument.  As an example, consider
the following XDR routine xdr_chararr1 which  deals  with  a
fixed array of bytes with length SIZE


Remote Procedure Call Programming Guide              Page 17


xdr_chararr1(xdrsp, chararr)
        XDR *xdrsp;
        char chararr[];
{
        char *p;
        int len;

        p = chararr;
        len = SIZE;
        return (xdr_bytes(xdrsp, &p, &len, SIZE));
}

It might be called from a server like this,

char chararr[SIZE];

svc_getargs(transp, xdr_chararr1, chararr);

space has already been allocated in chararr.   If  you  want
XDR  to  do  the  allocation, you would have to rewrite this
routine in the following way:

xdr_chararr2(xdrsp, chararrp)
        XDR *xdrsp;
        char **chararrp;
{
        int len;

        len = SIZE;
        return (xdr_bytes(xdrsp, charrarrp, &len, SIZE));
}

Then the RPC call might look like this:

char *arrptr;

arrptr = NULL;
svc_getargs(transp, xdr_chararr2, &arrptr);
/*
 * Use the result here
 */
svc_freeargs(transp, xdr_chararr2, &arrptr);

Note that, after being used,  the  character  array  can  be
freed  with  svc_freeargs  svc_freeargs  will not attempt to
free any memory if the variable indicating it is NULL.   For
example, in the the routine xdr_finalexample, given earlier,
if finalp->string was NULL, then it would not be freed.  The
same is true for finalp->simplep.

To summarize, each XDR routine is responsible for  serializ-
ing, deserializing, and freeing memory.  When an XDR routine
is called from callrpc the serializing part is  used.   When


Page 18              Remote Procedure Call Programming Guide


called  from svc_getargs the deserializer is used.  And when
called from svc_freeargs the  memory  deallocator  is  used.
When  building simple examples like those in this section, a
user doesn't have to worry about the three modes.   See  the
eXternal  Data  Representation:  Sun Technical Notes chapter
for examples of more sophisticated XDR routines that  deter-
mine  which  of the three modes they are in and adjust their
behavior accordingly.


Remote Procedure Call Programming Guide              Page 19


3.3.  The Calling Side

When you use callrpc  you  have  no  control  over  the  RPC
delivery mechanism or the socket used to transport the data.
To illustrate the layer of RPC that lets  you  adjust  these
parameters,  consider  the following code to call the nusers
service:

#include <stdio.h>
#include <rpc/rpc.h>
#include <utmp.h>
#include <rpcsvc/rusers.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <netdb.h>

main(argc, argv)
        int argc;
        char **argv;
{
        struct hostent *hp;
        struct timeval pertry_timeout, total_timeout;
        struct sockaddr_in server_addr;
        int sock = RPC_ANYSOCK;
        register CLIENT *client;
        enum clnt_stat clnt_stat;
        unsigned long nusers;

        if (argc < 2) {
                fprintf(stderr, "usage: nusers hostname\n");
                exit(-1);
        }
        if ((hp = gethostbyname(argv[1])) == NULL) {
                fprintf(stderr, "can't get addr for %s\n",argv[1]);
                exit(-1);
        }
        pertry_timeout.tv_sec = 3;
        pertry_timeout.tv_usec = 0;
        bcopy(hp->h_addr, (caddr_t)&server_addr.sin_addr,
                hp->h_length);
        server_addr.sin_family = AF_INET;
        server_addr.sin_port =  0;
        if ((client = clntudp_create(&server_addr, RUSERSPROG,
          RUSERSVERS, pertry_timeout, &sock)) == NULL) {
                clnt_pcreateerror("clntudp_create");
                exit(-1);
        }
        total_timeout.tv_sec = 20;
        total_timeout.tv_usec = 0;
        clnt_stat = clnt_call(client, RUSERSPROC_NUM, xdr_void,
                0, xdr_u_long, &nusers, total_timeout);
        if (clnt_stat != RPC_SUCCESS) {
                clnt_perror(client, "rpc");


Page 20              Remote Procedure Call Programming Guide


                exit(-1);
        }
        clnt_destroy(client);
        close(sock);
}

The low-level version of callrpc is clnt_call which takes  a
CLIENT  pointer  rather than a host name.  The parameters to
clnt_call are a CLIENT pointer, the  procedure  number,  the
XDR  routine  for serializing the argument, a pointer to the
argument, the  XDR  routine  for  deserializing  the  return
value,  a  pointer to where the return value will be placed,
and the time in seconds to wait for a reply.

The CLIENT pointer is encoded with the transport  mechanism.
callrpc  uses  UDP,  thus  it  calls clntudp_create to get a
CLIENT pointer.  To get TCP (Transmission Control Protocol),
you would use clnttcp_create

The parameters to clntudp_create are the server address, the
program number, the version number, a timeout value (between
tries), and a pointer to a socket.  The  final  argument  to
clnt_call  is  the total time to wait for a response.  Thus,
the number of tries is the clnt_call timeout divided by  the
clntudp_create timeout.

There is one thing to note when using the clnt_destroy call.
It  deallocates any space associated with the CLIENT handle,
but it does not close the socket associated with  it,  which
was  passed as an argument to clntudp_create.  This makes it
possible, in cases where there are multiple  client  handles
using the same socket, to destroy one handle without closing
the socket that other handles are using.

To make a stream connection, the call to  clntudp_create  is
replaced with a call to clnttcp_create.

        clnttcp_create(&server_addr, prognum, versnum, &sock,
                       inputsize, outputsize);

There is no timeout argument; instead, the receive and  send
buffer  sizes  must  be  specified.  When the clnttcp_create
call is made, a TCP  connection  is  established.   All  RPC
calls  using  that  CLIENT handle would use this connection.
The server side of an RPC call using TCP  has  svcudp_create
replaced by svctcp_create

        transp = svctcp_create(RPC_ANYSOCK, 0, 0);

The last two arguments to svctcp_create are send and receive
sizes  respectively.   If  `0'  is  specified  for either of
these, the system chooses a reasonable default.


Remote Procedure Call Programming Guide              Page 21


4.  Other RPC Features

This section discusses some other aspects of  RPC  that  are
occasionally useful.

4.1.  Select on the Server Side

Suppose a process is processing RPC requests while  perform-
ing  some  other  activity.   If the other activity involves
periodically updating a data structure, the process can  set
an  alarm signal before calling svc_run But if the other ac-
tivity involves waiting on a a file descriptor, the  svc_run
call won't work.  The code for svc_run is as follows:

void
svc_run()
{
        fd_set readfds;
        int dtbsz = getdtablesize();

        for (;;) {
                readfds = svc_fds;
                switch (select(dtbsz, &readfds, NULL,NULL,NULL)) {

                case -1:
                        if (errno == EINTR)
                                continue;
                        perror("select");
                        return;
                case 0:
                        break;
                default:
                        svc_getreqset(&readfds);
                }
        }
}


You can bypass svc_run and call svc_getreqset yourself.  All
you  need  to know are the file descriptors of the socket(s)
associated with the programs you are waiting on.   Thus  you
can  have your own select that waits on both the RPC socket,
and your own descriptors.  Note that svc_fds is a  bit  mask
of  all the file descriptors that RPC is using for services.
It can change everytime that  any  RPC  library  routine  is
called,  because descriptors are constantly being opened and
closed, for example for TCP connections.

4.2.  Broadcast RPC

The portmapper is a daemon that converts RPC program numbers
into  DARPA protocol port numbers; see the portmap man page.
You can't do broadcast RPC without the portmapper.  Here are


Page 22              Remote Procedure Call Programming Guide


the  main  differences  between broadcast RPC and normal RPC
calls:

1.   Normal RPC expects one answer,  whereas  broadcast  RPC
     expects  many  answers  (one  or  more answer from each
     responding machine).

2.   Broadcast RPC can only be supported by  packet-oriented
     (connectionless) transport protocols like UPD/IP.

3.   The implementation of broadcast RPC treats  all  unsuc-
     cessful  responses  as  garbage  by filtering them out.
     Thus, if there is a version mismatch between the broad-
     caster  and a remote service, the user of broadcast RPC
     never knows.

4.   All broadcast messages are sent to  the  portmap  port.
     Thus, only services that register themselves with their
     portmapper are accessible via the broadcast RPC mechan-
     ism.

4.2.1.  Broadcast RPC Synopsis

#include <rpc/pmap_clnt.h>
        . . .
enum clnt_stat  clnt_stat;
        . . .
clnt_stat = clnt_broadcast(prognum, versnum, procnum,
  inproc, in, outproc, out, eachresult)
        u_long    prognum;        /* program number */
        u_long    versnum;        /* version number */
        u_long    procnum;        /* procedure number */
        xdrproc_t inproc;         /* xdr routine for args */
        caddr_t   in;             /* pointer to args */
        xdrproc_t outproc;        /* xdr routine for results */
        caddr_t   out;            /* pointer to results */
        bool_t    (*eachresult)();/* call with each result gotten */


The procedure eachresult is called each time a valid  result
is obtained.  It returns a boolean that indicates whether or
not the user wants more responses.

bool_t done;
        . . .
done = eachresult(resultsp, raddr)
        caddr_t resultsp;
        struct sockaddr_in *raddr; /* Addr of responding machine */

If done is TRUE, then broadcasting stops and  clnt_broadcast
returns  successfully.   Otherwise,  the  routine  waits for
another response.  The request is rebroadcast  after  a  few
seconds  of waiting.  If no responses come back, the routine
returns with RPC_TIMEDOUT


Remote Procedure Call Programming Guide              Page 23


4.3.  Batching

The RPC architecture is designed so that clients send a call
message,  and  wait  for servers to reply that the call suc-
ceeded.  This implies that  clients  do  not  compute  while
servers  are  processing a call.  This is inefficient if the
client does not want or need an  acknowledgement  for  every
message  sent.   It is possible for clients to continue com-
puting while waiting for a response, using RPC batch facili-
ties.

RPC messages can be placed in a ``pipeline'' of calls  to  a
desired  server;  this is called batching.  Batching assumes
that: 1) each RPC call in the pipeline requires no  response
from  the  server,  and  the server does not send a response
message; and 2) the pipeline of calls is  transported  on  a
reliable  byte  stream  transport such as TCP/IP.  Since the
server does not respond to every call, the client  can  gen-
erate new calls in parallel with the server executing previ-
ous  calls.   Furthermore,  the  TCP/IP  implementation  can
buffer up many call messages, and send them to the server in
one write system call.  This  overlapped  execution  greatly
decreases  the  interprocess  communication  overhead of the
client and server processes, and the total elapsed time of a
series of calls.

Since the batched calls  are  buffered,  the  client  should
eventually  do a legitimate call in order to flush the pipe-
line.

A contrived example of batching follows.   Assume  a  string
rendering  service  (like  a  window system) has two similar
calls: one renders a string and returns void results,  while
the  other renders a string and remains silent.  The service
(using the TCP/IP transport) may look like:


Page 24              Remote Procedure Call Programming Guide


#include <stdio.h>
#include <rpc/rpc.h>
#include <suntool/windows.h>

void windowdispatch();

main()
{
        SVCXPRT *transp;

        transp = svctcp_create(RPC_ANYSOCK, 0, 0);
        if (transp == NULL){
                fprintf(stderr, "can't create an RPC server\n");
                exit(1);
        }
        pmap_unset(WINDOWPROG, WINDOWVERS);
        if (!svc_register(transp, WINDOWPROG, WINDOWVERS,
          windowdispatch, IPPROTO_TCP)) {
                fprintf(stderr, "can't register WINDOW service\n");
                exit(1);
        }
        svc_run();  /* Never returns */
        fprintf(stderr, "should never reach this point\n");
}

void
windowdispatch(rqstp, transp)
        struct svc_req *rqstp;
        SVCXPRT *transp;
{
        char *s = NULL;

        switch (rqstp->rq_proc) {
        case NULLPROC:
                if (!svc_sendreply(transp, xdr_void, 0))
                        fprintf(stderr, "can't reply to RPC call\n");
                return;
        case RENDERSTRING:
                if (!svc_getargs(transp, xdr_wrapstring, &s)) {
                        fprintf(stderr, "can't decode arguments\n");
                        /*
                         * Tell caller he screwed up
                         */
                        svcerr_decode(transp);
                        break;
                }
                /*
                 * Call here to render the string s
                 */
                if (!svc_sendreply(transp, xdr_void, NULL))
                        fprintf(stderr, "can't reply to RPC call\n");
                break;
        case RENDERSTRING_BATCHED:


Remote Procedure Call Programming Guide              Page 25


                if (!svc_getargs(transp, xdr_wrapstring, &s)) {
                        fprintf(stderr, "can't decode arguments\n");
                        /*
                         * We are silent in the face of protocol errors
                         */
                        break;
                }
                /*
                 * Call here to render string s, but send no reply!
                 */
                break;
        default:
                svcerr_noproc(transp);
                return;
        }
        /*
         * Now free string allocated while decoding arguments
         */
        svc_freeargs(transp, xdr_wrapstring, &s);
}

Of course the service could have one  procedure  that  takes
the string and a boolean to indicate whether or not the pro-
cedure should respond.

In order for a client to take  advantage  of  batching,  the
client  must  perform RPC calls on a TCP-based transport and
the actual calls must have the following attributes: 1)  the
result's  XDR  routine  must  be  zero NULL), and 2) the RPC
call's timeout must be zero.


Page 26              Remote Procedure Call Programming Guide


Here is an example of a client that uses batching to  render
a  bunch of strings; the batching is flushed when the client
gets a null string:

#include <stdio.h>
#include <rpc/rpc.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <netdb.h>
#include <suntool/windows.h>

main(argc, argv)
        int argc;
        char **argv;
{
        struct hostent *hp;
        struct timeval pertry_timeout, total_timeout;
        struct sockaddr_in server_addr;
        int sock = RPC_ANYSOCK;
        register CLIENT *client;
        enum clnt_stat clnt_stat;
        char buf[1000], *s = buf;

        if ((client = clnttcp_create(&server_addr,
          WINDOWPROG, WINDOWVERS, &sock, 0, 0)) == NULL) {
                perror("clnttcp_create");
                exit(-1);
        }
        total_timeout.tv_sec = 0;
        total_timeout.tv_usec = 0;
        while (scanf("%s", s) != EOF) {
                clnt_stat = clnt_call(client, RENDERSTRING_BATCHED,
                        xdr_wrapstring, &s, NULL, NULL, total_timeout);
                if (clnt_stat != RPC_SUCCESS) {
                        clnt_perror(client, "batched rpc");
                        exit(-1);
                }
        }

        /* Now flush the pipeline */

        total_timeout.tv_sec = 20;
        clnt_stat = clnt_call(client, NULLPROC, xdr_void, NULL,
                xdr_void, NULL, total_timeout);
        if (clnt_stat != RPC_SUCCESS) {
                clnt_perror(client, "rpc");
                exit(-1);
        }
        clnt_destroy(client);
}

Since the server sends no message,  the  clients  cannot  be
notified  of any of the failures that may occur.  Therefore,


Remote Procedure Call Programming Guide              Page 27


clients are on their own when it comes to handling errors.

The above example was completed to render all of the  (2000)
lines  in  the file /etc/termcap.  The rendering service did
nothing but throw the lines away.  The example  was  run  in
the  following  four  configurations:  1) machine to itself,
regular RPC; 2) machine to itself, batched RPC;  3)  machine
to  another, regular RPC; and 4) machine to another, batched
RPC.  The results are as  follows:  1)  50  seconds;  2)  16
seconds;  3)  52  seconds; 4) 10 seconds.  Running fscanf on
/etc/termcap only requires six seconds.  These timings  show
the  advantage of protocols that allow for overlapped execu-
tion, though these protocols are often hard to design.

4.4.  Authentication

In the examples presented so far, the caller  never  identi-
fied  itself to the server, and the server never required an
ID from the caller.  Clearly, some network services, such as
a  network  filesystem,  require stronger security than what
has been presented so far.

In reality, every RPC call is authenticated by the RPC pack-
age  on  the  server,  and similarly, the RPC client package
generates and sends authentication parameters.  Just as dif-
ferent transports (TCP/IP or UDP/IP) can be used when creat-
ing RPC clients and servers, different forms of  authentica-
tion can be associated with RPC clients; the default authen-
tication type used as a default is type none.

The authentication subsystem of  the  RPC  package  is  open
ended.   That  is, numerous types of authentication are easy
to support.

4.4.1.  UNIX Authentication

The Client Side

When a caller creates a new RPC client handle as in:

        clnt = clntudp_create(address, prognum, versnum,
                              wait, sockp)

the appropriate transport instance  defaults  the  associate
authentication handle to be

        clnt->cl_auth = authnone_create();

The RPC client can choose to use UNIX  style  authentication
by  setting clnt->cl_auth after creating the RPC client han-
dle:

        clnt->cl_auth = authunix_create_default();


Page 28              Remote Procedure Call Programming Guide


This causes each RPC call associated with clnt to carry with
it the following authentication credentials structure:

/*
 * UNIX style credentials.
 */
struct authunix_parms {
    u_long  aup_time;       /* credentials creation time */
    char    *aup_machname;  /* host name where client is */
    int     aup_uid;        /* client's UNIX effective uid */
    int     aup_gid;        /* client's current group id */
    u_int   aup_len;        /* element length of aup_gids */
    int     *aup_gids;      /* array of groups user is in */
};

These fields are set by authunix_create_default by  invoking
the  appropriate  system  calls.  Since the RPC user created
this new style of authentication, the  user  is  responsible
for destroying it with:

        auth_destroy(clnt->cl_auth);

This should be done in all cases, to conserve memory.


The Server Side

Service implementors have a harder time dealing with authen-
tication  issues  since  the  RPC package passes the service
dispatch routine a request that has an arbitrary authentica-
tion  style  associated  with  it.  Consider the fields of a
request handle passed to a service dispatch routine:

/*
 * An RPC Service request
 */
struct svc_req {
    u_long    rq_prog;          /* service program number */
    u_long    rq_vers;          /* service protocol vers num */
    u_long    rq_proc;          /* desired procedure number */
    struct opaque_auth rq_cred; /* raw credentials from wire */
    caddr_t   rq_clntcred;  /* credentials (read only) */
};

The rq_cred is  mostly  opaque,  except  for  one  field  of
interest: the style or flavor of authentication credentials:


Remote Procedure Call Programming Guide              Page 29


/*
 * Authentication info.  Mostly opaque to the programmer.
 */
struct opaque_auth {
    enum_t  oa_flavor;  /* style of credentials */
    caddr_t oa_base;    /* address of more auth stuff */
    u_int   oa_length;  /* not to exceed MAX_AUTH_BYTES */
};

The RPC package guarantees  the  following  to  the  service
dispatch routine:

1.   That the request's rq_cred is well  formed.   Thus  the
     service   implementor   may   inspect   the   request's
     rq_cred.oa_flavor to determine which style of authenti-
     cation  the  caller  used.  The service implementor may
     also wish to inspect the other fields of rq_cred if the
     style  is  not  one  of the styles supported by the RPC
     package.

2.   That the request's rq_clntcred field is either NULL  or
     points to a well formed structure that corresponds to a
     supported   style   of   authentication    credentials.
     Remember  that  only unix style is currently supported,
     so (currently) rq_clntcred could be cast to  a  pointer
     to  an  authunix_parms  structure.   If  rq_clntcred is
     NULL, the service implementor may wish to  inspect  the
     other  (opaque)  fields  of rq_cred in case the service
     knows about a new type of authentication that  the  RPC
     package does not know about.

Our remote users service example can be extended so that  it
computes results for all users except UID 16:


Page 30              Remote Procedure Call Programming Guide


nuser(rqstp, transp)
        struct svc_req *rqstp;
        SVCXPRT *transp;
{
        struct authunix_parms *unix_cred;
        int uid;
        unsigned long nusers;

        /*
         * we don't care about authentication for null proc
         */
        if (rqstp->rq_proc == NULLPROC) {
                if (!svc_sendreply(transp, xdr_void, 0)) {
                        fprintf(stderr, "can't reply to RPC call\n");
                        exit(1);
                 }
                 return;
        }
        /*
         * now get the uid
         */
        switch (rqstp->rq_cred.oa_flavor) {
        case AUTH_UNIX:
                unix_cred =
                        (struct authunix_parms *)rqstp->rq_clntcred;
                uid = unix_cred->aup_uid;
                break;
        case AUTH_NULL:
        default:
                svcerr_weakauth(transp);
                return;
        }
        switch (rqstp->rq_proc) {
        case RUSERSPROC_NUM:
                /*
                 * make sure caller is allowed to call this proc
                 */
                if (uid == 16) {
                        svcerr_systemerr(transp);
                        return;
                }
                /*
                 * code here to compute the number of users
                 * and put in variable nusers
                 */
                if (!svc_sendreply(transp, xdr_u_long, &nusers)) {
                        fprintf(stderr, "can't reply to RPC call\n");
                        exit(1);
                }
                return;
        default:
                svcerr_noproc(transp);
                return;


Remote Procedure Call Programming Guide              Page 31


        }
}

A few things should be noted here.  First, it  is  customary
not  to  check the authentication parameters associated with
the  NULLPROC  (procedure  number  zero).   Second,  if  the
authentication  parameter's  type  is  not suitable for your
service, you should call svcerr_weakauth  And  finally,  the
service  protocol  itself  should  return  status for access
denied; in the case of our example, the  protocol  does  not
have  such  a  status,  so  we  call  the  service primitive
svcerr_systemerr instead.

The last point underscores  the  relation  between  the  RPC
authentication package and the services; RPC deals only with
authentication and not with individual services' access con-
trol.   The  services  themselves  must  implement their own
access control policies and reflect these policies as return
statuses in their protocols.

4.5.  Using Inetd

An RPC server can be started from inetd The only  difference
from  the  usual  code  is that the service creation routine
should be called in the following form:

transp = svcudp_create(0);     /* For UDP */
transp = svctcp_create(0,0,0); /* For listener TCP sockets */
transp = svcfd_create(0,0,0);  /* For connected TCP sockets */

since inet passes a socket  as  file  descriptor  0.   Also,
svc_register should be called as

svc_register(transp, PROGNUM, VERSNUM, service, 0);

with the final flag as 0, since the program would already be
registered  by  inetd Remember that if you want to exit from
the server process and return control to inet  you  need  to
explicitly exit, since svc_run never returns.

The format of entries in /etc/inetd.conf for RPC services is
in one of the following two forms:

p_name/version dgram  rpc/udp wait/nowait user server args
p_name/version stream rpc/tcp wait/nowait user server args

where p_name is the symbolic  name  of  the  program  as  it
appears  in  rpc(5),  server  is the C code implementing the
server, and program and version are the program and  version
numbers   of   the   service.   For  more  information,  see
inetd.conf(5).

If the same program handles multiple versions, then the ver-
sion number can be a range, as in this example:


Page 32              Remote Procedure Call Programming Guide


rstatd/1-2 dgram rpc/udp wait root /usr/etc/rpc.rstatd


5.  More Examples


5.1.  Versions

By convention, the first version number of program  PROG  is
PROGVERS_ORIG  and  the most recent version is PROGVERS Sup-
pose there is a new version of the user program that returns
an  unsigned  rather  than  a long.  If we name this version
RUSERSVERS_SHORT then a server that wants  to  support  both
versions would do a double register.

if (!svc_register(transp, RUSERSPROG, RUSERSVERS_ORIG,
  nuser, IPPROTO_TCP)) {
        fprintf(stderr, "can't register RUSER service\n");
        exit(1);
}
if (!svc_register(transp, RUSERSPROG, RUSERSVERS_SHORT,
  nuser, IPPROTO_TCP)) {
        fprintf(stderr, "can't register RUSER service\n");
        exit(1);
}

Both versions can be handled by the same C procedure:


Remote Procedure Call Programming Guide              Page 33


nuser(rqstp, transp)
        struct svc_req *rqstp;
        SVCXPRT *transp;
{
        unsigned long nusers;
        unsigned short nusers2;

        switch (rqstp->rq_proc) {
        case NULLPROC:
                if (!svc_sendreply(transp, xdr_void, 0)) {
                        fprintf(stderr, "can't reply to RPC call\n");
                        exit(1);
                }
                return;
        case RUSERSPROC_NUM:
                /*
                 * code here to compute the number of users
                 * and put in variable nusers
                 */
                nusers2 = nusers;
                switch (rqstp->rq_vers) {
                case RUSERSVERS_ORIG:
                        if (!svc_sendreply(transp, xdr_u_long, &nusers)) {
                                fprintf(stderr, "can't reply to RPC call\n");
                        }
                        break;
                case RUSERSVERS_SHORT:
                        if (!svc_sendreply(transp, xdr_u_short, &nusers2)) {
                                fprintf(stderr, "can't reply to RPC call\n");
                        }
                        break;
                }
        default:
                svcerr_noproc(transp);
                return;
        }
}


Page 34              Remote Procedure Call Programming Guide


5.2.  TCP

Here is an example that is essentially rcp The initiator  of
the  RPC  snd  call takes its standard input and sends it to
the server rcv which prints it on standard output.  The  RPC
call  uses TCP.  This also illustrates an XDR procedure that
behaves differently on serialization  than  on  deserializa-
tion.

/*
 * The xdr routine:
 *              on decode, read from wire, write onto fp
 *              on encode, read from fp, write onto wire
 */
#include <stdio.h>
#include <rpc/rpc.h>

xdr_rcp(xdrs, fp)
        XDR *xdrs;
        FILE *fp;
{
        unsigned long size;
        char buf[BUFSIZ], *p;

        if (xdrs->x_op == XDR_FREE)/* nothing to free */
                return 1;
        while (1) {
                if (xdrs->x_op == XDR_ENCODE) {
                        if ((size = fread(buf, sizeof(char), BUFSIZ,
                          fp)) == 0 && ferror(fp)) {
                                fprintf(stderr, "can't fread\n");
                                exit(1);
                        }
                }
                p = buf;
                if (!xdr_bytes(xdrs, &p, &size, BUFSIZ))
                        return 0;
                if (size == 0)
                        return 1;
                if (xdrs->x_op == XDR_DECODE) {
                        if (fwrite(buf, sizeof(char), size,
                          fp) != size) {
                                fprintf(stderr, "can't fwrite\n");
                                exit(1);
                        }
                }
        }
}


Remote Procedure Call Programming Guide              Page 35


/*
 * The sender routines
 */
#include <stdio.h>
#include <netdb.h>
#include <rpc/rpc.h>
#include <sys/socket.h>
#include <sys/time.h>

main(argc, argv)
        int argc;
        char **argv;
{
        int xdr_rcp();
        int err;

        if (argc < 2) {
                fprintf(stderr, "usage: %s servername\n", argv[0]);
                exit(-1);
        }
        if ((err = callrpctcp(argv[1], RCPPROG, RCPPROC,
          RCPVERS, xdr_rcp, stdin, xdr_void, 0) != 0)) {
                clnt_perrno(err);
                fprintf(stderr, "can't make RPC call\n");
                exit(1);
        }
}

callrpctcp(host, prognum, procnum, versnum,
           inproc, in, outproc, out)
        char *host, *in, *out;
        xdrproc_t inproc, outproc;
{
        struct sockaddr_in server_addr;
        int socket = RPC_ANYSOCK;
        enum clnt_stat clnt_stat;
        struct hostent *hp;
        register CLIENT *client;
        struct timeval total_timeout;

        if ((hp = gethostbyname(host)) == NULL) {
                fprintf(stderr, "can't get addr for '%s'\n", host);
                exit(-1);
        }
        bcopy(hp->h_addr, (caddr_t)&server_addr.sin_addr,
                hp->h_length);
        server_addr.sin_family = AF_INET;
        server_addr.sin_port =  0;
        if ((client = clnttcp_create(&server_addr, prognum,
          versnum, &socket, BUFSIZ, BUFSIZ)) == NULL) {
                perror("rpctcp_create");
                exit(-1);
        }


Page 36              Remote Procedure Call Programming Guide


        total_timeout.tv_sec = 20;
        total_timeout.tv_usec = 0;
        clnt_stat = clnt_call(client, procnum,
                inproc, in, outproc, out, total_timeout);
        clnt_destroy(client);
        return (int)clnt_stat;
}


Remote Procedure Call Programming Guide              Page 37


/*
 * The receiving routines
 */
#include <stdio.h>
#include <rpc/rpc.h>

main()
{
        register SVCXPRT *transp;
     int rcp_service(), xdr_rcp();

        if ((transp = svctcp_create(RPC_ANYSOCK,
          BUFSIZ, BUFSIZ)) == NULL) {
                fprintf("svctcp_create: error\n");
                exit(1);
        }
        pmap_unset(RCPPROG, RCPVERS);
        if (!svc_register(transp,
          RCPPROG, RCPVERS, rcp_service, IPPROTO_TCP)) {
                fprintf(stderr, "svc_register: error\n");
                exit(1);
        }
        svc_run();  /* never returns */
        fprintf(stderr, "svc_run should never return\n");
}

rcp_service(rqstp, transp)
        register struct svc_req *rqstp;
        register SVCXPRT *transp;
{
        switch (rqstp->rq_proc) {
        case NULLPROC:
                if (svc_sendreply(transp, xdr_void, 0) == 0) {
                        fprintf(stderr, "err: rcp_service");
                        exit(1);
                }
                return;
        case RCPPROC_FP:
                if (!svc_getargs(transp, xdr_rcp, stdout)) {
                        svcerr_decode(transp);
                        return;
                }
                if (!svc_sendreply(transp, xdr_void, 0)) {
                        fprintf(stderr, "can't reply\n");
                        return;
                }
                exit(0);
        default:
                svcerr_noproc(transp);
                return;
        }
}


Page 38              Remote Procedure Call Programming Guide


5.3.  Callback Procedures

Occasionally, it is useful to have a server become a client,
and  make  an RPC call back the process which is its client.
An example is remote debugging, where the client is a window
system  program, and the server is a debugger running on the
remote machine.  Most of the time, the user clicks  a  mouse
button  at  the  debugging  window, which converts this to a
debugger command, and then makes an RPC call to  the  server
(where the debugger is actually running), telling it to exe-
cute that command.  However, when the debugger hits a break-
point,  the  roles  are  reversed, and the debugger wants to
make an rpc call to the  window  program,  so  that  it  can
inform the user that a breakpoint has been reached.

In order to do an RPC callback, you need a program number to
make the RPC call on.  Since this will be a dynamically gen-
erated program number, it should be in the transient  range,
0x40000000 - 0x5fffffff.  The routine gettransient returns a
valid program number in the transient range,  and  registers
it  with  the  portmapper.   It only talks to the portmapper
running on the same  machine  as  the  gettransient  routine
itself.   The  call to pmap_set is a test and set operation,
in that it indivisibly tests whether a  program  number  has
already  been  registered,  and if it has not, then reserves
it.  On return, the sockp argument  will  contain  a  socket
that  can  be  used  as  the argument to an svcudp_create or
svctcp_create call.


Remote Procedure Call Programming Guide              Page 39


#include <stdio.h>
#include <rpc/rpc.h>
#include <sys/socket.h>

gettransient(proto, vers, sockp)
        int proto, vers, *sockp;
{
        static int prognum = 0x40000000;
        int s, len, socktype;
        struct sockaddr_in addr;

        switch(proto) {
                case IPPROTO_UDP:
                        socktype = SOCK_DGRAM;
                        break;
                case IPPROTO_TCP:
                        socktype = SOCK_STREAM;
                        break;
                default:
                        fprintf(stderr, "unknown protocol type\n");
                        return 0;
        }
        if (*sockp == RPC_ANYSOCK) {
                if ((s = socket(AF_INET, socktype, 0)) < 0) {
                        perror("socket");
                        return (0);
                }
                *sockp = s;
        }
        else
                s = *sockp;
        addr.sin_addr.s_addr = 0;
        addr.sin_family = AF_INET;
        addr.sin_port = 0;
        len = sizeof(addr);
        /*
         * may be already bound, so don't check for error
         */
        bind(s, &addr, len);
        if (getsockname(s, &addr, &len)< 0) {
                perror("getsockname");
                return (0);
        }
        while (!pmap_set(prognum++, vers, proto,
                ntohs(addr.sin_port))) continue;
        return (prognum-1);
}

NOTE: The call to ntohs is necessary to ensure that the port
number  in addr.sin_port, which is in network byte order, is
passed in host byte order (as pmap_set expects).  This works
on  all  Sun  machines.   See the byteorder(3N) man page for
more details on the conversion  of  network  addresses  from


Page 40              Remote Procedure Call Programming Guide


network to host byte order.


Remote Procedure Call Programming Guide              Page 41


The following pair of programs illustrate  how  to  use  the
gettransient  routine.   The client makes an RPC call to the
server, passing it a transient  program  number.   Then  the
client waits around to receive a callback from the server at
that program number.  The server registers the program EXAM-
PLEPROG  so that it can receive the RPC call informing it of
the callback program number.  Then at some random  time  (on
receiving  an ALRM signal in this example), it sends a call-
back RPC call, using the program number it received earlier.

/*
 * client
 */
#include <stdio.h>
#include <rpc/rpc.h>

int callback();
char hostname[256];

main()
{
        int x, ans, s;
        SVCXPRT *xprt;

        gethostname(hostname, sizeof(hostname));
        s = RPC_ANYSOCK;
        x = gettransient(IPPROTO_UDP, 1, &s);
        fprintf(stderr, "client gets prognum %d\n", x);
        if ((xprt = svcudp_create(s)) == NULL) {
          fprintf(stderr, "rpc_server: svcudp_create\n");
                exit(1);
        }
        /* protocol is 0 - gettransient() does registering
         */
        (void)svc_register(xprt, x, 1, callback, 0);
        ans = callrpc(hostname, EXAMPLEPROG, EXAMPLEVERS,
                EXAMPLEPROC_CALLBACK, xdr_int, &x, xdr_void, 0);
        if ((enum clnt_stat) ans != RPC_SUCCESS) {
                fprintf(stderr, "call: ");
                clnt_perrno(ans);
                fprintf(stderr, "\n");
        }
        svc_run();
        fprintf(stderr, "Error: svc_run shouldn't return\n");
}

callback(rqstp, transp)
        register struct svc_req *rqstp;
        register SVCXPRT *transp;
{
        switch (rqstp->rq_proc) {
                case 0:
                        if (!svc_sendreply(transp, xdr_void, 0)) {


Page 42              Remote Procedure Call Programming Guide


                                fprintf(stderr, "err: rusersd\n");
                                exit(1);
                        }
                        exit(0);
                case 1:
                        if (!svc_getargs(transp, xdr_void, 0)) {
                                svcerr_decode(transp);
                                exit(1);
                        }
                        fprintf(stderr, "client got callback\n");
                        if (!svc_sendreply(transp, xdr_void, 0)) {
                                fprintf(stderr, "err: rusersd");
                                exit(1);
                        }
        }
}


Remote Procedure Call Programming Guide              Page 43


/*
 * server
 */
#include <stdio.h>
#include <rpc/rpc.h>
#include <sys/signal.h>

char *getnewprog();
char hostname[256];
int docallback();
int pnum;               /* program number for callback routine */

main()
{
        gethostname(hostname, sizeof(hostname));
        registerrpc(EXAMPLEPROG, EXAMPLEVERS,
          EXAMPLEPROC_CALLBACK, getnewprog, xdr_int, xdr_void);
        fprintf(stderr, "server going into svc_run\n");
        signal(SIGALRM, docallback);
        alarm(10);
        svc_run();
        fprintf(stderr, "Error: svc_run shouldn't return\n");
}

char *
getnewprog(pnump)
        char *pnump;
{
        pnum = *(int *)pnump;
        return NULL;
}

docallback()
{
        int ans;

        ans = callrpc(hostname, pnum, 1, 1, xdr_void, 0,
                xdr_void, 0);
        if (ans != 0) {
                fprintf(stderr, "server: ");
                clnt_perrno(ans);
                fprintf(stderr, "\n");
        }
}


rpcgen Programming Guide

1.  The rpcgen Protocol Compiler

The details of programming applications to use  Remote  Pro-
cedure  Calls can be overwhelming.  Perhaps most daunting is
the  writing  of  the  XDR  routines  necessary  to  convert


Page 44                             rpcgen Programming Guide


procedure  arguments  and  results into their network format
and vice-versa.

Fortunately, rpcgen exists to  help  programmers  write  RPC
applications  simply  and directly.  rpcgen does most of the
dirty  work,  allowing  programmers  to  debug   the    main
features  of their application, instead of requiring them to
spend most of their time debugging their  network  interface
code.

rpcgen is a  compiler.  It accepts a remote  program  inter-
face  definition written in a language, called RPC Language,
which is similar to C.  It  produces  a  C  language  output
which  includes  stub  versions  of  the  client routines, a
server skeleton, XDR filter routines for both parameters and
results, and a header file that contains common definitions.
The client stubs interface with the RPC library  and  effec-
tively hide the network from their callers.  The server stub
similarly hides the network from the server procedures  that
are  to be invoked by remote clients.  rpcgen's output files
can be compiled and linked in the usual way.  The  developer
writes  server  procedures-in any language that observes Sun
calling conventions-and links them with the server  skeleton
produced  by rpcgen to get an executable server program.  To
use a remote program, a programmer writes an  ordinary  main
program that makes local procedure calls to the client stubs
produced by rpcgen.   Linking  this  program  with  rpcgen's
stubs  creates  an executable program.  (At present the main
program must be written in C).  rpcgen options can  be  used
to  suppress stub generation and to specify the transport to
be used by the server stub.

Like all compilers, rpcgen  reduces  development  time  that
would otherwise be spent coding and debugging low-level rou-
tines.  All compilers, including rpcgen, do this at a  small
cost  in  efficiency  and flexibility.  However,   many com-
pilers allow  escape  hatches for programmers to   mix  low-
level  code  with   high-level code. rpcgen is no exception.
In speed-critical applications, hand-written routines can be
linked with the rpcgen output without any difficulty.  Also,
one may proceed by using rpcgen output as a starting  point,
and rewriting it as necessary.

2.  Converting Local Procedures into Remote Procedures

Assume an application that runs on  a  single  machine,  one
which  we  want to convert to run over the network.  Here we
will demonstrate such  a  conversion  by  way  of  a  simple
example-a program that prints a message to the console:


rpcgen Programming Guide                             Page 45


/*
 * printmsg.c: print a message on the console
 */
#include <stdio.h>

main(argc, argv)
        int argc;
        char *argv[];
{
        char *message;

        if (argc < 2) {
                fprintf(stderr, "usage: %s <message>\n", argv[0]);
                exit(1);
        }
        message = argv[1];

        if (!printmessage(message)) {
                fprintf(stderr, "%s: couldn't print your message\n",
                        argv[0]);
                exit(1);
        }
        printf("Message delivered!0);
}
/*
 * Print a message to the console.
 * Return a boolean indicating whether the message was actually printed.
 */
printmessage(msg)
        char *msg;
{
        FILE *f;

        f = fopen("/dev/console", "w");
        if (f == NULL) {
                return (0);
        }
        fprintf(f, "%s\n", msg);
        fclose(f);
        return(1);
}


And then, of course:

example%  cc printmsg.c -o printmsg
example%  printmsg "Hello, there."
Message delivered!
example%


If printmessage was turned into  a remote procedure, then it
could  be   called  from anywhere in   the network. Ideally,


Page 46                             rpcgen Programming Guide


one would just  like to stick   a  keyword  like  remote  in
front   of  a procedure to turn it into a  remote procedure.
Unfortunately, we  have to live  within the  constraints  of
the    C  language, since it existed   long before  RPC did.
But   even without language support, it's not very difficult
to make a procedure remote.

In  general, it's necessary to figure  out  what  the  types
are  for  all  procedure inputs and outputs.  In  this case,
we  have a procedure printmessage which takes a   string  as
input, and returns  an integer as output.  Knowing  this, we
can write a  protocol specification  in  RPC  language  that
describes the remote  version of printmessage.  Here it is:

/*
 * msg.x: Remote message printing protocol
 */

program MESSAGEPROG {
        version MESSAGEVERS {
                int PRINTMESSAGE(string) = 1;
        } = 1;
} = 99;


Remote procedures are part of remote programs, so  we  actu-
ally  declared  an  entire  remote program  here  which con-
tains  the single procedure  PRINTMESSAGE.   This  procedure
was declared to be  in version  1 of the remote program.  No
null procedure (procedure 0)  is  necessary  because  rpcgen
generates it automatically.

Notice  that  everything  is  declared  with   all   capital
letters.   This is not required, but is a good convention to
follow.

Notice also that the argument type is "string" and not "char
*".   This is because a "char *" in C is ambiguous. Program-
mers usually intend it to  mean   a  null-terminated  string
of  characters, but  it  could also represent a pointer to a
single character or a  pointer to an  array  of  characters.
In   RPC  language,  a  null-terminated  string is unambigu-
ously called a "string".

There are  just two more things to  write.  First, there  is
the  remote  procedure  itself.   Here's the definition of a
remote procedure to implement the PRINTMESSAGE procedure  we
declared above:


rpcgen Programming Guide                             Page 47


/*
 * msg_proc.c: implementation of the remote procedure "printmessage"
 */

#include <stdio.h>
#include <rpc/rpc.h>    /* always needed  */
#include "msg.h"        /* need this too: msg.h will be generated by rpcgen */

/*
 * Remote verson of "printmessage"
 */
int *
printmessage_1(msg)
        char **msg;
{
        static int result;  /* must be static! */
        FILE *f;

        f = fopen("/dev/console", "w");
        if (f == NULL) {
                result = 0;
                return (&result);
        }
        fprintf(f, "%s\n", *msg);
        fclose(f);
        result = 1;
        return (&result);
}


Notice here that the declaration  of  the  remote  procedure
printmessage_1  differs  from  that  of  the local procedure
printmessage in three ways:

1.   It takes a pointer to a  string  instead  of  a  string
     itself.   This is true of all  remote procedures:  they
     always take pointers to  their  arguments  rather  than
     the arguments themselves.

2.   It returns a pointer to  an   integer  instead  of   an
     integer  itself.  This is also generally true of remote
     procedures: they  always  return  a  pointer  to  their
     results.

3.   It has  an "_1" appended to  its name.    In   general,
     all   remote  procedures  called by rpcgen are named by
     the following rule: the name in the program  definition
     (here  PRINTMESSAGE)  is  converted   to all lower-case
     letters, an underbar ("_")   is appended   to  it,  and
     finally the version number (here 1) is appended.

The last thing to do is declare the main client program that
will call the remote procedure. Here it is:


Page 48                             rpcgen Programming Guide


/*
 * rprintmsg.c: remote version of "printmsg.c"
 */
#include <stdio.h>
#include <rpc/rpc.h>     /* always needed  */
#include "msg.h"         /* need this too: msg.h will be generated by rpcgen */

main(argc, argv)
        int argc;
        char *argv[];
{
        CLIENT *cl;
        int *result;
        char *server;
        char *message;

        if (argc < 3) {
                fprintf(stderr, "usage: %s host message\n", argv[0]);
                exit(1);
        }

        /*
         * Save values of command line arguments
         */
        server = argv[1];
        message = argv[2];

        /*
         * Create client "handle" used for calling MESSAGEPROG on the
         * server designated on the command line. We tell the RPC package
         * to use the "tcp" protocol when contacting the server.
         */
        cl = clnt_create(server, MESSAGEPROG, MESSAGEVERS, "tcp");
        if (cl == NULL) {
                /*
                 * Couldn't establish connection with server.
                 * Print error message and die.
                 */
                clnt_pcreateerror(server);
                exit(1);
        }

        /*
         * Call the remote procedure "printmessage" on the server
         */
        result = printmessage_1(&message, cl);
        if (result == NULL) {
                /*
                 * An error occurred while calling the server.
                 * Print error message and die.
                 */
                clnt_perror(cl, server);
                exit(1);


rpcgen Programming Guide                             Page 49


        }

        /*
         * Okay, we successfully called the remote procedure.
         */
        if (*result == 0) {
                /*
                 * Server was unable to print our message.
                 * Print error message and die.
                 */
                fprintf(stderr, "%s: %s couldn't print your message\n",
                        argv[0], server);
                exit(1);
        }

        /*
         * The message got printed on the server's console
         */
        printf("Message delivered to %s!\n", server);
}

There are two things to note here:

1.   First a client  "handle"  is  created   using  the  RPC
     library  routine  clnt_create.  This client handle will
     be passed  to the stub routines which call  the  remote
     procedure.

2.   The remote procedure printmessage_1 is  called  exactly
     the  same  way  as it is  declared in msg_proc.c except
     for the inserted client handle as the first argument.

Here's how to put all of the pieces together:

example%  rpcgen msg.x
example%  cc rprintmsg.c msg_clnt.c -o rprintmsg
example%  cc msg_proc.c msg_svc.c -o msg_server

Two programs were compiled here: the client program printmsg
and  the  server   program  msg_server.   Before  doing this
though, rpcgen was used to fill in the missing pieces.

Here is what rpcgen did with the input file msg.x:

1.   It created a header file called  msg.h  that  contained
     #define's for MESSAGEPROG, MESSAGEVERS and PRINTMESSAGE
     for use in  the  other modules.

2.   It created client "stub"  routines  in  the  msg_clnt.c
     file.    In  this case there is only one, the printmes-
     sage_1 that was referred to from  the  printmsg  client
     program.   The name  of the output file for client stub
     routines is always formed in this way:  if the name  of
     the  input  file is FOO.x, the   client  stubs   output


Page 50                             rpcgen Programming Guide


     file is    called FOO_clnt.c.

3.   It created  the  server   program which calls printmes-
     sage_1  in  msg_proc.c.   This  server program is named
     msg_svc.c.  The rule for naming the server output  file
     is  similar   to  the previous one:  for an input  file
     called FOO.x, the   output   server    file  is   named
     FOO_svc.c.

Now we're ready to have some fun.  First, copy the server to
a  remote  machine  and  run  it.   For  this  example,  the
machine is called "moon".  Server processes are run  in  the
background, because they never exit.

moon% msg_server &

Then on our local machine ("sun") we can print a message  on
"moon"s console.

sun% printmsg moon "Hello, moon."

The message will   get printed  to   "moon"s   console.  You
can  print   a  message on anybody's console (including your
own) with this program if you are able to copy the server to
their machine and run it.

3.  Generating XDR Routines

The previous example  only demonstrated  the  automatic gen-
eration of client  and server RPC  code. rpcgen may also  be
used to generate  XDR  routines,  that   is,   the  routines
necessary  to  convert   local  data structures into network
format and vice-versa.  This example presents a complete RPC
service-a  remote  directory  listing  service,  which  uses
rpcgen not  only  to generate stub  routines,  but  also  to
generate   the  XDR routines.  Here is the protocol descrip-
tion file:


rpcgen Programming Guide                             Page 51


/*
 * dir.x: Remote directory listing protocol
 */
const MAXNAMELEN = 255;         /* maximum length of a directory entry */

typedef string nametype<MAXNAMELEN>;    /* a directory entry */

typedef struct namenode *namelist;              /* a link in the listing */

/*
 * A node in the directory listing
 */
struct namenode {
        nametype name;          /* name of directory entry */
        namelist next;          /* next entry */
};

/*
 * The result of a READDIR operation.
 */
union readdir_res switch (int errno) {
case 0:
        namelist list;  /* no error: return directory listing */
default:
        void;           /* error occurred: nothing else to return */
};

/*
 * The directory program definition
 */
program DIRPROG {
        version DIRVERS {
                readdir_res
                READDIR(nametype) = 1;
        } = 1;
} = 76;

Running rpcgen on dir.x creates four output files. Three are
the  same  as before: header file, client stub routines  and
server skeleton. The fourth are the XDR  routines  necessary
for  converting  the  data types we declared into XDR format
and vice-versa. These are  output in the file dir_xdr.c.

Here is the implementation of the "READDIR" procedure:


Page 52                             rpcgen Programming Guide


/*
 * dir_proc.c: remote readdir implementation
 */
#include <rpc/rpc.h>
#include <sys/dir.h>
#include "dir.h"

extern int errno;
extern char *malloc();
extern char *strdup();

readdir_res *
readdir_1(dirname)
        nametype *dirname;
{
        DIR *dirp;
        struct direct *d;
        namelist nl;
        namelist *nlp;
        static readdir_res res; /* must be static! */

        /*
         * Open directory
         */
        dirp = opendir(*dirname);
        if (dirp == NULL) {
                res.errno = errno;
                return (&res);
        }

        /*
         * Free previous result
         */
        xdr_free(xdr_readdir_res, &res);

        /*
         * Collect directory entries
         */
        nlp = &res.readdir_res_u.list;
        while (d = readdir(dirp)) {
                nl = *nlp = (namenode *) malloc(sizeof(namenode));
                nl->name = strdup(d->d_name);
                nlp = &nl->next;
        }
        *nlp = NULL;

        /*
         * Return the result
         */
        res.errno = 0;
        closedir(dirp);
        return (&res);
}


rpcgen Programming Guide                             Page 53


Finally, there is  the  client  side  program  to  call  the
server:


Page 54                             rpcgen Programming Guide


/*
 * rls.c: Remote directory listing client
 */
#include <stdio.h>
#include <rpc/rpc.h>    /* always need this */
#include "dir.h"                /* need this too: will be generated by rpcgen */

extern int errno;

main(argc, argv)
        int argc;
        char *argv[];
{
        CLIENT *cl;
        char *server;
        char *dir;
        readdir_res *result;
        namelist nl;


        if (argc != 3) {
                fprintf(stderr, "usage: %s host directory\n", argv[0]);
                exit(1);
        }

        /*
         * Remember what our command line arguments refer to
         */
        server = argv[1];
        dir = argv[2];

        /*
         * Create client "handle" used for calling MESSAGEPROG on the
         * server designated on the command line. We tell the RPC package
         * to use the "tcp" protocol when contacting the server.
         */
        cl = clnt_create(server, DIRPROG, DIRVERS, "tcp");
        if (cl == NULL) {
                /*
                 * Couldn't establish connection with server.
                 * Print error message and die.
                 */
                clnt_pcreateerror(server);
                exit(1);
        }

        /*
         * Call the remote procedure readdir on the server
         */
        result = readdir_1(&dir, cl);
        if (result == NULL) {
                /*
                 * An error occurred while calling the server.


rpcgen Programming Guide                             Page 55


                 * Print error message and die.
                 */
                clnt_perror(cl, server);
                exit(1);
        }

        /*
         * Okay, we successfully called the remote procedure.
         */
        if (result->errno != 0) {
                /*
                 * A remote system error occurred.
                 * Print error message and die.
                 */
                errno = result->errno;
                perror(dir);
                exit(1);
        }

        /*
         * Successfully got a directory listing.
         * Print it out.
         */
        for (nl = result->readdir_res_u.list; nl != NULL;
          nl = nl->next) {
                printf("%s0, nl->name);
        }
}

Compile everything, and run.

        sun%  rpcgen dir.x
        sun%  cc rls.c dir_clnt.c dir_xdr.c -o rls
        sun%  cc dir_svc.c dir_proc.c dir_xdr.c -o dir_svc

        sun%  dir_svc &

        moon%  rls sun /usr/pub
        .
        ..
        ascii
        eqnchar
        greek
        kbd
        marg8
        tabclr
        tabs
        tabs4
        moon%


A final note about rpcgen: The client program and the server
procedure can be tested together as a single program by sim-
ply linking them with each other rather than with the client


Page 56                             rpcgen Programming Guide


and  server  stubs.  The procedure calls will be executed as
ordinary local  procedure  calls  and  the  program  can  be
debugged  with  a local debugger such as dbx.  When the pro-
gram is working, the client program can  be  linked  to  the
client stub produced by rpcgen and the server procedures can
be linked to the server stub produced by rpcgen.

NOTE: If you do this, you may want to comment out  calls  to
RPC  library  routines,  and  have client-side routines call
server routines directly.


4.  The C-Preprocessor

The C-preprocessor is  run on all input  files  before  they
are  compiled, so all the preprocessor  directives are legal
within a  ".x" file. Four symbols may be defined,  depending
upon  which  output  file  is getting generated. The symbols
are:

_______________________________________
 Symbol     Usage
_______________________________________
 RPC_HDR    for header-file output
 RPC_XDR    for XDR routine output
 RPC_SVC    for server-skeleton output
 RPC_CLNT   for client stub output
_______________________________________







|


                                      |


Also, rpcgen does  a little preprocessing   of its own.  Any
line  that  begins  with  a percent sign is passed  directly
into the output file,  without  any  interpretation  of  the
line.   Here  is  a  simple  example  that  demonstrates the
preprocessing features.


rpcgen Programming Guide                             Page 57


/*
 * time.x: Remote time protocol
 */
program TIMEPROG {
        version TIMEVERS {
                unsigned int TIMEGET(void) = 1;
        } = 1;
} = 44;

#ifdef RPC_SVC
%int *
%timeget_1()
%{
%        static int thetime;
%
%        thetime = time(0);
%        return (&thetime);
%}
#endif

The '%' feature is not generally recommended, as there is no
guarantee  that the compiler will stick the output where you
intended.

5.  RPC Language

RPC language is an extension of XDR   language.    The  sole
extension  is  the addition of the program type.  For a com-
plete description of the XDR language syntax, see the eXter-
nal  Data  Representation  Standard:  Protocol Specification
chapter.  For a description of the RPC extensions to the XDR
language,  see the Remote Procedure Calls: Protocol Specifi-
cation chapter.

However, XDR language is so close to C that if you  know  C,
you  know  most of it already.  We describe here  the syntax
of the RPC language, showing a  few examples along the  way.
We  also  show how  the various RPC and XDR type definitions
get  compiled into C  type definitions in the output  header
file.

5.1.  Definitions

An RPC language file consists of a series of definitions.

    definition-list:
        definition ";"
        definition ";" definition-list

It recognizes five types of definitions.


Page 58                             rpcgen Programming Guide


    definition:
        enum-definition
        struct-definition
        union-definition
        typedef-definition
        const-definition
        program-definition


5.2.  Structures

An XDR struct  is declared almost exactly like  its C  coun-
terpart.  It looks like the following:

    struct-definition:
        "struct" struct-ident "{"
            declaration-list
        "}"

    declaration-list:
        declaration ";"
        declaration ";" declaration-list

As  an example, here is an  XDR structure to define  a  two-
dimensional  coordinate,  and the C structure  that it  gets
compiled into  in the output header file.

           struct coord {             struct coord {
                int x;       -->           int x;
                int y;                     int y;
           };                         };
                                      typedef struct coord coord;

The output is identical to the  input, except  for the added
typedef  at  the end of  the output. This allows  one to use
"coord" instead  of "struct coord" when declaring items.

5.3.  Unions

DR unions are discriminated unions, and look quite different
from  C  unions.  They are more analogous to  Pascal variant
records than they are to C unions.

    union-definition:
        "union" union-ident "switch" "(" declaration ")" "{"
            case-list
        "}"

    case-list:
        "case" value ":" declaration ";"
        "default" ":" declaration ";"
        "case" value ":" declaration ";" case-list


rpcgen Programming Guide                             Page 59


Here is an example of a type that might be returned  as  the
result  of  a "read data" operation.  If there is no  error,
return a block of data.  Otherwise, don't return anything.

    union read_result switch (int errno) {
    case 0:
        opaque data[1024];
    default:
        void;
    };

It gets compiled into the following:

    struct read_result {
        int errno;
        union {
            char data[1024];
        } read_result_u;
    };
    typedef struct read_result read_result;

Notice that the union component of the  output  struct   has
the name as the type name, except for the trailing "_u".

5.4.  Enumerations

DR enumerations have the same syntax as C enumerations.

    enum-definition:
        "enum" enum-ident "{"
            enum-value-list
        "}"

    enum-value-list:
        enum-value
        enum-value "," enum-value-list

    enum-value:
        enum-value-ident
        enum-value-ident "=" value

Here is a short example of  an XDR enum,   and  the  C  enum
that  it gets compiled into.

     enum colortype {      enum colortype {
          RED = 0,              RED = 0,
          GREEN = 1,   -->      GREEN = 1,
          BLUE = 2              BLUE = 2,
     };                    };
                           typedef enum colortype colortype;


Page 60                             rpcgen Programming Guide


5.5.  Typedef

DR typedefs have the same syntax as C typedefs.

    typedef-definition:
        "typedef" declaration

Here  is an example  that defines  a  fname_type  used   for
declaring  file  name  strings that have a maximum length of
255 characters.

    typedef string fname_type<255>; --> typedef char *fname_type;


5.6.  Constants

DR constants  symbolic constants  that may be  used wherever
a  integer  constant  is  used,  for  example, in array size
specifications.

    const-definition:
        "const" const-ident "=" integer

For example, the following defines a constant DOZEN equal to
12.

    const DOZEN = 12;  -->  #define DOZEN 12


5.7.  Programs

RPC programs are declared using the following syntax:

    program-definition:
        "program" program-ident "{"
            version-list
        "}" "=" value

    version-list:
        version ";"
        version ";" version-list

    version:
        "version" version-ident "{"
            procedure-list
        "}" "=" value

    procedure-list:
        procedure ";"
        procedure ";" procedure-list

    procedure:
        type-ident procedure-ident "(" type-ident ")" "=" value


rpcgen Programming Guide                             Page 61


For example, here is the time protocol, revisited:

/*
 * time.x: Get or set the time. Time is represented as number of seconds
 * since 0:00, January 1, 1970.
 */
program TIMEPROG {
    version TIMEVERS {
        unsigned int TIMEGET(void) = 1;
        void TIMESET(unsigned) = 2;
    } = 1;
} = 44;

This file compiles into #defines in the output header file:

#define TIMEPROG 44
#define TIMEVERS 1
#define TIMEGET 1
#define TIMESET 2


5.8.  Declarations

In XDR, there are only four kinds of declarations.

    declaration:
        simple-declaration
        fixed-array-declaration
        variable-array-declaration
        pointer-declaration

1) Simple declarations are just like simple C declarations.

    simple-declaration:
        type-ident variable-ident

Example:

    colortype color;    --> colortype color;

2) Fixed-length Array Declarations are  just  like  C  array
declarations:

    fixed-array-declaration:
        type-ident variable-ident "[" value "]"

Example:

    colortype palette[8];    --> colortype palette[8];

3) Variable-Length Array Declarations have no explicit  syn-
tax in C, so XDR invents its own using angle-brackets.


Page 62                             rpcgen Programming Guide


variable-array-declaration:
    type-ident variable-ident "<" value ">"
    type-ident variable-ident "<" ">"

The maximum size is specified between  the  angle  brackets.
The size may be omitted, indicating that the array may be of
any size.

    int heights<12>;    /* at most 12 items */
    int widths<>;       /* any number of items */

Since  variable-length  arrays have no  explicit  syntax  in
C,    these   declarations   are   actually   compiled  into
"struct"s.  For example,   the  "heights"  declaration  gets
compiled into the following struct:

    struct {
        u_int heights_len;  /* # of items in array */
        int *heights_val;   /* pointer to array */
    } heights;

Note  that the number  of items in  the array is  stored  in
the  "_len"  component   and the  pointer  to   the array is
stored  in  the "_val" component. The first part of each  of
these  component's  names  is  the  same  as the name of the
declared XDR variable.

4) Pointer Declarations are made in DR  exactly as they  are
in  C.   You   can't  really send pointers over the network,
but  you  can use XDR pointers for  sending  recursive  data
types  such as lists and trees.  The type is actually called
"optional-data", not "pointer", in XDR language.

    pointer-declaration:
        type-ident "*" variable-ident

Example:

    listitem *next;  -->  listitem *next;


5.9.  Special Cases

There are a few exceptions to the rules described above.

Booleans: C has no built-in boolean type. However,  the  RPC
library  does   a  boolean  type    called  bool_t that   is
either TRUE or FALSE.  Things declared as  type bool in  XDR
language   are   compiled   into bool_t in the output header
file.

Example:


rpcgen Programming Guide                             Page 63


    bool married;  -->  bool_t married;

Strings: C has  no built-in string  type, but  instead  uses
the  null-terminated "char *" convention.  In XDR  language,
strings are declared using the "string" keyword,  and   com-
piled  into "char  *"s in the  output header file. The  max-
imum size contained  in the  angle  brackets  specifies  the
maximum  number  of  characters allowed in the  strings (not
counting the NULL character). The maximum size may  be  left
off, indicating a string of arbitrary length.

Examples:

    string name<32>;    -->  char *name;
    string longname<>;  -->  char *longname;

Opaque  Data: Opaque data is used in RPC and XDR to describe
untyped  data, that is, just  sequences of arbitrary  bytes.
It may be  declared  either as a fixed  or  variable  length
array.

Examples:
    opaque diskblock[512];  -->  char diskblock[512];

    opaque filedata<1024>;  -->  struct {
                                    u_int filedata_len;
                                    char *filedata_val;
                                 } filedata;

Voids: In a void declaration, the variable  is   not  named.
The  declaration  is  just  "void"  and  nothing else.  Void
declarations can only occur in two places: union definitions
and  program  definitions  (as  the  argument or result of a
remote procedure).

eXternal Data Representation: Sun Technical Notes

This chapter contains technical notes on  Sun's  implementa-
tion  of  the eXternal Data Representation (XDR) standard, a
set of  library  routines  that  allow  a  C  programmer  to
describe  arbitrary data structures in a machine-independent
fashion.  For a formal specification of  the  XDR  standard,
see  the  eXternal  Data Representation Standard.  DR is the
backbone of Sun's Remote  Procedure  Call  package,  in  the
sense  that  data  for remote procedure calls is transmitted
using the standard.  XDR library routines should be used  to
transmit  data  that  is  accessed (read or written) by more
than one type of machine.[2]
_________________________
  [2] For a compete specification of the system  eXter-
nal Data Representation routines, see the xdr(3N) manu-
al page.


Page 64    eXternal Data Representation: Sun Technical Notes


This chapter contains a short tutorial overview of  the  XDR
library  routines,  a guide to accessing currently available
XDR streams, and information on  defining  new  streams  and
data  types.   XDR  was  designed  to  work across different
languages, operating  systems,  and  machine  architectures.
Most  users  (particularly  RPC  users)  will  need only the
information in sections 1, 2 and 3 of this  document.   Pro-
grammers  wishing  to  implement RPC and XDR on new machines
will need the information in the rest of this document,  and
especially the eXternal Data Representation Standard.

NOTE: rpcgen can be used to write XDR routines even in cases
where no RPC calls are being made.

On Sun systems, C programs that want  to  use  XDR  routines
must  include  the  file <rpc/rpc.h>, which contains all the
necessary interfaces to the XDR system.  Since the C library
libc.a contains all the XDR routines, compile as normal.

        % cc program.c


1.  Justification

Consider the following two programs, writer

#include <stdio.h>
main()                  /* writer.c */
{
        long i;
        for (i = 0; i < 8; i++) {
                if (fwrite((char *)&i, sizeof(i), 1, stdout) != 1) {
                        fprintf(stderr, "failed!\n");
                        exit(1);
                }
        }
}

and reader

#include <stdio.h>
main()                  /* reader.c */
{
        long i, j;
        for (j = 0; j < 8; j++) {
                if (fread((char *)&i, sizeof (i), 1, stdin) != 1) {
                        fprintf(stderr, "failed!\n");
                        exit(1);
                }
                printf("%ld ", i);
        }
        printf("\n");
}


eXternal Data Representation: Sun Technical Notes    Page 65


The two programs appear to be  portable,  because  (a)  they
pass  lint  checking, and (b) they exhibit the same behavior
when executed on two different hardware architectures, a Sun
and a VAX.

Piping the output of the writer program to the  reader  pro-
gram gives identical results on a Sun or a VAX.

        sun% writer | reader
        0 1 2 3 4 5 6 7
        sun%

        vax% writer | reader
        0 1 2 3 4 5 6 7
        vax%

With the advent of local area networks and 4.2BSD  came  the
concept  of  ``network  pipes'' - a process produces data on
one machine, and a second process consumes data  on  another
machine.   A network pipe can be constructed with writer and
reader.  Here are the results if the first produces data  on
a Sun, and the second consumes data on a VAX.

        sun% writer | rsh vax reader
        0 16777216 33554432 50331648 67108864 83886080 100663296
        117440512
        sun%

Identical results can be obtained by executing writer on the
VAX  and reader on the Sun.  These results occur because the
byte ordering of long integers differs between the  VAX  and
the  Sun,  even  though  word  size  is the same.  Note that
$16777216$ is $2 sup 24$ - when four bytes are reversed, the
1 winds up in the 24th bit.

Whenever data is shared by two or more machine types,  there
is  a  need  for  portable data.  Programs can be made data-
portable by replacing the read and write calls with calls to
an  XDR  library  routine  xdr_long  a filter that knows the
standard representation of a long integer  in  its  external
form.  Here are the revised versions of writer:


Page 66    eXternal Data Representation: Sun Technical Notes


#include <stdio.h>
#include <rpc/rpc.h>    /* xdr is a sub-library of rpc */
main()          /* writer.c */
{
        XDR xdrs;
        long i;
        xdrstdio_create(&xdrs, stdout, XDR_ENCODE);
        for (i = 0; i < 8; i++) {
                if (!xdr_long(&xdrs, &i)) {
                        fprintf(stderr, "failed!\n");
                        exit(1);
                }
        }
}

and reader:

#include <stdio.h>
#include <rpc/rpc.h>    /* xdr is a sub-library of rpc */
main()          /* reader.c */
{
        XDR xdrs;
        long i, j;
        xdrstdio_create(&xdrs, stdin, XDR_DECODE);
        for (j = 0; j < 8; j++) {
                if (!xdr_long(&xdrs, &i)) {
                        fprintf(stderr, "failed!\n");
                        exit(1);
                }
                printf("%ld ", i);
        }
        printf("\n");
}

The new programs were executed on a Sun, on a VAX, and  from
a Sun to a VAX; the results are shown below.

        sun% writer | reader
        0 1 2 3 4 5 6 7
        sun%

        vax% writer | reader
        0 1 2 3 4 5 6 7
        vax%

        sun% writer | rsh vax reader
        0 1 2 3 4 5 6 7
        sun%


NOTE: Integers are just the tip of  the  portable-data  ice-
berg.   Arbitrary  data structures present portability prob-
lems, particularly with respect to alignment  and  pointers.


eXternal Data Representation: Sun Technical Notes    Page 67


Alignment  on word boundaries may cause the size of a struc-
ture to vary from machine to machine.  And  pointers,  which
are  very  convenient  to  use,  have no meaning outside the
machine where they are defined.


2.  A Canonical Standard

DR's  approach  to  standardizing  data  representations  is
canonical.   That  is,  XDR defines a single byte order (Big
Endian), a single floating-point representation (IEEE),  and
so  on.   Any  program running on any machine can use XDR to
create portable data by translating its local representation
to  the XDR standard representations; similarly, any program
running on any machine can read portable data by translating
the  DR  standard  representaions  to its local equivalents.
The  single  standard  completely  decouples  programs  that
create  or send portable data from those that use or receive
portable data.  The  advent  of  a  new  machine  or  a  new
language  has  no effect opn the community of existing port-
able data creators and users.  A new machine joins this com-
munity  be  being  ``taught''  how  to  convert the standard
representations and its  local  representations;  the  local
representations  of  other  machines  are  irrelevant.  Con-
versely, to existing programs running on other machines, the
local   representations   of   the   new  machine  are  also
irrelevant; such programs can immediately read portable data
produced  by  the  new machine because such data conforms to
the canonical standards that they already understand.

There are strong precedents for  XDR's  canonical  approach.
For example, TCP/IP, UDP/IP, XNS, Ethernet, and, indeed, all
protocols below layer five of the ISO model,  are  canonical
protocols.   The advantage of any canonical approach is sim-
plicity; in the case of XDR, a single set of conversion rou-
tines  is  written  once  and  is  never touched again.  The
canonical approach has a disadvantage, but it is unimportant
in  real-world  data  transfer  applications.   Suppose  two
Little-Endian machines are transferring  integers  according
to  the  XDR  standard.   The  sending  machine converts the
integers from Little-Endian byte order to  XDR  (Big-Endian)
byte  order;  the  receiving  machine  performs  the reverse
conversion.  Because both machines  observe  the  same  byte
order,  their  conversions are unnecessary.  The point, how-
ever, is not necessity, but cost as compared to the alterna-
tive.

The time spent converting to and from a canonical  represen-
tation  is  insignificant, especially in networking applica-
tions.  Most of the time required to prepare a  data  struc-
ture for transfer is not spent in conversion but in travers-
ing the elements of the data structure.  To transmit a tree,
for example, each leaf must be visited and each element in a
leaf record must be copied to a buffer  and  aligned  there;


Page 68    eXternal Data Representation: Sun Technical Notes


storage  for  the  leaf  may have to be deallocated as well.
Similarly, to receive a tree, storage must be allocated  for
each  leaf,  data  must be moved from the buffer to the leaf
and properly aligned, and pointers must  be  constructed  to
link  the  leaves  together.  Every machine pays the cost of
traversing  and  copying  data  structures  whether  or  not
conversion is required.  In networking applications, commun-
ications overhead-the time required to move  the  data  down
through the sender's protocol layers, across the network and
up through the receiver's protocol layers-dwarfs  conversion
overhead.

3.  The XDR Library

The XDR library not only solves data  portability  problems,
it  also allows you to write and read arbitrary C constructs
in a consistent, specified, well-documented  manner.   Thus,
it  can  make sense to use the library even when the data is
not shared among machines on a network.

The XDR library  has  filter  routines  for  strings  (null-
terminated arrays of bytes), structures, unions, and arrays,
to name a few.  Using more primitive routines, you can write
your  own  specific  XDR routines to describe arbitrary data
structures, including elements of arrays, arms of unions, or
objects  pointed  at  from other structures.  The structures
themselves may contain  arrays  of  arbitrary  elements,  or
pointers to other structures.

Let's examine the two programs more  closely.   There  is  a
family  of XDR stream creation routines in which each member
treats the stream of bits differently.  In our example, data
is  manipulated  using  standard  I/O  routines,  so  we use
xdrstdio_create.  The parameters to XDR stream creation rou-
tines  vary  according  to  their function.  In our example,
xdrstdio_create takes a pointer to an XDR structure that  it
initializes, a pointer to a FILE that the input or output is
performed on, and  the  operation.   The  operation  may  be
XDR_ENCODE   for  serializing  in  the  writer  program,  or
XDR_DECODE for deserializing in the reader program.

Note: RPC users never need to create XDR  streams;  the  RPC
system  itself  creates these streams, which are then passed
to the users.

The xdr_long primitive is characteristic of most XDR library
primitives  and all client XDR routines.  First, the routine
returns FALSE (0) if it fails, and TRUE (1) if it  succeeds.
Second,  for each data type, xxx, there is an associated XDR
routine of the form:


eXternal Data Representation: Sun Technical Notes    Page 69


        xdr_xxx(xdrs, xp)
                XDR *xdrs;
                xxx *xp;
        {
        }

In our case, xxx is long, and the corresponding XDR  routine
is  a  primitive,  xdr_long  The client could also define an
arbitrary structure xxx in which case the client would  also
supply the routine xdr_xxx, describing each field by calling
XDR routines of the appropriate  type.   In  all  cases  the
first  parameter,  xdrs  can be treated as an opaque handle,
and passed to the primitive routines.

DR routines are direction independent;  that  is,  the  same
routines  are called to serialize or deserialize data.  This
feature is critical  to  software  engineering  of  portable
data.   The  idea  is  to  call  the same routine for either
operation - this almost guarantees that serialized data  can
also  be deserialized.  One routine is used by both producer
and consumer of networked  data.   This  is  implemented  by
always  passing  the  address  of  an object rather than the
object itself - only in the case of deserialization  is  the
object  modified.   This feature is not shown in our trivial
example, but its value becomes obvious when nontrivial  data
structures  are  passed among machines.  If needed, the user
can obtain the direction of the XDR operation.  See the  XDR
Operation Directions section of this chapter for details.

Let's look at a slightly more complicated  example.   Assume
that  a  person's  gross  assets  and  liabilities are to be
exchanged among processes.  Also assume  that  these  values
are important enough to warrant their own data type:

struct gnumbers {
        long g_assets;
        long g_liabilities;
};

The corresponding  XDR  routine  describing  this  structure
would be:

bool_t                  /* TRUE is success, FALSE is failure */
xdr_gnumbers(xdrs, gp)
        XDR *xdrs;
        struct gnumbers *gp;
{
        if (xdr_long(xdrs, &gp->g_assets) &&
            xdr_long(xdrs, &gp->g_liabilities))
                return(TRUE);
        return(FALSE);
}


Page 70    eXternal Data Representation: Sun Technical Notes


Note that the parameter xdrs is never inspected or modified;
it  is  only  passed on to the subcomponent routines.  It is
imperative to inspect the return value of each  XDR  routine
call,  and  to  give  up immediately and return FALSE if the
subroutine fails.

This example also shows that the type bool_t is declared  as
an  integer  whose  only  values are TRUE (1) and FALSE (0).
This document uses the following definitions:

#define bool_t  int
#define TRUE    1
#define FALSE   0
#define enum_t int      /* enum_t used for generic enums */


Keeping these  conventions  in  mind,  xdr_gnumbers  can  be
rewritten as follows:

xdr_gnumbers(xdrs, gp)
        XDR *xdrs;
        struct gnumbers *gp;
{
        return(xdr_long(xdrs, &gp->g_assets) &&
                xdr_long(xdrs, &gp->g_liabilities));
}

This document uses both coding styles.

4.  XDR Library Primitives

This section gives a synopsis of  each  XDR  primitive.   It
starts  with  basic  data  types and moves on to constructed
data types.  Finally,  XDR  utilities  are  discussed.   The
interface  to  these  primitives and utilities is defined in
the include  file  <rpc/xdr.h>,  automatically  included  by
<rpc/rpc.h>.

4.1.  Number Filters

The XDR library provides  primitives  to  translate  between
numbers  and  their  corresponding external representations.
Primitives cover the set of numbers in:

             [signed,unsigned]*[short,int,long]
Specifically, the eight primitives are:


eXternal Data Representation: Sun Technical Notes    Page 71


        bool_t xdr_char(xdrs, cp)
                XDR *xdrs;
                char *cp;
        bool_t xdr_u_char(xdrs, ucp)
                XDR *xdrs;
                unsigned char *ucp;
        bool_t xdr_int(xdrs, ip)
                XDR *xdrs;
                int *ip;
        bool_t xdr_u_int(xdrs, up)
                XDR *xdrs;
                unsigned *up;
        bool_t xdr_long(xdrs, lip)
                XDR *xdrs;
                long *lip;
        bool_t xdr_u_long(xdrs, lup)
                XDR *xdrs;
                u_long *lup;
        bool_t xdr_short(xdrs, sip)
                XDR *xdrs;
                short *sip;
        bool_t xdr_u_short(xdrs, sup)
                XDR *xdrs;
                u_short *sup;

The first parameter, xdrs, is an  XDR  stream  handle.   The
second  parameter is the address of the number that provides
data to the stream or receives data from it.   All  routines
return  TRUE if they complete successfully, and FALSE other-
wise.

4.2.  Floating Point Filters

The XDR library also provides  primitive  routines  for  C's
floating point types:

        bool_t xdr_float(xdrs, fp)
                XDR *xdrs;
                float *fp;
        bool_t xdr_double(xdrs, dp)
                XDR *xdrs;
                double *dp;

The first parameter, xdrs is  an  XDR  stream  handle.   The
second parameter is the address of the floating point number
that provides data to the stream or receives data  from  it.
All  routines return TRUE if they complete successfully, and
FALSE otherwise.

Note: Since the numbers are  represented  in  IEEE  floating
point,   routines  may  fail  when  decoding  a  valid  IEEE
representation into a  machine-specific  representation,  or
vice-versa.


Page 72    eXternal Data Representation: Sun Technical Notes


4.3.  Enumeration Filters

The XDR library provides a primitive  for  generic  enumera-
tions.   The  primitive  assumes  that a C enum has the same
representation inside the  machine  as  a  C  integer.   The
boolean  type  is  an  important  instance of the enum.  The
external representation of a boolean is always TRUE  (1)  or
FALSE (0).

        #define bool_t  int
        #define FALSE   0
        #define TRUE    1
        #define enum_t int
        bool_t xdr_enum(xdrs, ep)
                XDR *xdrs;
                enum_t *ep;
        bool_t xdr_bool(xdrs, bp)
                XDR *xdrs;
                bool_t *bp;

The second parameters ep and bp are addresses of the associ-
ated  type that provides data to, or receives data from, the
stream xdrs The routine returns FALSE if the number of char-
acters exceeds maxlength, and TRUE if it doesn't.

4.4.  No Data

Occasionally, an XDR routine must be  supplied  to  the  RPC
system,  even  when  no  data  is  passed  or required.  The
library provides such a routine:

        bool_t xdr_void();  /* always returns TRUE */


4.5.  Constructed Data Type Filters

Constructed or compound data type  primitives  require  more
parameters  and  perform more complicated functions then the
primitives discussed above.  This  section  includes  primi-
tives  for  strings,  arrays, unions, and pointers to struc-
tures.

Constructed data type primitives may use memory  management.
In  many  cases, memory is allocated when deserializing data
with XDR_DECODE Therefore,  the  XDR  package  must  provide
means  to  deallocate memory.  This is done by an XDR opera-
tion, XDR_FREE To review, the three XDR  directional  opera-
tions are XDR_ENCODE, XDR_DECODE and XDR_FREE.

4.5.1.  Strings

In C, a string is defined as a sequence of bytes  terminated
by  a  null  byte,  which is not considered when calculating
string  length.   However,  when  a  string  is  passed   or


eXternal Data Representation: Sun Technical Notes    Page 73


manipulated,  a  pointer  to it is employed.  Therefore, the
XDR library defines a string to be a char and not a sequence
of  characters.   The external representation of a string is
drastically  different  from  its  internal  representation.
Externally,  strings  are  represented as sequences of ASCII
characters, while  internally,  they  are  represented  with
character  pointers.  Conversion between the two representa-
tions is accomplished with the routine xdr_string.

        bool_t xdr_string(xdrs, sp, maxlength)
                XDR *xdrs;
                char **sp;
                u_int maxlength;

The first parameter xdrs is  the  XDR  stream  handle.   The
second  parameter sp is a pointer to a string (type char The
third parameter maxlength specifies the  maximum  number  of
bytes allowed during encoding or decoding; its value is usu-
ally specified by  a  protocol.   For  example,  a  protocol
specification may say that a file name may be no longer than
255 characters.  The routine returns FALSE if the number  of
characters exceeds maxlength, and TRUE if it doesn't.

The behavior of xdr_string is similar  to  the  behavior  of
other  routines  discussed  in  this section.  The direction
XDR_ENCODE is  easiest  to  understand.   The  parameter  sp
points  to  a string of a certain length; if the string does
not exceed maxlength, the bytes are serialized.

The effect of deserializing a string is subtle.   First  the
length  of  the  incoming  string is determined; it must not
exceed maxlength.  Next sp is dereferenced; if the the value
is  NULL,  then  a string of the appropriate length is allo-
cated and *sp is set to this string.  If the original  value
of *sp is non-null, then the XDR package assumes that a tar-
get area has been  allocated,  which  can  hold  strings  no
longer  than  maxlength.   In  either  case,  the  string is
decoded into the target area.  The routine  then  appends  a
null character to the string.

In the XDR_FREE operation, the string is obtained  by  dere-
ferencing  sp.   If  the string is not NULL, it is freed and
*sp is set to NULL.  In this operation,  xdr_string  ignores
the maxlength parameter.

4.5.2.  Byte Arrays

Often variable-length arrays  of  bytes  are  preferable  to
strings.   Byte  arrays differ from strings in the following
three ways: 1) the length of the array (the byte  count)  is
explicitly  located  in  an  unsigned  integer,  2) the byte
sequence is not terminated by a null character, and  3)  the
external  representation  of  the bytes is the same as their
internal representation.  The primitive  xdr_bytes  converts


Page 74    eXternal Data Representation: Sun Technical Notes


between  the  internal  and external representations of byte
arrays:

        bool_t xdr_bytes(xdrs, bpp, lp, maxlength)
            XDR *xdrs;
            char **bpp;
            u_int *lp;
            u_int maxlength;

The usage of the first, second  and  fourth  parameters  are
identical  to  the  first,  second  and  third parameters of
xdr_string, respectively.  The length of the  byte  area  is
obtained by dereferencing lp when serializing; *lp is set to
the byte length when deserializing.

4.5.3.  Arrays

The XDR library package provides a  primitive  for  handling
arrays  of arbitrary elements.  The xdr_bytes routine treats
a subset of generic arrays, in which the size of array  ele-
ments is known to be 1, and the external description of each
element is built-in.  The generic array primitive, xdr_array
requires parameters identical to those of xdr_bytes plus two
more: the size of array elements, and an XDR routine to han-
dle  each of the elements.  This routine is called to encode
or decode each element of the array.

        bool_t
        xdr_array(xdrs, ap, lp, maxlength, elementsiz, xdr_element)
            XDR *xdrs;
            char **ap;
            u_int *lp;
            u_int maxlength;
            u_int elementsiz;
            bool_t (*xdr_element)();

The parameter ap is the address of the pointer to the array.
If  *ap  is  NULL  when  the array is being deserialized, DR
allocates an array of the appropriate size and sets  *ap  to
that array.  The element count of the array is obtained from
*lp when the array is serialized; *lp is set  to  the  array
length  when  the  array is deserialized. The parameter max-
length is the maximum number of elements that the  array  is
allowed to have; elementsiz is the byte size of each element
of the array (the C function sizeof can be  used  to  obtain
this  value).   The routine xdr_element is called to serial-
ize, deserialize, or free each element of the array.

Before defining more constructed data types, it is appropri-
ate to present three examples.

Example A:
A user on a networked machine can be identified by  (a)  the
machine name, such as krypton: see the gethostname man page;


eXternal Data Representation: Sun Technical Notes    Page 75


(b) the user's UID: see the geteuid man page;  and  (c)  the
group  numbers  to which the user belongs: see the getgroups
man page.  A structure with this information and its associ-
ated DR routine could be coded like this:

struct netuser {
    char    *nu_machinename;
    int     nu_uid;
    u_int   nu_glen;
    int     *nu_gids;
};
#define NLEN 255    /* machine names < 256 chars */
#define NGRPS 20    /* user can't be in > 20 groups */
bool_t
xdr_netuser(xdrs, nup)
    XDR *xdrs;
    struct netuser *nup;
{
    return(xdr_string(xdrs, &nup->nu_machinename, NLEN) &&
        xdr_int(xdrs, &nup->nu_uid) &&
        xdr_array(xdrs, &nup->nu_gids, &nup->nu_glen,
        NGRPS, sizeof (int), xdr_int));
}


Example B:
A party of network users could be implemented as an array of
netuser  structure.   The declaration and its associated XDR
routines are as follows:

struct party {
    u_int p_len;
    struct netuser *p_nusers;
};
#define PLEN 500    /* max number of users in a party */
bool_t
xdr_party(xdrs, pp)
    XDR *xdrs;
    struct party *pp;
{
    return(xdr_array(xdrs, &pp->p_nusers, &pp->p_len, PLEN,
        sizeof (struct netuser), xdr_netuser));
}


Example C:
The well-known parameters to main, argc and argv can be com-
bined  into  a  structure.  An array of these structures can
make up a history of commands.   The  declarations  and  XDR
routines might look like:


Page 76    eXternal Data Representation: Sun Technical Notes


struct cmd {
    u_int c_argc;
    char **c_argv;
};
#define ALEN 1000   /* args cannot be > 1000 chars */
#define NARGC 100   /* commands cannot have > 100 args */

struct history {
    u_int h_len;
    struct cmd *h_cmds;
};
#define NCMDS 75    /* history is no more than 75 commands */

bool_t
xdr_wrap_string(xdrs, sp)
    XDR *xdrs;
    char **sp;
{
    return(xdr_string(xdrs, sp, ALEN));
}


bool_t
xdr_cmd(xdrs, cp)
    XDR *xdrs;
    struct cmd *cp;
{
    return(xdr_array(xdrs, &cp->c_argv, &cp->c_argc, NARGC,
        sizeof (char *), xdr_wrap_string));
}


bool_t
xdr_history(xdrs, hp)
    XDR *xdrs;
    struct history *hp;
{
    return(xdr_array(xdrs, &hp->h_cmds, &hp->h_len, NCMDS,
        sizeof (struct cmd), xdr_cmd));
}

The most confusing part of this example is that the  routine
xdr_wrap_string is needed to package the xdr_string routine,
because the implementation  of  xdr_array  only  passes  two
parameters   to   the  array  element  description  routine;
xdr_wrap_string supplies the third parameter to xdr_string.

By now the recursive nature of the  XDR  library  should  be
obvious.  Let's continue with more constructed data types.

4.5.4.  Opaque Data

In some protocols, handles  are  passed  from  a  server  to


eXternal Data Representation: Sun Technical Notes    Page 77


client.   The client passes the handle back to the server at
some later time.  Handles are never  inspected  by  clients;
they  are  obtained  and submitted.  That is to say, handles
are opaque.  The primitive xdr_opaque is used for describing
fixed sized, opaque bytes.

        bool_t xdr_opaque(xdrs, p, len)
            XDR *xdrs;
            char *p;
            u_int len;

The parameter p is the location of the  bytes;  len  is  the
number  of  bytes  in the opaque object.  By definition, the
actual data contained in the opaque object are  not  machine
portable.

4.5.5.  Fixed Sized Arrays

The  XDR  library  provides  a  primitive,  xdr_vector,  for
fixed-length arrays.

#define NLEN 255    /* machine names must be < 256 chars */
#define NGRPS 20    /* user belongs to exactly 20 groups */
struct netuser {
    char *nu_machinename;
    int nu_uid;
    int nu_gids[NGRPS];
};
bool_t
xdr_netuser(xdrs, nup)
    XDR *xdrs;
    struct netuser *nup;
{
    int i;
    if (!xdr_string(xdrs, &nup->nu_machinename, NLEN))
        return(FALSE);
    if (!xdr_int(xdrs, &nup->nu_uid))
        return(FALSE);
    if (!xdr_vector(xdrs, nup->nu_gids, NGRPS, sizeof(int),
        xdr_int)) {
            return(FALSE);
    }
    return(TRUE);
}


4.5.6.  Discriminated Unions

The XDR library supports discriminated unions.   A  discrim-
inated  union  is a C union and an enum_t value that selects
an ``arm'' of the union.


Page 78    eXternal Data Representation: Sun Technical Notes


        struct xdr_discrim {
            enum_t value;
            bool_t (*proc)();
        };
        bool_t xdr_union(xdrs, dscmp, unp, arms, defaultarm)
            XDR *xdrs;
            enum_t *dscmp;
            char *unp;
            struct xdr_discrim *arms;
            bool_t (*defaultarm)();  /* may equal NULL */

First the routine translates the discriminant of  the  union
located  at  *dscmp.   The  discriminant is always an enum_t
Next the union located at *unp is translated.  The parameter
arms  is  a  pointer  to an array of xdr_discrim structures.
Each structure contains an order pair of  [value,proc].   If
the  union's  discriminant  is equal to the associated value
then the proc is called to translate the union.  The end  of
the  xdr_discrim  structure array is denoted by a routine of
value NULL (0).  If the discriminant is  not  found  in  the
arms array, then the defaultarm procedure is called if it is
non-null; otherwise the routine returns FALSE.

Example D: Suppose the type of a union may be integer, char-
acter  pointer  (a  string), or a gnumbers structure.  Also,
assume the union and its current  type  are  declared  in  a
structure.  The declaration is:

enum utype { INTEGER=1, STRING=2, GNUMBERS=3 };
struct u_tag {
    enum utype utype;   /* the union's discriminant */
    union {
        int ival;
        char *pval;
        struct gnumbers gn;
    } uval;
};

The following constructs and XDR procedure (de)serialize the
discriminated union:


eXternal Data Representation: Sun Technical Notes    Page 79


struct xdr_discrim u_tag_arms[4] = {
    { INTEGER, xdr_int },
    { GNUMBERS, xdr_gnumbers }
    { STRING, xdr_wrap_string },
    { __dontcare__, NULL }
    /* always terminate arms with a NULL xdr_proc */
}
bool_t
xdr_u_tag(xdrs, utp)
    XDR *xdrs;
    struct u_tag *utp;
{
    return(xdr_union(xdrs, &utp->utype, &utp->uval,
        u_tag_arms, NULL));
}

The routine xdr_gnumbers was presented above in the The  XDR
Library  section.   xdr_wrap_string was presented in example
C.  The default arm parameter to xdr_union (the last parame-
ter)  is  NULL  in this example.  Therefore the value of the
union's discriminant may legally take on only values  listed
in  the  u_tag_arms  array.   This example also demonstrates
that the elements of the arm's  array  do  not  need  to  be
sorted.

It is worth pointing out that the values of the discriminant
may  be  sparse, though in this example they are not.  It is
always good practice to assign explicitly integer values  to
each element of the discriminant's type.  This practice both
documents the external representation  of  the  discriminant
and  guarantees  that  different  C compilers emit identical
discriminant values.

Exercise: Implement xdr_union using the other primitives  in
this section.

4.5.7.  Pointers

In C it is often  convenient  to  put  pointers  to  another
structure  within  a structure.  The primitive xdr_reference
makes it easy to  serialize,  deserialize,  and  free  these
referenced structures.

        bool_t xdr_reference(xdrs, pp, size, proc)
            XDR *xdrs;
            char **pp;
            u_int ssize;
            bool_t (*proc)();


Parameter pp is the address of the pointer to the structure;
parameter  ssize  is the size in bytes of the structure (use
the C function sizeof to obtain this value); and proc is the


Page 80    eXternal Data Representation: Sun Technical Notes


XDR  routine  that  describes  the structure.  When decoding
data, storage is allocated if *pp is NULL.

There is no need for  a  primitive  xdr_struct  to  describe
structures  within  structures,  because pointers are always
sufficient.

Exercise: Implement xdr_reference using xdr_array.  Warning:
xdr_reference and xdr_array are NOT interchangeable external
representations of data.

Example  E:  Suppose  there  is  a  structure  containing  a
person's name and a pointer to a gnumbers structure contain-
ing the person's gross assets  and  liabilities.   The  con-
struct is:

        struct pgn {
            char *name;
            struct gnumbers *gnp;
        };

The corresponding XDR routine for this structure is:

        bool_t
        xdr_pgn(xdrs, pp)
            XDR *xdrs;
            struct pgn *pp;
        {
            if (xdr_string(xdrs, &pp->name, NLEN) &&
              xdr_reference(xdrs, &pp->gnp,
              sizeof(struct gnumbers), xdr_gnumbers))
                return(TRUE);
            return(FALSE);
        }

Pointer Semantics and XDR

In many applications, C programmers attach double meaning to
the values of a pointer.  Typically the value NULL (or zero)
means data is  not  needed,  yet  some  application-specific
interpretation  applies.   In  essence,  the C programmer is
encoding a discriminated union  efficiently  by  overloading
the interpretation of the value of a pointer.  For instance,
in example E a NULL pointer value  for  gnp  could  indicate
that  the person's assets and liabilities are unknown.  That
is, the pointer value encodes two things: whether or not the
data  is  known;  and if it is known, where it is located in
memory.  Linked lists are an extreme example of the  use  of
application-specific pointer interpretation.

The primitive xdr_reference cannot and does not  attach  any
special  meaning  to  a null-value pointer during serializa-
tion.  That is, passing an address of a pointer whose  value
is  NULL  to  xdr_reference  when  serialing  data will most


eXternal Data Representation: Sun Technical Notes    Page 81


likely cause a memory fault and, on the UNIX system, a  core
dump.

xdr_pointer  correctly  handles  NULL  pointers.   For  more
information about its use, see Linked Lists.

Exercise: After reading the section on Linked Lists,  return
here and extend example E so that it can correctly deal with
NULL pointer values.

Exercise: Using the  xdr_union  xdr_reference  and  xdr_void
primitives,  implement  a generic pointer handling primitive
that implicitly deals with NULL pointers.  That  is,  imple-
ment xdr_pointer.

4.6.  Non-filter Primitives

DR streams can be manipulated with the primitives  discussed
in this section.

        u_int xdr_getpos(xdrs)
            XDR *xdrs;
        bool_t xdr_setpos(xdrs, pos)
            XDR *xdrs;
            u_int pos;
        xdr_destroy(xdrs)
            XDR *xdrs;

The routine xdr_getpos  returns  an  unsigned  integer  that
describes the current position in the data stream.  Warning:
In some XDR streams, the returned  value  of  xdr_getpos  is
meaningless;  the  routine returns a -1 in this case (though
-1 should be a legitimate value).

The routine xdr_setpos sets a stream position to  pos  Warn-
ing:  In some XDR streams, setting a position is impossible;
in such cases, xdr_setpos will return FALSE.   This  routine
will  also  fail if the requested position is out-of-bounds.
The definition of bounds varies from stream to stream.

The xdr_destroy primitive destroys the XDR stream.  Usage of
the stream after calling this routine is undefined.

4.7.  XDR Operation Directions

At times you may wish to optimize  XDR  routines  by  taking
advantage  of  the  direction  of the operation - XDR_ENCODE
XDR_DECODE or XDR_FREE The value xdrs->x_op always  contains
the  direction  of  the  XDR operation.  Programmers are not
encouraged to take advantage of  this  information.   There-
fore,  no example is presented here.  However, an example in
Section 7 demonstrates  the  usefulness  of  the  xdrs->x_op
field.


Page 82    eXternal Data Representation: Sun Technical Notes


4.8.  XDR Stream Access

An XDR stream is obtained by calling the  appropriate  crea-
tion  routine.   These creation routines take arguments that
are tailored to the specific properties of the stream.

Streams currently exist for (de)serialization of data to  or
from  standard I/O FILE streams, TCP/IP connections and UNIX
files, and memory.  Section 5 documents the XDR  object  and
how to make new XDR streams when they are required.

4.8.1.  Standard I/O Streams

DR streams can be  interfaced  to  standard  I/O  using  the
xdrstdio_create routine as follows:

        #include <stdio.h>
        #include <rpc/rpc.h>    /* xdr streams part of rpc */
        void
        xdrstdio_create(xdrs, fp, x_op)
            XDR *xdrs;
            FILE *fp;
            enum xdr_op x_op;

The  routine  xdrstdio_create  initializes  an  XDR   stream
pointed to by xdrs The XDR stream interfaces to the standard
I/O library.  Parameter fp is an open file, and x_op  is  an
XDR direction.

4.8.2.  Memory Streams

Memory streams allow the streaming of data into or out of  a
specified area of memory:

        #include <rpc/rpc.h>
        void
        xdrmem_create(xdrs, addr, len, x_op)
            XDR *xdrs;
            char *addr;
            u_int len;
            enum xdr_op x_op;

The routine xdrmem_create initializes an XDR stream in local
memory.   The memory is pointed to by parameter addr parame-
ter len is the length in bytes of the memory.   The  parame-
ters xdrs and x_op are identical to the corresponding param-
eters of xdrstdio_create Currently, the  UDP/IP  implementa-
tion  of RPC uses xdrmem_create Complete call or result mes-
sages are built in memory before calling the  sendto  system
routine.

4.8.3.  Record (TCP/IP) Streams

A record stream is an XDR stream built on top  of  a  record


eXternal Data Representation: Sun Technical Notes    Page 83


marking  standard  that  is built on top of the UNIX file or
4.2 BSD connection interface.

        #include <rpc/rpc.h>    /* xdr streams part of rpc */
        xdrrec_create(xdrs,
          sendsize, recvsize, iohandle, readproc, writeproc)
            XDR *xdrs;
            u_int sendsize, recvsize;
            char *iohandle;
            int (*readproc)(), (*writeproc)();

The routine xdrrec_create provides an XDR  stream  interface
that  allows  for a bidirectional, arbitrarily long sequence
of records.  The contents of the records  are  meant  to  be
data in XDR form.  The stream's primary use is for interfac-
ing RPC to TCP connections.  However,  it  can  be  used  to
stream data into or out of normal UNIX files.

The parameter xdrs is similar to the corresponding parameter
described  above.   The  stream  does its own data buffering
similar to that of standard I/O.   The  parameters  sendsize
and  recvsize  determine the size in bytes of the output and
input buffers, respectively; if their values are  zero  (0),
then  predetermined  defaults are used.  When a buffer needs
to be filled or flushed, the routine readproc  or  writeproc
is  called,  respectively.   The usage and behavior of these
routines are similar to  the  UNIX  system  calls  read  and
write.   However,  the first parameter to each of these rou-
tines is the opaque parameter iohandle The other two parame-
ters buf and nbytes and the results (byte count) are identi-
cal to the system routines.  If xxx is readproc or writeproc
then it has the following form:

        /*
         * returns the actual number of bytes transferred.
         * -1 is an error
         */
        int
        xxx(iohandle, buf, len)
            char *iohandle;
            char *buf;
            int nbytes;

The XDR stream provides means for delimiting records in  the
byte  stream.   The  implementation  details  of  delimiting
records in a stream are discussed in appendix 1.  The primi-
tives that are specific to record streams are as follows:


Page 84    eXternal Data Representation: Sun Technical Notes


        bool_t
        xdrrec_endofrecord(xdrs, flushnow)
            XDR *xdrs;
            bool_t flushnow;
        bool_t
        xdrrec_skiprecord(xdrs)
            XDR *xdrs;
        bool_t
        xdrrec_eof(xdrs)
            XDR *xdrs;

The routine xdrrec_endofrecord causes the  current  outgoing
data to be marked as a record.  If the parameter flushnow is
TRUE, then the stream's writeproc will be called; otherwise,
writeproc  will  be  called  when the output buffer has been
filled.

The routine xdrrec_skiprecord causes an input stream's posi-
tion  to  be moved past the current record boundary and onto
the beginning of the next record in the stream.

If there is no more data in the stream's input buffer,  then
the  routine  xdrrec_eof  returns  TRUE.  That is not to say
that there is no more data in the underlying  file  descrip-
tor.

4.9.  XDR Stream Implementation

This section provides the  abstract  data  types  needed  to
implement new instances of XDR streams.

4.9.1.  The XDR Object

The following structure defines  the  interface  to  an  XDR
stream:


eXternal Data Representation: Sun Technical Notes    Page 85


enum xdr_op { XDR_ENCODE=0, XDR_DECODE=1, XDR_FREE=2 };
typedef struct {
    enum xdr_op x_op;            /* operation; fast added param */
    struct xdr_ops {
        bool_t  (*x_getlong)();  /* get long from stream */
        bool_t  (*x_putlong)();  /* put long to stream */
        bool_t  (*x_getbytes)(); /* get bytes from stream */
        bool_t  (*x_putbytes)(); /* put bytes to stream */
        u_int   (*x_getpostn)(); /* return stream offset */
        bool_t  (*x_setpostn)(); /* reposition offset */
        caddr_t (*x_inline)();   /* ptr to buffered data */
        VOID    (*x_destroy)();  /* free private area */
    } *x_ops;
    caddr_t     x_public;        /* users' data */
    caddr_t     x_private;       /* pointer to private data */
    caddr_t     x_base;          /* private for position info */
    int         x_handy;         /* extra private word */
} XDR;

The x_op field is the current operation being  performed  on
the  stream.  This field is important to the XDR primitives,
but should not affect a stream's implementation.  That is, a
stream's  implementation  should  not  depend on this value.
The fields x_private x_base and x_handy are private  to  the
particular  stream's  implementation.  The field x_public is
for the XDR client and should  never  be  used  by  the  XDR
stream implementations or the XDR primitives.

Macros for accessing  operations x_getpostn  x_setpostn  and
x_destroy  were  defined  in  Section  3.6.   The  operation
x_inline takes two parameters: an XDR  *,  and  an  unsigned
integer,  which  is  a  byte  count.   The routine returns a
pointer to a piece of the  stream's  internal  buffer.   The
caller  can  then  use  the  buffer segment for any purpose.
>From the stream's point of view, the bytes  in  the  buffer
segment  have  been consumed or put.  The routine may return
NULL if it cannot return a buffer segment of  the  requested
size.  (The x_inline routine is for cycle squeezers.  Use of
the  resulting  buffer  is  not  data-portable.   Users  are
encouraged not to use this feature.)

The operations x_getbytes and x_putbytes blindly get and put
sequences  of  bytes  from or to the underlying stream; they
return TRUE if they are  successful,  and  FALSE  otherwise.
The routines have identical parameters (replace xxx):

        bool_t
        xxxbytes(xdrs, buf, bytecount)
                XDR *xdrs;
                char *buf;
                u_int bytecount;

The operations x_getlong and x_putlong receive and put  long


Page 86    eXternal Data Representation: Sun Technical Notes


numbers from and to the data stream.  It is the responsibil-
ity of these routines to translate the numbers  between  the
machine representation and the (standard) external represen-
tation.  The UNIX primitives htonl and ntohl can be  helpful
in  accomplishing  this.   Section  6  defines  the standard
representation of numbers.  The higher-level XDR implementa-
tion  assumes that signed and unsigned long integers contain
the same number of bits, and that nonnegative integers  have
the same bit representations as unsigned integers.  The rou-
tines return TRUE if  they  succeed,  and  FALSE  otherwise.
They have identical parameters:

        bool_t
        xxxlong(xdrs, lp)
                XDR *xdrs;
                long *lp;

Implementors of new XDR streams must make an  XDR  structure
(with  new  operation  routines) available to clients, using
some kind of create routine.

5.  Advanced Topics

This section describes techniques for  passing  data  struc-
tures  that are not covered in the preceding sections.  Such
structures include  linked  lists  (of  arbitrary  lengths).
Unlike the simpler examples covered in the earlier sections,
the following examples are written  using  both  the  XDR  C
library routines and the XDR data description language.  The
eXternal Data Representation Standard chapter of  this  Net-
working  Programming  manual describes this language in com-
plete detail.

5.1.  Linked Lists

The last example in the Pointers section presented a C  data
structure and its associated XDR routines for a individual's
gross assets and  liabilities.  The  example  is  duplicated
below:

struct gnumbers {
        long g_assets;
        long g_liabilities;
};
bool_t
xdr_gnumbers(xdrs, gp)
        XDR *xdrs;
        struct gnumbers *gp;
{
        if (xdr_long(xdrs, &(gp->g_assets)))
                return(xdr_long(xdrs, &(gp->g_liabilities)));
        return(FALSE);
}


eXternal Data Representation: Sun Technical Notes    Page 87


Now assume that we wish to implement a linked list  of  such
information.  A  data structure could be constructed as fol-
lows:

struct gnumbers_node {
        struct gnumbers gn_numbers;
        struct gnumbers_node *gn_next;
};
typedef struct gnumbers_node *gnumbers_list;


The head of the linked list can be thought of  as  the  data
object;  that is, the head is not merely a convenient short-
hand for a structure.  Similarly the gn_next field  is  used
to indicate whether or not the object has terminated. Unfor-
tunately, if the object continues, the gn_next field is also
the  address of where it continues. The link addresses carry
no useful information when the object is serialized.  LP The
XDR data description of this linked list is described by the
recursive declaration of gnumbers_list:

struct gnumbers {
        int g_assets;
        int g_liabilities;
};
struct gnumbers_node {
        gnumbers gn_numbers;
        gnumbers_list gn_next;
};
union gnumbers_list switch (bool more_data) {
case TRUE:
        gnumbers_node node;
case FALSE:
        void;
};


In this description, the boolean indicates whether there  is
more  data following it. If the boolean is FALSE, then it is
the last data field of the structure. If it is TRUE, then it
is  followed  by a gnumbers structure and (recursively) by a
gnumbers_list.  Note that the C declaration has  no  boolean
explicitly  declared  in it (though the gn_next field impli-
citly carries the information), while the XDR data  descrip-
tion has no pointer explicitly declared in it.

Hints for writing the XDR routines for a gnumbers_list  fol-
low  easily  from  the  XDR  description above. Note how the
primitive xdr_pointer is used to  implement  the  XDR  union
above.


Page 88    eXternal Data Representation: Sun Technical Notes


bool_t
xdr_gnumbers_node(xdrs, gn)
        XDR *xdrs;
        gnumbers_node *gn;
{
        return(xdr_gnumbers(xdrs, &gn->gn_numbers) &&
                xdr_gnumbers_list(xdrs, &gp->gn_next));
}
bool_t
xdr_gnumbers_list(xdrs, gnp)
        XDR *xdrs;
        gnumbers_list *gnp;
{
        return(xdr_pointer(xdrs, gnp,
                sizeof(struct gnumbers_node),
                xdr_gnumbers_node));
}


The unfortunate side effect of XDR'ing  a  list  with  these
routines  is that the C stack grows linearly with respect to
the number of node in the list.  This is due to  the  recur-
sion. The following routine collapses the above two mutually
recursive into a single, non-recursive one.


eXternal Data Representation: Sun Technical Notes    Page 89


bool_t
xdr_gnumbers_list(xdrs, gnp)
        XDR *xdrs;
        gnumbers_list *gnp;
{
        bool_t more_data;
        gnumbers_list *nextp;
        for (;;) {
                more_data = (*gnp != NULL);
                if (!xdr_bool(xdrs, &more_data)) {
                        return(FALSE);
                }
                if (! more_data) {
                        break;
                }
                if (xdrs->x_op == XDR_FREE) {
                        nextp = &(*gnp)->gn_next;
                }
                if (!xdr_reference(xdrs, gnp,
                        sizeof(struct gnumbers_node), xdr_gnumbers)) {

                return(FALSE);
                }
                gnp = (xdrs->x_op == XDR_FREE) ?
                        nextp : &(*gnp)->gn_next;
        }
        *gnp = NULL;
        return(TRUE);
}


The first task is to find out whether there is more data  or
not,  so  that  this  boolean information can be serialized.
Notice that this statement is unnecessary in the  XDR_DECODE
case,  since  the  value  of more_data is not known until we
deserialize it in the next statement.

The next statement XDR's the  more_data  field  of  the  XDR
union. Then if there is truly no more data, we set this last
pointer to NULL to indicate the end of the list, and  return
TRUE  because  we are done. Note that setting the pointer to
NULL is only important in the XDR_DECODE case, since  it  is
already NULL in the XDR_ENCODE and DR_FREE cases.

Next, if the direction is XDR_FREE, the value  of  nextp  is
set  to  indicate  the  location  of the next pointer in the
list. We do this now because we need to dereference  gnp  to
find  the  location  of the next item in the list, and after
the next statement the pointer gnp will be freed up  and  no
longer  valid.   We can't do this for all directions though,
because in the XDR_DECODE direction the value of  gnp  won't
be set until the next statement.


Page 90    eXternal Data Representation: Sun Technical Notes


Next, we XDR the  data  in  the  node  using  the  primitive
xdr_reference.   xdr_reference  is like xdr_pointer which we
used before, but it does not send over the boolean  indicat-
ing  whether  there  is  more  data.  We  use  it instead of
xdr_pointer because we have already XDR'd  this  information
ourselves.  Notice  that  the  xdr routine passed is not the
same type as an element in the list. The routine  passed  is
xdr_gnumbers,  for XDR'ing gnumbers, but each element in the
list is actually  of  type  gnumbers_node.   We  don't  pass
xdr_gnumbers_node  because  it is recursive, and instead use
xdr_gnumbers which XDR's  all  of  the  non-recursive  part.
Note  that this trick will work only if the gn_numbers field
is the first item in each element, so that  their  addresses
are identical when passed to xdr_reference.

Finally, we update gnp to point to  the  next  item  in  the
list.  If the direction is XDR_FREE, we set it to the previ-
ously saved value, otherwise we can dereference gnp  to  get
the  proper  value.   Though  harder  to understand than the
recursive version, this  non-recursive  routine  will  never
cause  the  C  stack to blow up. It will also run more effi-
ciently since a lot of  procedure  call  overhead  has  been
removed.  Most  lists  are  small though (in the hundreds of
items or less) and the recursive version  should  be  suffi-
cient for them.

eXternal Data Representation Standard:  Protocol  Specifica-
tion


1.  Status of this Standard

Note: This chapter specifies a protocol that  Sun  Microsys-
tems,  Inc.,  and  others are using.  It has been designated
RFC1014 by the ARPA Network Information Center.

2.  Introduction

DR is a standard for the description and encoding  of  data.
It  is  useful  for transferring data between different com-
puter architectures, and has been used to  communicate  data
between  such  diverse machines as the Sun Workstation, VAX,
IBM-PC, and Cray.  DR fits into the ISO presentation  layer,
and  is  roughly analogous in purpose to X.409, ISO Abstract
Syntax Notation.  The major difference between these two  is
that  XDR  uses  implicit  typing, while X.409 uses explicit
typing.

DR uses a language to describe data formats.   The  language
can only be used only to describe data; it is not a program-
ming language.  This language allows one to describe  intri-
cate  data  formats  in a concise manner. The alternative of
using  graphical   representations   (itself   an   informal
language)  quickly  becomes incomprehensible when faced with


eXternal Data Representation Standard                Page 91


complexity.  The XDR language itself is  similar  to  the  C
language [1], just as Courier [4] is similar to Mesa. Proto-
cols such as Sun RPC (Remote Procedure  Call)  and  the  NFS
(Network  File  System)  use  XDR  to describe the format of
their data.

The XDR standard makes the following assumption: that  bytes
(or  octets)  are  portable, where a byte is defined to be 8
bits of data.  A given hardware  device  should  encode  the
bytes  onto  the  various  media  in  such  a way that other
hardware devices may decode the bytes without loss of  mean-
ing.  For example, the Ethernet standard suggests that bytes
be encoded in "little-endian" style [2], or  least  signifi-
cant bit first.

2.1.  Basic Block Size

The representation of all items requires a multiple of  four
bytes  (or  32  bits)  of  data.   The  bytes are numbered 0
through n-1.  The bytes are read or  written  to  some  byte
stream  such that byte m always precedes byte m+1.  If the n
bytes needed to contain the data are not a multiple of four,
then  the  n  bytes are followed by enough (0 to 3) residual
zero bytes, r, to make the total byte count a multiple of 4.

We include the familiar graphic box notation  for  illustra-
tion  and comparison.  In most illustrations, each box (del-
imited by a plus sign at the 4 corners and vertical bars and
dashes)  depicts  a byte.  Ellipses (...) between boxes show
zero or more additional bytes where required.

A Block

+--------+--------+...+--------+--------+...+--------+
| byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
+--------+--------+...+--------+--------+...+--------+
|<-----------n bytes---------->|<------r bytes------>|
|<-----------n+r (where (n+r) mod 4 = 0)>----------->|


3.  XDR Data Types

Each of the sections  that  follow  describes  a  data  type
defined  in the DR standard, shows how it is declared in the
language, and includes a graphic illustration of its  encod-
ing.

For each data type in the language we show a  general  para-
digm declaration.  Note that angle brackets (< and >) denote
variable length sequences of data and square brackets ([ and
])  denote fixed-length sequences of data.  "n", "m" and "r"
denote integers.  For the full  language  specification  and
more  formal  definitions  of terms such as "identifier" and


Page 92                eXternal Data Representation Standard


"declaration", refer  to  The  XDR  Language  Specification,
below.

For some data types, more specific examples are included.  A
more  extensive example of a data description is in An Exam-
ple of an XDR Data Description, below.

3.1.  Integer

An XDR signed integer is a  32-bit  datum  that  encodes  an
integer  in the range [-2147483648,2147483647].  The integer
is represented in two's complement notation.  The  most  and
least significant bytes are 0 and 3, respectively.  Integers
are declared as follows:

Integer

(MSB)                   (LSB)
+-------+-------+-------+-------+
|byte 0 |byte 1 |byte 2 |byte 3 |
+-------+-------+-------+-------+
<------------32 bits------------>


3.2.  Unsigned Integer

An XDR unsigned integer is a 32-bit  datum  that  encodes  a
nonnegative  integer  in  the  range  [0,4294967295].  It is
represented by an unsigned  binary  number  whose  most  and
least  significant  bytes  are  0  and  3, respectively.  An
unsigned integer is declared as follows:

Unsigned Integer

(MSB)                   (LSB)
+-------+-------+-------+-------+
|byte 0 |byte 1 |byte 2 |byte 3 |
+-------+-------+-------+-------+
<------------32 bits------------>


3.3.  Enumeration

Enumerations  have  the  same   representation   as   signed
integers.   Enumerations are handy for describing subsets of
the integers.  Enumerated data is declared as follows:

        enum { name-identifier = constant, ... } identifier;

For example, the three colors red, yellow, and blue could be
described by an enumerated type:

        enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;


eXternal Data Representation Standard                Page 93


It is an error to encode as an enum any other  integer  than
those  that have been given assignments in the enum declara-
tion.

3.4.  Boolean

Booleans are important enough and occur frequently enough to
warrant  their  own explicit type in the standard.  Booleans
are declared as follows:

        bool identifier;

This is equivalent to:

        enum { FALSE = 0, TRUE = 1 } identifier;


3.5.  Hyper Integer and Unsigned Hyper Integer

The standard also defines  64-bit  (8-byte)  numbers  called
hyper integer and unsigned hyper integer.  Their representa-
tions are the obvious extensions  of  integer  and  unsigned
integer  defined  above.  They are represented in two's com-
plement notation.  The most and least significant bytes  are
0 and 7, respectively.  Their declarations:

Hyper Integer
Unsigned Hyper Integer

(MSB)                                                   (LSB)
+-------+-------+-------+-------+-------+-------+-------+-------+
|byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 |
+-------+-------+-------+-------+-------+-------+-------+-------+
<----------------------------64 bits---------------------------->


3.6.  Floating-point

The standard defines the floating-point  data  type  "float"
(32  bits  or 4 bytes).  The encoding used is the IEEE stan-
dard for normalized single-precision floating-point  numbers
[3].   The  following  three  fields  describe  the  single-
precision floating-point number:

     S:   The sign of the number.  Values 0 and  1 represent
          positive and negative, respectively.  One bit.

     E:   The exponent of the number, base 2.  8   bits  are
          devoted  to this field.  The exponent is biased by
          127.

     F:   The fractional  part  of  the  number's  mantissa,
          base 2.   23 bits are devoted to this field.


Page 94                eXternal Data Representation Standard


Therefore, the floating-point number is described by:

        (-1)**S * 2**(E-Bias) * 1.F

It is declared as follows:

Single-Precision Floating-Point

+-------+-------+-------+-------+
|byte 0 |byte 1 |byte 2 |byte 3 |
S|   E   |           F          |
+-------+-------+-------+-------+
1|<- 8 ->|<-------23 bits------>|
<------------32 bits------------>

Just as the most and least significant bytes of a number are
0  and  3,  the most and least significant bits of a single-
precision floating- point number are 0 and 31.   The  begin-
ning  bit  (and most significant bit) offsets of S, E, and F
are 0, 1, and 9,  respectively.   Note  that  these  numbers
refer  to the mathematical positions of the bits, and NOT to
their actual physical locations (which vary from  medium  to
medium).

The IEEE specifications should be consulted  concerning  the
encoding  for  signed  zero, signed infinity (overflow), and
denormalized numbers (underflow)  [3].   According  to  IEEE
specifications, the "NaN" (not a number) is system dependent
and should not be used externally.

3.7.  Double-precision Floating-point

The standard defines the encoding for  the  double-precision
floating-  point  data  type  "double" (64 bits or 8 bytes).
The encoding  used  is  the  IEEE  standard  for  normalized
double-precision  floating-point  numbers [3].  The standard
encodes the  following  three  fields,  which  describe  the
double-precision floating-point number:

     S:   The  sign  of  the  number.   Values   0   and   1
          represent  positive  and  negative,  respectively.
          One bit.

     E:   The exponent of the number, base 2.  11  bits  are
          devoted  to this field.  The exponent is biased by
          1023.

     F:   The fractional part  of  the  number's   mantissa,
          base 2.   52 bits are devoted to this field.

Therefore, the floating-point number is described by:

        (-1)**S * 2**(E-Bias) * 1.F


eXternal Data Representation Standard                Page 95


It is declared as follows:

Double-Precision Floating-Point

+------+------+------+------+------+------+------+------+
|byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7|
S|    E   |                    F                        |
+------+------+------+------+------+------+------+------+
1|<--11-->|<-----------------52 bits------------------->|
<-----------------------64 bits------------------------->

Just as the most and least significant bytes of a number are
0  and  3,  the most and least significant bits of a double-
precision floating- point number are 0 and 63.   The  begin-
ning  bit (and most significant bit) offsets of S, E , and F
are 0, 1, and 12, respectively.   Note  that  these  numbers
refer  to the mathematical positions of the bits, and NOT to
their actual physical locations (which vary from  medium  to
medium).

The IEEE specifications should be consulted  concerning  the
encoding  for  signed  zero, signed infinity (overflow), and
denormalized numbers (underflow)  [3].   According  to  IEEE
specifications, the "NaN" (not a number) is system dependent
and should not be used externally.

3.8.  Fixed-length Opaque Data

At times, fixed-length uninterpreted data needs to be passed
among  machines.   This  data  is  called  "opaque"  and  is
declared as follows:

        opaque identifier[n];

where the constant n is the (static) number of bytes  neces-
sary  to contain the opaque data.  If n is not a multiple of
four, then the n bytes are followed by enough (0 to 3) resi-
dual  zero  bytes,  r,  to  make the total byte count of the
opaque object a multiple of four.

Fixed-Length Opaque

0        1     ...
+--------+--------+...+--------+--------+...+--------+
| byte 0 | byte 1 |...|byte n-1|    0   |...|    0   |
+--------+--------+...+--------+--------+...+--------+
|<-----------n bytes---------->|<------r bytes------>|
|<-----------n+r (where (n+r) mod 4 = 0)------------>|


3.9.  Variable-length Opaque Data

The standard also  provides  for  variable-length  (counted)
opaque  data, defined as a sequence of n (numbered 0 through


Page 96                eXternal Data Representation Standard


n-1) arbitrary bytes to  be  the  number  n  encoded  as  an
unsigned integer (as described below), and followed by the n
bytes of the sequence.

Byte m of the sequence  always  precedes  byte  m+1  of  the
sequence,  and  byte  0  of  the sequence always follows the
sequence's length (count).  enough (0 to  3)  residual  zero
bytes,  r,  to make the total byte count a multiple of four.
Variable-length opaque data is  declared  in  the  following
way:

        opaque identifier<m>;

or

        opaque identifier<>;

The constant m denotes an upper bound of the number of bytes
that the sequence may contain.  If m is not specified, as in
the second declaration, it is assumed to be (2**32) - 1, the
maximum length.  The constant m would normally be found in a
protocol specification.  For example, a filing protocol  may
state  that the maximum data transfer size is 8192 bytes, as
follows:

        opaque filedata<8192>;

This can be illustrated as follows:

Variable-Length Opaque

0     1     2     3     4     5   ...
+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
|        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
|<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
|<----n+r (where (n+r) mod 4 = 0)---->|


It   is  an error  to  encode  a  length  greater  than  the
maximum described in the specification.

3.10.  String

The standard defines a string of n (numbered 0 through  n-1)
ASCII  bytes  to  be  the  number  n  encoded as an unsigned
integer (as described above), and followed by the n bytes of
the  string.   Byte m of the string always precedes byte m+1
of the string, and byte 0 of the string always  follows  the
string's length.  If n is not a multiple of four, then the n
bytes are followed by enough (0 to 3) residual  zero  bytes,
r, to make the total byte count a multiple of four.  Counted
byte strings are declared as follows:


eXternal Data Representation Standard                Page 97


        string object<m>;

or

        string object<>;

The constant m denotes an upper bound of the number of bytes
that a string may contain.  If m is not specified, as in the
second declaration, it is assumed to be  (2**32)  -  1,  the
maximum length.  The constant m would normally be found in a
protocol specification.  For example, a filing protocol  may
state  that  a file name can be no longer than 255 bytes, as
follows:

        string filename<255>;

Which can be illustrated as:

A String

0     1     2     3     4     5   ...
+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
|        length n       |byte0|byte1|...| n-1 |  0  |...|  0  |
+-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
|<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
|<----n+r (where (n+r) mod 4 = 0)---->|


It   is an  error  to  encode  a length greater  than    the
maximum described in the specification.

3.11.  Fixed-length Array

Declarations for fixed-length arrays of homogeneous elements
are in the following form:

        type-name identifier[n];

Fixed-length arrays of elements numbered 0 through  n-1  are
encoded  by  individually encoding the elements of the array
in their natural order, 0 through n-1.  Each element's  size
is  a multiple of four bytes. Though all elements are of the
same type, the elements may have different sizes.  For exam-
ple, in a fixed-length array of strings, all elements are of
type "string", yet each element will vary in its length.

Fixed-Length Array

+---+---+---+---+---+---+---+---+...+---+---+---+---+
|   element 0   |   element 1   |...|  element n-1  |
+---+---+---+---+---+---+---+---+...+---+---+---+---+
|<--------------------n elements------------------->|


Page 98                eXternal Data Representation Standard


3.12.  Variable-length Array

Counted arrays provide the ability to encode variable-length
arrays of homogeneous elements.  The array is encoded as the
element count n (an unsigned integer) followed by the encod-
ing of each of the array's elements, starting with element 0
and progressing through element n- 1.  The  declaration  for
variable-length arrays follows this form:

        type-name identifier<m>;

or

        type-name identifier<>;

The constant m  specifies  the  maximum  acceptable  element
count of an array; if  m is not specified, as  in the second
declaration, it is assumed to be (2**32) - 1.

Counted Array

0  1  2  3
+--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
|     n     | element 0 | element 1 |...|element n-1|
+--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
|<-4 bytes->|<--------------n elements------------->|

It is  an error to  encode  a  value of n that   is  greater
than the maximum described in the specification.

3.13.  Structure

Structures are declared as follows:

        struct {
                component-declaration-A;
                component-declaration-B;
                ...
        } identifier;

The components of the structure are encoded in the order  of
their  declaration  in the structure.  Each component's size
is a multiple of four bytes, though the  components  may  be
different sizes.

Structure

+-------------+-------------+...
| component A | component B |...
+-------------+-------------+...


eXternal Data Representation Standard                Page 99


3.14.  Discriminated Union

A discriminated union is a type composed of  a  discriminant
followed  by a type selected from a set of prearranged types
according to the value of the  discriminant.   The  type  of
discriminant   is   either  "int",  "unsigned  int",  or  an
enumerated type, such as "bool".  The  component  types  are
called "arms" of the union, and are preceded by the value of
the discriminant which  implies  their  encoding.   Discrim-
inated unions are declared as follows:

        union switch (discriminant-declaration) {
                case discriminant-value-A:
                arm-declaration-A;
                case discriminant-value-B:
                arm-declaration-B;
                ...
                default: default-declaration;
        } identifier;

Each "case" keyword is followed by  a  legal  value  of  the
discriminant.   The  default  arm is optional.  If it is not
specified, then a valid encoding of the union cannot take on
unspecified  discriminant  values.   The size of the implied
arm is always a multiple of four bytes.

The discriminated union is encoded as its discriminant  fol-
lowed by the encoding of the implied arm.

Discriminated Union

0   1   2   3
+---+---+---+---+---+---+---+---+
|  discriminant |  implied arm  |
+---+---+---+---+---+---+---+---+
|<---4 bytes--->|


3.15.  Void

An XDR void is a 0-byte  quantity.   Voids  are  useful  for
describing  operations that take no data as input or no data
as output. They are also useful in unions, where  some  arms
may contain data and others do not.  The declaration is sim-
ply as follows:

        void;

Voids are illustrated as follows:


Page 100               eXternal Data Representation Standard


Void

  ++
  ||
  ++
--><-- 0 bytes


3.16.  Constant

The data declaration for a constant follows this form:

        const name-identifier = n;

"const" is used to define a symbolic name for a constant; it
does  not  declare  any  data.  The symbolic constant may be
used anywhere a regular constant may be used.  For  example,
the  following  defines  a symbolic constant DOZEN, equal to
12.

        const DOZEN = 12;


3.17.  Typedef

"typedef" does not declare any data either,  but  serves  to
define new identifiers for declaring data. The syntax is:

        typedef declaration;

The new type name is  actually  the  variable  name  in  the
declaration part of the typedef.  For example, the following
defines a new type called "eggbox" using  an  existing  type
called "egg":

        typedef egg eggbox[DOZEN];

Variables declared using the new type  name  have  the  same
type  as  the new type name would have in the typedef, if it
was considered a variable.  For example, the  following  two
declarations   are  equivalent  in  declaring  the  variable
"fresheggs":

        eggbox  fresheggs;
        egg     fresheggs[DOZEN];

When a typedef involves a struct, enum, or union definition,
there  is  another  (preferred)  syntax  that may be used to
define the same type.  In general, a typedef of the  follow-
ing form:

        typedef <<struct, union, or enum definition>> identifier;


eXternal Data Representation Standard               Page 101


may be converted to the alternative  form  by  removing  the
"typedef"   part   and  placing  the  identifier  after  the
"struct", "union", or "enum" keyword, instead of at the end.
For  example,  here  are  the  two  ways  to define the type
"bool":

        typedef enum {    /* using typedef */
                FALSE = 0,
                TRUE = 1
                } bool;

        enum bool {       /* preferred alternative */
                FALSE = 0,
                TRUE = 1
                };

The reason this syntax is preferred is one does not have  to
wait  until  the end of a declaration to figure out the name
of the new type.

3.18.  Optional-data

Optional-data is one kind of union that occurs so frequently
that  we  give  it a special syntax of its own for declaring
it.  It is declared as follows:

        type-name *identifier;

This is equivalent to the following union:

        union switch (bool opted) {
                case TRUE:
                type-name element;
                case FALSE:
                void;
        } identifier;

It is also equivalent to the following variable-length array
declaration, since the boolean "opted" can be interpreted as
the length of the array:

        type-name identifier<1>;

Optional-data is not so interesting in  itself,  but  it  is
very useful for describing recursive data-structures such as
linked-lists and trees.  For example, the following  defines
a  type  "stringlist" that encodes lists of arbitrary length
strings:

        struct *stringlist {
                string item<>;
                stringlist next;
        };


Page 102               eXternal Data Representation Standard


It could have been equivalently declared  as  the  following
union:

        union stringlist switch (bool opted) {
                case TRUE:
                        struct {
                                string item<>;
                                stringlist next;
                        } element;
                case FALSE:
                        void;
        };

or as a variable-length array:

        struct stringlist<1> {
                string item<>;
                stringlist next;
        };

Both of these declarations  obscure  the  intention  of  the
stringlist  type,  so  the optional-data declaration is pre-
ferred over both of them.  The optional-data type also has a
close  correlation  to  how  recursive  data  structures are
represented in high-level languages such as Pascal or  C  by
use  of pointers. In fact, the syntax is the same as that of
the C language for pointers.

3.19.  Areas for Future Enhancement

The XDR standard lacks representations for  bit  fields  and
bitmaps, since the standard is based on bytes.  Also missing
are packed (or binary-coded) decimals.

The intent of the XDR standard was  not  to  describe  every
kind of data that people have ever sent or will ever want to
send from machine to machine. Rather, it only describes  the
most  commonly  used data-types of high-level languages such
as Pascal  or  C  so  that  applications  written  in  these
languages  will  be  able  to  communicate  easily over some
medium.

One could imagine  extensions  to  XDR  that  would  let  it
describe  almost  any  existing  protocol, such as TCP.  The
minimum necessary for this are support for  different  block
sizes and byte-orders.  The XDR discussed here could then be
considered the 4-byte big-endian member of a larger XDR fam-
ily.

4.  Discussion


eXternal Data Representation Standard               Page 103


4.1.  Why a Language for Describing Data?

There  are  many  advantages  in  using  a  data-description
language  such  as  XDR  versus using  diagrams.   Languages
are  more  formal than diagrams   and   lead  to less  ambi-
guous    descriptions   of  data.  Languages are also easier
to understand and allow  one  to  think  of  other    issues
instead  of  the   low-level details of bit-encoding.  Also,
there is  a close analogy  between the  types  of XDR and  a
high-level  language    such   as  C    or    Pascal.   This
makes   the implementation  of  XDR  encoding  and  decoding
modules an easier task.  Finally, the language specification
itself  is an ASCII string that can be passed from   machine
to machine  to perform  on-the-fly data interpretation.

4.2.  Why Only one Byte-Order for an XDR Unit?

Supporting two byte-orderings requires a higher level proto-
col for determining in which byte-order the data is encoded.
Since XDR is not a protocol, this can't be done.  The advan-
tage  of  this,  though,  is  that data in XDR format can be
written to a magnetic tape, for  example,  and  any  machine
will be able to interpret it, since no higher level protocol
is necessary for determining the byte-order.

4.3.  Why does XDR use Big-Endian Byte-Order?

Yes, it is unfair, but having only one byte-order means  you
have  to be unfair to somebody.  Many architectures, such as
the Motorola 68000  and  IBM  370,  support  the  big-endian
byte-order.

4.4.  Why is the XDR Unit Four Bytes Wide?

There is a tradeoff in choosing the XDR unit size.  Choosing
a  small  size such as two makes the encoded data small, but
causes alignment problems for machines that  aren't  aligned
on  these  boundaries.  A large size such as eight means the
data will be aligned on virtually every machine, but  causes
the  encoded  data  to  grow  too  big.   We chose four as a
compromise.  Four is big enough to  support  most  architec-
tures  efficiently,  except  for  rare  machines such as the
eight-byte aligned Cray.  Four is also small enough to  keep
the encoded data restricted to a reasonable size.

4.5.  Why must Variable-Length Data be Padded with Zeros?

It is desirable that the same  data  encode  into  the  same
thing  on all machines, so that encoded data can be meaning-
fully compared or checksummed.  Forcing the padded bytes  to
be zero ensures this.


Page 104               eXternal Data Representation Standard


4.6.  Why is there No Explicit Data-Typing?

Data-typing has a relatively high cost for what small advan-
tages it may have.  One cost is the expansion of data due to
the inserted type fields.  Another  is  the  added  cost  of
interpreting  these type fields and acting accordingly.  And
most protocols already know what type they expect, so  data-
typing  supplies  only  redundant information.  However, one
can still get the benefits of data-typing using XDR. One way
is  to  encode  two  things: first a string which is the XDR
data description of the encoded data, and then  the  encoded
data  itself.   Another  way is to assign a value to all the
types in XDR, and then define a universal type  which  takes
this value as its discriminant and for each value, describes
the corresponding data type.

5.  The XDR Language Specification


5.1.  Notational Conventions

This specification  uses an extended Backus-Naur Form  nota-
tion  for  describing  the  XDR language.   Here is  a brief
description  of the notation:

1.   The characters |, (, ), [, ],  , and * are special.

2.   Terminal symbols are  strings of any   characters  sur-
     rounded by double quotes.

3.   Non-terminal symbols are strings of non-special charac-
     ters.

4.   Alternative items  are  separated  by  a  vertical  bar
     ("|").

5.   Optional items are enclosed in brackets.

6.   Items  are  grouped  together  by  enclosing  them   in
     parentheses.

7.   A * following an item means  0 or more  occurrences  of
     that item.

For example,  consider  the  following pattern:

"a " "very" (", " " very")* [" cold " "and"]  " rainy " ("day" | "night")


An infinite  number of  strings match  this pattern.  A  few
of them are:


eXternal Data Representation Standard               Page 105


        "a very rainy day"
        "a very, very rainy day"
        "a very cold and  rainy day"
        "a very, very, very cold and  rainy night"


5.2.  Lexical Notes

1.   Comments begin with '/*' and terminate with '*/'.

2.   White space serves to separate items and  is  otherwise
     ignored.

3.   An identifier is a  letter  followed  by   an  optional
     sequence  of  letters,  digits  or underbar ('_').  The
     case of identifiers is not ignored.

4.   A  constant is  a  sequence  of  one  or  more  decimal
     digits, optionally preceded by a minus-sign ('-').

5.3.  Syntax Information

        declaration:
                type-specifier identifier
                | type-specifier identifier "[" value "]"
                | type-specifier identifier "<" [ value ] ">"
                | "opaque" identifier "[" value "]"
                | "opaque" identifier "<" [ value ] ">"
                | "string" identifier "<" [ value ] ">"
                | type-specifier "*" identifier
                | "void"


        value:
                constant
                | identifier

        type-specifier:
                  [ "unsigned" ] "int"
                | [ "unsigned" ] "hyper"
                | "float"
                | "double"
                | "bool"
                | enum-type-spec
                | struct-type-spec
                | union-type-spec
                | identifier


Page 106               eXternal Data Representation Standard


        enum-type-spec:
                "enum" enum-body

        enum-body:
                "{"
                ( identifier "=" value )
                ( "," identifier "=" value )*
                "}"


        struct-type-spec:
                "struct" struct-body

        struct-body:
                "{"
                ( declaration ";" )
                ( declaration ";" )*
                "}"


        union-type-spec:
                "union" union-body

        union-body:
                "switch" "(" declaration ")" "{"
                ( "case" value ":" declaration ";" )
                ( "case" value ":" declaration ";" )*
                [ "default" ":" declaration ";" ]
                "}"

        constant-def:
                "const" identifier "=" constant ";"


        type-def:
                "typedef" declaration ";"
                | "enum" identifier enum-body ";"
                | "struct" identifier struct-body ";"
                | "union" identifier union-body ";"

        definition:
                type-def
                | constant-def

        specification:
                definition *


5.3.1.  Syntax Notes


1.   The following are keywords and cannot be used as  iden-
     tifiers:  "bool", "case", "const", "default", "double",


eXternal Data Representation Standard               Page 107


     "enum", "float", "hyper", "opaque", "string", "struct",
     "switch", "typedef", "union", "unsigned" and "void".

2.   Only unsigned constants may be used as size  specifica-
     tions  for  arrays.   If an identifier is used, it must
     have been declared previously as an  unsigned  constant
     in a "const" definition.

3.   Constant and type identifiers within  the  scope  of  a
     specification  are  in  the same name space and must be
     declared uniquely within this scope.

4.   Similarly, variable names must  be unique  within   the
     scope   of struct and union declarations. Nested struct
     and union declarations create new scopes.

5.   The discriminant of a union must  be  of  a  type  that
     evaluates  to  an  integer.  That  is, "int", "unsigned
     int", "bool", an enumerated type or any typedefed  type
     that  evaluates  to  one  of these is legal.  Also, the
     case values must be one of  the  legal  values  of  the
     discriminant.   Finally, a case value may not be speci-
     fied more  than  once  within  the  scope  of  a  union
     declaration.

6.  An Example of an XDR Data Description

Here is a short XDR data description of  a  thing  called  a
"file",  which  might  be  used  to  transfer files from one
machine to another.


Page 108               eXternal Data Representation Standard


const MAXUSERNAME = 32;     /* max length of a user name */
const MAXFILELEN = 65535;   /* max length of a file      */
const MAXNAMELEN = 255;     /* max length of a file name */

/*
 * Types of files:
 */

enum filekind {
        TEXT = 0,       /* ascii data */
        DATA = 1,       /* raw data   */
        EXEC = 2        /* executable */
};

/*
 * File information, per kind of file:
 */

union filetype switch (filekind kind) {
        case TEXT:
                void;                           /* no extra information */
        case DATA:
                string creator<MAXNAMELEN>;     /* data creator         */
        case EXEC:
                string interpretor<MAXNAMELEN>; /* program interpretor  */
};

/*
 * A complete file:
 */

struct file {
        string filename<MAXNAMELEN>; /* name of file */
        filetype type;               /* info about file */
        string owner<MAXUSERNAME>;   /* owner of file   */
        opaque data<MAXFILELEN>;     /* file data       */
};


Suppose now that there is  a user named  "john" who wants to
store  his  lisp program "sillyprog" that contains just  the
data "(quit)".  His file would be encoded as follows:


eXternal Data Representation Standard               Page 109


______________________________________________________________
 Offset   Hex Bytes     ASCII   Description
______________________________________________________________
      0   00 00 00 09    ....   Length of filename = 9
      4   73 69 6c 6c    sill   Filename characters
      8   79 70 72 6f    ypro    ... and more characters ...
     12   67 00 00 00    g...    ... and 3 zero-bytes of fill
     16   00 00 00 02    ....   Filekind is EXEC = 2
     20   00 00 00 04    ....   Length of interpretor = 4
     24   6c 69 73 70    lisp   Interpretor characters
     28   00 00 00 04    ....   Length of owner = 4
     32   6a 6f 68 6e    john   Owner characters
     36   00 00 00 06    ....   Length of file data = 6
     40   28 71 75 69    (qui   File data bytes ...
     44   74 29 00 00    t)..    ... and 2 zero-bytes of fill
______________________________________________________________















|


                                                             |


7.  References

[1]  Brian W. Kernighan & Dennis M. Ritchie, "The C Program-
ming  Language", Bell Laboratories, Murray Hill, New Jersey,
1978.

[2]  Danny Cohen, "On Holy Wars and a Plea for Peace",  IEEE
Computer, October 1981.

[3]  "IEEE Standard for Binary  Floating-Point  Arithmetic",
ANSI/IEEE  Standard  754-1985,  Institute  of Electrical and
Electronics Engineers, August 1985.

[4]  "Courier: The Remote Procedure  Call  Protocol",  XEROX
Corporation, XSIS 038112, December 1981.

Remote Procedure Calls: Protocol Specification


1.  Status of this Memo

Note: This chapter specifies a protocol that  Sun  Microsys-
tems,  Inc., and others are using.  It has been submitted to
the ARPA-Internet for  consideration  as  an  RFC.   Certain
details  may  change as a result of comments made during the
review of this draft standard.


2.  Introduction

This chapter specifies  a  message protocol  used in  imple-
menting  Sun's  Remote  Procedure  Call (RPC) package.  (The
message  protocol  is  specified  with  the  eXternal   Data
Representation   (XDR)  language.   See  the  eXternal  Data
Representation Standard for the details.   Here,  we  assume
that  the  reader is familiar with XDR and do not attempt to


Page 110      Remote Procedure Calls: Protocol Specification


justify RPC or  its uses).  The paper by Birrell and  Nelson
[1]  is recommended as an  excellent background to  and jus-
tification of RPC.

2.1.  Terminology

This chapter discusses  servers,  services,  programs,  pro-
cedures,  clients,  and  versions.   A  server is a piece of
software where network services are implemented.  A  network
service  is  a collection of one or more remote programs.  A
remote program implements one or more remote procedures; the
procedures,  their parameters, and results are documented in
the specific program's protocol specification (see the  Port
Mapper  Program  Protocol,  below, for an example).  Network
clients are pieces of software  that  initiate  remote  pro-
cedure  calls  to  services.  A server may support more than
one version of a remote program in order to be forward  com-
patible with changing protocols.

For example, a network file service may be composed  of  two
programs.  One program may deal with high-level applications
such as file system access control and locking.   The  other
may  deal  with  low-level  file IO and have procedures like
"read" and "write".  A client machine of  the  network  file
service  would  call  the procedures associated with the two
programs of the service on behalf of some user on the client
machine.

2.2.  The RPC Model

The remote procedure call model is similar to the local pro-
cedure  call  model.   In  the local case, the caller places
arguments to a procedure  in  some  well-specified  location
(such  as  a result register).  It then transfers control to
the procedure, and eventually gains back control.   At  that
point,  the  results of the procedure are extracted from the
well-specified location, and the caller continues execution.

The remote procedure call is similar, in that one thread  of
control  logically  winds  through  two processes-one is the
caller's process, the other is a server's process.  That is,
the  caller  process sends a call message to the server pro-
cess and waits (blocks) for a reply message.  The call  mes-
sage   contains  the  procedure's  parameters,  among  other
things.  The reply message contains the procedure's results,
among other things.  Once the reply message is received, the
results of the procedure are extracted, and caller's  execu-
tion is resumed.

On the server  side,  a  process  is  dormant  awaiting  the
arrival  of  a  call  message.  When one arrives, the server
process extracts the procedure's  parameters,  computes  the
results,  sends  a  reply  message, and then awaits the next
call message.


Remote Procedure Calls: Protocol Specification      Page 111


Note that in this model, only one of the  two  processes  is
active at any given time.  However, this model is only given
as an example.  The RPC protocol makes  no  restrictions  on
the  concurrency model implemented, and others are possible.
For example, an implementation may choose to have RPC  calls
be asynchronous, so that the client may do useful work while
waiting for the reply from the server.  Another  possibility
is  to  have the server create a task to process an incoming
request, so that the server can be  free  to  receive  other
requests.

2.3.  Transports and Semantics

The RPC protocol  is  independent  of  transport  protocols.
That  is, RPC does not care how a message is passed from one
process to another.  The protocol deals only with specifica-
tion and interpretation of messages.

It is important to point out that RPC does not try to imple-
ment  any  kind of reliability and that the application must
be aware of the type of transport protocol  underneath  RPC.
If  it  knows  it  is running on top of a reliable transport
such as TCP/IP[6], then most of the work is already done for
it.   On  the  other  hand,  if  it  is running on top of an
unreliable transport such as UDP/IP[7], it must implement is
own retransmission and time-out policy as the RPC layer does
not provide this service.

Because of transport independence, the RPC protocol does not
attach  specific semantics to the remote procedures or their
execution.  Semantics can be inferred from  (but  should  be
explicitly  specified by) the underlying transport protocol.
For example, consider RPC running on top  of  an  unreliable
transport such as UDP/IP.  If an application retransmits RPC
messages after short time-outs, the only thing it can  infer
if  it  receives no reply is that the procedure was executed
zero or more times.  If it does receive a reply, then it can
infer that the procedure was executed at least once.

A server may wish to remember  previously  granted  requests
from  a  client and not regrant them in order to insure some
degree of execute-at-most-once semantics.  A server  can  do
this by taking advantage of the transaction ID that is pack-
aged with every RPC request.  The main use of this  transac-
tion  is  by  the  client  RPC  layer in matching replies to
requests.  However, a client application may choose to reuse
its  previous  transaction ID when retransmitting a request.
The server application, knowing this  fact,  may  choose  to
remember  this  ID  after granting a request and not regrant
requests with the same ID in order to achieve some degree of
execute-at-most-once  semantics.   The server is not allowed
to examine this ID in any other way except  as  a  test  for
equality.


Page 112      Remote Procedure Calls: Protocol Specification


On the other hand, if using a  reliable  transport  such  as
TCP/IP,  the application can infer from a reply message that
the procedure was executed exactly once, but if it  receives
no  reply message, it cannot assume the remote procedure was
not executed.  Note that even if a connection-oriented  pro-
tocol like TCP is used, an application still needs time-outs
and reconnection to handle server crashes.

There  are  other  possibilities  for   transports   besides
datagram-  or connection-oriented protocols.  For example, a
request-reply protocol such as VMTP[2] is perhaps  the  most
natural transport for RPC.

NOTE:  At Sun, RPC is currently implemented on top  of  both
TCP/IP and UDP/IP transports.


2.4.  Binding and Rendezvous Independence

The act of binding a client to a service is NOT part of  the
remote  procedure  call  specification.   This important and
necessary function is left up to some higher-level software.
(The software may use RPC itself-see the Port Mapper Program
Protocol, below).

Implementors should think of the RPC protocol as  the  jump-
subroutine  instruction  ("JSR")  of  a  network; the loader
(binder) makes JSR useful, and the loader itself uses JSR to
accomplish  its  task.  Likewise, the network makes RPC use-
ful, using RPC to accomplish this task.

2.5.  Authentication

The RPC protocol provides the fields necessary for a  client
to  identify  itself  to a service and vice-versa.  Security
and access control mechanisms can be built  on  top  of  the
message  authentication.   Several  different authentication
protocols can be supported.  A field in the RPC header indi-
cates  which  protocol  is  being used.  More information on
specific  authentication  protocols  can  be  found  in  the
Authentication Protocols, below.

3.  RPC Protocol Requirements

The RPC protocol must provide for the following:

1.   Unique specification of a procedure to be called.

2.   Provisions for matching response  messages  to  request
     messages.

3.   Provisions for authenticating the caller to service and
     vice-versa.


Remote Procedure Calls: Protocol Specification      Page 113


Besides these requirements, features that detect the follow-
ing  are  worth  supporting  because  of  protocol roll-over
errors,  implementation  bugs,  user  error,   and   network
administration:

1.   RPC protocol mismatches.

2.   Remote program protocol version mismatches.

3.   Protocol  errors  (such  as   misspecification   of   a
     procedure's parameters).

4.   Reasons why remote authentication failed.

5.   Any other reasons why the  desired  procedure  was  not
     called.

3.1.  Programs and Procedures

The RPC call message has three unsigned fields:  remote pro-
gram  number, remote program version number, and remote pro-
cedure number.  The three fields uniquely identify the  pro-
cedure  to  be  called.  Program numbers are administered by
some central authority (like Sun).  Once an implementor  has
a  program  number, he can implement his remote program; the
first implementation would  most  likely  have  the  version
number of 1.  Because most new protocols evolve into better,
stable, and mature protocols, a version field  of  the  call
message  identifies which version of the protocol the caller
is using.  Version numbers make speaking old and new  proto-
cols through the same server process possible.

The procedure number identifies the procedure to be  called.
These  numbers are documented in the specific program's pro-
tocol specification.  For example, a file service's protocol
specification  may  state  that  its  procedure  number 5 is
"read" and procedure number 12 is "write".

Just as remote program protocols  may  change  over  several
versions, the actual RPC message protocol could also change.
Therefore, the call message also has in it the  RPC  version
number,  which is always equal to two for the version of RPC
described here.

The reply message to a request  message  has enough   infor-
mation to distinguish the following error conditions:

1.   The remote implementation of RPC  does  speak  protocol
     version  2.   The lowest and highest supported RPC ver-
     sion numbers are returned.

2.   The remote program is not available on the remote  sys-
     tem.


Page 114      Remote Procedure Calls: Protocol Specification


3.   The remote program does not support the requested  ver-
     sion  number.   The lowest and highest supported remote
     program version numbers are returned.

4.   The requested procedure number does not  exist.   (This
     is  usually  a  caller  side  protocol  or  programming
     error.)

5.   The parameters to the remote  procedure  appear  to  be
     garbage  from the server's point of view.  (Again, this
     is usually caused by a disagreement about the  protocol
     between client and service.)

3.2.  Authentication

Provisions for  authentication  of  caller  to  service  and
vice-versa  are provided as a part of the RPC protocol.  The
call message has two authentication fields, the  credentials
and  verifier.   The  reply  message  has one authentication
field, the response verifier.  The RPC  protocol  specifica-
tion  defines  all  three  fields to be the following opaque
type:

        enum auth_flavor {
            AUTH_NULL        = 0,
            AUTH_UNIX        = 1,
            AUTH_SHORT       = 2,
            /* and more to be defined */
        };

        struct opaque_auth {
            auth_flavor flavor;
            opaque body<400>;
        };


In  simple  English,  any  opaque_auth   structure   is   an
auth_flavor  enumeration followed by bytes which are  opaque
to the RPC protocol implementation.

The interpretation and  semantics   of  the  data  contained
within  the authentication   fields  is specified  by  indi-
vidual,   independent  authentication   protocol  specifica-
tions.    (See  Authentication Protocols, below, for defini-
tions of the various authentication protocols.)

If authentication parameters were   rejected, the   response
message contains information stating why they were rejected.

3.3.  Program Number Assignment

Program numbers  are  given  out  in  groups  of  0x20000000
(decimal 536870912) according to the following chart:


Remote Procedure Calls: Protocol Specification      Page 115


_______________________________________
 Program Numbers       Description
_______________________________________
        0 - 1fffffff   Defined by Sun
 20000000 - 3fffffff   Defined by user
 40000000 - 5fffffff      Transient
 60000000 - 7fffffff      Reserved
 80000000 - 9fffffff      Reserved
 a0000000 - bfffffff      Reserved
 c0000000 - dfffffff      Reserved
 e0000000 - ffffffff      Reserved
_______________________________________











|


                                      |


The first group is a range of numbers  administered  by  Sun
Microsystems  and  should  be  identical for all sites.  The
second range is for applications peculiar  to  a  particular
site.   This  range  is intended primarily for debugging new
programs.  When a site develops an application that might be
of  general  interest,  that  application should be given an
assigned number in the first range.  The third group is  for
applications that generate program numbers dynamically.  The
final groups are reserved for future use, and should not  be
used.

3.4.  Other Uses of the RPC Protocol

The intended use of this protocol is for calling remote pro-
cedures.   That  is,  each  call  message  is matched with a
response  message.   However,  the  protocol  itself  is   a
message-passing  protocol  with which other (non-RPC) proto-
cols can be implemented.  Sun  currently  uses,  or  perhaps
abuses,  the  RPC  message  protocol  for  the following two
(non-RPC) protocols:  batching (or pipelining) and broadcast
RPC.   These  two  protocols  are  discussed but not defined
below.

3.4.1.  Batching

Batching allows  a  client  to  send  an  arbitrarily  large
sequence  of  call  messages to a server; batching typically
uses reliable byte stream protocols (like  TCP/IP)  for  its
transport.   In the case of batching, the client never waits
for a reply from the server, and the server  does  not  send
replies  to  batch  requests.   A sequence of batch calls is
usually terminated by a legitimate RPC in order to flush the
pipeline (with positive acknowledgement).

3.4.2.  Broadcast RPC

In broadcast RPC-based protocols, the client sends a  broad-
cast  packet  to the network and waits for numerous replies.
Broadcast RPC uses unreliable, packet-based protocols  (like
UDP/IP)  as  its transports.  Servers that support broadcast


Page 116      Remote Procedure Calls: Protocol Specification


protocols only respond when the request is successfully pro-
cessed, and are silent in the face of errors.  Broadcast RPC
uses the Port Mapper RPC service to achieve  its  semantics.
See the Port Mapper Program Protocol, below, for more infor-
mation.


Remote Procedure Calls: Protocol Specification      Page 117


4.  The RPC Message Protocol

This section defines the RPC message  protocol  in  the  XDR
data  description  language.   The  message  is defined in a
top-down style.

enum msg_type {
        CALL  = 0,
        REPLY = 1
};

/*
* A reply to a call message can take on two forms:
* The message was either accepted or rejected.
*/
enum reply_stat {
        MSG_ACCEPTED = 0,
        MSG_DENIED   = 1
};

/*
* Given that a call message was accepted,  the following is the
* status of an attempt to call a remote procedure.
*/
enum accept_stat {
        SUCCESS       = 0, /* RPC executed successfully       */
        PROG_UNAVAIL  = 1, /* remote hasn't exported program  */
        PROG_MISMATCH = 2, /* remote can't support version #  */
        PROC_UNAVAIL  = 3, /* program can't support procedure */
        GARBAGE_ARGS  = 4  /* procedure can't decode params   */
};


/*
* Reasons why a call message was rejected:
*/
enum reject_stat {
        RPC_MISMATCH = 0, /* RPC version number != 2          */
        AUTH_ERROR = 1    /* remote can't authenticate caller */
};

/*
* Why authentication failed:
*/
enum auth_stat {
        AUTH_BADCRED      = 1,  /* bad credentials (seal broken) */
        AUTH_REJECTEDCRED = 2,  /* client must begin new session */
        AUTH_BADVERF      = 3,  /* bad verifier (seal broken)    */
        AUTH_REJECTEDVERF = 4,  /* verifier expired or replayed  */
        AUTH_TOOWEAK      = 5   /* rejected for security reasons */
};


Page 118      Remote Procedure Calls: Protocol Specification


/*
* The  RPC  message:
* All   messages  start with   a transaction  identifier,  xid,
* followed  by a  two-armed  discriminated union.   The union's
* discriminant is a  msg_type which switches to  one of the two
* types   of the message.   The xid  of a REPLY  message always
* matches  that of the initiating CALL   message.   NB: The xid
* field is only  used for clients  matching reply messages with
* call messages  or for servers detecting  retransmissions; the
* service side  cannot treat this id  as any type   of sequence
* number.
*/
struct rpc_msg {
        unsigned int xid;
        union switch (msg_type mtype) {
                case CALL:
                        call_body cbody;
                case REPLY:
                        reply_body rbody;
        } body;
};


/*
* Body of an RPC request call:
* In version 2 of the  RPC protocol specification, rpcvers must
* be equal to 2.  The  fields prog,  vers, and proc specify the
* remote program, its version number, and the  procedure within
* the remote program to be called.  After these  fields are two
* authentication  parameters: cred (authentication credentials)
* and verf  (authentication verifier).  The  two authentication
* parameters are   followed by  the  parameters  to  the remote
* procedure,  which  are specified  by  the  specific   program
* protocol.
*/
struct call_body {
        unsigned int rpcvers;  /* must be equal to two (2) */
        unsigned int prog;
        unsigned int vers;
        unsigned int proc;
        opaque_auth cred;
        opaque_auth verf;
        /* procedure specific parameters start here */
};


Remote Procedure Calls: Protocol Specification      Page 119


/*
* Body of a reply to an RPC request:
* The call message was either accepted or rejected.
*/
union reply_body switch (reply_stat stat) {
        case MSG_ACCEPTED:
                accepted_reply areply;
        case MSG_DENIED:
                rejected_reply rreply;
} reply;


/*
* Reply to   an RPC request  that  was accepted  by the server:
* there could be an error even though the request was accepted.
* The first field is an authentication verifier that the server
* generates in order to  validate itself  to the caller.  It is
* followed by    a  union whose     discriminant  is   an  enum
* accept_stat.  The  SUCCESS  arm of    the union  is  protocol
* specific.  The PROG_UNAVAIL, PROC_UNAVAIL, and GARBAGE_ARGP
* arms of the union are void.   The PROG_MISMATCH arm specifies
* the lowest and highest version numbers of the  remote program
* supported by the server.
*/
struct accepted_reply {
        opaque_auth verf;
        union switch (accept_stat stat) {
                case SUCCESS:
                        opaque results[0];
                        /* procedure-specific results start here */
                case PROG_MISMATCH:
                        struct {
                                unsigned int low;
                                unsigned int high;
                        } mismatch_info;
                default:
                        /*
                        * Void.  Cases include PROG_UNAVAIL, PROC_UNAVAIL,
                        * and GARBAGE_ARGS.
                        */
                        void;
        } reply_data;
};


Page 120      Remote Procedure Calls: Protocol Specification


/*
* Reply to an RPC request that was rejected by the server:
* The request  can   be rejected for   two reasons:  either the
* server   is not  running a   compatible  version  of the  RPC
* protocol    (RPC_MISMATCH), or    the  server   refuses    to
* authenticate the  caller  (AUTH_ERROR).  In  case of  an  RPC
* version mismatch,  the server returns the  lowest and highest
* supported    RPC  version    numbers.  In   case   of refused
* authentication, failure status is returned.
*/
union rejected_reply switch (reject_stat stat) {
        case RPC_MISMATCH:
                struct {
                        unsigned int low;
                        unsigned int high;
                } mismatch_info;
        case AUTH_ERROR:
                auth_stat stat;
};


5.  Authentication Protocols

As previously stated, authentication parameters are  opaque,
but  open-ended  to the rest of the RPC protocol.  This sec-
tion defines some "flavors" of authentication implemented at
(and  supported by) Sun.  Other sites are free to invent new
authentication types, with the same rules of  flavor  number
assignment as there is for program number assignment.

5.1.  Null Authentication

Often calls must be made where the caller does not know  who
he  is  or  the  server does not care who the caller is.  In
this  case,  the  flavor  value  (the  discriminant  of  the
opaque_auth's  union)  of  the  RPC  message's  credentials,
verifier, and response verifier is AUTH_NULL.  The  bytes of
the  opaque_auth's  body   are undefined.  It is recommended
that the opaque length be zero.

5.2.  UNIX Authentication

The caller of a remote procedure may wish to  identify  him-
self  as  he  is identified on a UNIX system.  The  value of
the credential's discriminant of an  RPC  call   message  is
AUTH_UNIX.  the credential's opaque body encode the the fol-
lowing structure:


Remote Procedure Calls: Protocol Specification      Page 121


        struct auth_unix {
                unsigned int stamp;
                string machinename<255>;
                unsigned int uid;
                unsigned int gid;
                unsigned int gids<10>;
        };

The stamp is an  arbitrary    ID which the   caller  machine
may generate.  The machinename is the  name of the  caller's
machine (like  "krypton").  The uid is  the caller's  effec-
tive  user   ID.   The gid is  the caller's effective  group
ID.  The gids is  a counted array of  groups  which  contain
the  caller  as   a  member.   The verifier accompanying the
credentials  should  be  of AUTH_NULL (defined above).

The value of the  discriminant  of   the  response  verifier
received  in  the   reply  message  from  the    server  may
be AUTH_NULL or AUTH_SHORT.  In  the  case   of  AUTH_SHORT,
the bytes of the response verifier's string encode an opaque
structure.  This new opaque structure may now be  passed  to
the  server instead of the original AUTH_UNIX flavor creden-
tials.  The server keeps a cache which maps shorthand opaque
structures  (passed  back  by  way  of  an  AUTH_SHORT style
response  verifier)  to  the  original  credentials  of  the
caller.   The  caller  can save network bandwidth and server
cpu cycles by using the new credentials.

The server may flush the shorthand opaque structure  at  any
time.   If  this  happens, the remote procedure call message
will be rejected due to an authentication error.  The reason
for  the  failure will be AUTH_REJECTEDCRED.  At this point,
the caller may wish to try the original AUTH_UNIX  style  of
credentials.


Page 122      Remote Procedure Calls: Protocol Specification


6.  Record Marking Standard

When RPC messages are passed on top of a byte stream  proto-
col  (like  TCP/IP), it is necessary, or at least desirable,
to delimit one message from another in order to  detect  and
possibly  recover from user protocol errors.  This is called
record marking (RM).  Sun uses this RM/TCP/IP transport  for
passing  RPC  messages on TCP streams.  One RPC message fits
into one RM record.

A record is composed of one or  more  record  fragments.   A
record  fragment  is  a  four-byte  header  followed by 0 to
(2**31) - 1 bytes of fragment data.  The bytes encode an un-
signed binary number; as with DR integers, the byte order is
from highest to lowest.  The  number  encodes  two  values-a
boolean  which  indicates  whether  the fragment is the last
fragment of the record (bit value 1 implies the fragment  is
the  last fragment) and a 31-bit unsigned binary value which
is the length in bytes of the fragment's data.  The  boolean
value  is the highest-order bit of the header; the length is
the 31 low-order bits.  (Note that this record specification
is NOT in XDR standard form!)


Remote Procedure Calls: Protocol Specification      Page 123


7.  The RPC Language

Just as there was a need to describe the XDR data-types in a
formal  language,  there  is  also need to describe the pro-
cedures that operate on these XDR  data-types  in  a  formal
language as well.  We use the RPC Language for this purpose.
It is an extension to the XDR language.  The following exam-
ple is used to describe the essence of the language.

7.1.  An Example Service Described in the RPC Language

Here is an example of the specification  of  a  simple  ping
program.

/*
* Simple ping program
*/
program PING_PROG {
        /* Latest and greatest version */
        version PING_VERS_PINGBACK {
        void
        PINGPROC_NULL(void) = 0;

        /*
        * Ping the caller, return the round-trip time
        * (in microseconds). Returns -1 if the operation
        * timed out.
        */
        int
        PINGPROC_PINGBACK(void) = 1;
} = 2;

/*
* Original version
*/
version PING_VERS_ORIG {
        void
        PINGPROC_NULL(void) = 0;
        } = 1;
} = 1;

const PING_VERS = 2;      /* latest version */


The first version described is PING_VERS_PINGBACK with   two
procedures,     PINGPROC_NULL     and     PINGPROC_PINGBACK.
PINGPROC_NULL takes no arguments and returns no results, but
it  is useful for computing round-trip times from the client
to the server and back again.  By convention, procedure 0 of
any  RPC  protocol should have the same semantics, and never
require any kind of authentication.  The second procedure is
used  for  the  client  to have the server do a reverse ping
operation back to the client, and it returns the  amount  of


Page 124      Remote Procedure Calls: Protocol Specification


time  (in  microseconds)  that the operation used.  The next
version, PING_VERS_ORIG, is the original version of the pro-
tocol  and  it does not contain PINGPROC_PINGBACK procedure.
It  is useful for compatibility  with old client   programs,
and as  this program matures it may be dropped from the pro-
tocol entirely.

7.2.  The RPC Language Specification

The  RPC language is identical to  the XDR language,  except
for the added definition of a program-def described below.

        program-def:
                "program" identifier "{"
                        version-def
                        version-def *
                "}" "=" constant ";"

        version-def:
                "version" identifier "{"
                        procedure-def
                        procedure-def *
                "}" "=" constant ";"

        procedure-def:
                type-specifier identifier "(" type-specifier ")"
                "=" constant ";"


7.3.  Syntax Notes

1.   The following keywords  are  added   and    cannot   be
     used   as identifiers: "program" and "version";

2.   A version name cannot occur more than once  within  the
     scope of a program definition. Nor can a version number
     occur more than once within  the  scope  of  a  program
     definition.

3.   A procedure name cannot occur  more  than  once  within
     the  scope of a version definition. Nor can a procedure
     number occur more than once within the scope of version
     definition.

4.   Program identifiers are in the same name space as  con-
     stant and type identifiers.

5.   Only unsigned constants can  be assigned  to  programs,
     versions and procedures.

8.  Port Mapper Program Protocol

The port mapper program maps RPC program and version numbers
to  transport-specific  port  numbers.   This  program makes


Remote Procedure Calls: Protocol Specification      Page 125


dynamic binding of remote programs possible.

This is desirable because the range of reserved port numbers
is very small and the number of potential remote programs is
very large.  By running only the port mapper on  a  reserved
port,  the  port  numbers  of  other  remote programs can be
ascertained by querying the port mapper.

The port mapper also aids in broadcast  RPC.   A  given  RPC
program  will usually have different port number bindings on
different machines, so there is no way to directly broadcast
to  all  of  these programs.  The port mapper, however, does
have a fixed port number.  So, to broadcast to a given  pro-
gram,  the  client  actually  sends  its message to the port
mapper located at the broadcast address.  Each  port  mapper
that  picks  up  the  broadcast then calls the local service
specified by the client.  When  the  port  mapper  gets  the
reply  from the local service, it sends the reply on back to
the client.


Page 126      Remote Procedure Calls: Protocol Specification


8.1.  Port Mapper Protocol Specification (in RPC Language)

const PMAP_PORT = 111;      /* portmapper port number */

/*
* A mapping of (program, version, protocol) to port number
*/
struct mapping {
        unsigned int prog;
        unsigned int vers;
        unsigned int prot;
        unsigned int port;
};

/*
* Supported values for the "prot" field
*/
const IPPROTO_TCP = 6;      /* protocol number for TCP/IP */
const IPPROTO_UDP = 17;     /* protocol number for UDP/IP */

/*
* A list of mappings
*/
struct *pmaplist {
        mapping map;
        pmaplist next;
};


/*
* Arguments to callit
*/
struct call_args {
        unsigned int prog;
        unsigned int vers;
        unsigned int proc;
        opaque args<>;
};

/*
* Results of callit
*/
struct call_result {
        unsigned int port;
        opaque res<>;
};


Remote Procedure Calls: Protocol Specification      Page 127


/*
* Port mapper procedures
*/
program PMAP_PROG {
        version PMAP_VERS {
                void
                PMAPPROC_NULL(void)         = 0;

                bool
                PMAPPROC_SET(mapping)       = 1;

                bool
                PMAPPROC_UNSET(mapping)     = 2;

                unsigned int
                PMAPPROC_GETPORT(mapping)   = 3;

                pmaplist
                PMAPPROC_DUMP(void)         = 4;

                call_result
                PMAPPROC_CALLIT(call_args)  = 5;
        } = 2;
} = 100000;


8.2.  Port Mapper Operation

The portmapper  program  currently  supports  two  protocols
(UDP/IP and TCP/IP).  The portmapper is contacted by talking
to it on assigned port number 111 (SUNRPC [8]) on either  of
these  protocols.  The following is a description of each of
the portmapper procedures:

PMAPPROC_NULL:
     This procedure does no work.  By convention,  procedure
     zero of any protocol takes no parameters and returns no
     results.

PMAPPROC_SET:
     When a program first becomes available on a machine, it
     registers  itself  with  the port mapper program on the
     same machine.  The program passes  its  program  number
     "prog",   version  number  "vers",  transport  protocol
     number "prot", and the port "port" on which  it  awaits
     service  request.   The  procedure  returns  a  boolean
     response whose value is TRUE if the procedure  success-
     fully established the mapping and FALSE otherwise.  The
     procedure refuses to establish a mapping if one already
     exists for the tuple "(prog, vers, prot)".

PMAPPROC_UNSET:
     When  a  program   becomes   unavailable,   it   should


Page 128      Remote Procedure Calls: Protocol Specification


     unregister  itself  with the port mapper program on the
     same machine.  The parameters and results have meanings
     identical  to  those of PMAPPROC_SET.  The protocol and
     port number fields of the argument are ignored.

PMAPPROC_GETPORT:
     Given a program number "prog", version  number  "vers",
     and  transport  protocol  number "prot", this procedure
     returns the port number on which the program is  await-
     ing  call  requests.   A  port value of zeros means the
     program has not been registered.  The "port"  field  of
     the argument is ignored.

PMAPPROC_DUMP:
     This procedure  enumerates  all  entries  in  the  port
     mapper's  database.   The procedure takes no parameters
     and returns a list of program, version,  protocol,  and
     port values.

PMAPPROC_CALLIT:
     This procedure allows a caller to call  another  remote
     procedure  on  the  same  machine  without  knowing the
     remote procedure's port number.   It  is  intended  for
     supporting  broadcasts to arbitrary remote programs via
     the well-known  port  mapper's  port.   The  parameters
     "prog", "vers", "proc", and the bytes of "args" are the
     program number, version number, procedure  number,  and
     parameters of the remote procedure.

Note:

     1.   This procedure only sends a response if  the  pro-
          cedure was successfully executed and is silent (no
          response) otherwise.

     2.   The port mapper communicates with the remote  pro-
          gram using UDP/IP only.

The procedure returns the remote program's port number,  and
the  bytes  of  results  are  the results of the remote pro-
cedure.


Remote Procedure Calls: Protocol Specification      Page 129


9.  References

[1]  Birrell, Andrew D. & Nelson, Bruce  Jay;  "Implementing
Remote Procedure Calls"; XEROX CSL-83-7, October 1983.

[2]  Cheriton, D.;  "VMTP:   Versatile  Message  Transaction
Protocol",  Preliminary  Version  0.3;  Stanford University,
January 1987.

[3]  Diffie & Hellman;  "Net  Directions  in  Cryptography";
IEEE  Transactions  on  Information  Theory  IT-22, November
1976.

[4]  Harrenstien, K.; "Time Server",  RFC  738;  Information
Sciences Institute, October 1977.

[5]  National Bureau of Standards;  "Data  Encryption  Stan-
dard";  Federal Information Processing Standards Publication
46, January 1977.

[6]  Postel, J.;  "Transmission  Control  Protocol  -  DARPA
Internet  Program Protocol Specification", RFC 793; Informa-
tion Sciences Institute, September 1981.

[7]  Postel, J.; "User Datagram Protocol", RFC 768; Informa-
tion Sciences Institute, August 1980.

[8]  Reynolds, J.  & Postel,  J.;  "Assigned  Numbers",  RFC
923; Information Sciences Institute, October 1984.

Network File System: Version 2 Protocol Specification


1.  Status of this Standard

Note: This chapter specifies a protocol that  Sun  Microsys-
tems,  Inc., and others are using.  It specifies it in stan-
dard ARPA RFC form.

2.  Introduction

The Sun Network Filesystem  (NFS)  protocol  provides  tran-
sparent  remote access to shared filesystems over local area
networks.  The NFS  protocol  is  designed  to  be  machine,
operating system, network architecture, and transport proto-
col independent.  This independence is achieved through  the
use  of  Remote Procedure Call (RPC) primitives built on top
of an eXternal Data Representation  (XDR).   Implementations
exist  for a variety of machines, from personal computers to
supercomputers.

The supporting mount protocol allows the server to hand  out
remote access privileges to a restricted set of clients.  It
performs the operating system-specific functions that allow,


Page 13Network File System: Version 2 Protocol Specification


for  example, to attach remote directory trees to some local
file system.

2.1.  Remote Procedure Call

Sun's  remote  procedure  call  specification   provides   a
procedure-  oriented  interface  to  remote  services.  Each
server supplies a program that is a set of procedures.   NFS
is  one  such  "program".   The combination of host address,
program number, and procedure number  specifies  one  remote
service procedure.  RPC does not depend on services provided
by specific protocols, so it can be used with any underlying
transport  protocol.  See the Remote Procedure Calls: Proto-
col Specification chapter of this manual.

2.2.  External Data Representation

The eXternal Data Representation (XDR) standard  provides  a
common  way  of representing a set of data types over a net-
work.  The NFS Protocol Specification is written  using  the
RPC data description language. For more information, see the
eXternal Data Representation Standard:  Protocol  Specifica-
tion  chapter  of this manual.  Sun provides implementations
of XDR and RPC,  but NFS does not require  their  use.   Any
software that provides equivalent functionality can be used,
and if the encoding is exactly the same it can  interoperate
with other implementations of NFS.

2.3.  Stateless Servers

The NFS protocol is stateless.  That is, a server  does  not
need  to  maintain  any extra state information about any of
its clients  in  order  to  function  correctly.   Stateless
servers  have  a distinct advantage over stateful servers in
the event of a failure.  With stateless  servers,  a  client
need only retry a request until the server responds; it does
not even need to know that the server has  crashed,  or  the
network  temporarily  went  down.   The client of a stateful
server, on the other hand, needs to either detect  a  server
crash  and rebuild the server's state when it comes back up,
or cause client operations to fail.

This may not sound like an important issue, but  it  affects
the  protocol  in  some unexpected ways.  We feel that it is
worth a bit of extra complexity in the protocol to  be  able
to write very simple servers that do not require fancy crash
recovery.

On the other hand, NFS deals with objects such as files  and
directories  that inherently have state -- what good would a
file be if it did not keep its contents intact?  The goal is
to  not  introduce  any  extra state in the protocol itself.
Another way to simplify recovery  is  by  making  operations
"idempotent" whenever possible (so that they can potentially


Network File System: Version 2 Protocol SpecificatioPage 131


be repeated).

3.  NFS Protocol Definition

Servers have been known to change over time, and so can  the
protocol  that  they  use.  So RPC provides a version number
with each RPC request. This RFC describes version two of the
NFS protocol.  Even in the second version, there are various
obsolete procedures and parameters, which will be removed in
later versions. An RFC for version three of the NFS protocol
is currently under preparation.

3.1.  File System Model

NFS assumes a file system that is hierarchical, with  direc-
tories  as  all but the bottom-level files.  Each entry in a
directory (file, directory,  device,  etc.)   has  a  string
name.   Different operating systems may have restrictions on
the depth of the tree or the names used, as  well  as  using
different  syntax  to represent the "pathname", which is the
concatenation of all the "components"  (directory  and  file
names)  in  the name.  A "file system" is a tree on a single
server (usually a single disk or physical partition) with  a
specified  "root".  Some operating systems provide a "mount"
operation to make all file systems appear as a single  tree,
while others maintain a "forest" of file systems.  Files are
unstructured streams of uninterpreted bytes.  Version  3  of
NFS uses a slightly more general file system model.

NFS looks up one component of a pathname at a time.  It  may
not be obvious why it does not just take the whole pathname,
traipse down the directories, and return a file handle  when
it  is done.  There are several good reasons not to do this.
First, pathnames need separators between the directory  com-
ponents,  and  different  operating  systems  use  different
separators.  We could define  a  Network  Standard  Pathname
Representation,  but  then  every  pathname would have to be
parsed and converted at each end.   Other  issues  are  dis-
cussed in NFS Implementation Issues below.

Although files and directories are similar objects  in  many
ways,  different procedures are used to read directories and
files.   This  provides  a  network  standard   format   for
representing  directories.  The same argument as above could
have been used to justify a procedure that returns only  one
directory  entry  per  call.   The  problem  is  efficiency.
Directories can contain many entries, and a remote  call  to
return each would be just too slow.

3.2.  RPC Information

Authentication
     The    NFS   service  uses  AUTH_UNIX,   AUTH_DES,   or
     AUTH_SHORT  style  authentication, except in  the  NULL


Page 13Network File System: Version 2 Protocol Specification


     procedure where AUTH_NONE is also allowed.

Transport Protocols
     NFS currently is supported on UDP/IP only.

Port Number
     The NFS protocol currently uses  the  UDP  port  number
     2049.   This  is  not  an  officially assigned port, so
     later versions of the protocol use the  ``Portmapping''
     facility of RPC.

3.3.  Sizes of XDR Structures

These are the sizes, given in decimal bytes, of various  XDR
structures used in the protocol:

        /* The maximum number of bytes of data in a READ or WRITE request  */
        const MAXDATA = 8192;

        /* The maximum number of bytes in a pathname argument */
        const MAXPATHLEN = 1024;

        /* The maximum number of bytes in a file name argument */
        const MAXNAMLEN = 255;

        /* The size in bytes of the opaque "cookie" passed by READDIR */
        const COOKIESIZE  = 4;

        /* The size in bytes of the opaque file handle */
        const FHSIZE = 32;


3.4.  Basic Data Types

The following XDR  definitions  are  basic   structures  and
types used in other structures described further on.


Network File System: Version 2 Protocol SpecificatioPage 133


3.4.1.  stat

        enum stat {
                NFS_OK = 0,
                NFSERR_PERM=1,
                NFSERR_NOENT=2,
                NFSERR_IO=5,
                NFSERR_NXIO=6,
                NFSERR_ACCES=13,
                NFSERR_EXIST=17,
                NFSERR_NODEV=19,
                NFSERR_NOTDIR=20,
                NFSERR_ISDIR=21,
                NFSERR_FBIG=27,
                NFSERR_NOSPC=28,
                NFSERR_ROFS=30,
                NFSERR_NAMETOOLONG=63,
                NFSERR_NOTEMPTY=66,
                NFSERR_DQUOT=69,
                NFSERR_STALE=70,
                NFSERR_WFLUSH=99
        };


The stat type  is returned with every  procedure's  results.
A  value  of  NFS_OK indicates that the  call completed suc-
cessfully and the  results are  valid.  The   other   values
indicate   some kind of error  occurred on the  server  side
during the servicing   of the procedure.  The  error  values
are derived from UNIX error numbers.

NFSERR_PERM:
     Not owner.  The caller does not have correct  ownership
     to perform the requested operation.

NFSERR_NOENT:
     No such file or directory.     The  file  or  directory
     specified does not exist.

NFSERR_IO:
     Some sort of hard  error occurred  when  the  operation
     was in progress.  This could be a disk error, for exam-
     ple.

NFSERR_NXIO:
     No such device or address.

NFSERR_ACCES:
     Permission  denied.  The  caller does   not   have  the
     correct permission to perform the requested operation.

NFSERR_EXIST:
     File exists.  The file specified already exists.


Page 13Network File System: Version 2 Protocol Specification


NFSERR_NODEV:
     No such device.

NFSERR_NOTDIR:
     Not   a  directory.    The  caller  specified   a  non-
     directory in a directory operation.

NFSERR_ISDIR:
     Is a directory.  The caller specified  a directory in a
     non- directory operation.

NFSERR_FBIG:
     File too large.   The  operation caused a file to  grow
     beyond the server's limit.

NFSERR_NOSPC:
     No space left on  device.   The  operation  caused  the
     server's filesystem to reach its limit.

NFSERR_ROFS:
     Read-only filesystem.  Write attempted on  a  read-only
     filesystem.

NFSERR_NAMETOOLONG:
     File name   too   long.  The file  name  in  an  opera-
     tion was too long.

NFSERR_NOTEMPTY:
     Directory    not  empty.   Attempted   to    remove   a
     directory that was not empty.

NFSERR_DQUOT:
     Disk quota exceeded.  The client's disk  quota  on  the
     server has been exceeded.

NFSERR_STALE:
     The  "fhandle" given in   the  arguments  was  invalid.
     That  is,  the  file referred to by that file handle no
     longer exists, or access to it has been revoked.

NFSERR_WFLUSH:
     The server's  write cache  used  in the WRITECACHE call
     got flushed to disk.


Network File System: Version 2 Protocol SpecificatioPage 135


3.4.2.  ftype

        enum ftype {
                NFNON = 0,
                NFREG = 1,
                NFDIR = 2,
                NFBLK = 3,
                NFCHR = 4,
                NFLNK = 5
        };

The enumeration ftype gives the type of a  file.   The  type
NFNON  indicates  a non-file, NFREG is a regular file, NFDIR
is a directory, NFBLK is a block-special device, NFCHR is  a
character-special device, and NFLNK is a symbolic link.

3.4.3.  fhandle

        typedef opaque fhandle[FHSIZE];

The fhandle is the file handle passed between the server and
the  client. All file operations are done using file handles
to refer to a file or directory.  The file handle  can  con-
tain whatever information the server needs to distinguish an
individual file.

3.4.4.  timeval

        struct timeval {
                unsigned int seconds;
                unsigned int useconds;
        };

The  timeval  structure  is  the  number  of   seconds   and
microseconds  since midnight January 1, 1970, Greenwich Mean
Time.  It is used to pass time and date information.


Page 13Network File System: Version 2 Protocol Specification


3.4.5.  fattr

        struct fattr {
                ftype        type;
                unsigned int mode;
                unsigned int nlink;
                unsigned int uid;
                unsigned int gid;
                unsigned int size;
                unsigned int blocksize;
                unsigned int rdev;
                unsigned int blocks;
                unsigned int fsid;
                unsigned int fileid;
                timeval      atime;
                timeval      mtime;
                timeval      ctime;
        };

The fattr structure  contains  the  attributes  of  a  file;
"type"  is  the  type  of the file; "nlink" is the number of
hard links to the file (the number of  different  names  for
the  same  file); "uid" is the user identification number of
the owner of the file; "gid"  is  the  group  identification
number of the group of the file; "size" is the size in bytes
of the file; "blocksize" is the size in bytes of a block  of
the  file;  "rdev" is the device number of the file if it is
type NFCHR or NFBLK; "blocks" is the number  of  blocks  the
file  takes up on disk; "fsid" is the file system identifier
for the filesystem containing the file; "fileid" is a number
that  uniquely  identifies  the  file within its filesystem;
"atime" is the time when the  file  was  last  accessed  for
either read or write; "mtime" is the time when the file data
was last modified (written); and "ctime" is  the  time  when
the  status  of  the  file was last changed.  Writing to the
file also changes "ctime" if the size of the file changes.

"mode" is the access mode encoded as a set of bits.   Notice
that the file type is specified both in the mode bits and in
the file type.  This is really a bug  in  the  protocol  and
will  be  fixed  in future versions.  The descriptions given
below specify the bit positions using octal numbers.


Network File System: Version 2 Protocol SpecificatioPage 137


____________________________________________________________________________
   Bit                               Description
____________________________________________________________________________
 0040000   This is a directory; "type" field should be NFDIR.
 0020000   This is a character special file; "type" field should be NFCHR.
 0060000   This is a block special file; "type" field should be NFBLK.
 0100000   This is a regular file; "type" field should be NFREG.
 0120000   This is a symbolic link file;  "type" field should be NFLNK.
 0140000   This is a named socket; "type" field should be NFNON.
 0004000   Set user id on execution.
 0002000   Set group id on execution.
 0001000   Save swapped text even after use.
 0000400   Read permission for owner.
 0000200   Write permission for owner.
 0000100   Execute and search permission for owner.
 0000040   Read permission for group.
 0000020   Write permission for group.
 0000010   Execute and search permission for group.
 0000004   Read permission for others.
 0000002   Write permission for others.
 0000001   Execute and search permission for others.
____________________________________________________________________________





















|


                                                                           |


Notes:

     The bits are  the same as the mode   bits returned   by
     the  stat(2)  system call in the UNIX system.  The file
     type is  specified  both in the mode  bits  and in  the
     file type.   This   is fixed  in future versions.

     The "rdev" field in  the  attributes  structure  is  an
     operating  system  specific device specifier.  It  will
     be  removed and generalized in the next revision of the
     protocol.


3.4.6.  sattr

        struct sattr {
                unsigned int mode;
                unsigned int uid;
                unsigned int gid;
                unsigned int size;
                timeval      atime;
                timeval      mtime;
        };

The sattr structure contains the file attributes  which  can
be  set  from  the  client.   The fields are the same as for
fattr above.  A "size" of zero  means  the  file  should  be
truncated.   A  value of -1 indicates a field that should be
ignored.


Page 13Network File System: Version 2 Protocol Specification


3.4.7.  filename

        typedef string filename<MAXNAMLEN>;

The type filename is used for  passing file names  or  path-
name components.


3.4.8.  path

        typedef string path<MAXPATHLEN>;

The type path is a pathname.  The server considers it  as  a
string with no internal structure,  but to the client  it is
the name of a node in a filesystem tree.


3.4.9.  attrstat

        union attrstat switch (stat status) {
                case NFS_OK:
                        fattr attributes;
                default:
                        void;
        };

The attrstat structure is a  common  procedure  result.   It
contains  a   "status"  and,  if  the call   succeeded,   it
also contains  the attributes  of  the  file  on  which  the
operation was done.


3.4.10.  diropargs

        struct diropargs {
                fhandle  dir;
                filename name;
        };

The diropargs structure is used  in  directory   operations.
The  "fhandle"  "dir" is the directory in  which to find the
file "name".  A directory operation  is  one  in  which  the
directory is affected.


Network File System: Version 2 Protocol SpecificatioPage 139


3.4.11.  diropres

        union diropres switch (stat status) {
                case NFS_OK:
                        struct {
                                fhandle file;
                                fattr   attributes;
                        } diropok;
                default:
                        void;
        };

The results of a directory operation   are  returned   in  a
diropres  structure.  If the call succeeded, a new file han-
dle "file" and the "attributes" associated  with  that  file
are  returned along with the "status".

3.5.  Server Procedures

The  protocol definition  is given as    a   set   of   pro-
cedures  with arguments  and results defined using the   RPC
language.   A brief description of the function of each pro-
cedure  should provide enough information to allow implemen-
tation.

All of  the procedures  in   the NFS  protocol  are  assumed
to   be  synchronous.    When  a  procedure   returns to the
client, the client can assume that the  operation  has  com-
pleted  and  any data associated with the request is  now on
stable storage.  For  example, a client WRITE request    may
cause   the    server   to  update  data  blocks, filesystem
information blocks (such as  indirect   blocks),   and  file
attribute   information  (size   and   modify  times).  When
the WRITE returns to the client, it  can  assume   that  the
write   is  safe,  even in case of  a server  crash, and  it
can discard the  data written.  This  is  a  very  important
part   of  the  statelessness  of the server.  If the server
waited to flush data from remote requests, the client  would
have  to   save those requests so that  it could resend them
in case of a server crash.


Page 14Network File System: Version 2 Protocol Specification


/*
* Remote file service routines
*/
program NFS_PROGRAM {
        version NFS_VERSION {
                void        NFSPROC_NULL(void)              = 0;
                attrstat    NFSPROC_GETATTR(fhandle)        = 1;
                attrstat    NFSPROC_SETATTR(sattrargs)      = 2;
                void        NFSPROC_ROOT(void)              = 3;
                diropres    NFSPROC_LOOKUP(diropargs)       = 4;
                readlinkres NFSPROC_READLINK(fhandle)       = 5;
                readres     NFSPROC_READ(readargs)          = 6;
                void        NFSPROC_WRITECACHE(void)        = 7;
                attrstat    NFSPROC_WRITE(writeargs)        = 8;
                diropres    NFSPROC_CREATE(createargs)      = 9;
                stat        NFSPROC_REMOVE(diropargs)       = 10;
                stat        NFSPROC_RENAME(renameargs)      = 11;
                stat        NFSPROC_LINK(linkargs)          = 12;
                stat        NFSPROC_SYMLINK(symlinkargs)    = 13;
                diropres    NFSPROC_MKDIR(createargs)       = 14;
                stat        NFSPROC_RMDIR(diropargs)        = 15;
                readdirres  NFSPROC_READDIR(readdirargs)        = 16;
                statfsres   NFSPROC_STATFS(fhandle)         = 17;
        } = 2;
} = 100003;


3.5.1.  Do Nothing

        void
        NFSPROC_NULL(void) = 0;

This procedure does no work.   It is made available  in  all
RPC services to allow server response testing and timing.

3.5.2.  Get File Attributes

        attrstat
        NFSPROC_GETATTR (fhandle) = 1;

If the reply  status is NFS_OK, then  the  reply  attributes
contains  the  attributes  for  the  file given by the input
fhandle.


Network File System: Version 2 Protocol SpecificatioPage 141


3.5.3.  Set File Attributes

        struct sattrargs {
                fhandle file;
                sattr attributes;
                };

        attrstat
        NFSPROC_SETATTR (sattrargs) = 2;

The  "attributes" argument  contains fields which are either
-1  or  are  the  new value for  the  attributes of  "file".
If the reply status is NFS_OK, then  the   reply  attributes
have  the  attributes of the file after the "SETATTR" opera-
tion has completed.

Note: The use of -1 to indicate an unused field  in  "attri-
butes" is changed in the next version of the protocol.

3.5.4.  Get Filesystem Root

        void
        NFSPROC_ROOT(void) = 3;

Obsolete.  This  procedure   is  no  longer  used    because
finding the root file handle of a filesystem requires moving
pathnames between client  and server.  To  do  this right we
would  have  to define a network  stan- dard  representation
of   pathnames.   Instead,  the  function   of   looking  up
the    root   file handle  is  done  by the MNTPROC_MNT pro-
cedure.    (See the  Mount  Protocol  Definition  below  for
details).

3.5.5.  Look Up File Name

        diropres
        NFSPROC_LOOKUP(diropargs) = 4;

If  the reply "status"  is NFS_OK, then  the  reply   "file"
and  reply  "attributes"  are the file handle and attributes
for the file "name" in the directory given by "dir"  in  the
argument.


Page 14Network File System: Version 2 Protocol Specification


3.5.6.  Read From Symbolic Link

        union readlinkres switch (stat status) {
                case NFS_OK:
                        path data;
                default:
                        void;
        };

        readlinkres
        NFSPROC_READLINK(fhandle) = 5;

If "status" has the value NFS_OK, then the reply  "data"  is
the  data in the symbolic link given by the file referred to
by the fhandle argument.

Note:   since    NFS  always   parses  pathnames     on  the
client,  the  pathname  in  a symbolic  link may  mean some-
thing  different (or be meaningless) on a  different  client
or on the server if  a different pathname syntax is used.

3.5.7.  Read From File

        struct readargs {
                fhandle file;
                unsigned offset;
                unsigned count;
                unsigned totalcount;
        };

        union readres switch (stat status) {
                case NFS_OK:
                        fattr attributes;
                        nfsdata data;
                default:
                        void;
        };

        readres
        NFSPROC_READ(readargs) = 6;

Returns  up  to  "count" bytes of   "data"  from   the  file
given by "file", starting at "offset" bytes from  the begin-
ning of the file.  The first byte of the file is  at  offset
zero.   The  file  attributes after the read takes place are
returned in "attributes".

Note: The  argument "totalcount" is  unused, and is  removed
in the next protocol revision.


Network File System: Version 2 Protocol SpecificatioPage 143


3.5.8.  Write to Cache

        void
        NFSPROC_WRITECACHE(void) = 7;

To be used in the next protocol revision.

3.5.9.  Write to File

        struct writeargs {
                fhandle file;
                unsigned beginoffset;
                unsigned offset;
                unsigned totalcount;
                nfsdata data;
        };

        attrstat
        NFSPROC_WRITE(writeargs) = 8;

Writes   "data" beginning  "offset"  bytes  from the  begin-
ning  of "file".  The first byte  of  the file is at  offset
zero.  If  the reply "status" is  NFS_OK,  then   the  reply
"attributes"  contains the attributes  of the file after the
write has  completed.  The write operation is atomic.   Data
from  this   call  to WRITE will not be mixed with data from
another client's calls.

Note:  The  arguments  "beginoffset"  and  "totalcount"  are
ignored and are removed in the next protocol revision.

3.5.10.  Create File

        struct createargs {
                diropargs where;
                sattr attributes;
        };

        diropres
        NFSPROC_CREATE(createargs) = 9;

The file "name" is  created   in  the  directory  given   by
"dir".   The initial  attributes of the  new file  are given
by "attributes".  A reply "status"  of NFS_OK indicates that
the   file  was  created,  and  reply  "file"    and   reply
"attributes"  are    its file handle and  attributes.    Any
other  reply  "status"  means that  the operation failed and
no file was created.

Note: This  routine should pass  an exclusive  create  flag,
meaning "create the file only if it is not already there".


Page 14Network File System: Version 2 Protocol Specification


3.5.11.  Remove File

        stat
        NFSPROC_REMOVE(diropargs) = 10;

The file "name" is  removed from  the  directory   given  by
"dir".    A  reply  of  NFS_OK means the directory entry was
removed.

Note: possibly non-idempotent operation.

3.5.12.  Rename File

        struct renameargs {
                diropargs from;
                diropargs to;
        };

        stat
        NFSPROC_RENAME(renameargs) = 11;

The existing file "from.name" in   the  directory  given  by
"from.dir" is renamed to "to.name" in the directory given by
"to.dir".  If the reply  is NFS_OK, the file  was   renamed.
The  RENAME  operation is atomic on the server; it cannot be
interrupted in the middle.

Note: possibly non-idempotent operation.

3.5.13.  Create Link to File

        struct linkargs {
                fhandle from;
                diropargs to;
        };

        stat
        NFSPROC_LINK(linkargs) = 12;

Creates the  file "to.name"  in the  directory   given    by
"to.dir",  which  is  a hard link to the existing file given
by "from".  If the  return  value  is  NFS_OK,  a  link  was
created.  Any other return value indicates an error, and the
link was not created.

A hard link should have the property that changes  to either
of  the  linked  files  are reflected in both files.  When a
hard link is made to a  file, the attributes  for  the  file
should   have   a value for "nlink" that is one greater than
the value before the link.

Note: possibly non-idempotent operation.


Network File System: Version 2 Protocol SpecificatioPage 145


3.5.14.  Create Symbolic Link

        struct symlinkargs {
                diropargs from;
                path to;
                sattr attributes;
        };

        stat
        NFSPROC_SYMLINK(symlinkargs) = 13;

Creates the  file "from.name"  with   ftype  NFLNK  in   the
directory  given by "from.dir".   The new file contains  the
pathname "to" and has initial attributes  given  by  "attri-
butes".  If  the return value is NFS_OK, a link was created.
Any other return value indicates an error, and the link  was
not created.

A symbolic  link is  a pointer to another file.    The  name
given  in  "to"  is   not  interpreted  by  the server, only
stored in  the  newly created file.  When the client  refer-
ences  a  file  that is a symbolic link, the contents of the
symbolic  link are normally transparently reinterpreted   as
a  pathname   to  substitute.   A READLINK operation returns
the data to the client for interpretation.

Note:  On UNIX servers the attributes are never used,  since
symbolic links always have mode 0777.

3.5.15.  Create Directory

        diropres
        NFSPROC_MKDIR (createargs) = 14;

The new directory "where.name" is created in  the  directory
given  by  "where.dir".   The  initial attributes of the new
directory are given by "attributes".  A  reply  "status"  of
NFS_OK  indicates  that  the  new directory was created, and
reply "file" and  reply "attributes" are  its  file   handle
and  attributes.  Any  other  reply "status"  means that the
operation failed and no directory was created.

Note: possibly non-idempotent operation.

3.5.16.  Remove Directory

        stat
        NFSPROC_RMDIR(diropargs) = 15;

The existing empty directory "name" in the  directory  given
by  "dir" is removed.  If the reply is NFS_OK, the directory
was removed.


Page 14Network File System: Version 2 Protocol Specification


Note: possibly non-idempotent operation.

3.5.17.  Read From Directory

        struct readdirargs {
                fhandle dir;
                nfscookie cookie;
                unsigned count;
        };

        struct entry {
                unsigned fileid;
                filename name;
                nfscookie cookie;
                entry *nextentry;
        };

        union readdirres switch (stat status) {
                case NFS_OK:
                        struct {
                                entry *entries;
                                bool eof;
                        } readdirok;
                default:
                        void;
        };

        readdirres
        NFSPROC_READDIR (readdirargs) = 16;

Returns a variable number of   directory  entries,   with  a
total  size of up to "count" bytes, from the directory given
by "dir".  If the returned  value of  "status"   is  NFS_OK,
then   it   is followed  by a variable  number  of "entry"s.
Each "entry"  contains   a  "fileid"  which  consists  of  a
unique  number   to identify the  file within  a filesystem,
the  "name" of the  file,  and  a  "cookie"  which    is  an
opaque  pointer  to  the next entry in  the  directory.  The
cookie is used   in  the  next  READDIR  call  to  get  more
entries   starting  at a given point in  the directory.  The
special cookie zero (all  bits zero) can be used to get  the
entries  starting   at the  beginning of the directory.  The
"fileid" field should be the same number as the "fileid"  in
the the  attributes of the  file.  (See the Basic Data Types
section.) The "eof" flag  has a value of TRUE if  there  are
no more entries in the directory.


Network File System: Version 2 Protocol SpecificatioPage 147


3.5.18.  Get Filesystem Attributes

        union statfsres (stat status) {
                case NFS_OK:
                        struct {
                                unsigned tsize;
                                unsigned bsize;
                                unsigned blocks;
                                unsigned bfree;
                                unsigned bavail;
                        } info;
                default:
                        void;
        };

        statfsres
        NFSPROC_STATFS(fhandle) = 17;

If the  reply "status"  is NFS_OK, then  the   reply  "info"
gives  the  attributes for the filesystem that contains file
referred to by the input fhandle.  The attribute fields con-
tain the following values:

tsize:The optimum transfer size  of  the  server  in  bytes.
     This  is the number  of bytes the server  would like to
     have in the data part of READ and WRITE requests.

bsize:The block size in bytes of the filesystem.

blocks:
     The total number of "bsize" blocks on the filesystem.

bfree:The number of free "bsize" blocks on the filesystem.

bavail:
     The  number  of   "bsize"  blocks   available  to  non-
     privileged users.

Note: This call does not  work well  if  a   filesystem  has
variable size blocks.

4.  NFS Implementation Issues

The NFS protocol is designed to be operating system indepen-
dent, but since this version was designed in a UNIX environ-
ment, many operations have semantics similar to  the  opera-
tions  of the UNIX file system.  This section discusses some
of the implementation-specific semantic issues.

4.1.  Server/Client Relationship

The NFS protocol is designed to allow servers to be as  sim-
ple  and  general  as possible.  Sometimes the simplicity of


Page 14Network File System: Version 2 Protocol Specification


the server can be a problem, if the client wants  to  imple-
ment complicated filesystem semantics.

For example, some operating systems allow  removal  of  open
files.   A  process  can  open a file and, while it is open,
remove it from the directory.  The  file  can  be  read  and
written  as  long  as the process keeps it open, even though
the file has no name in the filesystem.   It  is  impossible
for  a  stateless  server to implement these semantics.  The
client can do some tricks  such  as  renaming  the  file  on
remove,  and only removing it on close.  We believe that the
server provides enough functionality to implement most  file
system semantics on the client.

Every NFS client can  also  potentially  be  a  server,  and
remote  and  local  mounted filesystems can be freely inter-
mixed.  This leads  to  some  interesting  problems  when  a
client  travels down the directory tree of a remote filesys-
tem and reaches the mount point on the  server  for  another
remote filesystem.  Allowing the server to follow the second
remote mount would require loop  detection,  server  lookup,
and  user  revalidation.   Instead,  we  decided  not to let
clients cross a server's mount point.  When a client does  a
LOOKUP  on  a  directory  on  which the server has mounted a
filesystem, the client sees the underlying directory instead
of  the  mounted  directory.   A client can do remote mounts
that  match  the  server's  mount  points  to  maintain  the
server's view.


4.2.  Pathname Interpretation

There are a few complications to the rule that pathnames are
always  parsed  on  the client.  For example, symbolic links
could have different interpretations on  different  clients.
Another  common  problem for non-UNIX implementations is the
special interpretation of the pathname  ".."   to  mean  the
parent  of a given directory.  The next revision of the pro-
tocol uses an explicit flag to indicate the parent instead.

4.3.  Permission Issues

The NFS protocol, strictly speaking,  does  not  define  the
permission  checking  used   by  servers.   However,   it is
expected that a server will do normal operating system  per-
mission checking using AUTH_UNIX style authentication as the
basis of its protection  mechanism.   The  server  gets  the
client's  effective  "uid",  effective  "gid", and groups on
each call and uses them  to  check  permission.   There  are
various  problems with this method that can been resolved in
interesting ways.

Using "uid" and "gid" implies that  the  client  and  server
share  the  same  "uid"  list.  Every server and client pair


Network File System: Version 2 Protocol SpecificatioPage 149


must have the same mapping from user to "uid" and from group
to  "gid".   Since  every  client can also be a server, this
tends to imply  that  the  whole  network  shares  the  same
"uid/gid"  space.   AUTH_DES  (and the  next revision of the
NFS protocol) uses string  names  instead  of  numbers,  but
there are still complex problems to be solved.

Another problem arises due  to  the  usually  stateful  open
operation.   Most operating systems check permission at open
time, and then check that the file is open on each read  and
write  request.   With  stateless servers, the server has no
idea that the file is open and must do  permission  checking
on  each read and write call.  On a local filesystem, a user
can open a file and then change the permissions so  that  no
one  is allowed to touch it, but will still be able to write
to the file because it is open.  On a remote filesystem,  by
contrast, the write would fail.  To get around this problem,
the server's permission checking algorithm should allow  the
owner  of  a  file to access it regardless of the permission
setting.

A similar problem has to do with paging in from a file  over
the  network.   The operating system usually checks for exe-
cute permission before opening a file for demand paging, and
then reads blocks from the open file.  The file may not have
read permission, but after it is opened it  doesn't  matter.
An  NFS  server can not tell the difference between a normal
file read and a demand page-in read.  To make this work, the
server  allows  reading  of  files if the "uid" given in the
call has execute or read permission on the file.

In most operating systems, a particular user (on the user ID
zero)  has access to all files no matter what permission and
ownership they have.  This "super-user" permission  may  not
be  allowed  on  the  server,  since  anyone  who can become
super-user on their workstation could  gain  access  to  all
remote  files.  The UNIX server by default maps user id 0 to
-2 before doing its access checking.  This works except  for
NFS  root  filesystems,  where  super-user  access cannot be
avoided.

4.4.  Setting RPC Parameters

Various file system parameters and options should be set  at
mount time.  The mount protocol is described in the appendix
below.  For example, "Soft" mounts as well as "Hard"  mounts
are usually both provided.  Soft mounted file systems return
errors when RPC operations fail (after  a  given  number  of
optional  retransmissions),  while hard mounted file systems
continue to retransmit forever.   Clients  and  servers  may
need to keep caches of recent operations to help avoid prob-
lems with non-idempotent operations.


Page 15Network File System: Version 2 Protocol Specification


5.  Mount Protocol Definition


5.1.  Introduction

The mount protocol is separate from, but related to, the NFS
protocol.  It provides operating system specific services to
get the NFS off the ground -- looking up server path  names,
validating  user  identity, and checking access permissions.
Clients use the mount protocol to get the first file handle,
which allows them entry into a remote filesystem.

The mount protocol is kept separate from the NFS protocol to
make  it  easy to plug in new access checking and validation
methods without changing the NFS server protocol.

Notice that the protocol definition implies stateful servers
because  the  server  maintains  a  list  of  client's mount
requests.  The mount list information is  not  critical  for
the  correct functioning of either the client or the server.
It is intended for advisory use only, for example,  to  warn
possible clients when a server is going down.

Version one of the mount protocol is used with  version  two
of the NFS protocol.  The only connecting point is the fhan-
dle structure, which is the same for both protocols.

5.2.  RPC Information

Authentication
     The mount service uses  AUTH_UNIX  and  AUTH_DES  style
     authentication only.

Transport Protocols
     The mount service  is  currently  supported  on  UDP/IP
     only.

Port Number
     Consult the server's    portmapper, described  in   the
     Remote  Procedure  Calls:  Protocol  Specification,  to
     find  the  port number on which the  mount  service  is
     registered.

5.3.  Sizes of XDR Structures

These  are  the sizes,    given   in   decimal    bytes,  of
various XDR structures used in the protocol:


Network File System: Version 2 Protocol SpecificatioPage 151


        /* The maximum number of bytes in a pathname argument */
        const MNTPATHLEN = 1024;

        /* The maximum number of bytes in a name argument */
        const MNTNAMLEN = 255;

        /* The size in bytes of the opaque file handle */
        const FHSIZE = 32;


5.4.  Basic Data Types

This section presents the data  types used  by   the   mount
protocol.   In many cases they are similar to the types used
in NFS.

5.4.1.  fhandle

        typedef opaque fhandle[FHSIZE];

The type fhandle is the file handle that the  server  passes
to  the  client.   All  file operations are done  using file
handles  to refer to a  file  or directory.   The  file han-
dle   can   contain whatever information the server needs to
distinguish an individual file.

This  is the  same as the "fhandle" XDR definition  in  ver-
sion  2  of  the  NFS protocol;  see Basic Data Types in the
definition of the NFS protocol, above.

5.4.2.  fhstatus

        union fhstatus switch (unsigned status) {
                case 0:
                        fhandle directory;
                default:
                        void;
        };

The type fhstatus is a union.  If  a  "status"  of  zero  is
returned,  the   call completed   successfully, and  a  file
handle    for    the  "directory"   follows.   A    non-zero
status  indicates   some   sort  of error.  In this case the
status is a UNIX error number.

5.4.3.  dirpath

        typedef string dirpath<MNTPATHLEN>;

The type dirpath is a server pathname of a directory.


Page 15Network File System: Version 2 Protocol Specification


5.4.4.  name

        typedef string name<MNTNAMLEN>;

The type name is an arbitrary string used for various names.

5.5.  Server Procedures

The following sections define the RPC  procedures   supplied
by a mount server.

/*
* Protocol description for the mount program
*/

program MOUNTPROG {
/*
* Version 1 of the mount protocol used with
* version 2 of the NFS protocol.
*/
        version MOUNTVERS {
                void        MOUNTPROC_NULL(void)    = 0;
                fhstatus    MOUNTPROC_MNT(dirpath)  = 1;
                mountlist   MOUNTPROC_DUMP(void)    = 2;
                void        MOUNTPROC_UMNT(dirpath) = 3;
                void        MOUNTPROC_UMNTALL(void) = 4;
                exportlist  MOUNTPROC_EXPORT(void)  = 5;
        } = 1;
} = 100005;


5.5.1.  Do Nothing

        void
        MNTPROC_NULL(void) = 0;

This  procedure does no work.  It   is  made   available  in
all   RPC services to allow server response testing and tim-
ing.

5.5.2.  Add Mount Entry

        fhstatus
        MNTPROC_MNT(dirpath) = 1;

If the reply "status" is 0, then the reply "directory"  con-
tains  the  file  handle  for the directory "dirname".  This
file handle may be used in the NFS protocol.  This procedure
also  adds  a  new  entry  to the mount list for this client
mounting "dirname".


Network File System: Version 2 Protocol SpecificatioPage 153


5.5.3.  Return Mount Entries

        struct *mountlist {
                name      hostname;
                dirpath   directory;
                mountlist nextentry;
        };

        mountlist
        MNTPROC_DUMP(void) = 2;

Returns  the list  of   remote  mounted  filesystems.    The
"mountlist"  contains  one  entry  for  each  "hostname" and
"directory" pair.

5.5.4.  Remove Mount Entry

        void
        MNTPROC_UMNT(dirpath) = 3;

Removes the mount list entry for the input "dirpath".

5.5.5.  Remove All Mount Entries

        void
        MNTPROC_UMNTALL(void) = 4;

Removes all of the mount list entries for this client.

5.5.6.  Return Export List

        struct *groups {
                name grname;
                groups grnext;
        };

        struct *exportlist {
                dirpath filesys;
                groups groups;
                exportlist next;
        };

        exportlist
        MNTPROC_EXPORT(void) = 5;

Returns a variable number  of  export  list  entries.   Each
entry  contains  a filesystem name and a list of groups that
are allowed  to  import  it.   The  filesystem  name  is  in
"filesys", and the group name is in the list "groups".

Note:  The exportlist should contain more information  about
the status of the filesystem, such as a read-only flag.