Implementing Remote Procedure Calls

Notes by Adrian Bozdog (3/17/1999)

Previous notes by Lili Qiu

Goals

simplify building distributed systems
balance between powerful semantics and efficiency
secure communications
similarity between semantics of RPCs and LPCs

Major issues

precise semantics
integration of remote calls into existing programming systems
binding
suitable transport protocols
data integrity and security in an open communication network

Features of Environment

Cedar distributed operating system
Most computers are Dorados
Most communication is on the lightly loaded Ethernet (< 40%)
PUP transport protocol
Absence of shared addresses

Structure

user
user-stub: place a specification of the target procedure and the arguments into packet
RPCRuntime: transmission, acknowledges,packet routing, encryption
server stub: unpack data, and invoke the corresponding procedure
server

Binding

Naming: type and instance
Alternatives for locating an exporter :
- broadcast protocol
- including in application programs the network addresses of the exporter
- Grapevine distributed database:
  - Entry
    - Group entry: {RName, list of members' RNames}
    - Individual entry: {RName,connect-site}
  - Each machine maintains all the exported interfaces of the form {interface name, dispatcher, ID}
  - ID is unique by using successive 32-bit counter initialized to the real time clock and constrained to be less than the current value of the clock. This constrains the average rate of calls < 1/sec.
  - If binding succeeds the user-stub remembers :
    - exporter network address
    - ID
    - table index of the interface
  - At each call the call packet contains :
    - ID
    - table index of the desired interface
    - entry point of the desired procedure
Effects:
- Importing an interface does not affect data structures from the exporter machine
- Binding breaks if the exporter crashes and restarts because of the use of ID
- Calls only on procedures that have been explicitly exported through RPC
- Restrict the set of users to export particular interface by using access controls that restrict updates to the database

Transport protocol

Features and Goals
- There are substantial gains available if one designes and implements a transport protocol especially for RPC
- Minimize the elapsed real-time between initiating a call and getting results
- Minimize the load imposed on a server by substantial number of users
- Minimize state information and the time to set up a connection
- Exactly once and at most once
- No timeout (Is it acceptable?)
Call features
- Call identifier {machine ID, calling process ID, monotonic sequence no}
- Calee machine maintains the sequence number of the last call invoked by each activity { machine identifier, process }
- Call packet { call identifier, data specyfing the procedure, arguments }
- Result packet { call identifier, results }
- There is no special establishment connection protocol ( receive from unknown activity creates connection implicitly)
- The server discards its state information for a connection idle for long
- To ensure unique identifier even after a crash, conversation ID based on 32-bit clock is passed as well as the call sequence number on every call. (Any better alternative?)
- Using probe packets to detect failure of communications and server process
- Require one acknowledge for each packet except the last one (Too inefficient!!! Any better solution? )
Optimizations
- maintain a stock of idle server processes to avoid process creation
- reduce the number of process switches involved in a call
- use subsequent packet to implicitly acknowledge the previous packet
- design a transport protocol that suits RPC best ( bypass software layers that correponds to the normal layers of a protocol hierarchy )

Questions

What is the difference and similarity between RPC and LPC? How transparent should RPC be?
According to Table-I, there is considerable overhead in RPC: the transmission time and the execution time of the procedure counts only 10% of the total time. Why is this situation?
What are advantages and desadvantages of using probe packets as compared to using a timeout ?
The authors investigate whether a sufficient level of performance for RPC can be achieved by a general purpose transport protocol whose implementation adopts strategies suitable for RPC as well as bulk data transfer. What do you think? What kind of transport protocol is suitable for RPC and bulk data respectively?