Implementing Remote Procedure Calls
Notes by Lili Qiu (3/10/1998)
Goals
- simplify building distributed systems
- balance between powerful semantics and efficiency
- secure communications
Major issues
- precise semantics
- integration of remote calls into existing programming systems
- binding
- suitable transport protocols
- data integrity and security in an open communication network
Environment
- Cedar distributed operating system
- Most computers are Dorados
- Most communication is on the lightly loaded Ethernet (< 40%)
- PUP transport protocol
Structure
- user
- user-stub: place a specification of the target procedure and the arguments into packet
- RPCRuntime: transmission, packet routing, encryption
- server stub: unpack data, and invoke the corresponding procedure
- server
Binding
- Naming: type and instance
- Locating an approriate exporter:
- Entry
- Group entry: {RName, list of members' RNames}
- Individual entry: {RName,connect-site}
- Interface name
- Type entry (group entry): {Type, Group of exported instances of this type}
- Instance entry (individual entry): {Instance, Individual that last exported this
instance}
- Each machine maintains all the exported interfaces of the form {interface name,
dispatcher, ID}
- ID is unique by using successive 32-bit counter initialized to the real time clock and
constrained to be less than the current value of the clock. This constrains the average
rate of calls < 1/sec. Any better solution?
- Effects:
- Importing an interface does not affect the exporter
- Binding breaks if the exporter crashes and restarts (Is it good?)
- Calls only on procedures that have been explicitly exported through RPC
- Restrict the set of users to export particular interface
Transport protocol
- Requirements
- Minimize state information and the time to set up a connection
- Exactly once and at most once (Why not include at least once for idempotent operations?
Can we gaurantee exactly once? At most once? At least once?)
- No timeout (Is it acceptable?)
- Calls
- Call identifier {machine ID, calling process ID, monotonic sequence no}
- The server discards its state information for a connection idle for long (Better
alternative?)
- To ensure unique identifier even after a crash, conversation ID based on 32-bit clock is
passed as well as the call sequence number on every call. (Any better alternative?)
- Using probe packets to detect failure of communications and server process (How is it
compared to timeout?)
- Require one ack. for each packet except the last one (Too inefficient!!! Any better
solution? How about using sliding window error control?)
- Optimizations
- maintain a stock of idle server processes to avoid process creation
- use subsequent packet to implicitly acknowledge the previous packet
- design a transport protocol that suits RPC best (Feasible when there are so many
application layer protocols as now?)
Discussion
- What is the difference between RPC and LPC? How transparent should RPC
be?
- There are many tradeoffs in the RPC implementation: tradeoffs among efficiency, powerful
semantics, resemblance to the local procedure calls, and insulation from detailed
communications. How do you think about the solutions provided by the authors regarding
these tradeoffs?
- According to Table-I, there is considerable overhead in RPC: the transmission time and
the execution time of the procedure counts only 10% of the total time. Why so? Any
possible improvement?
- The authors ponder whether a sufficient level of performance for RPC can be achieved by
a general purpose transport protocol whose implementation adopts strategies suitable for
RPC as well as bulk data transfer. What do you think? What kind of transport protocol is
suitable for RPC and bulk data respectively? Is TCP or UDP good candidate?