CS4410/4411: Operating Systems

Fall 2013, Profs. Sirer and van Renesse

MiniProject 3: SMTP Server FAQ

More information

Ok, I read the miniproject description but spell it all out for me one more time.

The client and the server code provided in the handout are skeletal -- they do not implement the complete protocol or even correctly implement part of the protocol (if they did, there'd be little for you to do!). In particular, the server just closes the socket immediately when it receives a new connection, and the client blindly sends data to the server without checking to see if the server sent "200 OK" responses. You are expected to not only change both the server and the client to correctly follow the amended protocol described in the project, but also to develop additional clients to test corner cases in your server implementation. In particular, you might need test clients that misbehave.

The communication between the client and server occurs over TCP. While TCP guarantees reliable delivery, it does not necessarily guarantee that data sent at the same time on the client will be received at the same time on the server (and vice versa). For instance, if the client sends the string "HELO abc12\r\n", the string can be chopped into three packets delivered separately at the servers, containing "HE", "LO ", and "abc12\r\n". You will most likely want to write a "collect_input" function that performs recv() operations in a loop until a whole message has been received, instead of trying to parse the first packet (containing only "HE") received.

A misbehaving client may send a bad command, and your server should deal with this by returning a response with a "500" status code. Your server should not close the socket, but instead resume the conversation from where it left off, thus providing the client a chance to recover from its mistake. So the following is a legal conversation that should be allowed by your implementation:

    HELO foo
        250 HELO abc123
    MAIL FROM: notice_the_space   @foo.com
        504 5.5.2 <notice_the_space   @foo.com>: Sender address rejected
        502 5.5.2 Error: command not recognized
    MAIL FROM: abc@example.com
        250 2.1.0 OK
        502 5.5.2 Error: command not recognized
    RCPT TO: xyz@example.com
        250 2.1.5 OK

In essence, you are implementing a very simple state machine whose job is to first collect the sender email, then the receiver email, and then the message.

As with all network protocols, you must code defensively and do something reasonable even in the presence of bad inputs. Part of the assignment is to decide what is reasonable. If you find yourself stuck, feel free to consult course staff.

Because of the way the filesystem is structured and because of the sector-based atomicity of file writes, you may not easily notice an interleaving of messages, even if you neglect to perform your file operations without any synchronization. If you write a proper test client that writes large messages, however, you should be able to see that concurrent accesses to a file without a lock can cause messages to be interleaved. You should perform your file operations as part of a critical section to avoid such interleavings.

Evaluation Criteria

How will my code be evaluated?

Your code will be evaluated not only for correctness but for elegance, maintainability, and the thoroughness of accompanying test code. Please read the preceding sentence carefully, and make sure your submission excels at all cited criteria.


When should the timeout begin? From the time we start expecting the next item, or the time from the last input received from the client?

Your goal is to write a robust server in general, and the specific use of the timer timeout is to make sure that a bad client cannot tie up server resources indefinitely. So, as the system implementer, you get to decide how the timeout should be implemented, keeping in mind that only one of the two choices posted above is the right choice given your goals.

By the way, the timeout should not start when the connection is established, but from the time the server starts expecting the next valid line (transaction) in the protocol. So the client should get 10 seconds to provide each part (HELO, MAIL FROM, RCPT TO, DATA) of the protocol, for a maximum possible connection time of (10 * 4 =) 40 seconds, 10 seconds per line.

Timeout Implementation

Ok, since the instructions call for some kind of a timeout, I did some background reading and decided to use Python's "socket receive with timeout" operation. I'll specify a timeout of 10 seconds and be done. Is this a good idea?

So, if you were to do that, which of the two timeout strategies mentioned in the previous FAQ entry would you be endorsing? Given your goal of not allowing a bad client to tie up resources indefinitely, does that implementation strategy match your goal?

Timeout Precision

A. My timeout implementation might disconnect a client in way under 10 seconds. Is this ok?

B. My timeout implementation might disconnect a client in ever-so-slightly under 10 seconds. Is this ok?

C. My timeout implementation might disconnect a client in slightly over 10 seconds. Is this ok?

D. My timeout implementation might disconnect a client in way over 10 seconds (i.e. 11 or more). Is this ok?

E. If the client is clever enough and knows the internal workings of my code, it could occupy my server indefinitely and never get disconnected, even though I'm catching and disconnecting dumb clients that simply remain idle for a while. Is this ok?

A. No, not OK. The spec says that the client can count on having 10 seconds of time to converse with the server.

B. No, not OK, same reason as A. Would you like to be connected to a heart defibrillator that was as cavalier about its timing requirements?

C. Your implementation provides the agreed minimum amount of time to the client. It also ties up server resources for a little bit longer than it strictly could, but strict time enforcement is often so difficult that this kind of tradeoff (of occupying server resources versus doing a very very precise job of time accounting) is acceptable. What's an acceptable level of imprecision? See the next question, though you should strive to be as accurate as possible.

D. No, that implementation overshoots and ties up the resources for 10% longer than it should. You can do better (though we will accept anything that provides more than 10 seconds to the client and terminates in under 11 seconds in the worst case).

E. No. Server code needs to be bulletproof against malicious clients, and security should never be predicated on your code remaining obscure. A good implementation can be simultaneously robust and public (and often the better implementations are).

Invisible \r\n

Is there a \r\n after DATA? Is there a \r\n after 200 OK?

Yes and yes.

DATA and Message Timeout

Is the timeout 10 seconds for DATA and the message, or 10 seconds for DATA, and another 10 seconds for the message?

The former.

Message Length

Is there a limit on the length of a message?

The 10-second timeout is an effective limit on the length of a message. The spec does not call for any other limit on messages.

Client "Pre-Sending" Commands

What if the client does not wait for the server, and sends everything all at once?

What if? What do you think ought to happen? We know that the spec is written to describe the normal-case scenario, where the client waits for the server, but how would a server ever hope to ensure that the client got what it sent? Clearly, in this instance, the client did not wait, but from the perspective of the server, the client sent a command on the stream, then sent another command on the stream, then sent another. Since you know what a stream is, and how it differs from datagrams, you know that streams do not have a concept of message boundaries. They are like the end of a hose out of which data pours, a little bit at a time, regardless of how it was packed into the other end of the hose.

As far as the operation of the server is concerned, the stream just has to contain legal commands, so what the client did is perfectly legal. So your server should be able to deal with legal commands appearing on the stream, even though you, with your God's eye view, know that the client "did not follow the protocol." If you think about it, only you know that the wrong thing happened; your server has no way of figuring out when precisely the client sent what it sent. What if the server had sent its response at time x, to be received by the client at time x+rtt/2, but the client sent its next command at time x+rtt/2-epsilon? How would a server detect that the client is "disobeying the protocol"?

Ahhh, impossible, right.

And guess what? If you wrote your server in the natural style that is simplest to implement (i.e. read, process command, make state change, etc), it will already deal with the case of an overeager client that sends commands early just fine. So do the simple thing that the current spec is asking for, not the crazy thing (causality tracking with a protocol that does not capture causality) that is impossible to do.

HELO netid

The spec says that the server should respond with HELO yournetid. Whose netid is it? Is it the client's? How would I even discover the client's netid? Do you all smoke crack at TA meetings? This is IMPOSSIBLE!!!

Please replace "yournetid" with your NetID. Take your NetID, hardcode it in a string in the server, return it in response to HELO commands. We will use it to identify your server during testing. It's like the part of the test where you're supposed to write your name. It is not supposed to be a tricky question. ;-)


What should we turn in?

You should turn in a ZIP file containing MP3/QUESTIONS.txt, MP3/*.py containing your python code for the server, the client, and all of the test cases you have developed in conjunction with the SMTP server.

Submission Errors

I failed to follow the simple instructions above. Can I write to you and ask for special treatment? My 10 second mistake will only take 3 minutes of your time to fix up.

Actually, it breaks our automated flow, so it takes much more than 3 minutes of someone's time. But assuming the 3 minutes estimate is correct, 171 students * 3 minutes = 513 minutes = 8.5 hours of doing nothing but this kind of special treatment. We'd rather spend those hours talking about course material and the subtle content therein. Please follow instructions carefully.

Frequently Asked Questions

Can we use Python's thread-safe queue module or its multiprocessing modules?

Your thread pool implementation must only rely on synchronization primitives from Python's threading library (Thread, Lock, Semaphore, Condition). Notably, you *must not* use Python's thread-safe queue module or its multiprocessing module.

Are commands case-sensitive?

No. The HELO command may be provided as helo, HELO, or hElO (or any of the other possible ways to capitalize it).

What happens if the client never sends a HELO?

The client must send a HELO in order to send other commands. More heavyweight SMTP serves often verify the HELO using DNS and reverse DNS to check the identity of a client.

What happens if the client sends two HELOs?

You may optionally send a 503 error to the client. You're not required to send this message, but if you want to detect this case, you can use a message like:

503 Error: duplicate HELO

Does HELO need to come in any specific order relative to MAIL FROM and RCPT TO? Do the other commands need to come in any specific order?

Your commands should come in the order HELO, MAIL FROM, RCPT TO, and DATA. If the commands come out of order, you can generate error messages like:

503 Error: need XXX command

Does DATA need to come in any specific order relative to other commands?

DATA must come after HELO, MAIL FROM, and RCPT TO.

How do we handle extra whitespace in the HELO, MAIL FROM, and RCPT TO commands?

Commands may contain extra whitespace around valid tokens. For instance, there may be two or more spaces after the colon in MAIL FROM or RCPT TO. There may also be trailing spaces after the argument. It is not acceptable to allow whitespaces in email addresses or hostnames.

Leading spaces, or spaces before the colon are acceptable, but will not be tested. Do not worry about adding extra logic to handle them.

Are multiple RCPT TO commands allowed? If so, is email delivered multiple times, or once?

The RCPT TO command may be specified multiple times. Doing so will result in multiple "To:" headers in the delivered mail, but the mail will only be delivered once, and will be assigned one number.

When a client sends two emails in one session, which commands are for both emails?

The HELO command will carry over between messages. No other command persists after a message is delivered.

Where do the "From:" and "To:" lines in the mailbox example in the README come from? I see that client.py sends these lines as well. Where do these lines come from?

In a standards-compliant environment, these lines are provided by the client. In your implementation, we ask that you output the "From:" line with the address corresponding to the address sent in MAIL FROM, and one "To:" line for each RCPT TO command.

If the client provides these lines, they should not be treated special. In the above mailbox example, the message body is separated from the server-specified headers by one blank line. If the client provides these lines, they simply appear in the message body.

Can hostnames (in HELO) contain whitespace?

That is not a legal hostname. If the client were to try to send such a hostname, that can be treated as a syntax error.

© 2013, Cornell University