MiniProject 3: SMTP Server FAQ

More information

Ok, I read the miniproject description but spell it all out for me one more time.

The client and the server code provided in the handout are skeletal -- they do not implement the complete protocol or even correctly implement part of the protocol (if they did, there'd be little for you to do!). In particular, the server just closes the socket immediately when it receives a new connection, and the client blindly sends data to the server without checking to see if the server sent "200 OK" responses. You are expected to not only change both the server and the client to correctly follow the amended protocol described in the project, but also to develop additional clients to test corner cases in your server implementation. In particular, you might need test clients that misbehave.

The communication between the client and server occurs over TCP. While TCP guarantees reliable delivery, it does not necessarily guarantee that data sent at the same time on the client will be received at the same time on the server (and vice versa). For instance, if the client sends the string "HELO abc12\n", the string can be chopped into three packets delivered separately at the servers, containing "HE", "LO ", and "abc12\n". You will most likely want to write a "collect_input" function that performs recv() operations in a loop until a whole message has been received, instead of trying to parse the first packet (containing only "HE") received.

A misbehaving client may send a bad command, and your server should deal with this by returning a response with a "500" status code. Your server should not close the socket, but instead resume the conversation from where it left off, thus providing the client a chance to recover from its mistake. So the following is a legal conversation that should be allowed by your implementation: HELO foo 200 HELO abc123 MAIL FROM: notice_the_space @foo.com 500 ERROR in senderemail XXXXXXX 500 BAD COMMAND MAIL FROM: abc@example.com 200 OK YYYYYZZZZ 500 BAD COMMAND RCPT TO: xyz@example.com 200 OK
In essence, you are implementing a very simple state machine whose job is to first collect the sender email, then the receiver email, and then the message.

As with all network protocols, you must code defensively and do something reasonable even in the presence of bad inputs. Part of the assignment is to decide what is reasonable. If you find yourself stuck, feel free to consult course staff.

Because of the way the filesystem is structured and because of the sector-based atomicity of file writes, you may not easily notice an interleaving of messages, even if you neglect to perform your file operations without any synchronization. If you write a proper test client that writes large messages, however, you should be able to see that concurrent accesses to a file without a lock can cause messages to be interleaved. You should perform your file operations as part of a critical section to avoid such interleavings.

Evaluation Criteria

How will my code be evaluated?

Your code will be evaluated not only for correctness but for elegance, maintainability, and the thoroughness of accompanying test code. Please read the preceding sentence carefully, and make sure your submission excels at all cited criteria.

Timeouts

When should the timeout begin? From the time we start expecting the next item, or the time from the last input received from the client?

Your goal is to write a robust server in general, and the specific use of the timer timeout is to make sure that a bad client cannot tie up server resources indefinitely. So, as the system implementer, you get to decide how the timeout should be implemented, keeping in mind that only one of the two choices posted above is the right choice given your goals.

By the way, the timeout should not start when the connection is established, but from the time the server starts expecting the next valid line (transaction) in the protocol. So the client should get 10 seconds to provide each part (HELO, MAIL FROM, RCPT TO, DATA) of the protocol, for a maximum possible connection time of (10 * 4 =) 40 seconds, 10 seconds per line.

Timeout Implementation

Ok, since the instructions call for some kind of a timeout, I did some background reading and decided to use Python's "socket receive with timeout" operation. I'll specify a timeout of 10 seconds and be done. Is this a good idea?

So, if you were to do that, which of the two timeout strategies mentioned in the previous FAQ entry would you be endorsing? Given your goal of not allowing a bad client to tie up resources indefinitely, does that implementation strategy match your goal?

Timeout Precision

A. My timeout implementation might disconnect a client in way under 10 seconds. Is this ok?

B. My timeout implementation might disconnect a client in ever-so-slightly under 10 seconds. Is this ok?

C. My timeout implementation might disconnect a client in slightly over 10 seconds. Is this ok?

D. My timeout implementation might disconnect a client in way over 10 seconds (i.e. 11 or more). Is this ok?

E. If the client is clever enough and knows the internal workings of my code, it could occupy my server indefinitely and never get disconnected, even though I'm catching and disconnecting dumb clients that simply remain idle for a while. Is this ok?

A. No, not OK. The spec says that the client can count on having 10 seconds of time to converse with the server.

B. No, not OK, same reason as A. Would you like to be connected to a heart defibrillator that was as cavalier about its timing requirements?

C. Your implementation provides the agreed minimum amount of time to the client. It also ties up server resources for a little bit longer than it strictly could, but strict time enforcement is often so difficult that this kind of tradeoff (of occupying server resources versus doing a very very precise job of time accounting) is acceptable. What's an acceptable level of imprecision? See the next question, though you should strive to be as accurate as possible.

D. No, that implementation overshoots and ties up the resources for 20% longer than it should. You can do better (though we will accept anything that provides more than 10 seconds to the client and terminates in under 11 seconds in the worst case).

E. No. Server code needs to be bulletproof against malicious clients, and security should never be predicated on your code remaining obscure. A good implementation can be simultaneously robust and public (and often the better implementations are).

Invisible \n

Is there a \n after DATA? Is there a \n after 200 OK?

Yes and yes.

Windows and \r\n

I am using telnet on Windows to test my server, and the Windows terminal driver insists on generating "\r\n" when I press the return key. What should I do?

If you want to be able to perform ad hoc testing from the Windows terminal and telnet, you should treat "\r\n" as if it is a "\n". In fact, you should have done this without asking us, as it is easier to do it than for you to type an email and for us to write a FAQ entry. Keep in mind that our spec talks only about regular "\n"s, so your code need only recognize "\n"s, and our test code will only issue "\n"s.

Keep in mind that you should be developing and turning in an automated test suite, as opposed to relying on ad hoc manual testing.

Also keep in mind that every time someone reads a clarifying answer, like this one, which says "you can do X AND Y, but you need only do X", then sends an email asking "wait, am I required to also do Y?", a baby seal dies.

DATA and Message Timeout

Is the timeout 10 seconds for DATA and the message, or 10 seconds for DATA, and another 10 seconds for the message?

The FAQ asked you to implement the former, but we will accept the latter as well.

Message Length

Is there a limit on the length of a message?

The 10-second timeout is an effective limit on the length of a message. The spec does not call for any other limit on messages.

Client "Pre-Sending" Commands

What if the client does not wait for the server, and sends everything all at once?

What if? What do you think ought to happen? We know that the spec is written to describe the normal-case scenario, where the client waits for the server, but how would a server ever hope to ensure that the client got what it sent? Clearly, in this instance, the client did not wait, but from the perspective of the server, the client sent a command on the stream, then sent another command on the stream, then sent another. Since you know what a stream is, and how it differs from datagrams, you know that streams do not have a concept of message boundaries. They are like the end of a hose out of which data pours, a little bit at a time, regardless of how it was packed into the other end of the hose. As far as the operation of the server is concerned, the stream just has to contain legal commands, so what the client did is perfectly legal. So your server should be able to deal with legal commands appearing on the stream, even though you, with your God\'s eye view, know that the client "did not follow the protocol." If you think about it, only you know that the wrong thing happened; your server has no way of figuring out when precisely the client sent what it sent. What if the server had sent its response at time x, to be received by the client at time x+rtt/2, but the client sent its next command at time x+rtt/2-epsilon? How would a server detect that the client is "disobeying the protocol"? Ahhh, impossible, right. And guess what? If you wrote your server in the natural style that is simplest to implement (i.e. read, process command, make state change, etc), it will already deal with the case of an overeager client that sends commands early just fine. So do the simple thing that the current spec is asking for, not the crazy thing (causality tracking with a protocol that does not capture causality) that is impossible to do.

HELO netid

The spec says that the server should respond with HELO yournetid. Whose netid is it? Is it the client's? How would I even discover the client's netid? Do you all smoke crack at TA meetings? This is IMPOSSIBLE!!!

Please replace "yournetid" with your NetID. Take your NetID, hardcode it in a string in the server, return it in response to HELO commands. We will use it to identify your server during testing. It's like the part of the test where you're supposed to write your name. It is not supposed to be a tricky question. ;-)

Submission

What should we turn in?

You should turn in a ZIP file containing MP3/QUESTIONS.txt, MP3/*.py containing your python code for the server, the client, and all of the test cases you have developed in conjunction with the SMTP server.

Submission Errors

I failed to follow the simple instructions above. Can I write to you and ask for special treatment? My 10 second mistake will only take 3 minutes of your time to fix up.

Actually, it breaks our automated flow, so it takes much more than 3 minutes of someone's time. But assuming the 3 minutes estimate is correct, 171 students * 3 minutes = 513 minutes = 8.5 hours of doing nothing but this kind of special treatment. We'd rather spend those hours talking about course material and the subtle content therein. Please follow instructions carefully.

Frequently Asked Questions

Does HELO need to come in any specific order relative to MAIL FROM and RCPT TO? Do the other commands need to come in any specific order?

How do we handle extra whitespace in the HELO, MAIL FROM, and RCPT TO commands?

Are multiple RCPT TO commands allowed? If so, is email delivered multiple times, or once?

When a client sends two emails in one session, which commands are for both emails?

Where do the "From:" and "To:" lines in the mailbox example in the README come from? I see that client.py sends these lines as well. Where do these lines come from?