CS312 SML Style Guide

Our primary goal in this class is to teach you how to program elegantly.  You have most likely spent many years in secondary school learning style with respect to the English language -- programming should be no different.  Every programming language demands a particular style of programming, and forcing one language's style upon another can have disastrous results.  Of course there are some elements of writing a computer program that are shared between all languages.  You should be able to pick up these elements through experience.  In CS211 you learned a great deal about object-oriented programming, in this class you will learn about value-oriented programming.

As you will soon realize, this class takes style seriously.  Listed below are the style rules we expect you to follow.  The numbered rules are mandatory -- breaking them will result in point deductions.  Other comments and suggestions are simply bulleted.  These are not mandatory.  Lastly, you should note that our rules come no where near the style mandates you will likely come across in industry.  Many of companies go so far as to dictate exactly where spaces can go.  You can rejoice that you do not have to learn Hungarian style.

Mandatory Rules:

  1. 80 column limit
  2. No tab characters
  3. Begin with a comment
  4. Code must compile
  5. Comments go above the code they reference
  6. Avoid useless comments
  7. Avoid over-commenting
  8. Line breaks
  9. Proper multi-line commenting
  10. Use meaningful names
  11. Type annotations
  12. Avoid global mutable variables
  13. When to rename variables
  14. Order of declarations in a structure
  15. Indent two spaces at a time
  16. Over parenthesizing
  17. Indenting case expressions
  18. Indenting if expressions
  19. Indenting comments
  20. No incomplete pattern matches
  21. Pattern match in the function arguments when possible
  22. Function arguments should not use values for patterns
  23. Avoid using too many projections
  24. Pattern match with as few case expressions as necessary
  25. Don't use valOf, hd, or tl
  26. Don't let expressions take up multiple lines
  27. Breakup large functions into smaller functions
  28. Over-factoring code
  29. Don't rewrite existing code
  30. Misusing if expressions
  31. Misusing case expressions
  32. Other common misuses
  33. Don't rewrap functions
  34. Don't needlessly nest let expressions
  35. Avoid computing values twice

Suggestions:


File Submission

  1. 80 Column Limit: No line of code can have more than 80 columns.  Using more than 80 columns causes your code to wrap around to the next line which is devastating to the readability of your code.  This is so important that we will not allow you to submit code that has any line with more than 80 columns.  Given the Emacs installation off this website, saving a file with more than 80 columns will result in a warning.  Ensuring that all your lines fall within the 80 column limit is not something you should do when you have finished programming.  The course staff reserves the right to refuse assistance with code that violates this rule.
  1. No Tab Characters: Do not use the tab character (0x09).  Instead, use spaces to control indenting.  The Emacs package from this website avoids using tabs (with the exception of pasting text from the clipboard or kill ring).  Instead when in sml-mode, Emacs uses the TAB key control indenting instead of inserting the tab charater.
  1. Begin With a Comment: All submitted files must begin with a comment.  In other words, the first two characters of the file must be (*.  This requirement is for the submission system, and really has nothing to do with style.
  1. Code Must Compile: Any code you submit must compile.  If it does not compile, we won't grade the problem and you will lose all the points for the problem.  There is no excuse for it to not compile.  You should treat any compiler warnings as an error.

Commenting

  1. Comments Go Above the Code They Reference: Consider the following:
    val sum = foldl (op +) 0
    (* Sums a list of integers. *)
    
    (* Sums a list of integers. *)
    val sum = foldl (op +) 0
    The latter is the better style, although you may find some of the source code for the SML/NJ library uses the first.  We require that you use the latter.
  1. Avoid Useless Comments: Comments that merely repeat the code it references or state the obvious are a travesty to programmers.  Comments should state the invariants, the non-obvious, or any references that have more information about the code.
  1. Avoid Over-commenting: Incredibly long comments are not very useful.  Long comments should only appear at the top of a file -- here you should explain the overall design of the code and reference any sources that have more information about the algorithms or data structures.  All other comments in the file should be as short as possible, after all brevity is the soul of wit.  Most often the best place for any comment is just before a function declaration.  Rarely should you need to comment within a function -- variable naming should be enough.
  1. Line Breaks: Obviously the best way to stay within the 80 character limit imposed by the rule above is pressing the enter key every once and a while.  Including empty lines should only be done between value declarations within a struct block, especially between function declarations.  Often it is not necessary to have empty lines between other declarations unless you are separating the different types of declarations (such as structures, types, exceptions and values).  Unless function declarations within a let block are long, there should be no empty lines within a let block. There should absolutely never be an empty line within an expression.
  1. Proper Multi-line Commenting: When comments are printed on paper, the reader lacks the advantage of color highlighting performed by an editor such as Emacs.  This makes it important for you to distinguish comments from code.  When a comment extends beyond one line, it should be preceded with a * similar to the following:
    (* This is one of those rare but long comments
     * that need to span multiple lines because
     * the code is unusually complex and requires
     * extra explanation. *)
    fun complicatedFunction () = ...

Naming and Declarations

  1. Use Meaningful Names: Variable names should describe what they are for.  Distinguishing what a variable references is best done by following a particular naming convention (see suggestion below).  Variable names should be words or combinations of words.  Cases where variable names can be one letter are in a short let blocks.  Often it is the case that a function used in a fold, filter, or map is bound to the name f.  Here is an example for short variable names:
    let
      val d = Date.fromTimeLocal(Time.now())
      val m = Date.minute d
      val s = Date.second d
      fun f n = (n mod 3) = 0
    in
      List.filter f [m,s]
    end
  1. Type Annotations: Top-level functions and values should always be declared with types.  Consider the following:
    fun foo x = x+1
    
    fun foo(x:int):int = x+1
    The latter is considered better.
  1. Avoid Global Mutable Variables: Mutable values should be local to closures and almost never declared as a structure's value.  Making a mutable value global causes many problems.  First, running code that mutates the value cannot be ensured that the value is consistent with the algorithm, as it might be modified outside the function or by a previous execution of the algorithm.  Second, and more importantly, having global mutable values makes it more likely that your code is nonreentrant.  Without proper knowledge of the ramifications, declaring global mutable values can extend beyond bad style to incorrect code.  We will not allow you to declare mutable value declarations in a structure.
  1. When to Rename Variables: You should rarely need to rename values, in fact this is a sure way to obfuscate code.  Renaming a value should be backed up with a very good reason. One instance where renaming a variable is common and encouraged is aliasing structures. In these cases, other structures used by functions within the current structure are aliased to one or two letter variables at the top of the struct block. This serves two purposes: it shortens the name of the structure and it documents the structures you use. Here is an example:
    struct
      structure H = HashTable
      structure T = TextIO
      structure A = Array
      ...
    end
  1. Order of Declarations in a Structure: When declaring elements in a structure you first alias the structures you intend to use, followed by the types, followed by exceptions, and lastly list all the value declarations for the structure. Here is an example:
    struct
      structure L = List
      type foo = unit
      exception InternalError
      fun first list = L.nth(list,0)
    end
    Note that every declaration within the structure should be indented the same amount.

Indenting

  1. Indent Two Spaces at a Time: Most lines that indent code should only indent by two spaces more than the previous line of code.
  1. Over Parenthesizing: Parenthesis have many semantic purposes in ML, including constructing tuples, grouping sequences of side-effect expressions, forcing higher-precedence on an expression for parsing, and grouping structures for functor arguments.  Clearly, the parenthesis must be used with care.  You may only use parentheses when necessary or when it improves readability.  Consider the following two function applications:
    val x = function1 (arg1) (arg2) (function2 (arg3)) (arg4)
    
    val x = function1 arg1 arg2 (function2 arg3) arg4
    The latter is considered better style. Parentheses should never appear on a line by themselves, nor should they be the first graphical character -- parentheses do not serve the same purpose as brackets do in C or Java.
  1. Indenting case Expressions: Indent similar to the following.
    case expr of
      pat1 => ...
    | pat2 => ...
  1. Indenting if Expressions: Indent similar to the following.
    if exp1 then exp2              if exp1 then
    else if exp3 then exp4           exp2
    else if exp5 then exp6         else exp3
         else exp8
    
    if exp1 then exp2 else exp3    if exp1 then exp2
                                   else exp3
  1. Indenting Comments: Comments should be indented to the level of the line of code that follows the comment.

Pattern Matching

  1. No Incomplete Pattern Matches: Incomplete pattern matches are flagged with compiler warnings. We do not allow any compiler warnings when grading; thus, if there is a compiler warning, the problem will get no points.
  1. Pattern Match in the Function Arguments When Possible: Tuples, records and datatypes can be deconstructed using pattern matching.  If you simply deconstruct the function argument before you do anything useful, it is better to pattern match in the function argument. Consider these examples:
    Bad            Good
    fun f arg1 arg2 = let
      val x = #1 arg1
      val y = #2 arg1
      val z = #1 arg2
    in
      ...
    end
    
     
    fun f (x,y) (z,_) = ...
    fun f arg1 = let
      val x = #foo arg1
      val y = #bar arg1
      val baz = #baz arg1
    in
      ...
    end
     
    fun f {foo=x, bar=y, baz} = ...
  1. Function Arguments Should Not Use Values for Patterns: You should only deconstruct values with variable names and/or wildcards in function arguments.  If you want to pattern match against a specific value, use a case expression or an if expression.  We include this rule because there are too many errors that can occur when you don't do this exactly right.  Consider the following:
    fun fact 0 = 1
      | fact n = n * fact(n-1)
        
    fun fact n =
      if n=0 then 1
      else n * fact(n-1)
    The latter is considered better style.
  1. Avoid Using Too Many Projections: Frequently projecting a value from a record or tuple causes your code to become unreadable.  This is especially a problem with tuple projection because the value is not documented by a variable name.  To prevent projections, you should use pattern matching with a function argument or a value declaration.  Of course, using projections is okay as long as it is infrequent and the meaning is clearly understood from the context.  The above rule shows how to pattern match in the function arguments.  Here is an example for pattern matching with value declarations.
    Bad            Good
    let
      val v = someFunction()
      val x = #1 v
      val y = #2 v
    in
      x+y
    end
     
    let
      val (x,y) = someFunction()
    in
      x+y
    end
  1. Pattern Match with as Few case Expressions as Necessary: Rather than nest case expressions, you can combine them by pattern matching against a tuple.  Of course, this doesn't work if one of the nested case expressions matches against a value obtained from a branch in another case expression.  Nevertheless, if all the values are independent of each other you should combine the values in a tuple and match against that.  Here is an example:
    Bad
         let
           val d = Date.fromTimeLocal(Time.now())
         in
           case Date.month d of
             Date.Jan => (case Date.day d of
                            1 => print "Happy New Year"
                          | _ => ())
           | Date.Jul => (case Date.day d of
                            4 => print "Happy Independence Day"
                          | _ => ())
           | Date.Oct => (case Date.day d of
                            10 => print "Happy Metric Day"
                          | _ => ())
         end
    Good
         let
           val d = Date.fromTimeLocal(Time.now())
         in
           case (Date.month d, Date.day d) of
             (Date.Jan, 1) => print "Happy New Year"
           | (Date.Jul, 4) => print "Happy Independence Day"
           | (Date.Oct, 10) => print "Happy Metric Day"
           | _ => ()
         end
  1. Don't use valOf, hd, or tl: The functions valOf, hd, and tl are used to deconstruct option types and list types; however, they raise exceptions on certain inputs.  You should never use these functions.  In the case that you find it absolutely necessary to use these (something that probably won't ever happen), you should handle any exceptions that can be raised by these functions.

Factoring

  1. Don't Let Expressions Take Up Multiple Lines: If a tuple consists of more than two or three elements, you should consider using a record instead of a tuple.  Records have the advantage of placing each name on a separate line and still looking good.  Constructing a tuple over multiple lines makes your code look hideous -- the expressions within the tuple construction should be extraordinarily simple.  Other expressions that take up multiple lines should be done with a lot of thought.  The best way to transform code that constructs expressions over multiple lines to something that has good style is to factor the code using a let expression.  Consider the following:
    Bad
         fun euclid (m:int,n:int) : (int * int * int) =
           if n=0
             then (b 1, b 0, m)
           else (#2 (euclid (n, m mod n)), u - (m div n) *
                 (euclid (n, m mod n)), #3 (euclid (n, m mod n)))
    Good
         fun euclid (m:int,n:int) : (int * int * int) =
           if n=0
             then (b 1, b 0, m)
           else let
             val q = m div n
             val r = n mod n
             val (u,v,g) = euclid (n,r)
           in
             (v, u-(q*v), g)
           end
  1. Breakup Large Functions into Smaller Functions: One of the greatest advantages of functional programming is that it encourages writing smaller functions and combining them to solve bigger problems.  Just how and when to break up functions is something that comes with experience.
  1. Over-factoring code: In some situations, it's not necessary to bind the results of an expression to a variable.  Consider the following:
    Bad
         let
           val x = TextIO.inputLine TextIO.stdIn
         in
           case x of
             ...
         end
    Good
         case TextIO.inputLine TextIO.stdIn of
           ...
    Here is another example of over-factoring (provided y is not a large expression):
    let val x = y*y in x+z end
          
    y*y + z
    The latter is considered better.

Verbosity

  1. Don't Rewrite Existing Code: The basis library and the SML/NJ library have a great number of functions and data structures -- use them!  Often students will recode List.filter, List.map, and similar functions.  A more subtle situation for recoding is all the fold functions.  Writing a function that recursively walks down the list should make vigorous use of List.foldl or List.foldr.  Other data structures often have a folding function; use them whenever they are available.
  1. Misusing if Expressions: Remember that the type of the condition in an if expression is bool. In general, the type of an if expression is 'a, but in the case that the type is bool, you should not be using if at all. Consider the following:
    Bad            Good
    if e then true else false   e
    if e then false else true   not e
    if beta then beta else false   beta
    if not e then x else y   if e then y else x
    if x then true else y   x orelse y
    if x then y else false   x andalso y
    if x then false else y   not x andalso y
    if x then y else true   not x orelse y
  1. Misusing case Expressions: The case expression is misused in two common situations.  First, case should never be used in place of an if expression (that's why if exists).  Note the following:
    case e of
      true => x 
    | false => y
    
    if e then x else y
    The latter expression is much better.  Another situation where if expressions are preferred over case expressions is as follows:
    case e of
      c => x   (* c is a constant value *)
    | _ => y
    
    if e=c then x else y
    The latter expression is definitely better.  The other misuse is using case when pattern matching with a val declaration is enough. Consider the following:
    val x = case expr of (y,z) => y
    
    val (x,_) = expr
    The latter is considered better.
  1. Other Common Misuses: Here is a bunch of other common mistakes to watch out for:
    Bad            Good
    l::nil   [l]
    l::[]   [l]
    length + 0   length
    length * 1   length
    big exp * same big exp   let val x = big exp in x*x end
    if x then f a b c1
    else f a b c2
      f a b if x then c1 else c2
    String.compare(x,y)=EQUAL   x=y
    String.compare(x,y)=LESS   x<y
    String.compare(x,y)=GREATER   x>y
    Int.compare(x,y)=EQUAL   x=y
    Int.compare(x,y)=LESS   x<y
    Int.compare(x,y)=GREATER   x>y
    Int.sign(x)=~1   x<0
    Int.sign(x)=0   x=0
    Int.sign(x)=1   x>0
  1. Don't Rewrap Functions: When passing a function around as an argument to another function, don't rewrap the function if it already does what you want it to.  Here's an example:
    List.map (fn x => Math.sqrt x) [1.0, 4.0, 9.0, 16.0]
    
    List.map Math.sqrt [1.0, 4.0, 9.0, 16.0]
    The latter is better. Another case for rewrapping a function is often associated with infix binary operators. To prevent rewrapping the binary operator, use the op keyword. Consider this example:
    foldl (fn (x,y) => x + y) 0
    
    foldl (op +) 0
    The latter is considered better style.
  1. Don't Needlessly Nest let Expressions: It is possible to have more than one declaration in the first block of a let...in...end expression.  Consider the following:
    let
      val x = 42
    in
      let
        val y = 101
      in
        x + y
      end
    end
    
    let
      val x = 42
      val y = 101
    in
      x+y
    end
    The latter is better style.
  1. Avoid Computing Values Twice: When computing values twice you're wasting the CPU time and making your program ugly. The best way to avoid computing things twice is to create a let expression and bind the computed value to a variable name. This has the added benefit of letting you document the purpose of the value with a variable name -- which means less commenting.