M, W, F 10:10-11:00
in Upson 142

CS 1133: Short Course in Python

Spring 2019

Notes on Python Programming Style

A goal of CS 1133 is for you to learn to write programs that are not only correct but also understandable. These guidelines should help you toward that goal. We ask that you follow these guidelines when writing programs in this course. They will give you a good basis for developing a style of your own as you become a more experienced programmer.

This handout includes guidelines for Python constructs that will be covered later in the semester. Skim these sections now and read them more carefully later when the topics are discussed.


Table of Contents


Reasons for a Disciplined Programming Style

Your computer program should be readable, because if it is not readable, the chances of it being correct are slim. Moreover, if your program is unreadable, it will be difficult and time-consuming to find and correct the errors in it. Therefore,

Using a disciplined style of programming will save you time whenever you have to read your program or debug it.

Your program should be readable by others, not just you. Consider a program you will write in this course. A grader who has trouble understanding it is not likely to give you good a grade. The more understandable, the simpler your program appears to a grader, the better the grade they will be willing to give you.

Outside this course, making programs readable by others becomes even more important. Most programs live a long time and require "maintenance"; changes to adapt to new and different requirements, upgrades in other software, new hardware, etc. Further, the author of the program is quite likely not going to be around when the maintenance is required; someone else must read the program and understand it enough to update it successfully.

Even the programs you write for yourself should be readable; if not, four weeks after finishing it you will not remember it enough to make changes simply.

Thus, simply for your own sake and for the sake of others, it makes sense to develop programming habits that lend themselves to writing readable, understandable, and correct, programs.

Part of these habits concern simple, syntactical measures like indenting program parts properly and using a few conventions for names of variables, methods, etc. The more important part concerns recording enough information in comments for the reader to understand how a program is designed and why.

A computer program is the result many design decisions. These decisions — why this variable was introduced, what that function does, etc. — are often not reflected in the final Python code, which consists of low-level, detailed declarations and statements. However, the higher-level design must be understood if a programmer is to modify the program successfully. Trying to understand decisions that are not recorded in comments in the code is tedious, error-prone, and aggravating --but all too common.

So, it will be to your advantage to instill in yourself some disciplined programming habits, right from the beginning, such as the following:

  • Write a precise definition (in a comment) of a variable when you assign to it for the first time.

  • Update the definition of the variable (the comment) when you decide to change its meaning.

  • Write a precise specification for a function (which says what it does) at the same time you write its heading.

  • Change the function specification as you make changes to the body that violate the specification.

All of these suggestions have the same theme: write your comments before you start to code. Writing all the comments after the program is finished is bad for three reasons:

  1. You did not have the comments to guide your testing and debugging, slowing you down.
  2. You may not remember the meaning of all the variables and methods, making it harder to write comments.
  3. Since you did not write the comments beforehand, you probably will not bother to write them afterwards either.

You will find that writing good comments as you write a program will help you clarify your ideas and write better, correct code sooner. If you can write down clearly what your program is doing you are more likely to have a good understanding of the problem and your program is more likely to be correct. Time spent on careful thinking and writing is more than repaid in time saved during testing and debugging.

Return to Table of Contents


Indentation and Layout

Many people do not think that Python requires a style guide because — unlike other languages — Python programs have to be properly formatted and indented in order to run. However, even so, there are many problems that proper formatting will keep from happening.


Tabs versus Spaces

In general, Python will allow you to indent with either tabs or spaces. However, beginning programmers should always indent with spaces. This is what your editor, Atom Editor, does. In fact, this is standard practice in most code editors. And whatever you do, you should absolutely never mix tabs and spaces. Doing so will cause your program to crash.

You should always use 4 spaces for each indentation level.


Line Length

You should strive for a maximum of 79 characters per line.

This convention is a historical artifact of the fact that older computers could only show 80 characters per line on the command prompt. This convention has persisted as people have become accustomed to this length, and it is now considered a "natural length" for a single line of code.

While most code editors will wrap the text for you if you exceed this character limit, text wrapping occurs in inconvenient places and can be difficult to read. To make the code easier to read, it would be better for you to break up the lines yourself.

In order to break up a line of code you need to put expressions in parentheses. You can then break up the code by putting line breaks inside of the parentheses. The following is acceptable Python code:

if (width == 0 and height == 0 and
    color == 'red' and emphasis == 'strong' or
    highlight > 100):

When using parentheses to break up code across multiple lines, it is standard practice to indent the lines so that they start just after the position of the initial parenthesis. This is shown in the example above.


Blank Lines

Because Python is whitespace significant, blanks lines should be used sparingly and to give the most emphasis to code divisions. The following two important guidelines should be observered:

  • Functions (and class definitions) should be separated by two blank lines.
  • Methods (within a class definition) should be separated by a single blank line.

In addition, you may use blank lines to separate logical sections of code within a function or method, but this should be done sparingly.


Import Statements

Imports should usually be on separate lines. The following

import os
import sys
is preferable to
import sys, os

However, it is okay to use commas when importing multiple items from a single module. The following is acceptable:

from subprocess import Popen, PIPE

The from keyword should be limited to imports that are used heavily within the current module, and which are guaranteed to cause no collisions with the active namespace.

Return to Table of Contents


Naming Conventions

Good naming conventions can help you and any reader of a program understand the program easily; bad naming conventions or no naming conventions can lead to confusion and misunderstanding. Some people advocate using very long names that define what the named entity (a method, field, local variable, etc.) represents. However, in general, this is not feasible. A good definition for an entity may require three lines of explanation or more ---how can you have such a long name? On the other hand, extremely short names --one or two characters long-- don't help at all. Thus, some in-between measures have to be adopted. There are contexts where a short name is best and contexts where a longer name is best.

Remember, a name can rarely be used to give a complete, precise definition of the entity it names, and a complete definition should always be given where the entity is defined.

Unlike other programming languages, the naming conventions for Python are not quite finalized. In any Python module, you might see the following mix of conventions:

  • Lowercase: example
  • Lowercase with underscores: example_in_python
  • Uppercase: EXAMPLE
  • Uppercase with underscores: EXAMPLE_IN_PYTHON
  • CamelCase (each word in phrase is capitalized): ExampleInPython
  • Lower CamelCase (CamelCase with first letter lower case): exampleInPython

In order to maintain maximum clarity, we obey the following rules (which are standardized in Java, but accepted in Python too).

  • Non-object oriented features (functions and global variables) use underscores.
  • Object oriented features (classes and methods) use CamelCase.
  • Class names are upper (Camel) case.
  • All other names are lower (Camel or underscore) case.

Function names

Example: draw_oval, show_window, length, interest

A function that is a command to do something should be given by a verb phrase that gives some indication of what the method does (But this name should never be used as an excuse to not comment the method). This primarily includes procedures, though it could be true for fruitful functions as well.

On the other hand, a method that simply returns a value (e.g. a fruitful function with no side effects) should be a noun phrase that describes the value. Obviously, determining which case applies to a particular function can be a matter of taste.

Functions which are not methods in a class should use either lowercase or lowercase with underscores (when the function name is a compound word).


Parameter Names

Examples: x, y

The precise meaning of a parameter, and any restrictions on it, should be given in the comment of the heading of the method. Therefore, particularly if the method is fairly short, it is wise to give parameters short names, and avoid compound words. The names should be lower case. If you absolutely must use a compound word for a parameter, use underscores for functions and CamelCase (with first letter lower case) for methods.

For a more detailed example, consider the two method headings given below. The first is preferable because it is shorter and easier to understand. Moreover, the body of the method of the first method will also be shorter and far easier to understand and manipulate.

    drawOval(x, y, w, h):
        """
        Draws an ellipse that fits within the rectangle given.
      
        The upper left corner of the rectange is at position (x,y), its width is w, 
        and its height is h. The object uses the current color to draw the ellipse.
        """
    draw_oval(x_coordinate, y_coordinate, width, height):
        """
        Draws an ellipse that fits within the rectangle given.
      
        The upper left corner of the rectange is at position (x_coordinate,y_coordinate), 
        its width is width, and its height is height. The object uses the current color 
        to draw the ellipse.
        """

A parameter used as a flag should be named for what the flag represents, like no_more_pizza, rather than simply flag. In addition, you should avoid generic names like "count" and "value". Instead, describe the items being counted or the value stored in the parameter.

The first parameter for an instance method should always be self. The first parameter for a class method should always be cls.


Local Variable Names

Examples: size, x_coordinate, no_lines

A local variable of a method contains information that helps describe the state of the method during its execution. A local-variable should be a noun phrase that describes the information that it contains. However, the local variable still needs a more precise comment that describes it.

Local variables should start with a lower case letter. You may either use underscores or CamelCase, but you must be consistent throughout the module.

If the body of a function is short, or the places in which a local variable is used is fairly short, then a short, one-or-two letter name can be used for the local variable (see also the conventions for parameter names). A name like the_loop_counter or first_number instead of kk or x causes clutter. If the local variable is used only in a short context, and if it is suitably defined with a comment at its place of first assignment, then use the short name.

A variable used as a flag should be named for what the flag represents, like no_more_pizza, rather than simply flag. In addition, you should avoid generic names like "count" and "value". Instead, describe the items being counted or the value stored in the variable.


Global Variable Names

Examples: PI, Y, E, WINDOW_SIZE

Global variables should be reserved for constants (e.g. variables that do not change after their initial assignment). Standard practice for constants is to write them in upper case with underscores as needed.

Unlike other languages, Python does not enforce constants. It is possible (but a bad idea) to change a constant variable. This is an instance where the use of a naming convention is incredibly important. The user of uppercase indicates that it is a bad idea to change this variable.


Class names

Example: FilterInputStream, LivingMammals

Since a class represents a set of possible objects, each of which is an instance of the class, a class name should generally be a noun phrase that identifies the possible objects. Class names are written in proper CamelCase, with the first letter capitalized.


Method names

Example: drawOval, fillOval, length, toString, _privateMethod

The guidelines for method names are similar to those for function names except that we use CamelCase (with first letter lowercase) rather than underscores. Methods that are private (e.g. should only be used as a helper method) should begin with an underscore.

The first parameter in an instance method should always be self. The first parameter in a class method should always be cls.


Attribute Names

Examples: size, xCoordinate, noLines, _privateField

An attribute name, or instance variable as they are sometimes called, should be CamelCase with with the first letter lowercase. Atributes that are private (e.g. are not accessed outside of the instance methods) should begin with an underscore.

Attributes contain information that determine the state of their object. Therefore, an attribute name should be a noun phrase that describes the information the field contains. However, the field still needs a more precise comment that describes it.

An attribute should be CamelCase with with the first letter lowercase. Attributes that are private (e.g. are not accessed outside of the instance methods) should begin with an underscore.


Property Names

Examples: size, xCoordinate, noLines

A property looks like an attribute, but it is really a collection of accessor methods (getters and setters) for manipulating an attribute. A property typically corresponds to another private attribute. Hence, the property name should be the same as the attribute name, but without the leading underscore.

Return to Table of Contents


Comments

Python has two types of comments: single line comments (which start with a # sign) and docstrings (which are enclosed in triple quotes). The following is a general rule regarding commenting:

Specifications are docstrings; all other comments are single line comments.

Specification Style

All specification comments, be they for a function, module, or class, follow the same format. They are a docstring enclosed in triple-quotes on either side. They should start with a simple description that can fit on exactly one line. This should be followed by a blank line, and then a more detailed description of the specification.

For example, here is a detailed specification for the method drawOval:

    draw_oval(x, y, w, h):
        """
        Draws an ellipse that fits within the rectangle given.
      
        The upper left corner of the rectange is at position (x,y), its width is w, 
        and its height is h. The object uses the current color to draw the ellipse.
        """
Note that we do not go into detail about the parameters until after the blank line. The goal is to make a first line that is as descriptive, but short, as possible.

Sometimes the first line of the specification is enough information. In that case, there is no need for the blank line, as shown in the example below:

    def trim(s):
        """
        Returns: copy of s with no leading or trailing whitespace
        """


Module Comments

Module comments are extremely important for this course, particularly as they are typically the first thing that we read when we grade your assignments. Module comments should be the very first thing in any module (e.g. a .py file). As they are a specification, they should use docstrings. In particular, this is the documentation that will display when you use the help() command on this module.

Module comments should obey the basic rules for specifications. That is, they should have a short single line, followed by a blank line and then a more detailed explantion. However, the difference is what to do at the end of the specification. The last two lines of a module comment are special. The second to last line should be the name of the author (please include your netid). The last line should be the date that the file was last modified.

Putting all of this together, here is an example of a module comment in full:

"""
Module to demonstrate basics of Python style

This module does not do anything interesting when it is loaded or run as an 
application.  The primary purpose of this module is to demonstrate good programming 
style in Python.

Authors: Walker White (wmw2), Lillian Lee (ljl2)
Version: August 24, 2012
"""

Function Specifications

Every function header should be followed by a comment giving its specification. The specification comment should be a docstring, following the basic rules for specifications. It should be indented, just like the rest of the function body.

This specification, together with the function header — which gives the function name and number parameters — should provide all of the information needed to use the function. It should describe any restrictions on the parameters and what the function does, not how it does it. Someone should never have to look at the body of a method to understand how to use it.

One of the most important aspects of the specification comment is to describe the parameters of the function. Each parameter should be listed separately, with its own precondition. Here is an example:

    def find_common (t, c):
        """
        Prints the most frequently occurring temperature in t[0:c]. 
        
        If there is more than one possibility for the most frequently occuring 
        temperature, this function prints the least such temperature.
        
        Parameter t: a sequence of temperature values in degrees centigrade.  
        Precondition: t is a sequence of floats
        
        Parameter c: How far to search in the sequence t.
        Precondition: c is an int, 0 <= c < len(t)
        """

Unfortunately, it is more typical to find someone write a comment like the following (if any comment is provided at all).

    def find_common (t, c):  
        """
        Finds the most frequent temperature
        """

This specification fails to say what part of sequence t is to be included in finding the most frequent temperature. It also fails to say what it should do with the value it finds (Print it? Return it?). The only way for the user to find out is to look at the body of the method (if it is available), and that should not be necessary.

With that said, long paragraphs are just as hard to read as code. When you write comments they should be as short and to the point as possible. For a fruitful function (i.e. one that returns a value other than None), it is often easiest to simply describe the value returned, using the word "Return":

    def dist(x1, y1, x2, y2):
        """
        Returns: distance between points (x1,y1) and (x2,y2)
        """

Another good rule of thumb comes from The Elements of Style, a famous little book on writing style by Cornell Professors W. Strunk, Jr., and E.B. White. Comments should always use active voice whenever possible to keep the information short and direct. For example, do not write the specification "This function searches list x for a value y and ..." or "Function isin searches list x for a value y ...". Such specifications are too wordy and are not commands but descriptions. Instead, they should say the following:

    def isin(y, values):
        """
        Returns: True if y is in values
        
        Parameter y: A value to search for
        Precondition: None
        
        Parameter values: The list to search within
        Precondition: values is a list
        """

As a final word, note that all of our functions specifications end with a section labeled "PRECONDITIONS". While this information is often mentioned in the descriptive paragraph that preceeds it, this is a handy quick reference to remind the programmer of the preconditions required by that particular function.


Statement-Comments

Just as the sentences of an essay are grouped in paragraphs, so the sequence of statements of the body of a method should be grouped into logical units. Often, the clarity of the program is enhanced by preceding a logical unit by a comment that explains what it does. This comment serves as the specification for the logical unit; it should say precisely what the logical unit does.

The comment for such a logical unit is called a "statement-comment". This comment should be written as a command to do something. Here is an example:

    # Ensure x >= y by swapping x and y if needed.
    if (x < y):
        tmp = x
        x = y
        y = tmp

The comment should explain what the group of statements does, not how it does it. Thus, it serves the same purpose as the specification of a method: it allows one to skip the reading of the statements of the logical unit and just read the comment. With suitable statement-comments in the body of a method, one can read the method at several "levels of abstraction", which helps one scan a program quickly to find a section of current interest, much like on scans section and subsection headings in an article or book. But this purpose is served only if statement-comment are precise.

Statement comments should be complete. For example, the following comment is inadequate:

    # Test for valid input

What happens if the input is valid? What if it isn't; is an error message written or is some flag set? Without this information, one must read the statements for which this statement-comment is a specification, and the whole purpose of the statement comment is lost.

Placement of Statement-Comments

Here is a more extended example of a statement comments.

    # Eliminate whitespace from the beginning and end of t
    while (len(t) != 0 and t[0].isspace()):
        t= t[1:]

    # If t is empty, print an error message and return
    if (len(t) == 0):
        ...
        return False;

    if (contains_capitals(t)):
        ...

    # Store the French translation of t in t_french
    ...

At the highest level, this program fragment consists of two statements: (0) Truthify the definition "t" and (1) Store the French translation of "t" in "t_french". The statement "Truthify the definition of t" is implemented in three steps, two of which are themselves statement-comments. Thus, this program fragment has three levels of abstraction.

The comments identifying a statement-comment generally go at the start of the code fragment. Therefore, it is easy to tell where a statement-comment begins; however, it is just as important to tell where it ends. For the simplest statement-comments — those that do not have any statement-comments nested inside of them — they clearly end when we see the comment for the next statement-comment. For example, "Eliminate whitespace from the beginning and end of t" is complete by the time we reach "If t is empty, print an error message and return"

However, this is problematic for more complex statement-comments. The statement-comment "Truthify the definition of t" has statement-comments inside of it. It is not complete by the time we see "Eliminate whitespace from the beginning and end of t". Therefore we have to add an additional comment marking the end of this statement-comment, which we have done in the example above.

In our example, note the reliance on the definition of "t" in the statement-comment "Truthify the definition of t" This allows the statement-comment to be short but precise. But it means that we must have a suitable definition for "t" at its first assignment.


Attribution Comments

In order to avoid plagiarism, you should always credit a co-author or source. Co-authors should be credited in the module comments. If someone is not a co-author (e.g. they did not write code), but they influenced the code through other means, you should acknowledge this in the module specification. For example:

"""
Module to demonstrate basics of Python style

This module does not do anything interesting when it is loaded or run as an 
application.  The primary purpose of this module is to demonstrate good programming 
style in Python.

This module was written after a discussion with Steve Marschner (srm2), though he did 
not contribute any code.

Authors: Walker White (wmw2), Lillian Lee (ljl2)
Version: August 24, 2012
"""

For specific algorithms or code fragments, the acknowledgement should be in the should be in the function specification. For example:

    def collides(shape1, shape2):
        """
        Returns: True if shape1 and shape2 intersect
        
        This function is an implementation of the Gilbert-Johnson-Keerthi distance 
        algorithm, as described in the article
        
        "A fast procedure for computing the distance between complex objects in t
        hree-dimensional space", IEEE Journal of Robotics and Automation, Vol 4. 
        Issue 2, 1988.
        
        Parameter shape1: The first shape to test
        Precondition: shape1 is an object of class Shape
        
        Parameter shape2: The second shape to test
        Precondition: shape2 is an object of class Shape
        """

Return to Table of Contents


Variable Definitions

Every significant variable and data structure needs a precise and complete definition, which provides whatever information is needed to understand the variable The most useful information is often the invariant properties of the data: facts that are always true except, perhaps,momentarily, when several related variables are being updated.

Variable definitions should accompany the very first assignment of a variable (e.g. when the variable is first created). They should be single line comments. They should either be on the line just before the variable assignment, or on the same line as the assignment.

Important Hint: Write the definition of a set of variables when you first conceive of using them, and type the definitions as comments into the program when you type in the initial assignments. If you later decide to change the meanings of the variables, change the definitions before you change the statement that uses the variables.

Here is an example of a definition for two variables "i" and "current_item". Note that they are defined in a single comment because they are related.

    # 0 <= i <= current_item < number_items and i is the smallest value
    # such that item i's price is at most item current_item's price.

The more precise a definition, the better. Comments like "flag for loop" or "index into b" are useless; they only say how the variable is used, but not what it means.

Related variables should be declared and described together. For example, the definition of a table should describe not only the sequence that holds the data but also the integer variable that contains the number of items currently in the table. In the example below, for utmost clarity. spaces are used to line up the comments.

    max_temp = 150          # max number of temperature readings
    n_temp = 0              # readings are in temp[0:nTemp], 0 <= nTemp < maxTemp

Class Specifications

Class specifications are docstrings that immediately precede the class header, shown as follows:

    class RGB(object):    
        """
        Instances represent a RGB color value.
        """

The specification docstring is indented to the same level as attributes and methods in the class. It also follows the basic rules for specifications.

Method specifications behave just like function specifications. However, instance attributes are documented as part of the class specification. Each attribute is listed with a description and its invariant in brackets []. For example, here is an example of a class shown in lecture:

    class Worker(object):
        """
        Instances represent a worker in a certain organization
        
        INSTANCE ATTRIBUTES:
           lname: Last name.  [str]
           ssn:   Social security no. [int in 0..999999999]
           boss:  The worker's boss. [Worker, or None if no boss]
        """

Properties

Even though they act like attributes, properties do not go in the class specification. Instead, the have their own separate specfications, written as a docstring. The docstring should follow the basic rules for specifications. However, the specification should be written like a variable definition, and not a method specification. That is, it should state the invariants of the property, but nothing about what the setter and/or getter do.

The specification should accompany the definition of the getter, which occurs before the setter. For example, here is an expanded version of the RGB class with the red property.

    class RGB(object):    
        """
        Instances represent a RGB color value.
        """

        @property
        def red(self):
            """
            The red channel
            
            This value is an int between 0 and 255, inclusive.
            """
            return self._red
       
        @red.setter
        def red(self, v):
            assert (v >= 0 and v <= 255), '%s is outside [0,255]' % repr(v)
            assert (type(v) == int), '%s is not an int' % repr(v)
            self._red = v

Note that the setter and deleter do not require specifications. Their behavior is fully specified by the invariant given in the specification for the getter. There are a few exceptions to this rule — when we might want to comment a setter or deleter separately — but that is rare and will be discussed in class.

Return to Table of Contents


Acknowledgments

The ideas in this missive originated in the structured-programming movement of the 1970's. They are every bit as applicable today. Many examples and ideas were taken from old Cornell CS100 and and CS211 handouts, originating in the work of Richard Conway and David Gries (see their 1973 text on An Introduction to Programming, using PL/C). Hal Perkins also had a hand in writing an earlier version of this document.

The current version of this document was written by Walker White as part of the transition of CS1110 to Python. In addition to the earlier version by David Gries, he incorporated many of the official style conventions outlined in PEP 257. Some of the examples here come from that article.

Return to Table of Contents


Course Material Authors: D. Gries, L. Lee, S. Marschner, & W. White (over the years)