CSc 352: C Coding Guidelines

The coding guidelines we will follow are based on GNU coding styles, with some modifications based on personal taste. Your programs will be expected to follow these guidelines unless explicitly directed otherwise.

0. Program Structure

The design of a procedural program is quite different from that of an object-oriented one. A very important aspect of program design is that the code should be written in a way that makes it easy to be read and understood by others. To develop good habits in this regard, we'll follow the following (rather rigid) set of rules:

The main() routine in a program should consist simply of a series of calls to other functions, each of which carries out some high-level logical aspect of the computation, possibly with a little bit of control flow around them. The point is that someone reading the program should be able to understand the overall logic of the program by looking at main().
Each function in the program should have a comment before it stating what it does.
In general, each function should do one thing. If you find that a function is doing several different things, or that you're having trouble succinctly summarizing what a function does, you may want to reconsider the structure of the program. (This doesn't mean that it's never OK to have one function do more than one thing: but if you have a function doing many things, you should treat that as a warning flag, and its design should be carefully thought about and defended).

1. Robustness

Unless explicitly specified, avoid arbitrary limits on the length or number of data values, e.g., the length of a file name or an input line, the number of nodes in a tree or graph, etc. Instead, use dynamic memory allocation using malloc().
Do not pepper your code with hard-wired constants such as the length of a buffer, the size of a hash table, or the name of a scratch file. Use the values of #define'd macros instead. Thus, do not write

for (i = 0; i < 1024; i++) ...

Instead, write something like

#define HASH_TBL_SZ 1024 ...
for (i = 0; i < HASH_TBL_SZ; i++) ...
Check every system call for an error return. If an error is detected, give an intelligible error message indicating the nature of the problem.
Check every call to malloc() to see if it returned zero. If malloc fails in a noninteractive program, make that a fatal error. In an interactive program (one that reads commands from the user), it is better to abort the command and return to the command reader loop.
Don't assume that the address of an int object is also the address of its least-significant byte. This is false on big-endian machines. Thus, don't make the following mistake:

nt c;

...

while ((c = getchar()) != EOF)

write(file_descriptor, &c, 1);

Avoid casting pointers to integers if you can. Such casts greatly reduce portability, and in most programs they are easy to avoid.

2. Error Messages

Error messages should always be printed to stderr. They should, at the very minimum, indicate the nature of the problem. For example, if a system call has failed, indicate which system call and the nature of the failure (via perror).

3. Comments

Every program should start with a comment saying briefly what it is for. Example: `fmt - filter for simple filling of text'.
Put a comment before each function saying what the function does, what sorts of arguments it gets, and what the possible values of arguments mean and are used for. It is not necessary to duplicate in words the meaning of the C argument declarations, if a C type is being used in its customary fashion. If there is anything nonstandard about its use (such as an argument of type char * which is really the address of the second character of a string, not the first), or any possible values that would not work the way one would expect (such as, that strings containing newlines are not guaranteed to work), be sure to say so.
Explain the significance of the return value, if there is one.
Write in complete sentences, in text that is (as far as possible) grammatically correct, and capitalize the first word (for example, avoid stuff like this). If a lower-case identifier comes at the beginning of a sentence, don't capitalize it! Changing the spelling makes it a different identifier. If you don't like starting a sentence with a lower case letter, write the sentence differently (e.g., "The identifier lower-case is ...").
Avoid references to classroom discussions or meetings without adequate mention of specific technical aspects of the discussion. Comments are intended for other people who may read your code, perhaps long after it was written: it may not be immediately obvious to them what was discussed in class. Instead, state explicitly what it is you're referring to. (It's OK to add references to classroom discussions or meetings as long as such references supplement, rather than replace, the technical information you're providing.)
Thus, do not write something like

The data structure used in this program is as discussed in class.

Much better would be:

The data structure used in this program is a hash table, where each hash bucket is organized as a balanced binary tree (as discussed in class).

4. Clean Use of C Constructs

Explicitly declare the types of all objects. For example, you should explicitly declare all arguments to functions, and you should declare functions to return int rather than omitting the int and relying on the default.
Declarations of external functions and functions to appear later in the source file should all go in one place near the beginning of the file (somewhere before the first function definition in the file), or else should go in a header file. Don't put extern declarations inside functions.
Header files should not contain function definitions: such definitions should be in *.c files.
Try to avoid assignments inside if-conditions. For example, don't write this:

if ((foo = (char *) malloc (sizeof *foo)) == 0)
fatal ("virtual memory exhausted");

instead, write this:

foo = (char *) malloc (sizeof *foo);
if (foo == 0)
fatal ("virtual memory exhausted");

5. Naming Files, Variables and Functions

The name of a file should give some indication of the purpose of the code in that file. If you're having trouble coming up with a descriptive name for a file because the code in it has several different purposes, this may be an indication that, from an organizational perspective, you should split the code in the file into multiple files.
The names of global variables and functions in a program serve as comments of a sort. So don't choose terse names--instead, look for names that give useful information about the meaning of the variable or function. In a GNU program, names should be English, like other comments.
Local variable names can be shorter, because they are used only within one context, where (presumably) comments explain their purpose.

6. Information Hiding

When you declare a struct or union, you should also define macros to access its fields. Avoid accessing the fields directly in the main body of the program: instead, use these macros for accessing the fields. Thus, avoid writing code like this:

struct tn {
int ntype;
union {
    int nval;
    char *name;
    struct tn *child[2];
} flds;
} *tptr;
...
if (tptr->ntype == EXPR_PLUS) {
emit_code(tptr->flds.tn[0]);
emit_code(tptr->flds.tn[1]);
}

Instead, write

struct tn {
int ntype;
union {
      int nval;
      char *name;
      struct tn *child[2];
} flds;
} *tptr;
#define Type(x) ((x)->ntype)
#define Value(x) (((x)->flds).nval)
#define Name(x) (((x)->flds).name)
#define Child(x,i) (((x)->flds).child[i])
...
if (Type(tptr) == EXPR_PLUS) {
emit_code(Child(tptr,0));
emit_code(Child(tptr,1));
}

The latter style is easier to understand (e.g., it's easier to see that Child(tptr,0) refers to the 0^th child of the node that tptr points to than tptr->flds.tn[0]); and changes in the data structure, e.g., to improve performance, are easier to accommodate and less prone to bugs.

Acknowledgements

The material in this document is based on GNU coding styles.