Assignment 4: Worth 15% of the Homework grade Due: Monday July 13th at 11:59 Python is a very useful tool for system administrators manipulating files: admins typically troll filesystems looking for obese files, malformed directories, permission problems, troublesome scripts, etc. You may never do any system administration, but it's good to know how to write simple tools to help you keep your filesystem in shape. The object of this assignment is to become more familiar with Python for handling files and manipulating directory structures. The os, os.path and shutil Python modules are your friends. Chapter 9 of the "Core Python Programming" book has some discussion of the os and os.path modules. Chapter 10 of the "Python In A Nutshell" book has some more details about the os and os.path and os.shutil modules. To turnin program 4 (recdiff) from lectura, type % turnin 380assignment4 recdiff 1. Write a program called "recdiff" that is a Python script. Notice the lack of a ".py" extension: that is on purpose, make sure your file is executable. The main purpose of recdiff is to 'recursively' compare two directories: that is, compare elements of each directory (possible recursively) and return status. The most likely uses of a program like this would be to (a) write a testing script for a class (b) compare code from a backup version to see what you changed: a poor man's version control comparator. For example: If you take some code into the field, change it to make it work, you want to be able to bring it back home and compare against your original baseline. Invocation is simple from the command line: If you forget to give it two directories, it will be show you the usage. % recdiff usage: recdiff dir1 dir2 Not enough arguments The program will return 0 (look in the STATUS variable) if and only if the structure of both directories is exactly the same, and all files in all directories compare the same. % recdiff /home/rts/a1 /home/rts/a2 # dirs a1 and a2 exactly the same % echo $STATUS 0 The purpose of the tool is to tell you what files have changed, what files have stayed the same, what files have been deleted, and what files have been added between directories. For example: % recdiff d1 d2 a.txt is in d1 but not d2 . ok: d1/a1/some.txt d2/a1/some.txt DIFFERENT! d1/a2/else.txt d2/a2/else.txt ADDED.txt is in d2/a3 but not d1/a3 % echo $STATUS # dirs are NOT the same!!! 1 In the previous example, we can see that file 'a.txt' was deleted when going from d1 to d2, 'some.txt' hasn't changed, 'else.txt' HAS changed, and ADDED.txt got added to d2/a3. The philosphy of the tool is to compare as much as possible: don't just "give up" when there is a new file or deleted file. The tool wants to tell you about every set of files it can match. Note that "recdiff" does not actually show the diffs, it only shows that two files are different or the same. You can then manually do a diff to see what has changed: the purpose of this tool is really to give you an overview of all the changing files. All the testcases are online in /local/cs380/EXAMPLES/assignment4 on lectura. The 'testing.py' script describes what each test is testing so you can see what the test is looking for. The testing script is your best friend: it shows you exactly what we will be testing and expecting (although we reserve the right to change filenames or add a few more tests). NOTE: Two things are important: (1) when going through the files in a directory, iterate through them in alphabetical order. Otherwise, your output may not match the test cases. (2) The first argument to recdiff controls the iteration: in other words, you always iterate the list of files from arg1 first, THEN iterate through the leftovers in arg2 (for missing/added files). That's why sometimes running 'recdiff f1 f2' will give slightly different results from 'recdiff f2 f1': They both should ALWAYS catch all the changes/additions/deletions, but the order may differ. Of course, there is no arbitrary limit on how deep directories are that you will be examining (this suggests some kind of recursive solution).