Unix Lab

CS3410 Spring 2016


In-Class Lab, Due 4/11


Goal for Today

Today you will increase your familiarity with Unix by performing a series of tasks. Except writing the bash script and submission to CMS, everything should be completed on the command line. (You could write the bash script on the command line, but it would be really tedious.) You will increase your comfort level with the command line, learn some very useful commands, and write a simple bash script.

What to submit

  1. submission2.txt - A file created according to the instructions in this lab.
  2. script.sh - A script that automatically generates that file for you.

What to do?

Short Version:
  1. Open the course VM and bring up a terminal.
  2. Download a compressed directory from the internet and open it up.
  3. Learn a little bit about the contents of the directory.
  4. Complete a few tasks, creating the file submission.txt along the way.
  5. Make a bash script that automatically creates the same file, but name it submsision2.txt
  6. Make sure that the two files are identical.
  7. Submit!

Long Version:

(1) Open the course VM and bring up a terminal.

Start the course VM as usual. Go to the menu at the bottom left, select Accesssories, and pick a Terminal (Byobu Terminal or LXTerminal). If you don't already have a directory in which you put your 3410 materials, go ahead and create a 3410 directory:
~% mkdir 3410
(Note: throughout this assignment we will do our best to show you exactly what you might see on your VM terminal. It may not match exactly.)
You can see that the directory was created using ls:
~% ls
3410  Desktop  Downloads  Templates
~%
Now step into the directory:
~% cd 3410
~/3410$
To step out of the directory, you type:
~/3410$ cd ..
~$
(If you did that, you'll have to step back into the 3410 directory again, which you can do by either repeating cd 3410 or you can type cd - which brings you back to whatever your last directory was. This is a nice trick to know about in case you are deep into a series of subfolders and you accidentally type cd, which takes you all the way back to your home directory. Just type cd - and all is forgiven.) For today, you may want to make a subdirectory called lab_unix. Go ahead and create this directory inside your 3410 directory and step inside.

(2) Download a compressed, archived directory from the internet and open it up.

A very simple way of transferring a directory of files from one place to another is to archive it (bundle the directory into a single file) and compress it (encode so that it takes up fewer bytes). We have created a compressed archive for you and placed it on the course website at the address: http://www.cs.cornell.edu/Courses/cs3410/2016sp/labs/poems.tar.gz We created this file for you by using this command line from inside a directory that contains the directory poems.
// you don't need to type this. this is just for future reference
~/3410/lab_unix% tar -zcvf poems.tar.gz poems
The flags mean:
-z: Compress archive using gzip program
-c: Create archive
-v: Verbose, i.e., display progress while creating archive
-f: Archive filename

One way you can grab this file is to go to a web browser and download it. There are also command line utilities that can perform non-interactive downloads like this for you. One such tool--which happens to be installed on the course VM--is called wget.

~/3410/lab_unix% wget http://www.cs.cornell.edu/Courses/cs3410/2016sp/labs/poems.tar.gz
Note that if you type the address incorrectly, you will get a 404 error, just as you would in a browser.

Now it's time for you to open the archive. To do this, you will use the same tar command that we used, but with a new flag:

-x: Extract
So you will type:
~/3410/lab_unix% tar -zxvf poems.tar.gz
Because you include the -v flag, you should have seen the following text appear:
x poems/
x poems/angelou
x poems/browning
x poems/eliot
x poems/hughes
x poems/naidu
x poems/nakayasu
x poems/neruda
x poems/plath
...letting you know that the extraction was taking place. (If the poems had been longer, you would have known exactly which file was currently being extracted because the system would have stayed on that line for some number of seconds.)
Now if you do an ls you will see that the directory poems now lives inside your lab_unix directory:
~/3410/lab_unix$ ls
poems
~/3410/lab_unix$ 
Congratulations! You have successfully completed this step.

(3) Learn a little bit about the contents of the directory.

If you are already familiar with ls, pattern matching, TAB-completion, less, cat, man, wc, pipe, head, tail, and chmod, feel free to skip this step.

Want to see what's inside the poems directory? Give ls an argument:

~/3410/lab_unix$ ls poems
angelou browning eliot hughes naidu  nakayasu neruda plath
~/3410/lab_unix$
Now step inside the poems directory. You can determine the length of a file (or files) by using the word count command, wc:
~/3410/lab_unix/poems$ wc *
Most Unix commands allow you to use pattern matching for arguments. The * is a wildcard token meaning everything. So the command you just typed applied wc to all files in the directory. If you only cared about files beginning in n you could type:
~/3410/lab_unix/poems$ wc n*
If you only cared about files ending in u you could type:
~/3410/lab_unix/poems$ wc *u
By default the wc command tells you how many lines, words, and bytes are in a particular file. For example, the file nakayasu has 8 lines, 26 words, and 160 bytes. Let's take a look at the file by typing:
~/3410/lab_unix/poems$ cat nakayasu
cat is a utility that concatenates and prints files. Since we only gave it one short file to work on, the result was fine. Suppose instead we type:
~/3410/lab_unix/poems$ cat *
The result is a bit overwhelming. If you want a more reasonable way to read these pages of text, you can pipe the result to a utility called less, which allows you read text exactly one screen at a time:
~/3410/lab_unix/poems$ cat * | less
To see the next screen, type the SPACE-BAR. To quit, hit the ESC key. You can also less a file directly:
~/3410/lab_unix/poems$ less eliot
Suppose you only want to see the file angelou but you don't want to type it out. You can use the TAB button which will perform what is called TAB-completion. Type:
~/3410/lab_unix/poems$ less a
and instead of hitting RETURN, hit the TAB button. The entire word is completed for you. Now you can hit RETURN.

Suppose you only care about the file nakayasu but you don't want to type it out. Try to use TAB-completion again. Type:

~/3410/lab_unix/poems$ less n
naidu nakayusa neruda
and hit the TAB button. Because multiple files begin n, the TAB-completion shows you all files that match; you'll have to be more specific if you want TAB-completion to complete the word for you. Go ahead and type:
~/3410/lab_unix/poems$ less nak
and hit TAB. This time, the entire word nakayasu has been completed for you. (ta-da!)

Suppose you really only care about how many lines each file has and don't like that wc keeps giving you word and byte counts. You can use a flag to wc that says "give me the line count only". To learn which flag this is, you'll have to use the man command, which gives you the "user manual" for any command available on the command line. Simply type:

~/3410/lab_unix/poems$ man wc
Note that the manual is usually larger than one screen, so the man command automatically pipes the manual into less for you. (Remember: SPACE for next page, ESC to quit. Want to know how to go back a page? Type man less at the terminal and find out!)

If you want to see the beginning of a file, you can use the command head:

~/3410/lab_unix/poems$ head plath
which will show you the first 10 lines of each file. A similar command, tail will show you the last 10 lines of a file. There are flags which will let you adjust how many lines these commands show you. man head to learn about them.

Finally, repeat the ls command with the flag -l which provides the directory listing in Long Format:

~/3410/lab_unix/poems$ ls -l
Among other things (which you can read about if you man ls), you can see on the far left the file permissions for this file. A good explanation of what you're looking at is on wikipedia. The extremely short explanation is "These files are readable and writeable by you only." If for some reason you wanted to make naidu executable by you, you would type:
~/3410/lab_unix/poems$ chmod +x naidu
Now if you ls -l naidu you can see that it is technically executable. (If you actually try to execute it by typing "./naidu" on the command line, you'll at best incur a sea of errors. It doesn't really make sense to try to execute an ASCII encoded poem...)

(4) Complete a few tasks, creating the file submission.txt along the way.

First, step out of the poems directory and create a file submission.txt. You can create a file without invoking a text editor by using the command touch:
~/3410/lab_unix/poems$ cd ..
~/3410/lab_unix$ touch submission.txt
Remember that you are no longer in the poems directory, so you'll have to prepend all arguments with poems/, for example:
~/3410/lab_unix$ cat poems/*
As you may have noticed, the first line of every file in poems is a title and the second line is a byline. Use a single command to head to capture just the first 2 lines of each poem. First try this out and see if what gets printed to the screen is what you expect. Then redirect your head command to the submission.txt file:
~/3410/lab_unix$ head XXX > submission.txt
Where XXX is your correct invocation of head.

Next, use a single command to detemine the number of lines in all files that have an a somewhere in the filename. Once again, test out your command on the command line, and when you are sure it's correct, redirect the results into your submission.txt file. This time, you'll use the a double greater-than symbol (>>) which will append this text to the bottom of the existing submission.txt. (If you used > again, it would completely overwrite the file contents with this new information.)

Now, append the entire contents of hughes to the bottom your submission.txt file.

Finally, use a single command line to determine the number of lines and characters in both submission.txt and neruda (for your script. Pipe that back to wc on its default settings and then append this single line result to the bottom of submission.txt. You should be able to do this in a single command line. Random Pro Tip: at any time when you are working on the command line, you can type the up-arrow key (bottom right of the keyboard) and it will show you the previous command you typed on the command line.

(5) Make a bash script that automatically creates the same file, but name it submsision2.txt

There is a shell program running that executes the commands you type on the command line. On the course VM, this shell is called bash, the GNU Bourne-Again SHell. There are other shells out there, tcsh ("tee-see shell"), zsh (the Z shell), etc. If you have a series of commands that you want to execute on the command line, it can be faster to put them in a single executable file and run the entire sequence--especially if you might need to run it multiple times. The final part of this assignment will have you create a bash script that performs all the commands you just typed out in on fell swoop.

Well-named bash scripts ends in ".sh" in order to let others know that this is a bash script. Touch a file called script.sh and change its file permissions to make it executable.

At this point feel free to open up the text editor of your choice.

The first line of your script specifies the complete path to the program that should be used to run the script. Next, you would usually include a quick comment explaining what the script does. Then the script would begin.

#! /bin/bash

# this is a script that completes the cs 3410 lab_unix tasks
#
echo Hello World
If you are ever running on a machine other than the course VM and you don't know where bash lives, you can type

~$ which bash
/bin/bash 
And you will be given the path to bash.

The script as you have been given it is a bash version of Hello World. Remove the echo line (man echo if you are curious!) and create a script that completes the set of tasks from the previous step, replacing all references to submission.txt with submission2.txt. Remember you can run your script on the command line by typing:

~/3410/lab_unix$ ./script.sh

(6) Make sure that the two files are identical.

Now you should have two files that you believe are identical: submission.txt and submission2.txt. Use the diff command to see if they really are identical.

~/3410/lab_unix$ diff submission.txt submission2.txt
If diff returns nothing, the files are identical. If the files are not identical, diff will show you which lines differ and where. Sometimes this report can be a little hard to read. It can sometimes be helpful to use diff -w which ignores whitespace or diff -y | less which shows the two files next to each other, with differences noted by < > or | in the middle.

(7) Submit!

Please go upload submission2.txt and script.sh to CMS. (Note: the latter should produce the former when sitting next to the appropriate poems directory.)

Rookie (SUPER LAME) Submission: Open up a browser on your VM, go to CMS, and submit.

Rockstar (AWESOME) Submission: Open up a terminal on your host machine. (host = the machine that is running the VM, guest = the Virtual Machine). Perform a secure copy of the two files from your VM to your host machine.

Backup. If you're not familiar with the cp command it performs a copy:

~$ cp your_file copy_of_your_file
Creates a new file called copy_of_your_file which is a copy of your_file. You can recursive copy ( = copy a directory) using the -r flag:

~$ cp -r your_directory copy_of_your_directory
Secure copy, scp, performs the same functionality, but the files are allowed to be located remotely. When you type vagrant up one of the many things you are told is that the default SSH address is 127.0.0.1:2222. This gives you the address of the guest (127.0.0.1) and the port you should use (2222). From your host machine, you can now type:

~$ scp -P 2222 vagrant@127.0.01:/home/vagrant/3410/lab_unix/script.sh .
Just as you have a username and password associated with your local machine, there is actually a username and password associated with your virtual machine. The default username is "vagrant", which is why we prepend the address with "vagrant@". You will be asked for a password when you hit RETURN. The default password for the vagrant username is is "vagrant". Now, a copy of script.sh will be copied from your VM to your local machine (the host). Not sure of the exact path to script.sh? Type pwd in the directory where script.sh lives and you will be given the full path. Repeat for submission2.txt. Now if you like you can open up a browser and submit your script like a rockstar. Also, for the record, you can also send files back to the VM:

~$ scp -P 2222 file_to_send vagrant@127.0.01:/home/vagrant/
And remember, you can perform a recursive copy if you want to move an entire directory to or from your VM. It is also possible perform the scp from within the VM, but it can be a bit more involved. Feel free to Google it or ask a TA if they can show you.

Postscript: For the adventurous

Take a look at this script, which prompts for a directory name if there is no poems directory, and then uses that name in variable form throughout the script.

#! /bin/bash

# fancier script than the one before

if [ ! -d "poems" ];
then
    echo "What directory should I be using? "
    read dirname
    ls $dirname/* # etc. etc.
fi
Indeed, a script can be far more powerful than simply a series of command lines! If you ever find yourself performing the same task again and again, replace yourself with a small shell script, seriously.

More Info

For more info on UNIX, see man, and the UNIX appendix to All of Programming by Andrew Hilton and Anne Bracy.