Older assignments

CS 2043 Assignment 4: The Fellowship of the Script

Due Date: Friday February 22 at 11:15 AM

Assignment Captains Ashneel Das, Chris Roman

In this assignment, we will be writing more advanced scripts. If you would like a Lord of the Rings themed version of the assignment, please go here.

Task 1

Note: You will be appending to two files, MARBLES.txt and FAILED.log, in this task.

Create a script named collect_marbles.sh in the TASK_1 directory that performs the following:

  1. The script will be called with arguments as follows:
    • The “length” of a line is defined as the number of words on the line.
    • The first argument is the minimum length of a line.
    • The second argument is the maximum length of a line.
    • All remaining arguments are the names of the files that are to be searched. There is no limit to the number of additional arguments.
  2. If your script receives fewer than 3 arguments, it should print the exact message to STDERR:

     Ye shall not search for marbles if ye cannot find the key.
    

    and produce an exit code of 64, without doing any further processing. AKA check for this at the beginning before doing anything else… In order to redirect to STDERR, you can type >&2 echo "error". This means you are redirecting the output of echo (in this case, “error”) to file descriptor 2. In the world of UNIX, file descriptor 2 represents STDERR, whereas file descriptor 1 represents STDOUT.

  3. You may assume that arguments one and two, if supplied, are positive integers where:
    • The minimum length is greater than or equal to 1.
    • The maximum length is greater than or equal to the minimum length (never less than).
  4. You may not assume that the third and higher arguments are valid. That is, you must verify that the argument given is a file you can read first before trying to process it.
    • If an argument given is not a real file, append the name that was given to FAILED.log in the current directory.
  5. For every valid file path specified, loop over every line in the file:
    • Count the number of words on that line. A word is defined as anything separated by a space.
    • If the number of words is between the minimum and maximum specified, append it to the end of MARBLES.txt in the current directory.
    • Otherwise, just move on to the next line.

Warning: for full credit, TASK_1/collect_marbles.sh must be executable.

Sample Inputs and Output

Refer to TASK_1/TASK_1_EXAMPLE.md.

Task 2

You are to create a script merry_math.sh in the TASK_2 directory that can analyze specific columns of a csv file. A csv file is a “Comma Separated Value” sheet often used for storing data. Ordinarily csv files are supposed to have the first row be the “Header”, that describes what each column is. For this task, there will be no header to simplify your work. There will also be no quotes or other strange characters you need to worry about. Just assume it will be plain text separated by commas.


Your script must support the following integer arithmetic operations:


Your script will receive input arguments in the following form:

To be extra clear, you should check errors in the following order to provide the correct error code priorities:

if too few arguments, then
    exit with an error code of 64
else if the provided filename is not a file, then
    exit with an error code of 66
else if the operation code provided is not supported, then
    exit with an error code of 69

So if four arguments were provided, but both the filename and operation code were invalid, then the exit code should be 66, not 69.


Your script is to strip the columns requested and perform the mathematical operation requested.

Warning: for full credit, TASK_2/merry_math.sh must be executable.

Sample Inputs and Output

Refer to TASK_2/TASK_2_EXAMPLE.md.

Task 3

Forbidden Behavior:

Under no circumstances are you allowed to use the -i flag for sed. If you use it, you will receive 0 points. It is not portable, and you should have no reason to use it.


Required Behavior:

You will likely want to to use extended regular expressions for this task. The + operator (which means find one or more of the expression it modifies), for example, is part of the extended regular expressions. For example, the regex [a-z]+ simply denotes “one or more lowercase letter”. Strings that are in this set would be things like a, asdf, fdsa, etc.

GNU sed uses the -r flag to indicate that extended regular expressions are permitted, whereas the BSD (and therefore OSX) sed uses the -E (capital, not lower case!) flag to enable these. However, as it turns out, even though the -E flag is not present in the GNU man sed page, it is supported and is exactly the same as -r.


Task 3.1: Generate the Paragraphs (<p> tags) for HTML

You should assume that any line that has text, but does not start with a # sign will turn into a paragraph. In Markdown, you can create headers (title sections) by starting the line with a # and writing the title afterward. For example, these are all valid Markdown headers:

# This is an H1 header

This is a regular paragraph.

##          This is an H2 header

This is another regular paragraph.

###    This is an H3 header.

This is yet another regular paragraph.

Note that there is an arbitrary amount of whitespace (that is not a newline) before the headers. We will account for this in task 3.2. A friendly reminder of some things that might be useful:

As in class, we discussed how to only do replacements with sed if the line starts with a certain regex. This command only checks lines beginning with “The”:

sed '/^The/s/john/John/g' file

What we haven’t discussed is how to check lines that don’t begin with “The”, which is done as follows:

sed '/^The/!s/john/John/g' file

For this task, you need to surround the regular paragraphs with the HTML paragraph tag. In HTML, you start a tag with <tag>, and end it with </tag> (where the ending has a /). The paragraph tag is p. So for example, you would take the line

   This is a regular paragraph.

and turn it into

   <p>This is a regular paragraph.</p>

For simplicity, you can assume that there will not be any preceding whitespace before the text you are to turn into a paragraph. So you will not have to account for something like

       This is a weird paragraph with preceding whitespace.

Go ahead and get started!

Task 3.2: Generate the Headers for HTML

At this point we now have all of our paragraphs accounted for, and can start turning the various levels of # into the appropriate header. You will need to account for only the cases in which you have 1, 2, or 3 #s, corresponding to <h1>, <h2>, and <h3> respectively.

Working with the example we started above, our text now looks something like this:

# This is an H1 header

<p>This is a regular paragraph.</p>

##          This is an H2 header

<p>This is another regular paragraph.</p>

###    This is an H3 header.

<p>This is yet another regular paragraph.</p>

You need to produce

<h1>This is an H1 header</h1>

<p>This is a regular paragraph.</p>

<h2>This is an H2 header</h2>

<p>This is another regular paragraph.</p>

<h3>This is an H3 header.</h3>

<p>This is yet another regular paragraph.</p>

Warning:

  1. Note that I have removed the whitespace between the # and the text. You must do the same.

  2. The whitespace in MARBLE_SHOPPE.md has a mixture of spaces and tabs in some cases. You must catch these.
    • Use the POSIX sets.
  3. Note: Lines with no whitespace between the “#” and text should still be converted to headers.

    ##Header 2 with no whitespace must be parsed correctly
    

should be

   <h2>Header 2 with no whitespace must be parsed correctly</h2>

Using the Script

The gandalfify.sh script expects exactly one argument: the name of the Markdown document file to be parsed / turned into an HTML document. Simply specify the name of the file you want to parse. For example, assuming you were in the a4/TASK_3 directory:

$ ./gandalfify.sh MARBLE_SHOPPE.md
Converted MARBLE_SHOPPE.md into index.html!

From there, if you would like to “visually debug”, you may use SCP to transfer the HTML file to your local computer and open index.html in your browser.

Sample Input and Output

Refer to TASK_3_EXAMPLE.md.

Challenge Questions (optional - we will not grade this, but you might find it fun!)

  1. Produce proper indentation automatically using sed.

  2. There is an un-ordered list in MARBLE_SHOPPE.md, but each element will get parsed into a separate paragraph. Turn them into HTML unordered lists.

    Unordered lists in Markdown come in a variety of forms:

    + one format uses plus signs
    + to indicate new bullets
    + in the list
    
    - one format uses hyphens
    - to indicate new bullets
    - in the list
    
    * one format uses plus asterisks
    * to indicate new bullets
    * in the list
    

    They are all equivalent, and you can assume that I will not mix them (+ and - for example, will not be used in the same list).

    The HTML unordered list is as follows:

    <ul>
        <li>This is an item.</li>
        <li>This is another item.</li>
        <li>This is the last item.</li>
    </ul>
    

    So you will not only need to surround each bullet from Markdown in an li tag, but you will have to be able to find the beginning and end of the list and put the ul tag. You should assume that there will not be any newlines between bullet points. For example:

    - This list is acceptable
    - Because it is
    - Formatted correctly
    - Yay!
    

    whereas

    - This list will end up as two separate lists
    - Because there is an empty line between them
    
    - And your regex will see this as the end of the list
    - ...most likely.
    

CS 2043 Assignment 3: vonne-git

Due Date: Wednesday February 13 11:15 am

Assignment Captain: Irene Yoon

In this assignment, we will be learning the fundamentals of Git. If you would like a Kurt Vonnegut-themed version of the assignment, please go here.

The Task sections complete parts of the assignment that must be completed for submission. The Git Parable sections will help you gain intuition about how git works. No part of the Git Parable sections require submission.

If you would like to reset (get a fresh copy) of any part of the assignment, please run ./reset.sh <part_name>. For instance, if you would like to reset part1_dethemed/, run ./reset.sh part1_dethemed/, and you’ll get a fresh copy of part1_dethemed under part1_dethemed_fresh/. You MUST delete your old copy of partX/ and rename partX_fresh/ to partX/ for submission.

Part 1: It’s Tralfamadorian Time! ⏰

Task: Time is just Snapshots 📸

Find part1_dethemed/ under vonne-git/. You’ll find the following file structure: (Don’t worry about part1/. This is the themed version of part1, and you will get equal credit for submitting either version.)

.git/
billy.txt

If you open billy.txt, you will see that its contents say I am dead!

The history of the contents of billy.txt are tracked with a local git repository (.git/). The history comprises of discrete snapshots, which have been tracked by commits.

Here’s your task: find the snapshot where billy.txt contains the text I am alive!, using git. You can accomplish this with the right set of git commands.

(You can go back to any snapshot where you see the text I am alive!.)

Hint: a couple of useful git commands that may help you. Feel free to use man generously.

git log
git checkout
git reset

WARNING: you will receive no credit for plain file editing. You must use git!

Note:

For this part of the assignment, you do not need to make any extra snapshots.

If you run git status under part1/, you need to see this output on your terminal once you’ve completed this part.

On branch master
nothing to commit, working tree clean

Once you can see this and find that the most recent commit is where “Billy is alive”, you have successfully completed the assignment.

Git Parable: Snapshots

Version control systems (VCS) such as git take snapshots of your time. Think of them as save points in a video game. You can take snapshots of your codebase at anytime and resurrect that code on demand.

To make it easy to remember what changes you made in each snapshot (usually called a commit), git requires you to tag your snapshot with a commit message.

Once you’re done with Part 1, hand it in by running handin vonne-git-1 part1_dethemed/.

Part 2: A Tralfamadorian Stage 🎤

Git Parable: Staging

Snapshots can be represented as commits. When you’re working on a project, you want to have explicit control over what is captured in each snapshot.

A staging directory is a set of changes (i.e. modified files) that you wish to capture in the next snapshot. In order to add files to your current staging directory, you can use git add (refer to its man page for more details).

Task: Staging

This exercise will have you stage and commit in a particular order, so you can have a better sense of control over your git workflow.

Go to part2/ to explore our upcoming stage.

You should see the following files:

.git/
juggle.txt
jumprope.txt
jazz-hands.txt

You hear that this is the right order for staging your snapshots: jazz hands, jumprope, then juggle.

To survive this stage, provide snapshots (commits) that contain each of jazz hands, jumprope, and juggle in the correct order.

When you’re done, you’ll have a commit chain, comprising of commit nodes that point to previous commits. Your commit chain should look “linear”, and each commit should be adding a single file to the snapshot:

   juggle

     |
     v

  jumprope

     |
     v

  jazz hands

Time goes up; juggle is the latest.

Do not provide a single snapshot with all three performances staged at once.

(For instance, running

git add .
git commit -m "I've committed everything"

will be incorrect, as it provides exactly a single snapshot with all three performances staged at the same time.)

Hint: here are some useful commands.

git status
git diff
git add

For credit, each commit message must contain information about what is being committed (i.e. if you’re juggling, your commit message should contain the word “juggle”; jumproping - “jump”; jazz handing - “jazz”).

Once you’re done with Part 2, hand it in by running handin vonne-git-2 part2/.

Git Parable: Branching

Recall our “linear” commit chain. Time goes up; juggle is the latest.

   juggle

     |
     v

  jumprope

     |
     v

 jazz hands

Imagine this is going to be your stable release for a software (that jazz hands, jumpropes, then juggles.). You release the software at the point of juggle.

A month passes by and you decide to add more snapshots. Time goes up; unicycling is the latest.

   unicycling

     |
     V

   juggle (release)

     |
     V

  jumprope

     |
     V
  jazz hands

Unfortunately, bug reports start filing in and people start having problems with your juggling. You decide to fix the bug in a snapshot called fixed-juggle and re-release your code. However, you don’t want to release your unicycling snapshot.

Here, we find a couple problems that cannot be solved by our linear workflow. First, we don’t know where our fixed-juggle snapshot will go. Second, we need a way for both unicycling and fixed-juggle have a reference to the original juggle snapshot, so that the snapshot history is not lost.

As a solution, git introduces a tree structure to its workflow, where we can have branches to keep track of any non-linear workflow. Time goes up; both unicycling and fixed-juggle are the latest.

  unicycling    fixed-juggle (release)
     |           /
     V        /
   juggle

     |
     V

  jumprope

     |
     V

 jazz hands


Part 3: Tralfamadorian Trivia Night 🌙

Feel free to utilize Office Hours for any questions about this section. Although help will be provided through Piazza, it will help your intuition in understanding git if explained in person!

Git Parable: Collaboration

One of the benefits of using git is the ability to collaborate. So far, we’ve been using your local git repository to keep track of your workflow.

By having a remote shared repository, you can share your snapshots with others.

In this next exercise, you will have a chance to create a GitHub repository and solve some Trivia with a partner.

Task 1: Another Earthling

You must find a partner (enrolled in CS 2043) for the Trivia. You may use Piazza’s partner finding functionality.

Task 2: Solve Your Trivia

Solve the questions under the part3 directory on your own. Correctness of your solutions does not matter for the Trivia (but is good for the soul).

Task 3: Merge time!

Create a new GitHub repository through the CornellCIS GitHub.

Place the created git repository under your part3 directory (git clone <git-repository-url> may help).

In order to complete the Trivia, you must succeed in the following things:

Note: In order to successfully follow these steps, exactly one person should initialize the GitHub repository and push a commit to it, then the next person should clone the repository and push a commit.

WARNING: do not use git rebase for the purposes of this assignment.

Git Parable: Collaboration and Merging

At the end of this assignment, you should have a complete snapshot (your partner and yours) of your Trivia (check git log).

Through git, you can merge two separate branches of development into a single snapshot. Once you complete a merge, and all partners fetch all the snapshots from the master branch (git pull), both of your histories (git log) should match exactly.

Once you’re done with Part 3, follow these instructions for submission

  1. git clone your repo onto wash, onto your ~/Desktop. This must be a fresh clone.
  2. run handin vonne-git-3 <git-repo-directory>. For instance, if your git clone-ing your GitHub repository created a directory Trivia/, run handin vonne-git-3 Trivia/.

Congratulations! You have successfully finished this assignment.