CS 113

HOMEWORK 4

Due: Friday, September 22, 2000  (this is the official date, but unofficially it is going to be due one week later on Friday, September 29, 2000)

WARNING: The input and output for all the programs should be of the correct format and you are supposed to use the structures that I give for each program (e.g. you must make a tree structure for the 1st problem).


1. Binary Search Tree

A binary search tree is used to search for data strings of lowercase alphabetic characters (a-z). Each tree node, v, holds a data string, v(s). All strings stored in the left subtree of v are lexicographically smaller than v(s), whereas all strings in the right subtree of v are lexicographically larger than v(s). Also, no string is present in the tree more than once. The binary search tree supports insertions and deletions.

Assume that we have a set of strings s1, s2, ..., sn (n<=100), where each string comprises of 10 characters at most. The problem is to build a binary search tree incrementally, inserting one string at a time. Evidently, the tree is not balanced in the general case.

Input
Your program should read the input from the standard input, which has the following format.

Output
Your program should produce the output in the standard output, which should have the following format.
Input Output
7
door
ace
ace
external
extension
cat
flight
2 8
ROOT
L
EXISTS
R
RL
LR
RR

Sample Execution

7                \  input
door              |
ace               |
ace               |
external          |
extension         |
cat               |
flight           /
2 8              \  output
ROOT              |
L                 |
EXISTS            |
R                 |
RL                |
LR                |
RR               /

Sample program (you may wish to use this and fill in the gaps):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* node of the tree - it includes pointers to left/right children and a small string */
  typedef struct tree {
  char info[15];
  struct tree *left, *right;
} tree;

/* inserts the string pointed to by indat in the tree
   and updates the pointer *nodepp to point to the node in the tree where it was inserted
   returns -1 if the string exists and the depth of the node in the tree otherwise
   this function works recursively */
int insert(tree **nodepp, char *indat)
{
  /* put your code here */
}

/* prints the path of the node (from the root) where the string pointed by indat exists
the pointer nodep points to the node we have traversed to so far
and depth is the current depth
this function works recursively */
void findwhere(char *indat, tree *nodep, int depth)
{
  /* put your code here */
}

void main()
{
  tree *head;
  char data[1000][15], buf[15];
  int N, k, m, MaxP, IntP, datastat[1000];
  
  head = NULL;
  MaxP = IntP = 0;
  fgets(buf, 15, stdin);
  sscanf(buf, "%d", &N);
  for (m=0; m<N; m++)
  {
    fgets(buf, 15, stdin);
    strcpy(data[m], buf);
    k = insert(&head, buf);
    if (k>=0) {
      IntP += k;
      if (k > MaxP) MaxP = k;
      datastat[m] = 1;
    }
    else datastat[m] = 0;
  } 
  
  printf("%d %d\n", MaxP, IntP);
  for(m=0; m<N; m++)
  if (datastat[m]) findwhere(data[m], head, 0);
  else printf("EXISTS\n");
}

Note that the arrays: data and datastat contain the words and whether the word was already existed in the tree at the time of insertion (0=existed & 1=new).


2. Maze

This exercise is going to test your ability to write a recursive process as well as to open and close text files for input and output.

The maze is given in the form of a text file, where obstacles are represented by '*' or '#'. Some coordinates in the maze are also numbered with letters (a-z, A-Z); the goal is to navigate the maze for each pair of the same letter (one capital the other lower) from the capital letter to the lower case letter and to output the minimum number of moves that are required for a robot to move from the one to the other. The robot is capable of making either only horizontal and vertical moves (4-connectivity of the maze), or also diagonal moves (8-connectivity of the maze). The program tests for the existence of all pairs of English letters (A-a through Z-z). If only one letter exists (not a pair) then this letter is ignored. For each pair found the algorithm is run to determine the minimum path from one to the other taking into account the movements allowed (connectivity of maze) and outputs the result or "No path from A to a found" (if the desired route is from A to a).

Input: (file=maze)

A                  
        c          
                   
        * *   * *  
      * Z   *   z *
        * *   * *  
                   
        w     W    
                   
                  a

Output 1: 4-connectivity (file=output4)

Execute with parameters  "maze output4 4"

Output
a:   18
w:    3
No path from Z to z found

Output 2: 8-connectivity (file=output8)

Execute with parameters  "maze output8 8"

Output
a:   10
w:    3
z:    4

Sample Execution (it should give you a chance to see the format of the text files for the input and output)

vetsikas@snoball: [9] ~/113/4/q2 > q42sol maze output4 4
vetsikas@snoball: [10] ~/113/4/q2 > q42sol maze output8 8
vetsikas@snoball: [11] ~/113/4/q2 > more maze
10 10
A
    c

    ** **
   *Z * z*
    ** **

    w  W

          a
vetsikas@snoball: [12] ~/113/4/q2 > more output4
a:   18
w:    3
No path from Z to z found

vetsikas@snoball: [13] ~/113/4/q2 > more output8
a:   10
w:    3
z:    4

Another execution:
vetsikas@snoball: [5] ~/113/4/q2 > more maze2
8 8
a
* * * *
*** * **
*   *  *
** *** *
     ***
**  ****
       A
vetsikas@snoball: [6] ~/113/4/q2 > q42sol maze2 out2-4 4
vetsikas@snoball: [7] ~/113/4/q2 > more out2-4
a: 16
vetsikas@snoball: [8] ~/113/4/q2 > q42sol maze2 out2-8 8
vetsikas@snoball: [9] ~/113/4/q2 > more out2-8
a: 12
vetsikas@snoball: [10] ~/113/4/q2 > more maze3
7 7
*C*   *
* * * *
* * * *
* * * *
* * * *
* * * *
*   *c*
vetsikas@snoball: [11] ~/113/4/q2 > q42sol maze3 out3-4 4
vetsikas@snoball: [12] ~/113/4/q2 > q42sol maze3 out3-8 8
vetsikas@snoball: [13] ~/113/4/q2 > more out3-4
c: 22
vetsikas@snoball: [14] ~/113/4/q2 > more out3-8
c: 18

Sample program (will want to use this and just fill in the gaps - which are very few):

#include <stdio.h>
#include <stdlib.h>
#define POS(m,x,y) ((m)[(y)*_Lx+(x)]) /* m:pointer & x,y:coordinates */

/* given the name of the input file it reads the file and
puts the maze into an array and returns a pointer to it */
char *readInputFile(char *_infile, int *_Lx, int *_Ly)
{
  FILE *fin;
  char *matrix, *temp;
  int i;

  /* put one line of code for opening the file _infile : pointer stored at fin */
  if (!fin)
    return NULL;
  fscanf(fin, "%d%d\n", _Lx, _Ly);
  temp = matrix = (char *)malloc(((*_Lx)*(*_Ly)+2)*sizeof(char));
  if (!matrix)
  {
    fclose(fin);
    return NULL;
  }
  for (i=*_Ly; i; i--, temp+=*_Lx)
  fgets(temp, *_Lx+2, fin);
  fclose(fin);
  return matrix;
}

/* recursively fills the min distance from the source */
void fillMatrix(int *_vm, int _Lx, int _Ly,
int _x, int _y, int _v, int _N)
{
  /* put your code here */
}

/* returns the shortest route from the source to the goal */
int findRoute(int *_vm, int _Lx, int _Ly,
int _Sx, int _Sy, int _Gx, int _Gy, int _N)
{
  /* put your code here */
}

/* finds the fastest route from capital(_c) to _c and returns its length
or it returns 0 if such a route does not exist or
returns -1 if no such goal exists */
int processData(char *_cm, int *_vm, int _Lx, int _Ly, int _N, char _c)
{
  int Sx, Sy, Gx, Gy, x, y, *i;
  char *c;
  
  Sx=Sy=Gx=Gy=-1;
  for (y=0, i=_vm, c=_cm; y<_Ly; y++)
    for (x=0; x<_Lx; x++, i++, c++)
    {
      *i=_Lx*_Ly;
      if ((*c=='#') || (*c=='*'))
      *i=-1;
      else if (*c==_c)
      {
        Gx=x;
        Gy=y;
      }
      else if (*c==_c-32)
      {
        Sx=x;
        Sy=y;
      }
    }
  if ((Sx==-1) || (Sy==-1) || (Gx==-1) || (Gy==-1))
    return -1;
  return findRoute(_vm, _Lx, _Ly, Sx, Sy, Gx, Gy, _N);
}

/* uses the matrix to find paths from spots A-Z to spots a-z and
outputs the results; it returns 0 if all are OK, -1 otherwise */
int processNoutput(char *_outfile, char *_cm, int _Lx, int _Ly, int _N)
{
  FILE *fout;
  char c;
  int *matrix;
  int i;

  /* put one line of code for opening the file _outfile : pointer stored at fout */
  if (!fout)
    return -1;
  matrix = (int *)malloc(_Lx*_Ly*sizeof(int));
  if (!matrix)
  {
    fclose(fout);
    return -1;
  }
  for (c='a'; c<='z'; c++)
    if ((i=processData(_cm, matrix, _Lx, _Ly, _N, c))>0)
      if (i==_Lx*_Ly)
        fprintf(fout, "No path from %c to %c found\n", c-32, c);
      else
        fprintf(fout, "%c:%5d\n", c, i);
  fclose(fout);
  free(matrix);
  free(_cm);
  return 0;
}

int main(int argc, char **argv)
{
  char *matrix;
  int Lx, Ly, N;

  if (argc!=4)
  {
    fprintf(stderr, "Syntax: %s <input file> <output file> n(n=4 or 8)\n", argv[0]);
    return -1;
  }
  if ((argv[3][0]!='4') && (argv[3][0]!='8'))
  {
    fprintf(stderr, "Syntax: %s <input file> <output file> n(n=4 or 8)\n", argv[0]);
    return -1;
  }
  N = (argv[3][0]=='8')?8:4;
  if ((matrix = readInputFile(argv[1], &Lx, &Ly))==NULL)
  {
    fprintf(stderr, "Error while opening file %s\n", argv[1]);
    return -2;
  }
  if (processNoutput(argv[2], matrix, Lx, Ly, N)<0)
  {
    fprintf(stderr, "Error while outputting to file %s\n", argv[2]);
    return -3;
  }
  return 0;
}

Some things to note in function processData:

  1. (Sx, Sy) and (Gx, Gy) are the coordinates of the start and the destination (goal) respectively.
  2. The integer matrix _vm is initialized with -1 for the coordinates which represent obstacles and with _Lx*_Ly (which is bigger than any possible distance on the matrix) for all the rest. If this value is returned by the function findRoute then no possible route from the start to the goal exists.

I will give some more details about the algorithms and the problems (especially for the second problem which has a very simple recursive solution) in class...