\documentclass[11pt,onecolumn]{article}
\usepackage{amssymb, amsmath, amsthm,graphicx, paralist,algpseudocode,algorithm,cancel,url,color}
\usepackage{sectsty}
\usepackage{fancyvrb}
\usepackage{mathrsfs}
\usepackage{multirow}
\usepackage{hhline}
\usepackage{booktabs}
\usepackage[table]{xcolor}
\usepackage{tikz}
% \usepackage[framed,numbered,autolinebreaks,useliterate]{mcode}
\usepackage{listings}
\usepackage{enumitem}
\newcommand{\bvec}[1]{\mathbf{#1}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\Rn}{\R^{n\times n}}
\newcommand{\Rmn}{\R^{m\times n}}
\newcommand{\Cn}{\C^{n\times n}}
\newcommand{\Cmn}{\C^{m\times n}}
\newcommand{\cO}{\mathcal{O}}
\DeclareMathOperator{\Tr}{Tr}
\DeclareMathOperator{\trace}{trace}
\DeclareMathOperator{\diag}{diag}
\sectionfont{\Large\sc}
\subsectionfont{\sc}
\usepackage[margin=1 in]{geometry}
\begin{document}
\noindent
\textsc{\Large CS 6210: Homework 1}\\
Instructor: Anil Damle\\
Due: February 8, 2022\\
\subsection*{Policies}
You may discuss the homework problems freely with other students, but please
refrain from looking at their code or writeups (or sharing your own).
Ultimately, you must implement your own code and write up your own solution to
be turned in. Your solution, including plots and requested output from your code
should be typeset and submitted via the Gradescope as a pdf file. Additionally, please
submit any code written for the assignment. This can be done
by either including it in your solution as an appendix, or uploading it as a zip
file to the separate Gradescope assignment.\\
\hrule
\vspace{1cm}
\noindent
\subsection*{Question 1:}
This question is intended to serve as a bit of a theoretical warm up for the course, matrix factorizations and norms will play a key role in much of our analysis and understanding of the algorithms we discuss.
\begin{enumerate}[label=(\alph*)]
\item For any $A\in\Rn$ prove that there exists an orthogonal matrix $U\in\Rn$ and symmetric positive semi-definite matrix $H\in\Rn$ such that $$A=UH,$$ you may use the existence of the SVD in your proof.
\item Prove that for any induced matrix norm $\|\cdot\|$ and $A\in\Rn$ $$\rho(A)\leq \|A\|,$$ where $\rho(A)$ is the spectral radius of $A$ (i.e., the magnitude of the largest eigenvalue).
\item Given a matrix $A\in\Rmn$ of rank $k\leq \min\{m,n\}$ prove that $$\|A\|_F\leq \sqrt{k}\|A\|_2.$$
\item \textbf{(An ungraded, slightly more challenging question)} Since real matrices can have complex eigenvalues, if we want to discuss the Schur form of a non-symmetric real matrix we necessarily have to consider complex numbers. However, it is useful to consider what we can accomplish with only real numbers.
For the purposes of this problem you may only use the fact that for any matrix $A\in\Rn$ there exists at least one scalar $\lambda\in\C$ and associated vector $v\in\C^{n}$ such that $Av=\lambda v,$ and not assume the existence of any other matrix factorizations. Prove that for for any matrix $A\in\Rn$ there exists an orthogonal matrix $Q\in\Rn$ such that
\[
Q^TAQ = \begin{bmatrix}R_{11} & R_{12} & \cdots & R_{1m}\\ 0 & R_{22} & \cdots & R_{2m}\\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & R_{mm}\end{bmatrix}
\]
where each $R_{ij}$ is real and $R_{ii}$ is either $1\times 1$ or $2\times 2$ with complex eigenvalues. Such a decomposition is known are the real Schur decomposition.
\end{enumerate}
\subsection*{Question 2:}
This question is meant to be a bit of a warm up on something we will do repeatedly in this course, which is implement, validate, and test algorithms we discuss. Since we have not really discussed any algorithms yet, we will start by simply exploring the variety of performance we can observe even with something as simple as matrix-matrix multiplication. Consider/implement the following four algorithms for computing $C = AB$:
\begin{enumerate}[label=(\alph*)]
\item $C(i,j) = \sum_k A(i,k)B(k,j),$ for this part you can only use built in scalar multiplication
\item $C(i,j) = A(i,:)B(:,j),$ you may now leverage your chosen languages calls to routines for computing inner products
\item $C = \sum_k A(:,k)B(k,:),$ you may now leverage your chosen languages calls to compute outer products and add matrices
\item $C(:,i) = AB(:,i),$ you may now leverage your chosen languages calls to compute matrix-vector products.
\item As a point of comparison we will also use the ``built in'' routine for computing matrix-matrix multiplication (e.g., simply writing $C=A^*B$ in Matlab), this is our way of accessing the routine for matrix-matrix multiplication from BLAS (\url{http://www.netlib.org/blas/}).
\end{enumerate}
For all the above algorithms clearly illustrate that your implementation is $\cO\left(n^3\right)$ (as part of this question, argue why your choice of plot clearly and unambiguously illustrates the correct scaling, think about the axes), compare and contrast their performance, and argue about why you believe you might be seeing such differences.
\subsection*{Question 3:}
Given two matrices $A,B\in\Rn$ and let $\text{fl}(AB)$ be the result of computing $AB$ with floating point arithmetic using an algorithm that computes the entries of $\text{fl}(AB)$ either as inner products or as the sum of entries of appropriate outer products. Let $$\text{fl}(AB) = AB+E,$$ determine an element wise bound on the entries of $E$ in terms of machine precision $\mu$, $n$, and the entries of $A$ and $B.$ For this question you may only use floating point properties of the four basic arithmetic operations. For simplicity you may assume that the entries of $A$ and $B$ are exactly representable in floating point.
\subsection*{Question 4:}
For the following problems state whether you think it is plausible/reasonable to try and construct stable or backwards stable algorithms. Think carefully about what the input/output space of such algorithms is and based on your characterization of the problem justify your choice.
\begin{enumerate}[label=(\alph*)]
\item Computing the inner product $x^Ty$ given vectors $x,y\in\R^n.$
\item Computing the outer product $xy^T$ given vectors $x,y\in\R^n.$
\end{enumerate}
\subsection*{Question 5:}
For any square non-singular matrix $A,$ prove that $1/\kappa_2(A)$ is the relative distance to the nearest singular matrix to $A$ in the two-norm. In other words, prove that
\[
\frac{1}{\kappa_2(A)} = \min_{\text{rank}(X)