%%Appendix A

\section{The effect of correlated errors
on $\mathbf{\Delta\chi^2}$}

\label{sec:AppDelChi}

The global fitting function $\chi^{2}_{\rm global}$
defined in (\ref{eq:Chi2global}) resembles the standard
statistical variable $\chi^{2}$, so it is tempting
to try to apply theorems of Gaussian statistics to
analyze the significance of the fit between theory
and experiment.
However, the familiar theorems do not apply, because
of correlations between measurement errors.
The purpose of this Appendix is to explore this issue.
The effect of correlated errors is potentially
a source of confusion.

For simplicity we describe the simplest case:
measurement of a single observable.
The arguments can be extended to cases where
multiple quantities are measured, such as the
determination of parton distribution functions.

Consider an observable $m$ that is measured $N$ times.
We shall refer to $N$ measurements of $m$ as one ``experiment''.
Let the true value of $m$ be $m_{0}$.
The measurements are $m_{1}, m_{2}, m_{3},\dots, m_{N}$.
The deviations from the true value are
$\alpha_{1}, \alpha_{2}, \alpha_{3},\dots, \alpha_{N}$,
where $\alpha_{i}=m_{i}-m_{0}$.
In general the measurement errors are correlated, so in the Gaussian
approximation the probability distribution of the
fluctuations is
\begin{equation}\label{eq:PD}
dP={\cal N}\exp\left\{-\frac{1}{2}\sum_{i,j=1}^{N}
\alpha_{i}C_{ij}\alpha_{j}\right\}d^{N}\alpha.
\end{equation}
Here $C_{ij}$ is a real symmetric matrix, and
${\cal N}=\sqrt{{\rm Det}\,C}/(2\pi)^{N/2}$ ensures
the normalization condition $\int dP=1$.

We will need the variance matrix
$\langle{\alpha_{i}\alpha_{j}}\rangle$, where
the notation $\langle{Q}\rangle$ means the average of
$Q$ in the probability distribution (\ref{eq:PD}).
For this Gaussian distribution,
\begin{equation}
\langle{\alpha_{i}\alpha_{j}}\rangle=\left(C^{-1}\right)_{ij}.
\end{equation}
The mean square fluctuation $E_{i}$ of the
$i^{\rm th}$ measurement $m_i$ is
\begin{equation}
E_{i} \equiv \langle\alpha_{i}^{2}\rangle
=\left(C^{-1}\right)_{ii}.
\end{equation}
To find the best estimate of the value of $m$ from these $N$ measurements,
{\em ignoring the correlations in the measurement errors}, we define
a chi-squared function $\chi_{u}^{2}(m)$ by
\begin{equation}\label{eq:defchi2}
\chi_{u}^{2}(m)=\sum_{i=1}^{N}\frac{\left(m_{i}-m\right)^{2}}{E_{i}}.
\end{equation}
The value of $m$ that minimizes $\chi_{u}^{2}(m)$, call it $\overline{m}$,
is then the best estimate of $m_{0}$ based on this information.
The function $\chi_{u}^{2}(m)$ is analogous to the fitting
function $\chi^{2}_{\rm global}$ in the CTEQ program, in the
sense that it does not include information about the
correlations between errors.
The minimum of $\chi_{u}^{2}(m)$ occurs at a weighted average of the
measurements,
\begin{equation}
\overline{m}=\frac{\sum_{i=1}^{N}m_{i}/E_{i}}{\sum_{i=1}^{N}1/E_{i}}.
\end{equation}
If all the $E_{i}$'s are equal then $\overline{m}$ is just the average
of the measurements.

Now, what are the fluctuations of the mean $\overline{m}$?
That is, if the ``experiment'' consisting of $N$ measurements
could be replicated many times, what would be the distribution
of $\overline{m}$'s obtained in those many trials?
It turns out that $\overline{m}$ has a Gaussian distribution
\begin{equation}
\frac{dP}{d\overline{m}}=\frac{1}{\sqrt{2\pi\Sigma^{2}}}
\exp\left[
-(\overline{m}-m_{0})^{2}/(2\Sigma^{2})\right].
\end{equation}
The standard deviation $\Sigma$ of $\overline{m}$ is
the RMS fluctuation; that is,
\begin{equation}
\Sigma^{2}
=\int\left(\overline{m}-m\right)^{2}\,dP
=\frac{1}{D^{2}}\sum_{ij}\frac{\left(C^{-1}\right)_{ij}}
{E_{i}E_{j}}
\end{equation}
where
\begin{equation}
D=\sum_{i}\frac{1}{E_{i}}.
\end{equation}

The question we wish to answer is this:
{\em How much does $\chi_{u}^{2}(m)$ increase, when $m$ moves
away from the minimum (at $\overline{m}$) by the amount $\pm \Sigma$
that corresponds to one standard deviation of the mean?}
The answer to this question is
\begin{equation}
\Delta{\chi_{u}^{2}}=\Sigma^{2}D.
\end{equation}
This result follows easily from the definition (\ref{eq:defchi2}),
because
\begin{equation}
\chi_{u}^{2}(\overline{m}+\Sigma) - \chi_{u}^{2}(\overline{m}) =
-2\Sigma\sum_{i}\frac{m_{i}-\overline{m}}{E_{i}}
+\Sigma^{2}\sum_{i}\frac{1}{E_{i}},
\end{equation}
and the term linear in $\Sigma$ is $0$ by the definition
of $\overline{m}$.
So far the discussion has been quite general.
We will now examine some illustrative special cases.

{\bf Example 1:} Suppose the measurement errors
are uncorrelated; that is,
\begin{equation}
C_{ij}=\delta_{ij}/E_{i}.
\end{equation}
Then the standard deviation of the mean $\overline{m}$ is
$\Sigma=1/\sqrt{D}$.
Thus for the uncorrelated case, the increase of $\chi_{u}^{2}$
corresponding to one standard deviation of the mean
is $\Delta{\chi_{u}^{2}}=1$.  This is the ``normal'' statistical
result: The $1\sigma$ range corresponds to an increase
of $\chi^{2}$ by 1.

An even more special case is when
the errors are uncorrelated and constant:
$E_{i}=\sigma^{2}$ independent of $i$,
where $\sigma$ is the standard deviation of
single measurements.  The correlation matrix
is $C_{ij}=\delta_{ij}/\sigma^{2}$.
In this case $D$ is $N/\sigma^{2}$, and the standard
deviation of the mean is $\Sigma=\sigma/\sqrt{N}$.

The criterion $\Delta{\chi}^{2}=1$ for one standard deviation
of a measured quantity is a standard result, often used
in the analysis of precision data.
But if $\chi^{2}$ is defined ignoring the correlations between
measurement errors, then the criterion $\Delta{\chi}^{2}=1$
is only valid for uncorrelated errors.
We will next consider two examples with correlated errors,
to show that $\Delta{\chi_{u}^{2}}$ is not $1$ for such cases.

{\bf Example 2:} Suppose measurements 1 and 2 are correlated,
3 and 4 are correlated, 5 and 6 are correlated, {\it etc.}
Then the correlation matrix is
\begin{equation}
C_{ij}=\left\{
\begin{array}{l}
1/\sigma^{2} {\rm ~for~} i=j
\\
c/\sigma^{2} {\rm ~for~} ij=12 {\rm ~or~} 21, 34 {\rm ~or~} 43, {\it etc}
\\
0 {\rm ~otherwise}\\
\end{array} \right.
\end{equation}
where $-1 < c < 1$ since the determinant of $C$ must be positive.
The inverse matrix $C^{-1}$ can be constructed using the
fact that $C$ is block diagonal, consisting of $N/2$
$2\times 2$ blocks.
Then it can be shown that
\begin{equation}
\Sigma=\frac{\sigma}{\sqrt{N}\sqrt{1+c}}
{\rm ~~and~~}
\Delta{\chi_{u}^{2}}=1-c.
\end{equation}
The increase of $\chi_{u}^{2}$ for one standard deviation
of the mean ranges from $0$ to $2$, depending on $c$.
The criterion $\Delta{\chi}^{2}=1$ does not apply
to this example with correlated errors.
A standard increase of $\chi_{u}^{2}$ may be smaller or larger than $1$.

{\bf Example 3:} For an even more striking example,
suppose the $N$ measurements that constitute a single
``experiment'' are, for $i=1, 2, 3,\dots, N$,
\begin{equation}
m_{i}=m_{0}+y_{i}+\beta
\end{equation}
where the $y_{i}$ are randomly distributed with standard
deviation $\sigma$,
and the measurements are systematically off by the amount $\beta$.
Suppose that $\beta$ has a Gaussian distribution
with standard deviation $s$ for replications of the
``experiment''.
In this example,
\begin{eqnarray}
C_{ij} &=& \frac{1}{\sigma^2}
\left(\delta_{ij} - \frac{s^2}{N s^2 + \sigma^2} \right),
\\
(C^{-1})_{ij} &=& \sigma^2 \, \delta_{ij} + s^2.
\end{eqnarray}
The variance of the individual measurements ($m_{i}$) is
\begin{equation}
\langle{m^{2}}\rangle-\langle{m}\rangle^{2}
=\sigma^{2} + s^{2}.
\end{equation}
Therefore our uncorrelated chi-squared variable
$\chi_{u}^{2}(m)$, defined ignoring the correlations, is
\begin{equation}
\chi_{u}^{2}(m)=\sum_{i=1}^{N}\frac{\left(m-m_{i}\right)^{2}}
{\sigma^{2} + s^{2}}.
\end{equation}
The minimum of $\chi_{u}^{2}(m)$ occurs at $\overline{m}$, which is just the
average of the individual measurements.
The variance of $\overline{m}$, averaged over many replications
of the ``experiment'', is
\begin{equation}
\Sigma^{2}=\langle{\overline{m}^{2}}\rangle -\langle{\overline{m}}\rangle^{2}
          =s^{2} + \frac{\sigma^{2}}{N}.
\end{equation}
The increase of $\chi_{u}^{2}$ as $m$ moves from $\overline{m}$
to $\overline{m}\pm\Sigma$, {\it i.e.}, by one standard deviation
of the mean, is
\begin{equation}
\Delta{\chi_{u}^{2}} \equiv
\chi_{u}^{2}(\overline{m}+\Sigma)-\chi_{u}^{2}(\overline{m})
 = \frac{\sigma^{2} + N s^{2}}{\sigma^{2} + s^{2}}.
\end{equation}
In the limit $s/\sigma \ll 1$, the error correlations in this
model become negligible and
$\Delta\chi^{2}$ reduces to the conventional value of $1$.
But in the limit  $s/\sigma \gg 1$ where the error correlations
are dominant, $\Delta\chi^{2}$ approaches $N$.

Thus for Example 3---a systematic error with
100\% correlation between measurements---the
increase of $\chi_{u}^{2}$ for a standard deviation
of $\overline{m}$ is much larger than 1.
If $s$ and $\sigma$ are comparable, then $\Delta{\chi_{u}^{2}}$
is of order $N$.

If the correlation matrix $C_{ij}$ is known accurately,
then the correlation information can be incorporated
into the definition of the $\chi^{2}$ function,
in the manner of Appendix \ref{sec:AppCorSys}.
For the full list of experiments in the global analysis
of parton distribution functions, however, the correlations of
systematic errors have not been published, so the fitting
function $\chi^{2}_{\rm global}$ has only uncorrelated
systematic errors.

We described above the measurement of a single quantity.
The determination of parton distribution functions seeks
to measure {\em many} quantities, {\it i.e.,} the
$16$ parameters $\{a\}$.
The above arguments can be extended to measurements of
multiple quantities.
If the measurement errors are uncorrelated, then the
increase of $\chi^{2}_{u}$ by 1 from the minimum defines
a hyperellipse in parameter
space---the {\em error ellipse}---corresponding
to one standard deviation of linear combinations
of the parameters.
However, if the errors are correlated then $\Delta{\chi_{u}^{2}}=1$
is not the correct criterion for a standard deviation.

The Lagrange Multiplier method finds the best fit to
the data, subject to a constrained value of some quantity $X$.
The prediction of $X$ is at the absolute minimum of $\chi^{2}_{u}$.
Again, if the errors are uncorrelated then one standard
deviation of the constrained quantity corresponds to
an increase of $\chi^{2}$ by 1 from the absolute minimum.
But if the errors are correlated then $\Delta{\chi_{u}^{2}}=1$
is not the correct criterion for one standard deviation
of $X$.

One reason for describing this familiar, even elementary,
statistics, is to avoid certain misconceptions.
Our standard PDF set $S_{0}$
is a parametrized fit to 1295 data points
with 16 fitting parameters.
The minimum value of $\chi^{2}_{\rm global}$ is approximately
1200.
Naively, it seems that an increase of $\chi^{2}_{\rm global}$
by merely 1, say from 1200 to 1201, could not possibly
represent a standard deviation of the fit.
Naively one might suppose that a standard deviation
would have $\Delta{\chi}^{2}\sim\sqrt{1295}$ rather than 1.
However, this is a misconception.
If the errors are uncorrelated
(or if the correlations are incorporated into
$\chi^{2}$) then indeed $\Delta{\chi}^{2}=1$ would
represent a standard deviation.
But this theorem is irrelevant to our problem,
because the large correlations of systematic errors
are not taken into account in $\chi^{2}_{\rm global}$.
%% Appendix B

\section{\protect$\mathbf{\chi^2}$
function including correlated systematic errors}

\label{sec:AppCorSys}

The purpose of this appendix is to derive the appropriate definition
of $\chi^{2}$ for data with correlated systematic errors.
The defining condition is that $\chi^{2}$ should obey a
chi-squared distribution.

Let $\left\{m_{i}\right\}$ be a set of measurements, where
$i=1, 2, 3, \dots, N$.
Let $t_{i}$ be the true, {\it i.e.,} theoretical value
of the $i^{\rm th}$ measured quantity.
Several kinds of measurement errors will
contribute to the difference between $m_{i}$ and $t_{i}$.
The uncorrelated error of measurement $i$
is denoted by $\sigma_{i}$.
There are also correlated errors, $K$ in number,
denoted $\beta_{1i}, \beta_{2i}, \dots, \beta_{Ki}$.
Thus the $i^{\rm th}$ measurement can be written as
\begin{equation} \label{eq:mistper}
m_{i} = t_{i}+{\rm ~errors}
 = t_{i}+\sigma_{i}r_{i}+\sum_{j=1}^{K}\beta_{ji}r_{j}^{\prime}
\end{equation}
where $r_{i}$ and $r_{j}^{\prime}$ are independently
fluctuating variables.
We assume that each of these fluctuations has a Gaussian distribution
with width $1$,
\begin{equation}
p(r)=\frac{e^{-r^{2}/2}}{\sqrt{2\pi}}.
\end{equation}
Note that $r_{j}^{\prime}$ is independent of $i$;
that is, the errors $\beta_{j1}, \beta_{j2},\dots ,\beta_{jN}$
are 100\% correlated for all $N$ data points.

The probability distribution of the measurements is
\begin{eqnarray}
dP &=& \int \prod_{i=1}^{N} p(r_{i})dr_{i}
\prod_{j=1}^{K} p(r_{j}^{\prime}) dr_{j}^{\prime}
\nonumber \\
 &\times& \prod_{i=1}^{N}\delta
\left(m_{i}-t_{i}-\sigma_{i}r_{i}
-\sum_{j=1}^{K}\beta_{ji}r_{j}^{\prime}\right)d^{N}m.
\label{eq:PDwithC}
\end{eqnarray}
Now we will evaluate the integrals over $r_{i}$ and $r^{\prime}_{j}$
in two steps.
First evaluate the $r_{i}$ integrals using the delta functions,
\begin{equation}
dP = \int \prod_{j=1}^{K} dr^{\prime}_{j} {\cal C}_{1}
e^{-\chi_{1}^{2}/2} \;d^{N}m
\end{equation}
where ${\cal C}_{1}$ is a normalization constant and
\begin{equation}
\chi_{1}^{2}=
\sum_{i=1}^{N} \left(\frac{m_{i}-t_{i}
-\sum_{j}\beta_{ji}r^{\prime}_{j}}{\sigma_{i}}\right)^{2}
+\sum_{j=1}^{K} {r^{\prime}_{j}}^{2}.
\end{equation}
Note that $\chi_{1}^{2}$ is a function of $r_{1}^{\prime},
\dots,r_{K}^{\prime}$.
These variables $\{r_{j}^{\prime}\}$ could be used as fitting
parameters to account for the systematic errors:
Minimizing $\chi_{1}^{2}$ with respect to $r_{j}^{\prime}$ would
provide the best model to correct for the systematic error of
type $j$.
Because $\chi_{1}^{2}$ is only a quadratic polynomial in
the $r_{j}^{\prime}$ variables, the minimization can be
done analytically.

To continue evaluating (\ref{eq:PDwithC}) we now do the integration
over $\{r_{j}^{\prime}\}$.
Write $\chi_{1}^{2}$ in the form
\begin{equation}
\chi_{1}^{2} = \sum_{i=1}^{N}\frac{\left(m_{i}-t_{i}\right)^{2}}
{\sigma_{i}^{2}} - \sum_{j=1}^{K} 2 B_{j} r_{j}^{\prime}
+\sum_{j,j^{\prime}=1}^{K}A_{jj^{\prime}}r_{j}^{\prime}
r_{j^{\prime}}^{\prime}
\end{equation}
where $B_{j}$ is a vector with $K$ components
\begin{equation}
B_{j}=\sum_{i=1}^{N}\beta_{ji}(m_{i}-t_{i})/\sigma_{i}^{2},
\end{equation}
and $A_{jj^{\prime}}$ is a $K\times K$ matrix
\begin{equation}
A_{jj^{\prime}}=\delta_{jj^{\prime}}
+\sum_{i=1}^{N}\beta_{ji}\beta_{j^{\prime}i}/\sigma_{i}^{2}.
\end{equation}
Then the integration over $d^{K}r^{\prime}$ is an
exercise in Gaussian integration, with the result
\begin{equation}
dP={\cal C}\exp\left[-\frac{1}{2}\chi^{2}\right] d^{N}m
\end{equation}
where ${\cal C}$ is a normalization constant and
\begin{equation}\label{eq:th-data_corr}
\chi^{2}=\sum_{i=1}^{N}\frac{(m_{i}-t_{i})^{2}}{\sigma_{i}^{2}}
-\sum_{j=1}^{K}\sum_{j^{\prime}=1}^{K}
B_{j}\left(A^{-1}\right)_{jj^{\prime}}B_{j^{\prime}}.
\end{equation}
This equation is the appropriate definition of $\chi^{2}$
for data with correlated systematic errors.
The correlated errors are defined by the coefficients
$\beta_{ji}$ in (\ref{eq:mistper}), which determine the
vector $B_{j}$ and matrix $A_{jj^{\prime}}$.
An interesting relation is that the $\chi^{2}$ quantity in
(\ref{eq:th-data_corr}) is the minimum of $\chi_{1}^{2}$
with respect to the parameters
$r_{1}^{\prime},\dots,r_{K}^{\prime}$.

Another expression for $\chi^{2}$, which may be derived
from (\ref{eq:PDwithC}) by Gaussian integration, is \cite{Alekhin}
\begin{equation}\label{eq:Alekhin}
\chi^{2}=\sum_{i=1}^{N}\sum_{i^{\prime}=1}^{N}
\left(m_{i}-t_{i}\right) \left(V^{-1}\right)_{ii^{\prime}}
\left(m_{i^{\prime}}-t_{i^{\prime}}\right)
\end{equation}
where $V_{ij}$ is the variance matrix
\begin{equation}
V_{ii^{\prime}}=\sigma_{i}^{2} \delta_{ii^{\prime}}
+\sum_{j=1}^{K} \beta_{ji} \beta_{j i^{\prime}}.
\end{equation}
It can be shown that the inverse of the variance matrix is
\begin{equation}
\left(V^{-1}\right)_{ii^{\prime}}=
\frac{\delta_{ii^{\prime}}}{\sigma_{i}^{2}}-\sum_{j,j^{\prime}=1}^{K}
\frac{\beta_{ji}\beta_{j^{\prime}i^{\prime}}}
{{\sigma_{i}}^{2}{\sigma_{i^{\prime}}^{2}}}
\left(A^{-1}\right)_{jj^{\prime}}.
\end{equation}
Therefore (\ref{eq:th-data_corr})
and (\ref{eq:Alekhin}) are equivalent.
However, there is a real computational advantage in the use of
(\ref{eq:Alekhin}) because it does not require the numerical
inversion of the $N\times{N}$ variance matrix.

To check that (\ref{eq:th-data_corr}) makes sense
we can consider a special case.
Suppose the number $K$ of systematic errors is $N$,
and each systematic error contributes to just one
measurement.
Then the matrix of systematic errors has the form
\begin{equation}
\beta_{ji}=\delta_{ji}b_{i}.
\end{equation}
This situation is equivalent to an additional set of
{\em uncorrelated} errors $\left\{b_{i}\right\}$.
The vector $B_{j}$ is then
\begin{equation}
B_{j}=\frac{b_{j}\left(m_{j}-t_{j}\right)}{\sigma_{j}^{2}}
\end{equation}
and the matrix $A_{jj^{\prime}}$ is
\begin{equation}
A_{jj^{\prime}}=\delta_{jj^{\prime}}\left[1+\frac{b_{j}^{2}}
{\sigma_{j}^{2}}\right].
\end{equation}
Substituting these results into (\ref{eq:th-data_corr}) we find
\begin{equation}
\chi^{2}=\sum_{i}\frac{\left(m_{i}-t_{i}\right)^{2}}{\sigma_{i}^{2}
+b_{i}^{2}},
\end{equation}
which makes sense:
the uncorrelated errors just combine in quadrature.

The statistical quantity $\chi^{2}$ has a chi-squared
distribution with $N$ degrees of freedom.
Thus this variable may be used to set confidence levels
of the theory for the given data.
But to use this variable, the measurement errors
$\sigma_{i}$ and $\beta_{ji}$, for
$i=1,2,\dots,N$ and $j=1,2,\dots,K$, must be known
from the experiment.
A chi-squared distribution with many degrees of freedom
is a very narrow distribution, sharply peaked at
$\chi^{2}\!=\!N$.
Therefore small inaccuracies in the values of the
$\sigma_{i}$'s and $\beta_{ji}$'s may translate into
a large error on the confidence levels computed from
the chi-squared distribution.

It is equation (\ref{eq:th-data_corr}) that we use
in Section \ref{sec:quantifying} to compare the constrained
fits produced by the Lagrange multiplier method to data from
the H1 and BCDMS experiments.
Correlated systematic errors are also used to calculate
$\chi^{2}$ for the CDF and D0 jet experiments.
%%Appendix 3

\section{Parton Distribution Sets}

\label{sec:AppPdfs}

We give here the PDFs described in Section 6.
$S_{0}$ is the standard set, defined by the absolute minimum
of $\chi^{2}_{\rm global}$.
$S^{\pm}_{W,{\rm Tev}}$ are fits to the global data sets
with extreme values of $\sigma_{W}$(Tevatron),
{\it i.e.,} the outermost points on Fig.\ \ref{fig:Wprod},
generated by the Lagrange multiplier method.
$S^{\pm}_{Z,{\rm Tev}}$, $S^{\pm}_{W,{\rm LHC}}$, and
$S^{\pm}_{Z,{\rm LHC}}$ are analogous for $Z$ production
and $W$ and $Z$ production at the LHC.

\bigskip

The functional form of the initial parton distributions and the
definitions of the PDF parameters at the low-energy scale
$Q_{0}=1$\,GeV are
\[
f(x,Q_{0}^{2})=A_{0}\,x^{A_{1}}(1-x)^{A_{2}}(1+A_{3}\,x^{A_{4}})
\]
for $u_{v},d_{v},g,\bar{u}+\bar{d},s(=\bar{s})$; and for the ratio
\[
\frac{\bar{d}(x,Q_{0}^{2})}
{\bar{u}(x,Q_{0}^{2})}
=A_{0}\,\,x^{A_{1}}(1-x)^{A_{2}}+(1+A_{3}\,x)\,(1-x)^{A_{4}}.
\]
The tables of coefficients follow.

\[
S_{0}: \quad
\begin{array}{|c|r|r|r|r|r|}
\hline
      & A_{0}   & A_{1}   & A_{2}   & A_{3}   & A_{4}  \\  \hline
d_{v} & 0.5959  & 0.4942  & 4.2785  & 8.4187  & 0.7867 \\  \hline
u_{v} & 0.9783  & 0.4942  & 3.3705  & 10.0012 & 0.8571 \\  \hline
    g & 3.3862  & 0.2610  & 3.4795  & -0.9653 & 1. \\  \hline
\bar{d}/\bar{u}
      & 3.051E4 & 5.4143  & 15. & 9.8535  & 4.3558 \\  \hline
\bar{u}+\bar{d} 
      & 0.5089  & 0.0877  & 7.7482  & 3.3890  & 1. \\  \hline
    s & 0.1018  & 0.0877  & 7.7482  & 3.3890  & 1. \\  \hline
\end{array}
\]

\bigskip

\[
S_{W,\mathrm{TeV}}^\pm: \quad
\begin{array}{|c|r|r|r|r|r|}\hline
%WTev.tbl
 & A_{0} & A_{1} & A_{2} & A_{3} & A_{4} \\ \hline
d_v
 & 0.2891 & 0.5141 & 3.8555 & 10.9580 & 0.4128 \\ 
 & 0.2184 & 0.2958 & 4.6267 & 35.7229 & 1.0958 \\ \hline
%
u_v
 & 1.0142 & 0.5141 & 3.3614 & 9.2995 & 0.8053 \\ 
 & 0.2979 & 0.2958 & 3.3279 & 32.8453 & 0.9427 \\ \hline
%
g
 & 4.6245 & 0.4354 & 3.4795 & -0.9728 & 1. \\ 
 & 1.8080 & 0.0458 & 3.4795 & -0.0519 & 1. \\ \hline
%
\bar{d}/\bar{u}
 & 5.908E4 & 5.6673 & 15. & 9.8535 & 4.7458 \\ 
 & 2.041E4 & 5.1506 & 15. & 9.8535 & 4.8320 \\ \hline
%
\bar{u}+\bar{d}
 & 0.4615 & 0.0108 & 6.6145 & 0.92784 & 1. \\ 
 & 1.2515 & 0.3338 & 7.5216 & -0.0570 & 1. \\ \hline
%
s
 & 0.0923 & 0.0108 & 6.6145 & 0.9278 & 1. \\ 
 & 0.2503 & 0.3338 & 7.5216 & -0.0570 & 1. \\ \hline
%
\end{array}
\]

\bigskip

\[
S_{Z,\mathrm{TeV}}^\pm: \quad
\begin{array}{|c|r|r|r|r|r|} \hline
%ZTev.tbl
 & A_{0} & A_{1} & A_{2} & A_{3} & A_{4} \\ \hline
d_v
 & 0.6061 & 0.5502 & 4.0017 & 5.8346 & 0.5343 \\ 
 & 0.3427 & 0.3728 & 4.5166 & 19.8510 & 0.9966 \\ \hline
%
u_v
 & 1.2159 & 0.5502 & 3.3347 & 7.3386 & 0.7711 \\ 
 & 0.5247 & 0.3728 & 3.3905 & 20.1006 & 0.9556 \\ \hline
%
g
 & 4.4962 & 0.4321 & 3.4795 & -0.9023 & 1. \\ 
 & 2.3113 & 0.1032 & 3.4795 & -0.6349 & 1. \\ \hline
%
\bar{d}/\bar{u}
 & 4.321E4 & 5.4724 & 15. & 9.8535 & 4.6298 \\ 
 & 2.818E4 & 5.4540 & 15. & 9.8535 & 4.4376 \\ \hline
%
\bar{u}+\bar{d}
 & 0.4609 & 0.0103 & 6.6671 & 0.9822 & 1. \\ 
 & 0.9900 & 0.2926 & 8.3205 & 2.1648 & 1. \\ \hline
%
s
 & 0.0921 & 0.0103 & 6.6671 & 0.9823 & 1. \\ 
 & 0.1980 & 0.2926 & 8.3205 & 2.1648 & 1. \\ \hline
%

\end{array}
\]

\bigskip

\[
S_{W,\mathrm{LHC}}^\pm: \quad
\begin{array}{|c|r|r|r|r|r|} \hline
%WLHC.tbl
 & A_{0} & A_{1} & A_{2} & A_{3} & A_{4} \\ \hline
d_v
 & 0.7326 & 0.5008 & 4.6393 & 10.8532 & 1.0595 \\ 
 & 0.5671 & 0.4771 & 4.2615 &  8.8355 & 0.8130 \\ \hline
%
u_v
 & 1.0608 & 0.5008 & 3.4023 &  9.6622 & 0.8968 \\ 
 & 0.9142 & 0.4771 & 3.3761 & 10.9138 & 0.8809 \\ \hline
%
g
 & 2.2379 & 0.0733 & 3.4795 & -0.9860 & 1. \\ 
 & 2.5021 & 0.3981 & 3.4795 &  1.6229 & 1. \\ \hline
%
\bar{d}/\bar{u}
 & 2.178E4 & 5.2576 & 15. & 9.8535 & 4.4810 \\ 
 & 4.531E4 & 5.4979 & 15. & 9.8535 & 4.6585 \\ \hline
%
\bar{u}+\bar{d}
 & 1.1980 &  0.2952 & 6.9475 & -0.5442 & 1. \\ 
 & 0.2759 & -0.0918 & 8.2045 &  6.3950 & 1. \\ \hline
%
s
 & 0.2396 &  0.2952 & 6.9475 & -0.5442 & 1. \\ 
 & 0.0552 & -0.0918 & 8.2045 &  6.3950 & 1. \\ \hline
%

\end{array}
\]

\bigskip

\[
S_{Z,\mathrm{LHC}}^\pm: \quad
\begin{array}{|c|r|r|r|r|r|} \hline

%ZLHC.tbl
 & A_{0} & A_{1} & A_{2} & A_{3} & A_{4} \\ \hline
d_v
 & 0.5659 & 0.4616 & 4.5297 & 12.3685 & 0.9836 \\ 
 & 0.4585 & 0.4496 & 4.2122 & 10.3850 & 0.7760 \\ \hline
%
u_v
 & 0.8344 & 0.4616 & 3.3847 & 12.1129 & 0.8872 \\ 
 & 0.7640 & 0.4496 & 3.3566 & 12.8253 & 0.8701 \\ \hline
%
g
 & 2.3282 & 0.0918 & 3.4795 & -0.9837 & 1. \\ 
 & 2.9475 & 0.4219 & 3.4795 & 0.9447 & 1. \\ \hline
%
\bar{d}/\bar{u}
 & 2.421E4 & 5.3032 & 15. & 9.8535 & 4.5341 \\ 
 & 4.416E4 & 5.4708 & 15. & 9.8535 & 4.7925 \\ \hline
%
\bar{u}+\bar{d}
 & 1.1130 &  0.2698 & 6.8490 & -0.5330 & 1. \\ 
 & 0.2719 & -0.0899 & 8.1492 &  6.5300 & 1. \\ \hline
%
s
 & 0.2226 & 0.2698 & 6.8490 & -0.5330 & 1. \\ 
 & 0.0544 & -0.0899 & 8.1492 & 6.5300 & 1. \\ \hline
%

\end{array}
\]
With a program to solve the PDF evolution equations,
the PDFs for an arbitrary momentum scale $Q$ can be
generated.


% Standard Simple Cover Page for Papers -- wkt 1/11/95

\begin{titlepage}

\begin{tabular}{l}
\noindent{January 4, 2001}  %\DATE
\end{tabular}
\hfill
\begin{tabular}{l}
\PPrtNo
\end{tabular}

\vspace{1cm}

\begin{center}
                           % Title
\renewcommand{\thefootnote}{\fnsymbol{footnote}}
{
\LARGE \TITLE
%\footnote[2]{\THANKS}
}

\vspace{1.25cm}
                          % Authors
{\large  \AUTHORS}

\vspace{1.25cm}

                          % Institutions
\INST
\end{center}

\vfill

\ABSTRACT                 % Abstract

\vfill

\newpage
\end{titlepage}

\renewcommand{\thefootnote}{\alph{footnote}}   % Has to be outside titlepage
\setcounter{footnote}{0}
% References
\begin{thebibliography}{99}

%Intro
\bibitem{ZomerDis}
V.\ Barone, C.\ Pascaud, and F.\ Zomer,
Eur.\ Phys.\ J.\ {\bf C12}, 243 (2000)
;
C.~Pascaud and F.~Zomer,
%``QCD analysis from the proton structure function F2 measurement:
%Issues on fitting, statistical and systematic errors,'',
Tech.\ Note LAL-95-05.

\bibitem{Alekhin}
S.~Alekhin,
%``Extraction of parton distributions and alpha(s) from DIS data
%within the Bayesian treatment of systematic errors,''
Eur.\ Phys.\ J.\  {\bf C10}, 395 (1999) ;
contribution to Proceedings of
\emph{Standard Model Physics (and more) at the LHC}, 1999; and
S.~I.~Alekhin,
%``Global fit to the charged leptons DIS data: alpha(s), parton
%distributions, and high twists,''
.
%%CITATION = ;%%

\bibitem{GieleEtal}
 W.\ T.\ Giele and S.\ Keller, Phys.\ Rev.\ \textbf{D58},
094023 (1998) ;
W.\ T.\ Giele, S.\ Keller and D.\ Kosower,
Proceedings of the Fermilab Run II Workshop (2000).

\bibitem{Botje}
%``A QCD analysis of HERA and fixed target
%structure function data'',
M.\ Botje, Eur.\ Phys.\ J.\ {\bf C14}, 285 (2000)
.

\bibitem{Bialek}
W.\,Bialek, C.\,G.\,Callan, S.\,P.\,Strong,
Phys.\,Rev.\,Lett.\,{\bf 77}, 4693 (1996);
V.\,Periwal, Phys.\,Rev.\,{\bf D59}, 094006 (1999)
.

\bibitem{Ball}
R.\,D\,Ball, ``QCD and Hadronic Interactions'',
In {\em Proceedings of the XXXIVth Rencontres de
Moriond}, 1999.

\bibitem{RunII}
R.\ Brock, D.\ Casey, J.\ Huston, J.\ Kalk,
J.\ Pumplin, D.\ Stump, W.\,K.\ Tung,
Proceedings of the Fermilab Workshop, Run II and Beyond,
(2000) .

\bibitem{pdfuc0}
J.\ Pumplin, D.\,R.\ Stump, and W.\,K.\ Tung,
``Multivariate Fitting and the Error Matrix in Global Analysis of Data'',
submitted to Phys.\,Rev.\,{\bf D} .

\bibitem{CSCTEQ}
D.E.\ Soper and J.C.\ Collins,
``Issues in the Determination of Parton Distribution Functions'',
CTEQ Note 94/01 .

\bibitem{Hesse}
J.\ Pumplin, D.\ Stump, R.\ Brock, D.\ Casey, J.\ Huston,
J.\ Kalk, H.\,L.\ Lai, W.\,K.\ Tung, ``Uncertainties
of predictions from parton distribution functions II:
the Hessian method'', MSU preprint .

% Experiments in CTEQ5

\bibitem{cteq5}
H.\ L.\ Lai, J.\ Huston, S.\ Kuhlmann, J.\ Morfin, F.\ Olness,
J.\ F.\ Owens, J.\ Pumplin and W.\ K.\ Tung,
Eur.\ Phys.\ J., {\bf C12}, 354 (2000)

and earlier references cited therein.

\bibitem{MRST}  %"Parton distributions: a new global analysis"
A.\ D.\ Martin and R.\ G.\ Roberts and W.\ J.\ Stirling and R.\ S.\ Thorne,
Eur.\ Phys.\ J.\ \textbf{C4}, 463 (1998) ,
and earlier references cited therein.

\bibitem{MRST2}
%"Parton Distributions and the LHC, W and Z production"
A.\ D.\ Martin, R.\ G.\ Roberts, W.\ J.\ Stirling, and R.\ S.\ Thorne,
Eur.\ Phys.\ J., {\bf C14}, 133 (2000)
.

\bibitem{bcdms}
BCDMS Collaboration (A.C. Benvenuti, {\em et al.}),
Phys.\ Lett.\ {\bf B223}, 485 (1989);
and Phys.\ Lett.\ {\bf B237}, 592 (1990).

\bibitem{NewH1}
 H1 Collaboration (S. Aid, {\em et al.}):
``1993 data'' Nucl. Phys. {\bf B439}, 471 (1995);
``1994 data'', DESY-96-039, ;
and H1 Webpage.

\bibitem{NewZeus}
ZEUS Collaboration (M. Derrick, {\em et al.}),
``1993 data'' Z.~Phys. {\bf C65}, 379 (1995);
``1994 data'', DESY-96-076 (1996).

\bibitem{NewNmc}
 NMC Collaboration (M. Arneodo, {\em et al.}),
Phys.\ Lett.\ {\bf B364}, 107 (1995).

\bibitem{ccfr}
CCFR Collaboration (W.C. Leung, {\em et al.}),
Phys.\ Lett.\ {\bf B317}, 655 (1993);
and (P.Z. Quintas, {\em et al.}),
Phys.\ Rev.\ Lett.\ {\bf 71}, 1307 (1993).

\bibitem{E605}
E605 Collaboration (G. Moreno, {\it et al.}),
Phys.\ Rev.\ {\bf D43}, 2815 (1991).

\bibitem{NA51}
NA51 Collaboration (A. Baldit, {\em et al.}),
Phys.\ Lett.\ {\bf B332}, 244 (1994).

\bibitem{E866} E866 Collaboration (E.A. Hawker, {\em et al.}),
Phys.\ Rev.\ Lett.\ {\bf 80}, 3175 (1998).

\bibitem{Wasym}
CDF Collaboration (F.\ Abe, {\em et al.}),
Phys.\ Rev.\ Lett.\ {\bf 74}, 850 (1995).

\bibitem{D0Jets}
D0 Collaboration (B.\ Abbott, {\em et al.}),
FERMILAB-PUB-98-207-E; .

% W cross section

\bibitem{CdfJets}
CDF Collaboration (F.\ Abe, {\em et al.}),
Phys.\ Rev.\ Lett.\ \textbf{77}, 439 (1996);
and F.\ Bedeschi, talk at 1999 Hadron Collider Physics Conference,
Bombay, January, 1999.

\bibitem{Kosower98}
W.~T.~Giele, S.~Keller and D.~A.~Kosower,
``Parton distributions with errors'',
In {\it La Thuile 1999, Results and perspectives in particle physics},
Proccedings of `Les Rencontres de Physique de la Valle d'Aoste'.

\bibitem{FLehner}
F.\ Lehner,
``Some Aspects of $W/Z$ boson physics at the Tevatron'',
in {\em Proceedings of the 4th Rencontres du Vietnam;
International Conference on Physics at Extreme Energies},
Hanoi, 2000;
FERMILAB-Conf-00/273-E, Oct.\,2000.

% Extras

\bibitem{PartDatGr}
D.\,E.\ Groom {\it et al} (Particle Data Group)
Eur.\ Phys.\ J.\ {\bf C15}, 1 (2000).

\bibitem{FAbe}
F.\ Abe {\it et al} (CDF Collaboration),
Phys.\ Rev.\ {\bf D50}, 5550 (1994).

\bibitem{MAlbrow}
M.\ Albrow, A.\ Beretras, P.\ Giromini, L.\ Nodulman,
FERMILAB-TM-2071, 1999 (unpublished).


\end{thebibliography}
\newcommand{\DATE}
{}
%{\today}

\newcommand{\PPrtNo}
{ MSU-HEP-07102 \\
CERN-TH/2000-359}

\newcommand{\TITLE}
{\Large Uncertainties of Predictions from Parton Distribution Functions
\newline
I: the Lagrange Multiplier Method}

\newcommand{\AUTHORS}
{ D.\ Stump, J.\ Pumplin, R.\ Brock, D.\ Casey, J.\ Huston, J.\ Kalk, \\
H.L.\ Lai,$^a \strut$ W.K.\ Tung$^b \strut$
}
%
\newcommand{\INST}
{ Department of Physics and Astronomy \\
         Michigan State University \\
         East Lansing, MI 48824 \\

\vspace{2ex}

         $^a$ Ming-Hsin Institute of Technology \\
         Hsin-Chu, Taiwan

\vspace{2ex}

         $^b$ Theory Division, CERN \\
         Geneva, Switzerland
}
\newcommand{\ABSTRACT}
{\hspace*{0.1cm} We apply the Lagrange Multiplier method to study the
uncertainties of physical predictions due to the uncertainties of parton
distribution functions (PDFs), using the cross section $\sigma_{W}$ for $W$
production at a hadron collider as an archetypal example.
An effective $\chi^2$ function based on the CTEQ global QCD analysis
is used to generate a series of PDFs, each of which represents the
best fit to the global data for some specified value of $\sigma_W$.
By analyzing the likelihood of these ``alterative hypotheses'',
using available information on errors from the individual experiments,
we estimate that the fractional uncertainty of $\sigma_{W}$ due to current
experimental input to the PDF analysis is approximately $\pm{4}$\%
at the Tevatron, and $\pm{8}$--$10$\% at the LHC.
We give sets of PDFs corresponding to these up and down
variations of $\sigma_{W}$.
We also present similar results on $Z$ production
at the colliders.
Our method can be applied to any combination of physical variables in
precision QCD phenomenology, and it can be used to generate benchmarks
for testing the accuracy of approximate methods based on the
error matrix.}

%                 ------- Macros --------
\def\pt{$p_T$}

%               Usage: \begin{Simlis}[opt-label]{left-margin}
%                        \item ...
%                      \end{Symlis}
\newenvironment{Simlis}[2][$\bullet$]
{\begin{list}{#1}
 {
  \settowidth{\labelwidth}{#1}
  \setlength{\labelsep}{0.5em}
  \setlength{\leftmargin}{#2}
  \setlength{\rightmargin}{0em}
  \setlength{\itemsep}{0ex}
  \setlength{\topsep}{0ex}
 }
}
{\end{list}}
                              % page format of the logical page
\textwidth  = 6.5 in
\textheight = 9.0 in

\topmargin     = 1.0 in
\oddsidemargin = 1.0 in
                              % printer device-dependent offsets
\voffset = -1.60 in
\hoffset = -1.00 in

\setlength{\parindent}{1cm}
\setlength{\itemindent}{0 em}
\renewcommand{\baselinestretch}{1.1}

%\newcommand{\}[]{}
%All Tables for pdf uncertainty paper

% List of Data Sets
\newcommand{\tblDatSet}
{
\begin{table}

\begin{center}
\begin{tabular}{lllcl}
\hline
Experiment & Process & Label & \# Data pts & Reference \\ \hline
BCDMS & DIS $\mu p$ & BCDMSp & 168 & \cite{bcdms} \\
BCDMS & DIS $\mu d$ & BCDMSd & 156 & \cite{bcdms} \\
H1 & DIS $ep$ & H1 & 172 & \cite{NewH1} \\
ZEUS & DIS $ep$ & ZEUS & 186 & \cite{NewZeus} \\
NMC & DIS $\mu p$ & NMCp & 104 & \cite{NewNmc} \\
NMC & DIS $\mu p/\mu n$ & NMCr & 123 & \cite{NewNmc} \\
NMC & DIS $\mu p/\mu n$ & NMCrx & 13 & \cite{NewNmc} \\
CCFR & DIS $\nu p$ & CCFR2 & 87 & \cite{ccfr} \\
CCFR & DIS $\nu p$ & CCFR3 & 87 & \cite{ccfr} \\
E605 & D-Y $pp$ & E605 & 119 & \cite{E605} \\
NA51 & D-Y $pd/pp$ & NA51 & 1 & \cite{NA51} \\
E866 & D-Y $pd/pp$ & E866 & 11 & \cite{E866} \\
CDF & W$_{lep-asym.}$ & CDFw & 11 & \cite{Wasym} \\
D0 & $\bar{p}p\rightarrow jet\,X$ & D0jet & 24 & \cite{D0Jets} \\
CDF & $\bar{p}p\rightarrow jet\,X$ & CDFjet & 33 & \cite{CdfJets} \\ \hline
\end{tabular}
\end{center}

\caption{List of data sets used in the global analysis.
\label{tbl:DatSet} }
\end{table}
}

%%  Everything below this line are not in use in version pdfucLMw1 =======



%%---LIST OF EXPERIMENTS---
\newcommand{\tblExptList}
{
\begin{table}
\begin{center}
\begin{tabular}{|c|c|c|c|}
\hline Process & Experiment & Measurable & $N_{data}$ \\
DIS & BCDMS\cite{bcdms} & $F_{2\ H}^\mu, F_{2\ D}^\mu $ & 324   \\
\hline & NMC
\cite{NewNmc} & $F_{2\ H}^\mu, F_{2\ D}^\mu  $ & 240 \\
\hline & H1
\cite{NewH1}& $F_{2\ H}^e $ & 172   \\ \hline & ZEUS\cite{NewZeus} & $F_{2\
H}^e $ & 186   \\
\hline & CCFR  \cite{ccfr}& $F_{2\ Fe}^\nu, x\ F_{3\ Fe}^\nu
$ & 174  \\
\hline Drell-Yan  & E605\cite{E605} & $sd\sigma /d\sqrt{\tau }dy$ &
119  \\
\hline & E866 \cite{E866} & $\sigma(pd)/2\sigma(pp)$ & 11 \\ \hline &
NA-51\cite{NA51} & $A_{DY}$ & 1   \\ \hline W-prod. & CDF \cite{Wasym} & Lepton
asym. & 11  \\
\hline\hline Incl. Jet & CDF \cite{CdfJets} & $d\sigma /dE_t$ &
33  \\
\hline & D0\cite{D0Jets} & $d\sigma /dE_t$ & 24   \\
\hline
\end{tabular}
\end{center}
  \caption{List of processes and experiments used in the CTEQ5 global
  analysis. The total number of data points is 1295.
\label{tbl:ExptList}}

\end{table}
}

%%---CALCULATIONS OF CHISQ_CORR FOR H1 DATA (Tevatron)---
\newcommand{\tblHoneTev}
{
\begin{table}
\begin{center}
\begin{tabular}{rrrr} \\ \hline
Lagrange & $\sigma_{W}B$ & $\chi^{2}/172$ & probability\\
multiplier & in nb & & \\
 \hline
 3000 & 2.294 & 1.0847 & 0.212\\
 2000 & 2.321 & 1.0048 & 0.468\\
 1000 & 2.356 & 0.9676 & 0.605\\
 0 & 2.374 & 0.9805 & 0.558\\
 -1000 & 2.407 & 1.0416 & 0.339\\
 -2000 & 2.431 & 1.0949 & 0.187\\
 -3000 & 2.450 & 1.1463 & 0.092\\
 \hline
\end{tabular}
\end{center}
  \caption{Comparison of H1 data to the PDF fits with
constrained values of $\sigma_{W}$ for the Tevatron.
($B$ is the branching ratio for $W\rightarrow e\nu$.)
  \label{tbl:H1Tev}}

\end{table}
}

%%---CALCULATIONS OF CHISQ_CORR FOR H1 DATA (LHC)---
\newcommand{\tblHoneLHC}
{
\begin{table}
\begin{center}
\begin{tabular}{rrrr} \\ \hline
Lagrange & $\sigma_{W}B$ & $\chi^{2}/172$ & probability\\
multiplier & in nb & & \\
 \hline
 300 & 18.97 & 1.3607 & 0.001\\
 200 & 19.42 & 1.2085 & 0.032\\
 100 & 19.99 & 1.0616 & 0.276\\
   0 & 20.67  & 0.9805 & 0.558\\
 -100 & 21.20 & 0.9536 & 0.656\\
 -200 & 21.81 & 0.9615 & 0.628\\
 -300 & 21.99 & 0.9936 & 0.509\\
 \hline
\end{tabular}
\end{center}
  \caption{Comparison of H1 data to the PDF fits with
constrained values of $\sigma_{W}$ for the LHC.
($B$ is the branching ratio for $W\rightarrow e\nu$.)
  \label{tbl:H1LHC}}

\end{table}
}

%%---CALCULATIONS OF CHISQ_CORR FOR BCDMS DATA (Tevatron)---
\newcommand{\tblBCDMSTev}
{
\begin{table}
\begin{center}
\begin{tabular}{rrrr} \\ \hline
Lagrange & $\sigma_{W}B$ & $\chi^{2}/168$ & probability\\
multiplier & in nb & & \\
\hline
 3000 & 2.294 & 1.2061 & 0.035\\
 2000 & 2.321 & 1.1672 & 0.068\\
 1000 & 2.356 & 1.1118 & 0.153\\
    0 & 2.374 & 1.1095 & 0.157\\
 -1000 & 2.407 & 1.1213 & 0.135\\
 -2000 & 2.431 & 1.1592 & 0.077\\
 -3000 & 2.450 & 1.2131 & 0.031\\
 \hline
\end{tabular}
\end{center}
  \caption{Comparison of BCDMS data to the PDF fits with
constrained values of $\sigma_{W}$ for the Tevatron.
($B$ is the branching ratio for $W\rightarrow e\nu$.)
  \label{tbl:BCDMS}}

\end{table}
}

%%---CALCULATIONS OF CHISQ_CORR FOR BCDMS DATA (LHC)---
\newcommand{\tblBCDMSLHC}
{
\begin{table}
\begin{center}
\begin{tabular}{rrrr} \\ \hline
Lagrange & $\sigma_{W}B$ & $\chi^{2}/168$ & probability\\
multiplier & in nb & & \\
\hline
 300 & 18.97 & 1.0659 & 0.265\\
 200 & 19.42 & 1.0620 & 0.276\\
 100 & 19.99 & 1.0622 & 0.276\\
   0 & 20.67 & 1.1095 & 0.157\\
-100 & 21.20 & 1.1354 & 0.110\\
-200 & 21.81 & 1.1943 & 0.043\\
-300 & 21.99 & 1.1946 & 0.043\\
 \hline
\end{tabular}
\end{center}
  \caption{Comparison of BCDMS data to the PDF fits with
constrained values of $\sigma_{W}$ for the LHC.
($B$ is the branching ratio for $W\rightarrow e\nu$.)
  \label{tbl:BCDMSLHC}}

\end{table}
}

%=====June 27=====
%Table of chisq's for each experiment
\documentclass[12pt]{article}
\usepackage{epsf,amssymb}

%                       table and figure definitions
\input{pdfuc2.tbl}
\input{pdfuc2.fig}
%                       preamble; front-matter and macro definitions
\input{pdfuc2.pre}

%                       document
\begin{document}

%                       cover page
\input{front.tex}
%                       body
\tableofcontents
\newpage
\input{s21.Intro.tex}             % Introduction
                  %\clearpage
\input{s22.Global.tex}            % Lagrange Multiplier Method
                  %\clearpage
\input{s23.Lagrange.tex}          % \sigma_W parabola
                  %\clearpage
\input{s24.DeltaChi2.tex}         % quantifying error of \sigma_w
                  %\clearpage
\input{s25.MoreExamples.tex}      % Other examples, LHC-w; Z
                  %\clearpage
\input{s26.UpDownPdf.tex}         % Give the Up/Down Pdf sets to be distributed
                  %\clearpage
\input{s27.Conclude.tex}

                  %\clearpage
\appendix

\input{a21.DeltaChi2.tex}           % Our calculation of \chi^2 with correlations
                  %\newpage
%\clearpage
\input{a22.Chi2withcorr.tex}
%\clearpage
%
\input{a23.pdfs.tex}
                  %\newpage
\newpage
                              % citations
\input{pdfuc2.cit}
                              % back matters (tables & figures, if not set in-line)
%\input{pdfuc2.bck}

\end{document}
%%Section 2

\section{The Global QCD Analysis}

\label{sec:Global}

We adopt the same experimental and theoretical input as the CTEQ5
analysis \cite{cteq5}: 15 data sets from 11 experiments on neutral-current and
charged-current deep inelastic scattering (DIS), lepton-pair production
(Drell-Yan), lepton asymmetry in $W$-production, and high $p_{T}$ inclusive
jet production processes are used.
({\it Cf.} Table \ref{tbl:DatSet} in Sec.~\ref{sec:quantifying}.)
The total number of data points is $N=1295$.
We denote the experimental data values by
$\{D\}=\{D_{I};\;I=1,\dots,N\}$.
The theory input is next-leading-order (NLO) PQCD, and the
theory value for the data point $I$ will be denoted by $T_{I}$.
The theory depends on a set of parameters
$\{a\} \equiv \{a_{i};\;i=1,\dots,d\}$.
These parameters characterize the nonperturbative
QCD input to the analysis; they determine the initial PDFs
$\{f(x,Q_{0};\{a\})\}$ defined at a low energy scale $Q_{0}$,
below the energy scale of the data, which we choose to be
$Q_{0}=1$\,GeV.
When we need to emphasize that the theoretical values depend
on the PDF parameters we write $T_{I}(a)$ to indicate
the dependence on $\{a\}$.

The parametrization of $\{f(x,Q_{0})\}$ is somewhat arbitrary,
motivated by physics, numerical considerations, and economy.
Another parametrization might be employed,
and differences among the possible parametrizations are in principle a
source of theoretical uncertainty in their own right.
For most of this study we focus on a single parametrization,
but we comment on the effect of changing the parametrization
at the end of Sec.~\ref{sec:quantifying}.
The number $d$ of the parameters $\{a\}$ is chosen
to be commensurate with current experimental constraints.
For this study we use $d=16.$
The detailed forms adopted for the initial functions
$\{f(x,Q_{0};\{a\})\}$ are not of particular concern
in this study, since we shall be emphasizing
results obtained by ranging over the full
parameter space.\footnote{%
In other words, for this paper, the PDF parameters $\{a\}$
play mostly the role of ``internal variables''.
In contrast, they occupy the center stage in
the companion paper \cite{Hesse}.}
The explicit formulas are given in Appendix \ref{sec:AppPdfs}
(where relevant PDFs from the results of our study are presented).
The $T_{I}(\{a\})$ are calculated as convolution integrals of the
relevant NLO QCD matrix elements and the universal parton distributions
$\{f(x,Q;\{a\})\}$ for all $Q$. The latter are obtained from the initial
functions $\{f(x,Q_{0};\{a\})\}$ by NLO QCD evolution.

The \emph{global analysis} consists of a systematic way to determine
the best values for the $\{a\},$ and the associated uncertainties,
by fitting $\{T(a)\}$ to $\{D\}$.
Because of the wide range of experimental and theoretical sources
of uncertainty mentioned in the Introduction, there are a variety
of strategies to deal with the complex issues
involved\ \cite{ZomerDis,Alekhin,GieleEtal,Botje,RunII}.
In the next two sections, the primary
tool we employ is conventional $\chi ^{2}$ analysis.
The important task is to define an effective $\chi^{2}$ function,
called $\chi_{\mathrm{global}}^{2}(a) $, that conveniently combines
the theoretical and global experimental inputs,
as well as relevant physics considerations based on prior knowledge,
to give an overall measure of the goodness-of-fit for a given set
of PDF parameters.

Experience in global analysis of PDFs during the past two decades has
demonstrated that the PDFs obtained by the minimization of such a suitably
chosen $\chi _{\mathrm{global}}^{2}$ provide very useful up-to-date hadron
structure functions which, although not unique, are representative of good fits
between theory and experiments.
Now we must quantify the uncertainties of the PDFs and their
predictions; {\it i.e.,} we must expand the scope of the work
from merely identifying typical solutions to systematically
mapping the PDF parameter space in the neighborhood around
the minimum of $\chi^2$.

The simplest possible choice for the $\chi ^{2}$ function would be
\begin{equation}
\chi^{2}(a)=\sum_{I=1}^{N}
\frac{\left[D_{I}-T_{I}(a)\right]^{2}}{\sigma
_{I}^{2}}  \label{eq:Ch2generic}
\end{equation}
where $\sigma _{I}$ is the error associated with data point $I$.
Through $T_{I}(a)$, $\chi^{2}(a)$ is a function of
the theory parameters $\{a\}$.
Minimization of $\chi^{2}(a)$ would identify
parameter values for which the theory fits the data.
However, the simple form (\ref{eq:Ch2generic}) is appropriate only
for the ideal case of a uniform data set with uncorrelated errors.
For data used in the global analysis, most experiments combine various
systematic errors into one effective error for each data point,
along with the statistical error.
Then, in addition, the fully correlated normalization error of
the experiment is usually specified separately.
For this reason,
it is natural to adopt the following definition for the effective $\chi^2$
(as done in previous CTEQ analyses):
\begin{eqnarray}
\chi _{\mathrm{global}}^{2}(a) &=&
\sum_{n} w_{n} \chi _{n}^{2}(a)\qquad (n\;%
\mbox{labels the different experiments})
\label{eq:Chi2global}
\\
\chi _{n}^{2}(a) &=&\left(\frac{1-{\cal N}_{n}}{\sigma _{n}^{N}}\right)^{2}
+\sum_{I}\left( \frac{{\cal N}_{n}D_{nI}-T_{nI}(a)}{\sigma _{nI}^{D}}
\right)^{2}
\label{eq:Chi2n}
\end{eqnarray}
For the $n^{\mathrm{th}}$ experiment, $D_{nI}$, $\sigma _{nI}^{D}$, and $%
T_{nI}(a)$ denote the data value, measurement uncertainty (statistical and
systematic combined), and theoretical
value (dependent on $\{a\}$) for the $I^{\mathrm{th}}$ data point; $\sigma
_{n}^{N}$ is the experimental normalization uncertainty;
${\cal N}_{n}$ is an overall normalization factor (with default
value $1$) for the data of experiment $n$.
The factor $w_{n}$ is a possible weighting factor (with default value $1$)
which may be necessary to take into account prior knowledge based on
physics considerations or other information.
The {\it a priori} choices represented by the $w_{n}$ values are
present, explicitly or implicitly, in any data analysis.
For instance, data inclusion or omission (choices which vary for
different global analysis efforts) represent extreme cases,
assigning either $100\%$ or \ $0\%$ weight to each available
experimental data set.
Similarly, choices of various elements of the
analysis procedure itself represent subjective input.
Subjectivity of this kind also enters into the analysis
of systematic errors in experiments.

The function $\chi_{\mathrm{global}}^{2}(a)$ allows the inclusion of all
experimental constraints in a uniform manner while allowing flexibility for
incorporating other relevant physics input.
We will make use of this function to explore the neighborhood
of the best fit, and to generate sample PDFs pertinent
to the uncertainty of the prediction of a specific physical
variable of interest.
However, the numerical value of this effective $\chi^{2}$
function should not be given an \textit{a priori} statistical
interpretation, because correlations between measurement errors,
and correlated theoretical errors, are not included in its definition.
In particular, the likelihood of a candidate PDF set $\{a\}$ cannot
be determined by the value of the increase
$\Delta\chi_{\mathrm{global}}^{2}(a)$
above the minimum.\footnote{%
The often quoted theorem of Gaussian error analysis, that an increase of
$\chi^{2}$ by 1 unit in a constrained fit to data corresponds to 1 standard
deviation of the constrained variable, is true only in the absence of
correlations.
When existing correlations are left out, the relevant size of
$\Delta\chi^2$ can be much larger than $1$.
Appendix \ref{sec:AppDelChi} discusses this point in some detail.}
Instead, the evaluation of likelihoods and estimation of global
uncertainty will be carried out in a separate step in
Sec.\,\ref{sec:quantifying}, after sets of
optimal sample PDFs for the physical variable of interest
have been obtained.
%%Section 3

\section{The Lagrange Multiplier Method}
\label{sec:Lagrange}

The Lagrange Multiplier method is an extension of the $\chi^{2}$
minimization procedure, that relates
the range of variation of a physical observable $X$ dependent upon
the PDFs, to the variation of the function $\chi_{\mathrm{global}}^{2}(a)$
that is used to judge the goodness of fit
of the PDFs to the experimental data and PQCD.

\subsection{The Method}%
\label{sec:Method}

The method has been introduced in \cite{RunII,pdfuc0}.
The starting point is to perform a
global analysis as described in Sec.~\ref{sec:Global},
by minimizing the function $\chi_{\mathrm{global}}^{2}(a)$
defined by Eq.~(\ref{eq:Chi2global}), thus
generating a set of PDFs that represents the best estimate
consistent with current experiment and theory.
We call this set the ``standard set''\footnote{%
This standard set is very similar to the published
CTEQ5M1 set \cite{cteq5}.}, denoted $S_{0}$.
The parameter values that characterize this set will be
denoted by $\{a^{(0)}\}\equiv \{a^{(0)}_{i};\,i=1,\dots,d\}$;
and the absolute minimum of $\chi_{\mathrm{global}}^{2}$
will be denoted by $\chi_{0}^{2}$.
Now, let $X$ be a
particular physical quantity of interest.
It depends on the PDFs, $X=X(a)$, and
the best estimate (or \thinspace prediction)
of $X$ is $X_{0}=X(a^{(0)})$.
We will assess the {\em uncertainty} of this predicted value
by a two-step analysis.
First, we use the Lagrange Multiplier method to determine how
the minimum of $\chi _{\mathrm{global}}^{2}(a)$ increases,
\textit{i.e.,} how the quality of the fit to the
global data set decreases, as $X$ deviates from the best estimate $X_{0}$.
Second, in Section 4, we analyze the appropriate tolerance of
$\chi_{\mathrm{global}}^{2}$.

As explained in \cite{RunII,pdfuc0}, the first step is taken
by introducing a Lagrange multiplier variable $\lambda $,
and minimizing the function
\begin{equation}
\Psi (\lambda ,a)=\chi _{\mathrm{global}}^{2}(a)+\lambda X(a)
\label{eq:LMF}
\end{equation}
with respect to the original $d$ parameters $\{a\}$ for fixed
values of $\lambda$.
In practice we minimize $\Psi(\lambda,a)$ for many values
of the Lagrange multiplier:
$\lambda_{1},\lambda_{2},\dots, \lambda_{M}$.
For each specific value $\lambda_{\alpha}$, the minimum of
$\Psi(\lambda_{\alpha},a)$
yields a set of parameters $\{a_{\mathrm{min}}(\lambda_{\alpha})\}$,
for which we evaluate the observable $X$ and the related
$\chi^{2}_{\rm global}$.
We use the shorthand
$(X_{\alpha},\chi^{2}_{{\rm global},\alpha})$ for this pair.
$\chi^{2}_{{\rm global},\alpha}$ represents the lowest achievable
$\chi^{2}_{{\rm global}}$, for the global data, for which $X$ has
the value $X_{\alpha}$, taking into account all possible PDFs
in the {\em full $d$-dimensional parameter space} of points $\{a\}$.
In other words, the result $\{a_{\mathrm{min}}(\lambda_{\alpha})\}$
is a {\em constrained fit}---with $X$ constrained to be $X_{\alpha}$.
We can equivalently say that $X_{\alpha}$ is an extremum of $X$
if $\chi^{2}_{\rm global}$ is constrained to be
$\chi^{2}_{{\rm global},\alpha}$.
We denote the resulting set of PDFs by $S_{\alpha}$.

We repeat the calculation for many values of $\lambda$,
following the chain
\[
\lambda_{\alpha}  \longrightarrow  \mathrm{min}
\left[\Psi(\lambda_{\alpha},a)\right]
\longrightarrow a_{\rm min}(\lambda_{\alpha})
\longrightarrow X_{\alpha} {\rm ~and~}
\chi^{2}_{{\rm global},\alpha}
\]
for $\alpha=1, 2, 3,\dots,M$.
The result is a parametric relationship between $X$
and $\chi^{2}_{\rm global}$, through $\lambda$.
We call this function $\chi^{2}_{\rm global}(X)$; so $\chi^{2}_{\rm
global}(X_{\alpha})=\chi^{2}_{{\rm global},\alpha}$ is the minimum of
$\chi^{2}_{\rm global}(a)$ when $X$ is constrained to be $X_{\alpha}$.
The absolute minimum of $\chi^{2}_{\rm global}$, which we denote
$\chi^{2}_{0}$, is the minimum of $\Psi(\lambda=0,a)$, occurring
at $\{a\}=\{a^{(0)}\}$.
Thus the procedure generates a set of optimized sample PDFs along the
curve of maximum variation of the physical variable $X$ in the
$d$-dimensional PDF parameter space (with $d=16$ in our case).
These PDF sets $\{S_{\alpha}\}$ are exactly what is needed to assess
the range of variation of $X$ allowed by the data.
In other words, the Lagrange Multiplier method provides
optimal PDFs tailored to the physics problem at hand,
in contrast to an alternative method \cite{GieleEtal}
that generates a large sample of PDFs by the Monte Carlo method.
The underlying ideas of these two complementary approaches
are illustrated in the plot on the left side of
Fig.~\ref{fig:pedagogy}.
\figPedagogy

$\chi_{\mathrm{global}}^{2}(X)$ is the lowest achievable
value of $\chi_{\mathrm{global}}^{2}(a)$ for the value $X$ of the
observable, where $\chi_{\mathrm{global}}^{2}(a)$ represents our
measure of the goodness-of-fit to the global data.
Therefore the allowed range of $X$,
say from $X_{0}-\Delta{X}$ to $X_{0}+\Delta{X}$,
corresponding to a chosen tolerance of the goodness of fit
$\Delta\chi_{\mathrm{global}}^{2}
=\chi_{\mathrm{global}}^{2}-\chi_{0}^{2}$,
can be determined by examining
a graph of $\chi _{\mathrm{global}}^{2}$ versus $X$,
as illustrated in the plot on the right side of Fig.\ \ref{fig:pedagogy}.
This method for calculating $\Delta{X}$ may be more robust and reliable
than the traditional error propagation because it does not approximate
$X(a)$ and $\chi_{\mathrm{global}}^{2}(a)$ by linear and
quadratic dependence on $\{a\}$, respectively,
around the minimum.

Although the parameters $\{a\}$ do not appear explicitly in this
analysis, the results do depend, in principle, on the choice of
parameter space (including the dimension, $d$) in which the
minimization takes place.
In practice, if the degrees of freedom represented
by the parametrization are chosen to match the constraining power of
the global data sets used, which must be true for a sensible global
analysis, the results are quite stable with respect to changes
in the parametrization choices.
The sensitivity to these choices is tested,
as part of the continuing effort to improve the global analysis.

The discussion so far has left open this question:
What is the appropriate tolerance $\Delta\chi_{\mathrm{global}}^{2}$
to define the ``error'' of the prediction $X_{0}$?
This question will be addressed in Sec.~\ref{sec:quantifying}.

Our method can obviously be generalized to study the uncertainties of a
collection of physical observables $(X_{1},\,X_{2},\dots,X_{s})$ by
introducing a separate Lagrange multiplier for each observable.
Although the principle stays the same, the amount of computational
work increases dramatically with each additional observable.

\subsection{A case study: the ${W}$ cross section}

\label{sec:wXsec}

In this subsection we examine the cross section $\sigma_{W}$
for inclusive $W^{\pm}$ production at the Tevatron
($p\overline{p}$ collisions at $\sqrt{s}=1.8$\,TeV)
to illustrate the method and to lay the ground work for the
quantitative study of uncertainties to be given in Sec.\,4.
Other examples will be described in Sec.~\ref{sec:MoreExamples}.
Preliminary results of this section have been reported previously
\cite{RunII,pdfuc0}.

Until recently the only method for assessing the uncertainty of
$\sigma_{W}$ due to PDFs has been to compare the calculated values
obtained from a number of different PDFs, as illustrated in
Fig.~\ref{fig:WprodPDFcomparison}, in which the plots are taken
from existing literature.\footnote{%
These plots show the product of $\sigma_{W}$ times a leptonic
branching ratio, which is what is measured experimentally.
The branching ratio $B$ has some experimental error.
For studying the uncertainties of $\sigma_{W}$, we will
focus on $\sigma_{W}$ itself in the rest of the paper.}
The PDFs used in these comparisons are either the
``best fits'' from different global analysis groups \cite{cteq5, MRST}
(hence are not pertinent to uncertainty studies) or are chosen by some
simple intuitive criteria \cite{MRST2}.
The meaning and reliability of the resulting range of $%
\sigma _{W}$ are not at all clear. Furthermore, these results do not provide
any leads on how the uncertainties can be improved in the future. The Lagrange
Multiplier technique provides a systematic method
to address and remedy both of these problems.%
\figWprodPDFcomparison

Let the physical quantity $X$ of the last subsection be the cross section
$\sigma_{W}$ for $W^{\pm}$ production at the Tevatron. Applying the Lagrange
method, we obtain the constrained minimum of $\chi _{\mathrm{global}}^{2}$ as a
function of $\sigma _{W}$, shown as solid points in Fig.~\ref{fig:Wprod}. The
best estimate value, {\it i.e.,} the prediction for the standard set $S_0$,
is $\sigma_{W0}=21.75$\,nb.%
\figCsqvsWTev%
The curve is a polynomial fit to the points
to provide a smooth representation of the continuous
function $\chi^{2}_{\rm global}(X)$.
We see that all the sample PDF sets obtained by this method lie
on a smooth quasi-parabolic curve with the best-fit value at the
minimum.

As discussed earlier (in Fig.\ \ref{fig:pedagogy}) points on the curve
represent our sample of optimal PDFs relevant to the determination
of the uncertainty of $\sigma_{W}$.
To quantify this uncertainty, we need to reach beyond the effective
$\chi_{\rm global}$ function, and establish the confidence levels for
these ``alternative hypotheses'' with respect to the experimental data
sets used in the global analysis.

%%Section 4

\section{Quantifying the Uncertainty}

\label{sec:quantifying}

Consider a series of sample PDF sets along the curve
$\chi _{\mathrm{global}}^{2}(X)$ of Fig.~\ref{fig:Wprod}
denoted by $\{S_{\alpha};\alpha=0,1,\dots,M\}$
where $S_{0}$ is the standard set.
These represent ``alternative hypotheses'' for the true PDFs,
and we wish to evaluate the likelihoods associated with
these alternatives.
To do so, we go back to the individual experiments and,
in each case, perform as detailed a statistical analysis
as is permitted with available information from that experiment.
After we have obtained meaningful estimates of the ``errors'' of
these candidate PDFs with respect to the individual experiments,
we shall try to combine this information into a global
uncertainty measure in the form of $\Delta X$
and $\Delta \chi_{\rm global}^{2}$.

\tblDatSet
The experimental data sets included in our global analysis are listed in
Table \ref{tbl:DatSet}.  For some of these experiments,
information on correlated systematic errors is available (albeit usually in
unpublished form).
For these, statistical inference should be drawn from a more accurate
$\chi_{n}^{2}$ function
than the simple formula Eq.~(\ref{eq:Chi2n}) used for the global fit.
In particular, if $\sigma _{nI}$ is the uncorrelated error
and $\{\beta _{kI};\,k=1,2,\dots,K\}$ are the coefficients of $K$ distinct
correlated errors associated with the data point $I$, then an
appropriate formula for the $\chi_{n}^{2}$ function is
\begin{equation}
\chi_{n}^{2}=\sum_{I}\frac{(D_{nI}-T_{nI})^{2}}{\sigma_{nI}^{2}}%
-\sum_{k=1}^{K}\sum_{k^{\prime }=1}^{K}B_{k}\left( A^{-1}\right)
_{kk^{\prime }}B_{k^{\prime }}  \label{eq:chi_corr}
\end{equation}
where $B_{k}$ is a vector, and $A_{kk^{\prime }}$ a matrix, in $K$
dimensions:
\begin{equation}
B_{k}=\sum_{I}\beta _{kI}(D_{nI}-T_{nI})/\sigma_{nI}^{2}
\;\;;\;\;A_{kk^{\prime }}=\delta _{kk^{\prime }}
+\sum_{I}\beta_{kI}\beta _{k^{\prime }I}/\sigma _{nI}^{2}.
\label{eq:BandA}
\end{equation}
(The sum over $I$ here includes only the data from experiment $n$.)
Traditionally, $\chi^{2}_{n}$ is written in other ways,
{\it e.g.,} in terms of the inverse of the ($N\times N$)
variance matrix.
For experiments with many data points, the inversion of such
large matrices may lead to numerical instabilities, in addition
to being time-consuming.
Our formula (\ref{eq:chi_corr}) has a significant advantage in that
all the systematic errors are first combined (``analytically'') in
the definitions of $B_{k}$ and $A_{kk'}$.
Equation (\ref{eq:chi_corr}) requires only the inverse of the
much smaller ($K\times K$) matrix $A_{kk'}$.
($K$ is the number of distinct systematic errors.)
The derivation of these formulas is given in
Appendix \ref{sec:AppCorSys}.
Equation~(\ref{eq:chi_corr}) reduces to the minimum of
$\chi_{n}^{2}$ in Eq.~(\ref{eq:Chi2n}) with respect to
${\cal N}_{n}$ if the only correlated error is the overall
normalization error for the entire data set;
in that case $\beta_{I}=-\sigma^{N}_{n}D_{nI}$.

By using Eq.~(\ref{eq:chi_corr}), or
Eq.~(\ref{eq:Chi2n}) for cases where the correlations of
systematic errors are unavailable,
we obtain the best estimate on the range of
uncertainty permitted by available information
on each individual experiment.
We should note that the experimental data sets are continuously evolving.
Some data sets in Table 1 will soon be updated (Zeus, H1) or replaced
(CCFR).\footnote{%
{\it Cf.}\ Talks presented by these collaborations at DIS2000 \emph{Workshop on
Deep Inelastic Scattering and Related Topics}, Liverpool, England, April
2000.}
In addition, most information on correlated systematic errors
is either unpublished or preliminary.
The results presented in the following analysis should therefore
be considered more as a demonstration of principle---as the first
application of our proposed method---rather than the final word
on the PDF uncertainty of the $W$ cross section.

\subsection{Uncertainty with respect to individual experiments}

As an example, we begin by comparing the $\{S_{\alpha}\}$
series for $\sigma _{W}$
at the Tevatron to the H1 data set \cite{NewH1}.
Results on correlated systematic errors are available
for this data set,\footnote{%
These systematic errors are unpublished results,
but are made available to the public on the H1 Web page.
For convenience, we have approximated each of the pair
of 4 non-symmetrical errors by a single symmetric error.
The size of the resulting error on $\sigma _{W}$ inferred
from this evaluation is not affected by that approximation.}
and are incorporated in the calculation using Eq.~(\ref{eq:chi_corr}).
The number of data points in this set is $N_{H1}=172$.
The calculated values of $\chi _{H1}^{2}/N_{H1}$ are plotted
against $\sigma_{W}$ in Fig.~\ref{fig:H1Tev}. The curve is a smooth
interpolation of the points. The value of $\chi _{H1}^{2}/N_{H1}$ for the
standard set $S_{0}$ (indicated by a short arrow on the plot) is $0.975$;
and it is $0.970$ at the minimum of the curve.
These values are quite normal for data with
accurately determined measurement errors.
We can therefore apply standard statistics to calculate the $90$\%
confidence level on $\chi ^{2}/N$ for $N=172$. The result is shown as the
dashed horizontal line in Fig.~\ref{fig:H1Tev}.\figHoneTev

We have similarly calculated $\chi_{n}^{2}/N_{n}$ including information
on the correlations of systematic errors for the BCDMSp data set.
The results are similar to the H1 results, except that the absolute
values are all larger than 1.12, a large value for $N=168$ data points.
This is a familiar problem in data analysis, and it is
encountered in several other data sets in this global analysis
({\it cf.}\ below).
The $\chi_{n}^{2}/N_{n}$ calculation including correlations of the errors
is also done for the D0 and CDF jet cross sections.\footnote{%
The measurement errors of the jet cross sections
are dominated by systematic errors, so the error correlation
matrices are used for $\chi^{2}_{n}$ of these experiments
even in $\chi^{2}_{\mathrm{global}}$.}
For those experiments that have only provided (effective) uncorrelated errors,
we must rely on Eq.~(\ref{eq:Chi2n}) for our error calculation, since that
represents the best information available.

In order to obtain usable likelihood estimates from all the data sets, one must
address the problem mentioned in the previous paragraph:
Even in a ``best fit'', the values of\ $\chi ^{2}$ per data
point, $\chi_{n}^{2}/N_{n},$ for individual experiments vary considerably
among the established experiments (labeled by $n$).
Specifically, $\chi_{n}^{2}/N_{n}$ ranges from $1.5-1.7$
(for ZEUS and CDFjet) on the high end to $0.5-0.7$
(for some D-Y experiments) on the low end in all good fits.
Considering the fact that some of these data sets contain close to $200 $
points, the range of variation is huge from the viewpoint
of normal statistics:
Experiments with $\chi _{n}^{2}$ $/N_{n}$ deviating from $1.0$
by a few times $\sqrt{2/N_{n}}$ in either direction would have
to be ruled out as extremely unlikely \cite{Kosower98}.

The reasons for $\chi _{n}^{2}/N_{n}$ to deviate from $1.0$ in real
experiments are complex, and vary among experiments.
They are, almost by definition, not understood, since
otherwise the errors would have been corrected and
the resulting $\chi^2$ would become consistent with
the expected statistical value.
Under these circumstances, a commonly adopted pragmatic
approach is to focus on the relative $\chi^{2}$ values with
respect to some appropriate reference $\chi^{2}$.\footnote{%
The alternative is to take the \emph{absolute} values of
$\chi_{n}^{2}$ seriously, and hence only work with alternative
hypotheses and experiments that are both self-consistent
({\it i.e.,} have $|\chi_{n}^{2}/N_{n}-1|\lesssim \sqrt{2/N_{n}}$)
and mutually compatible in the strict statistical sense
({\it i.e.,} have overlapping likelihood functions).
Since few of the precision DIS experiments are compatible in this sense,
one must then abandon global analysis, and work instead with several
distinct (and mutually exclusive) analyses based on different
experiments.}
Accordingly, in the context of performing {\em global} QCD analysis,
we adopt the following procedure.
For each experiment (labeled by $n$):

\begin{Simlis}[]{0.5em}
\item (i) Let $\chi^{2}_{n,0}$ denote the value of $\chi^{2}_{n}$
for the standard set $S_{0}$.
We assume $S_{0}$ is a viable reference set.
Because $\chi^{2}_{n,0}$ may be far from a likely value
for random errors, we {\em rescale} the values of
$\chi^{2}_{n,\alpha}$ (for $\alpha=0, 1, 2,\dots, M$)
by a factor $C_{n0}$, calling the result
$\overline{\chi}^{2}_{n,\alpha}$
\begin{equation}\label{eq:rescale}
\chi^{2}_{n,\alpha} \longrightarrow
\overline{\chi}^{2}_{n,\alpha} \equiv
C_{n0}\,\chi^{2}_{n,\alpha}.
\end{equation}
The constant $C_{n0}$ is chosen such that, for the standard set,
$\overline{\chi}^{2}_{n,0}$ assumes the most probable value for
a chi-squared variable:
$\overline{\chi}^{2}_{n,0} \! = \! \xi_{50} \equiv$ the 50$^{th}$ percentile
of the chi-squared distribution $P(\chi^{2},N_{n})$
with $N_{n}$ degrees of freedom, defined by
\begin{equation}
\int_{0}^{\xi_{50}} P(\chi^{2},N_{n}) d\chi^{2} = 0.50.
\end{equation}
(If $N_{n}$ is large then $\xi_{50} \! \approx \! N_{n}$.)
The rescaling constant $C_{n0}$ is thus $\xi_{50}/\chi^{2}_{n,0}$.
For random errors the probability that $\chi^{2} \! < \! \xi_{50}$
(or $> \! \xi_{50}$) is 50\%.
For those experiments whose $\chi^{2}_{n,0}$ deviates significantly
from $\xi_{50}$,
this rescaling procedure is meant to
provide a simple (but crude) way to correct for the unknown correlations
or unusual fluctuations.

\item (ii) We then examine the values of
$\overline{\chi}^{2}_{n,\alpha}$ for the alternative sets
$S_{\alpha}$ with $\alpha=1, 2, \dots, M$,
using $\overline{\chi}^{2}_{n,\alpha}-\overline{\chi}^{2}_{n,0}$
to compute the statistical likelihood of the
alternative hypothesis $S_{\alpha}$ with respect to the
data set $n$, based on the chi-squared distribution
with $N_{n}$ data points.
\end{Simlis}
This procedure does not affect the results presented earlier for the H1
experiment, since $\chi^{2}_{n,0}/N_{n}$ is already very close to $1$
for that experiment.

Before presenting the results of the likelihood calculation, it is
interesting to examine, in Fig.~\ref{fig:allTev},
the differences
$\Delta \chi_{n,\alpha}^{2}=\chi_{n,\alpha}^{2}-\chi_{n,0}^{2}$
(before rescaling) versus $\sigma_{W}$ for the 15 data sets.
(N.B.\ The vertical scales of the various plots are not the same,
due to the large variations in the value of $\Delta \chi^{2}_{n,\alpha}$
for different experiments.)
The ordering of the experiments in Fig.\ \ref{fig:allTev}
is the same as in Table \ref{tbl:DatSet}, with experiments ordered by
process (DIS, DY, $W$ and jet production).
It is clear from these graphs that the DIS experiments place
the strongest constraints on $\sigma _{W}$, because they have
the largest $\Delta \chi_{n}^{2}$ for the same $\Delta \sigma _{W}$.
This is to be expected since quark-antiquark annihilation makes the
dominant contribution to $\sigma_{W}$.
We also observe that most experiments place some constraint on
$\sigma_{W}$ on both sides, but a few bound it on one side only.
Globally, as shown in Fig.~\ref{fig:Wprod}, the combined constraints
give rise to a classic parabolic behavior for
$\chi^{2}_{\rm global}(\sigma _{W})$.
\figallTev

To estimate the statistical significance of the
individual $\chi_{n}^{2}$ increases,
we assume that the rescaled variable
$\overline{\chi}^{2}_{n}$ obeys a chi-squared distribution
$P(\chi^{2},N_{n})$ for $N_{n}$ data points.
Thereby, we estimate the
value of ${\overline{\chi}}_{n}^{2}$ that corresponds to
the 90\% confidence level (CL) uncertainty for $\sigma _{W}$
(with respect to experiment $n$) from the formula
$\overline{\chi}_{n}^{2}=\xi_{90}$, where $\xi_{90}$ is
the 90$^{th}$ percentile defined by
\begin{equation}\label{eq:xi90}
\int_{0}^{\xi_{90}} P(\chi^{2},N_{n}) d\chi^{2} = 0.90.
\end{equation}
For example,
Fig.~\ref{fig:chsqdist} shows the chi-squared distribution
$P(\chi^{2},N_{n})$ for $N_{n}=172$, the number of data
points in the H1 data set.
The 50$^{th}$ and 90$^{th}$ percentiles are indicated.
We choose a conservative 90\% CL because
there are other theoretical and phenomenological uncertainties not taken
into account by this analysis.
\figchsqdist

To summarize our procedure, an alternative PDF set
$S_{\alpha}$ lies within the 90\% CL for experiment $n$ if it has
$\overline{\chi}^{2}_{n,\alpha}<\xi_{90}$; that is, if
\begin{equation}
\frac{\chi^{2}_{n,\alpha}}{\chi^{2}_{n,0}}
< \frac{\xi_{90}}{\xi_{50}}.
\end{equation}
We judge the likelihood of $S_{\alpha}$ from the {\em ratio}
of $\chi^{2}_{n,\alpha}$ to the reference value $\chi^{2}_{n,0}$,
rather than from the absolute magnitude.
The horizontal lines in Fig.~\ref{fig:allTev} correspond to the
values of $\Delta\chi_{n}^{2}$ obtained in this way.
Finally, from the intercepts of the line with the interpolating
curve in each plot in Fig.~\ref{fig:allTev}, we obtain an
estimated uncertainty range of $\sigma _{W}$ from each individual
experiment.
The results are presented collectively in Fig.~\ref{fig:rangesTev},
where, for each experiment, the point ($\bullet $) is the value of
$\sigma _{W}$ for which $\chi _{n}^{2}$ is minimum, and the error
bar extends across the 90\% CL based on that data set.
%
\figrangesTev%

The uncertainty ranges shown in Fig.\ \ref{fig:rangesTev}
with respect to individual experiments represent the
most definitive results of our study, in the sense that the input and the
assumptions can be stated clearly and the analysis is quantitative within
the stated framework.
It is natural to proceed further and estimate a global measure of
$\Delta\sigma_{W}$ and the corresponding $\Delta\chi_\mathrm{global}^{2}$.
This last step is, however, less well-defined and
requires some subjective judgement.

\subsection{The Global Uncertainty}
\label{subsec:globaluc}

It should be emphasized that the ranges shown by the error bars in Fig.~\ref
{fig:rangesTev}\ are not errors determined independently by each experiment;
rather they represent the ranges allowed by the experiments for alternative
{\em global} fits $\{S_{\alpha}\}$. For this reason, and others related to the
rescaling of $\chi ^{2}$ mentioned earlier as well as approximations inherent
in many of the original published
errors,\footnote{%
For instance, the single uncorrelated systematic error associated with each
data point, which is the only systematic error given for most experimental data
sets, is clearly only an ``effective uncorrelated error'' which qualitatively
represents the effects of the many sources of systematic error, some of which
are really correlated.} it is not obvious how to {\em combine} these errors.
We refer to the ranges in Fig.\ \ref{fig:rangesTev} by the generic term
{\em local} ({\it i.e.,} single-experiment) {\em uncertainties}.
On a qualitative level, Fig.~\ref{fig:rangesTev} exhibits the same features
seen earlier in Fig.~\ref {fig:allTev}: (i) the quark dominated DIS
experiments give the smallest error bars;
and (ii) a few experiments only set bounds on one side,
while the rest limit the range in both directions.
In addition, Fig.~\ref{fig:rangesTev} gives us an overall view which
clearly shows that $\sigma_{W}$ is well constrained in the global
analysis, and the experimental bounds are consistent with each other.

The important question is how to provide a sensible measure of the overall
uncertainty in view of the complexity of the problem already described. The
situation here is not unlike the problem of assigning an overall systematic
error to an experimental measurement. Figure \ref{fig:rangesTev} shows a set of
90\% CL ranges for $\sigma_{W}$ from different sources, but these ranges are
highly correlated, because the alternative hypotheses being tested come from
global fits. The final uncertainty
must be a reasonable intersection of these ranges.

We will state an algorithm for obtaining the final uncertainty
measure of $\sigma_{W}$ based on Fig.\ \ref{fig:rangesTev}.
The same algorithm can be applied in the future for predictions
of other observables.
It has two parts:
(1) Determine the central value using all the experiments;
that is the solid line in Fig.\ \ref{fig:rangesTev}.
(2) Then take the {\em intersection} of the error
ranges as the combined uncertainty.
But in calculating the intersection, experiments below the
mean are used only for setting the lower bound, and experiments
above the mean are used only for setting the upper bound.
With this algorithm, experiments that permit a large range
of $\sigma_{W}$,
{\it i.e.,} that depend on aspects of the PDFs that are not
sensitive to the value of $\sigma_{W}$,
will not affect the final uncertainty measure (as they should not).
According to this algorithm,
the result for the uncertainty of $\sigma_{W}$ is
$20.9\,{\rm nb} \! < \! \sigma_{W} < 22.6\,{\rm nb}$.
These bounds are approximately $\pm{4}$\% deviations
from the prediction (21.75\,nb) and so we quote
a $\pm{4}$\% uncertainty in $\sigma_{W}$ due
to PDFs.

Now we may determine the increase in $\chi^{2}_{\rm global}$
that corresponds to our estimated uncertainty
$\Delta\sigma_{W}$ in the $\sigma_{W}$ prediction.
Referring to Fig.\ \ref{fig:Wprod},
a deviation of $\sigma_{W}$ by $\pm{4}$\% from the minimum
corresponds to an increase
$\Delta\chi^{2}_{\rm global} \! \approx \! 180$.
That is, $\Delta\chi^{2}_{\rm global}$ in
Fig.\ \ref{fig:pedagogy} is 180.
In other words, along the direction of maximum variation
of $\sigma_{W}$ a PDF set with
$\Delta\chi^{2}_{\rm global} \gtrsim 180$ is found
to violate some experimental constraints by this analysis.

\subsection{Comments}

We should point out that the above uncertainty estimate,
$\Delta\sigma_{W}/\sigma_{W} \sim 4$\%,
represents only a lower bound on the true uncertainty,
since many other sources of error have not yet been
included in the analysis: theoretical ones such as
QCD higher order and resummation effects, power-law corrections,
and nuclear corrections.
These need to be taken into consideration in a
full investigation of the uncertainties, but
that goes beyond the scope of this paper.%
\footnote{%
Because there are these additional sources of uncertainty,
we have used 90\% CL's, rather than 68\% CL's,
to calculate the error.} %
We shall add only two remarks which are more directly related
to our analysis.

The first concerns a technical detail.
In the results reported so far, we have fixed the normalization
factors \{${\cal N}_{n}$\} in the definition of $\chi_{\rm global}^{2}$
(Eq.~(\ref{eq:Chi2global})) at their values determined in the
standard fit $S_{0}$.
If we let these factors float when we perform the
Lagrange Multiplier analysis, $\Delta \sigma _{W}$ will increase
noticeably compared to Fig.\ \ref{fig:Wprod} for the same
$\Delta \chi _{\rm global}^{2}$.
However, upon closer examination, this behavior can be easily
understood and it does not imply a real increase in the
uncertainty of $\sigma_{W}$.
The key observation is that the additional increase (or decrease)
in $\sigma_{W}$ is entirely due to a \emph{uniform} increase
(or decrease) of \{${\cal N}_{n}$\} for all the DIS experiments.
There is a simple reason for this: The values of the $q$ and $\bar{q}$
distributions in the relevant $x$ range (which determine the value of
$\sigma_{W}$) are approximately proportional to
\{${\cal N}_{n}$\}$_{DIS}$.
Although every experiment does have a normalization uncertainty,
the probability that the normalization factors of all the
\emph{independent} DIS experiments would shift in the \emph{same}
direction by the \emph{same} amount is certainly unlikely.
Hence we avoid this artificial effect by fixing \{${\cal N}_{n}$\}
at their ``best values'' for our study.
Allowing the factors \{${\cal N}_{n}$\} to vary {\em randomly}
(within the published experimental normalization uncertainties)
would not change our estimated value of $\Delta\sigma_{W}$
significantly.

The second remark concerns the choice of parametrization.
We have mentioned that even the robust Lagrange Multiplier method
depends in principle on the choice of the parton parameter space,
{\it i.e.,} on the choice of the functional forms used for the
nonperturbative PDFs at the low momentum scale $Q_{0}$.
To check how our answers depend on the choice of parametrization
in practice, we have done many similar calculations, using different
numbers of free parameters within the same functional form
({\it cf.}\ Appendix \ref{sec:AppPdfs}),
and using different functional forms for the factor
multiplying $x^{a}\left( 1-x\right)^{b}$.
We have not seen any dependence of the uncertainty estimates
on these changes.
Although more radical ways of parametrizing the nonperturbative
PDFs might affect the result more, there is no known example of
such a parametrization, which at the same time still provides
an equally good fit to the full data set.

