**Statistics Technical Reports:**Search | Browse by year

**Term(s):**1996**Results:**32**Sorted by:****Page: 1 2 Next**

**Title:**On The choice of m for the m out of n bootstrap in hypothesis testing**Author(s):**Bickel, Peter J.; Ren, Jian-Jian; **Date issued:**May 1996

http://nma.berkeley.edu/ark:/28722/bk000472j5v (PDF) **Keyword note:**Bickel__Peter_John Ren__Jian-Jian**Report ID:**476**Relevance:**100

**Title:**[Title unavailable]**Author(s):**Yu, Bin; **Date issued:**November 1996**Keyword note:**Yu__Bin**Report ID:**475**Relevance:**100

**Title:**Smoothing Spline Models for the Analysis of Nested and Crossed Samples of Curves**Author(s):**Brumback, Babette A.; Rice, John A.; **Date issued:**Nov 1996

http://nma.berkeley.edu/ark:/28722/bk0000n368j (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n3693 (PostScript) **Abstract:**We introduce a class of models for an additive decomposition of groups of curves stratified by crossed and nested factors,
generalizing smoothing splines to such samples by associating them with a corresponding mixed effects model. The models are
also useful for imputation of missing data and exploratory analysis of variance. We prove that the best linear unbiased predictors
(BLUP) from the extended mixed effects model correspond to solutions of a generalized penalized regression where smoothing
parameters are directly related to variance components, and we show that these solutions are natural cubic splines. The model
parameters are estimated using a highly efficient implementation of the EM algorithm for restricted maximum likelihood (REML)
estimation based on a preliminary eigenvector decomposition. Variability of computed estimates can be assessed with asymptotic
techniques or with a novel hierarchical bootstrap resampling scheme for nested mixed effects models. Our methods are applied
to menstrual cycle data from studies of reproductive function that measure daily urinary progesterone; the sample of progesterone
curves is stratified by cycles nested within subjects nested within conceptive and non-conceptive groups.**Keyword note:**Brumback__Babette_Anne Rice__John_Andrew**Report ID:**474**Relevance:**100

**Title:**The $L_2$ Rate of Convergence for Event History Regression with Time-Dependent Covariates**Author(s):**Huang, Jianhua; Stone, Charles J.; **Date issued:**September 1996**Keyword note:**Huang__Jianhua Stone__Charles**Report ID:**473**Relevance:**100

**Title:**Accurate estimation of travel times from single-loop detectors**Author(s):**Petty, Karl; Bickel, Peter; Jiang, Jiming; Ostland, Michael; Rice, John; Ritov, Ya'cov; Schoenberg, Frederic; **Date issued:**Aug 1996

http://nma.berkeley.edu/ark:/28722/bk0000n345v (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n346d (PostScript) **Abstract:**As advanced traveler information systems become increasingly prevalent the importance of accurately estimating link travel
times grows. Unfortunately, the predominant source of highway traffic information comes from single-trap loop detectors which
do not directly measure vehicle speed. The conventional method of estimating speed, and hence travel time, from the single-trap
data is to make a common vehicle length assumption and to use a resulting identity relating density, flow, and speed. Hall
and Persaud (1989) and Pushkar, Hall, and Acha-Daza (1994) show that these speed estimates are flawed. In this paper we present
a methodology to estimate link travel times directly from the single-trap loop detector flow and occupancy data without heavy
reliance on the flawed speed calculations. Our methods arise naturally from an intuitive stochastic model of traffic flow.
We demonstrate by example on data collected on I-880 that when the loop detector data has a fine resolution (about one second),
the single-trap estimates of travel time can accurately track the true travel time through many degrees of congestion. Probe
vehicle data and double-trap travel time estimates corroborate the accuracy of our methods in our examples.**Keyword note:**Petty__Karl_F Bickel__Peter_John Jiang__Jiming Ostland__Michael_Anthony Rice__John_Andrew Ritov__Yaacov Schoenberg__Frederic_R**Report ID:**472**Relevance:**100

**Title:**The Feynman-Kac formula and decomposition of Brownian paths**Author(s):**Jeanblanc, M.; Pitman, J.; Yor, M.; **Date issued:**Sep 1996

http://nma.berkeley.edu/ark:/28722/bk0000n363s (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n364b (PostScript) **Abstract:**This paper describes connections between the Feynman-Kac formula, related Sturm-Liouville equations, and various decompositions
of Brownian paths into independent components.**Pub info:**Comput. Appl. Math. 16, 27-52, 1997**Keyword note:**Jeanblanc__Monique Pitman__Jim Yor__Marc**Report ID:**471**Relevance:**100

**Title:**On the lengths of excursions of some Markov processes**Author(s):**Pitman, Jim; Yor, Marc; **Date issued:**Aug 1996

http://nma.berkeley.edu/ark:/28722/bk0000n3604 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n361p (PostScript) **Abstract:**Results are obtained regarding the distribution of the ranked lengths of component intervals in the complement of the random
set of times when a recurrent Markov process returns to its starting point. Various martingales are described in terms of
the L\'evy measure of the Poisson point process of interval lengths on the local time scale. The martingales derived from
the zero set of a one-dimensional diffusion are related to martingales studied by Az\'ema and Rainer. Formulae are obtained
which show how the distribution of interval lengths is affected when the underlying process is subjected to a Girsanov transformation.
In particular, results for the zero set of an Ornstein-Uhlenbeck process or a Cox-Ingersoll-Ross process are derived from
results for a Brownian motion or recurrent Bessel process, when the zero set is the range of a stable subordinator.**Pub info:**S{\'e}minaire de Probabilit{\'e}s XXXI, 272-286, Lecture Notes in Math. 1655, Springer, 1997**Keyword note:**Pitman__Jim Yor__Marc**Report ID:**470**Relevance:**100

**Title:**On the relative lengths of excursions derived from a stable subordinator**Author(s):**Pitman, Jim; Yor, Marc; **Date issued:**Aug 1996

http://nma.berkeley.edu/ark:/28722/bk0000n357g (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n3581 (PostScript) **Abstract:**Results are obtained concerning the distribution of ranked relative lengths of excursions of a recurrent Markov process from
a point in its state space whose inverse local time process is a stable subordinator. It is shown that for a large class of
random times $T$ the distribution of relative excursion lengths prior to $T$ is the same as if $T$ were a fixed time. It
follows that the generalized arc-sine laws of Lamperti extend to such random times $T$. For some other random times $T$, absolute
continuity relations are obtained which relate the law of the relative lengths at time $T$ to the law at a fixed time.**Pub info:**S{\'e}minaire de Probabilit{\'e}s XXXI, 287-305, Lecture Notes in Math. 1655, Springer, 1997**Keyword note:**Pitman__Jim Yor__Marc**Report ID:**469**Relevance:**100

**Title:**Some extensions of Knight's identity for Brownian motion**Author(s):**Pitman, Jim; Yor, Marc; **Date issued:**August 1996**Keyword note:**Pitman__Jim Yor__Marc**Report ID:**468**Relevance:**100

**Title:**Laplace Transforms Related to Excursions of a One-dimensional Diffusion**Author(s):**Pitman, Jim; Yor, Marc; **Date issued:**Aug 1996

http://nma.berkeley.edu/ark:/28722/bk0000n354t (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n355c (PostScript) **Abstract:**Various known expressions in terms of hyperbolic functions for the Laplace transforms of random times related to one-dimensional
Brownian motion are derived in a unified way by excursion theory and extended to one-dimensional diffusions.**Pub info:**Bernoulli 5, 249-255, 1999**Keyword note:**Pitman__Jim Yor__Marc**Report ID:**467**Relevance:**100

**Title:**More on Recurrence and Waiting Times**Author(s):**Wyner, Abraham J.; **Date issued:**August 1996**Keyword note:**Wyner__Abraham_J**Report ID:**466**Relevance:**100

**Title:**Construction of Markovian Coalescents**Author(s):**Evans, Steven N.; Pitman, Jim; **Date issued:**Jul 1996

http://nma.berkeley.edu/ark:/28722/bk0000n341n (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n3426 (PostScript) **Abstract:**Partition-valued and measure-valued coalescent Markov processes are constructed whose state describes the decomposition of
a finite total mass $m$ into a finite or countably infinite number of masses with sum $m$, and whose evolution is determined
by the following intuitive prescription: each pair of masses of magnitudes $x$ and $y$ runs the risk of a binary collision
to form a single mass of magnitude $x+y$ at rate $\kappa(x,y)$, for some nonnegative, symmetric collision rate kernel $\kappa(x,y)$.
Such processes with finitely many masses have been used to model polymerization, coagulation, condensation, and the evolution
of galactic clusters by gravitational attraction. With a suitable metric on the state space, and under appropriate restrictions
on $\kappa$ and the initial distribution of mass, it is shown that such processes can be constructed as Feller or Feller-like
processes. A number of further results are obtained for the (\em additive coalescent) with collision kernel $\kappa(x,y)
= x + y$. This process, which arises from the evolution of tree components in a random graph process, has asymptotic properties
related to the stable subordinator of index $1/2$.**Pub info:**Ann. Inst. Henri Poincare 34, 339-383, 1998**Keyword note:**Evans__Steven_N Pitman__Jim**Report ID:**465**Relevance:**100

**Title:**Confidence intervals with more power to determine the sign: two ends constrain the means**Author(s):**Benjamini, Y.; Hochberg, Y.; Stark, P. B.; **Date issued:**Jul 1996**Abstract:**We present two families of two-sided non-equivariant confidence intervals for the mean $\theta$ of a continuous, unimodal,
symmetric random variable that, compared with the conventional, symmetric, equivariant confidence interval, are shorter when
the observation is small, and restrict the sign of $\theta$ for smaller observations. One of the families, a modification
of Pratt's (1961) construction of intervals with minimal expected length when $\theta=0$, is longer than the conventional
symmetric interval when $|X|$ is large, and has longer expected length when $|\theta|$ is large. The other family gives the
conventional symmetric interval when $|X|$ is large, with a change to the smaller endpoint when $|X|$ is small. Its expected
length is less than that of the conventional symmetric interval when $|\theta|$ is small, larger for an intermediate range
of $|\theta|$, then approaches that of the conventional interval for large $|\theta|$. This slight modification of the conventional
two-sided interval has most of the power advantage of a one-sided interval, but short length.**Keyword note:**Benjamini__Yoav Hochberg__Y Stark__Philip_B**Report ID:**464**Relevance:**100

**Title:**Empirical Modeling of Extreme Events from Return-Volume Time Series in Stock Market**Author(s):**Bühlmann, Peter; **Date issued:**Jun 1996

http://nma.berkeley.edu/ark:/28722/bk0000n1s3m (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n1s45 (PostScript) **Abstract:**We propose the discretization of real-valued financial time series into few ordinal values and use non-linear likelihood modeling
for sparse Markov chains within the framework of generalized linear models for categorical time series. We analyze
daily return and volume data and estimate the probability structure of the process of extreme lower, extreme upper and the
complementary usual events. Knowing the whole probability law of such ordinal-valued vector processes of extreme events of
return and volume allows us to quantify non-linear associations. In particular, we find a (new kind of) asymmetry in the return-volume
relationship which is a partial answer to a research issue given by Karpoff (1987). We also propose a simple prediction
algorithm which is based on an empirically selected model.**Keyword note:**Buhlmann__Peter**Report ID:**463**Relevance:**100

**Title:**Closure of linear processes**Author(s):**Bickel, Peter J.; Bühlmann, Peter; **Date issued:**Sep 1996

http://nma.berkeley.edu/ark:/28722/bk0000n1s7t (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n1s8c (PostScript) **Abstract:**We consider the sets of moving-average and autoregressive processes and study their closures under the Mallows metric and
the total variation convergence on finite dimensional distributions. These closures are unexpectedly large, containing non-ergodic
processes which are Poisson sums of i.i.d. copies from a stationary process. The presence of these non-ergodic Poisson
sum processes has immediate implications. In particular, identifiability of the hypothesis of linearity of a process is in
question. A discussion of some of these issues for the set of moving-average processes has already been given without
proof in Bickel and B\"{u}hlmann (1996). We establish here the precise mathematical arguments and present some additional
extensions: results about the closure of autoregressive processes and natural sub-sets of moving-average and autoregressive
processes which are closed.**Keyword note:**Bickel__Peter_John Buhlmann__Peter**Report ID:**462**Relevance:**100

**Title:**On Average Derivative Quantile Regression**Author(s):**Chaudhuri, Probal; Doksum, Kjell; Samarov, Alexander; **Date issued:**Apr 1996

http://nma.berkeley.edu/ark:/28722/bk0000n267h (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n2682 (PostScript) **Abstract:**Keywords: Average derivative estimate, transformation model, projection pursuit model, index model, survival analysis, heteroscedasticity,
reduction of dimensionality, quantile specific regression coefficients For fixed $\alpha \in (0,1)$, the quantile regression
function gives the $\alpha$th quantile $\theta_(\alpha) ( (\bf x) )$ in the conditional distribution of a response variable
$Y$ given the value $(\bf X) = (\bf x)$ of a vector of covariates. It can be used to measure the effect of covariates not
only in the center of a population, but also in the upper and lower tails. A functional that summarizes key features of the
quantile specific relationship between $(\bf X)$ and $Y$ is the vector $\mbox(\boldmath$\beta$)_(\alpha)$ of weighted expected
values of the vector of partial derivatives of the quantile function $\theta_(\alpha) ( (\bf x) )$. In a nonparametric setting,
$\mbox(\boldmath$\beta$)_(\alpha)$ can be regarded as a vector of quantile specific nonparametric regression coefficients.
In survival analysis models (e.g. Cox's proportional hazard model, proportional odds rate model, accelerated failure time
model) and in monotone transformation models used in regression analysis, $\mbox(\boldmath$\beta$)_(\alpha)$ gives the direction
of the parameter vector in the parametric part of the model. $\mbox(\boldmath$\beta$)_(\alpha)$ can also be used to estimate
the direction of the parameter vector in semiparametric single index models popular in econometrics. We show that, under
suitable regularity conditions, the estimate of $\mbox(\boldmath$\beta$)_(\alpha)$ obtained by using the locally polynomial
quantile estimate of Chaudhuri (1991 (\it Annals of Statistics)), is $n^(1/2)$-consistent and asymptotically normal with asymptotic
variance equal to the variance of the influence function of the functional $\mbox(\boldmath$\beta$)_(\alpha)$. We discuss
how the estimate of $\mbox(\boldmath$\beta$)_(\alpha)$ can be used for model diagnostics and in the construction of a link
function estimate in general single index models.**Keyword note:**Chaudhuri__Probal Doksum__Kjell_Andreas Samarov__Alexander**Report ID:**461**Relevance:**100

**Title:**[Bias, Variance, and] Arcing Classifiers**Author(s):**Breiman, Leo; **Date issued:**Feb 1996**Date modified:**revised July, 1996

http://nma.berkeley.edu/ark:/28722/bk0000n2616 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n262r (PostScript) **Abstract:**Recent work has shown that combining multiple versions of unstable classifiers such as trees or neural nets results in reduced
test set error. One of the more effective is bagging (Breiman [1996a]) Here, modified training sets are formed by resampling
from the original training set, classifiers constructed using these training sets and then combined by voting. Freund and
Schapire [1995,1996] propose an algorithm the basis of which is to adaptively resample and combine (hence the acronym--arcing)
so that the weights in the resampling are increased for those cases most often misclassified and the combining is done by
weighted voting. Arcing is more successful than bagging in test set error reduction. We explore two arcing algorithms, compare
them to each other and to bagging, and try to understand how arcing works. We introduce the definitions of bias and variance
for a classifier as components of the test set error. Unstable classifiers can have low bias on a large range of data sets.
Their problem is high variance. Combining multiple versions either through bagging or arcing reduces variance significantly.**Keyword note:**Breiman__Leo**Report ID:**460**Relevance:**100

**Title:**Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data**Author(s):**Hoover, Don; Rice, John; Wu, Colin; Yang, Li-Ping; **Date issued:**Apr 1996

http://nma.berkeley.edu/ark:/28722/bk0000n245c (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n246x (PostScript) **Abstract:**This paper considers estimation of nonparametric components in a time-varying coefficient model with repeated measurements
of responses and covariates. The responses are modeled as depending linearly on the covariates, with coefficients that are
functions of time. The measurements are assumed to be independent for different subjects but can be correlated in an unspecified
way at different time points within each subject.Three nonparametric estimates, namely kernel, smoothing spline and locally
weighted polynomial, of the time-varying coefficients are derived for such repeatedly measured data. A cross-validation criterion
is proposed for the selection of the corresponding smoothing parameters. Asymptotic properties, such as consistency, rates
of convergence and asymptotic mean squared errors, are established for the kernel estimates. An example of predicting the
growth of children born to HIV infected mothers based on gender, HIV status and maternal vitamin A levels shows that this
model and the corresponding nonparametric estimates are useful in epidemiological studies.**Keyword note:**Hoover__Don Rice__John_Andrew Wu__Chien-Fu Yang__Li-Ping**Report ID:**459**Relevance:**100

**Title:**Functional ANOVA Models for Generalized Regression**Author(s):**Huang, Jianhua; **Date issued:**Apr 1996

http://nma.berkeley.edu/ark:/28722/bk0000n255w (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n256f (PostScript) **Abstract:**Functional ANOVA models are considered in the context of generalized regression, which includes logistic regression, probit
regression and Poisson regression as special cases. The multivariate predictor function is modeled as a specified sum of a
constant term, main effects and interaction terms. Maximum likelihood estimates are used, where the maximizations are taken
over suitably chosen approximating spaces. We allow general linear spaces and their tensor products as building blocks for
the approximating spaces. It is shown that the $L_2$ rates of convergence of the maximum likelihood estimates and their ANOVA
components are determined by the approximation power and dimension of the approximating spaces. When the approximating spaces
are appropriately chosen, the optimal rates of convergence can be achieved.**Keyword note:**Huang__Jianhua**Report ID:**458**Relevance:**100

**Title:**Coalescent random forests.**Author(s):**Pitman, Jim; **Date issued:**Sep 1996

http://nma.berkeley.edu/ark:/28722/bk0000n366f (PDF) **Abstract:**Various enumerations of labeled trees and forests, due to Cayley, Moon, and other authors, are consequences of the following
(\em coalescent algorithm) for construction of a sequence of random forests $(R_n, R_(n-1), \cdots, R_1)$ such that $R_k$
has uniform distribution over the set of all forests of $k$ rooted trees labeled by $\INn := \(1, \cdots , n\)$. Let $R_n$
be the trivial forest with $n$ root vertices and no edges. For $n \ge k \ge 2$, given that $R_n, \cdots, R_k$ have been defined
so that $R_k$ is a rooted forest of $k$ trees, define $R_(k-1)$ by addition to $R_k$ of a single directed edge picked uniformly
at random from the set of $n(k-1)$ directed edges which when added to $R_k$ yield a rooted forest of $k-1$ trees labeled by
$\INn$. Variations of this coalescent algorithm are described, and related to the literature of physical processes of clustering
and polymerization.**Keyword note:**Pitman__Jim**Report ID:**457**Relevance:**100