**Statistics Technical Reports:**Search | Browse by year

**Term(s):**1999**Results:**28**Sorted by:****Page: 1 2 Next**

**Title:**The fifth cell**Author(s):**Wachter, K. W.; Freedman, D. A.; **Date issued:**Dec 1999

http://nma.berkeley.edu/ark:/28722/bk0000n284w (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n285f (PostScript) **Abstract:**One form of error that can affect census adjustments is correlation bias, reflecting people who are doubly missing-- from
the census and from the adjusted counts as well. This paper presents a method for estimating the total national number of
doubly-missing people and their distribution by race and sex. Application to the 1990 U.S. census adjustment leads to an
estimate of 3 million doubly-missing people. Correlation bias is likely to be a serious problem for census adjustment in
2000. The methods of this paper are well suited for measuring its magnitude.**Keyword note:**Wachter__Kenneth Freedman__David**Report ID:**570**Relevance:**100

**Title:**Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions**Author(s):**Biane, Philippe; Pitman, Jim; Yor, Marc; **Date issued:**Oct 1999

http://nma.berkeley.edu/ark:/28722/bk0000n2817 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n282s (PostScript) **Abstract:**This paper reviews known results which connect Riemann's integral representations of his zeta function, involving Jacobi's
theta function and its derivatives, to some particular probability laws governing sums of independent exponential variables.
These laws are related to one-dimensional Brownian motion and to higher dimensional Bessel processes. We present some characterizations
of these probability laws, and some approximations of Riemann's zeta function which are related to these laws.**Keyword note:**Biane__Philippe Pitman__Jim Yor__Marc**Report ID:**569**Relevance:**100

**Title:**A necessary and sufficient condition for the $\Lambda$-coalescent to come down from infinity.**Author(s):**Schweinsberg, Jason; **Date issued:**Sep 1999**Abstract:**Let $\Pi_(\infty)$ be the standard $\Lambda$-coalescent of Pitman, which is defined so that $\Pi_(\infty)(0)$ is the partition
of the positive integers into singletons, and, if $\Pi_n$ denotes the restriction of $\Pi_(\infty)$ to $\( 1, \ldots, n \)$,
then whenever $\Pi_n(t)$ has $b$ blocks, each $k$-tuple of blocks is merging to form a single block at the rate $\lambda_(b,k)$,
where \begin(displaymath) \lambda_(b,k) = \int_0^1 x^(k-2) (1-x)^(b-k) \: \Lambda(dx) \end(displaymath) for some finite measure
$\Lambda$. We give a necessary and sufficient condition for the $\Lambda$-coalescent to ``come down from infinity'', which
means that the partition $\Pi_(\infty)(t)$ almost surely consists of only finitely many blocks for all $t > 0$. We then show
how this result applies to some particular families of $\Lambda$-coalescents.**Pub info:**ECP Vol 5 (2000) Paper 1**Keyword note:**Schweinsberg__Jason**Report ID:**568**Relevance:**100

**Title:**Random Forests--Random Features**Author(s):**Breiman, Leo; **Date issued:**Sep 1999

http://nma.berkeley.edu/ark:/28722/bk0000n271q (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n2728 (PostScript) **Abstract:**Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently
and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit
as the number of trees in the forest becomes large. The error of a forest of tree classifiers depends on the strength of
the individual trees in the forest and the correlation between them. Using a random selection of features to split each node
yields error rates that compare favorably to Adaboost, but are more robust with respect to noise. Internal estimates monitor
error, strength, and correlation and these are used to show the response to increasing the number of features used in the
splitting. These ideas are al;so applicable to regression.**Keyword note:**Breiman__Leo**Report ID:**567**Relevance:**100

**Title:**Two coalescents derived from the ranges of stable subordinators**Author(s):**Bertoin, Jean; Pitman, Jim; **Date issued:**Sep 1999**Abstract:**Let $M_\alpha$ be the closure of the range of a stable subordinator of index $\alpha\in ]0,1[$. There are two natural constructions
of the $M_(\alpha)$'s simultaneously for all $\alpha\in ]0,1[$, so that $M_(\alpha)\subseteq M_(\beta)$ for $0< \alpha <
\beta < 1$: one based on the intersection of independent regenerative sets and one based on Bochner's subordination. We compare
the corresponding two coalescent processes defined by the lengths of complementary intervals of $[0,1]\backslash M_(1-\rho)$
for $0 < \rho < 1$. In particular, we identify the coalescent based on the subordination scheme with the coalescent recently
introduced by Bolthausen and Sznitman.**Pub info:**Electronic Journal of Probability, Vol. 5 (2000) Paper no. 7, pages 1-17**Keyword note:**Bertoin__Jean Pitman__Jim**Report ID:**566**Relevance:**100

**Title:**Right inverses of L\'evy processes and stationary stopped local times**Author(s):**Evans, Steven N.; **Date issued:**Aug 1999

http://nma.berkeley.edu/ark:/28722/bk0000n4259 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n426v (PostScript) **Abstract:**If $X$ is a L\'evy process on the line, then there exists a non--decreasing, c\`adl\`ag process $H$ such that $X(H(x)) = x$
for all $x \ge 0$ if and only if $X$ is recurrent and has a non--trivial Gaussian component. The minimal such $H$ is a subordinator
$K$. The law of $K$ is identified and shown to be the same as that of a multiple of the inverse local time at $0$ of $X$.
When $X$ is Brownian motion, $K$ is just the usual ladder times process and this result extends the classical result of L\'evy
that the maximum process has the same law as the local time at $0$. Write $G_t$ for last point in the range of $K$ prior to
$t$. In a parallel with classical fluctuation theory, the process $Z := (X_t - X_(G_t))_(t \ge 0)$ is Markov with local time
at $0$ given by $(X_(G_t))_(t \ge 0)$. The transition kernel and excursion measure of $Z$ are identified. A similar programme
is carried out for L\'evy processes on the circle. This leads to the construction of a stopping time such that the stopped
local times constitute a stationary process indexed by the circle.**Keyword note:**Evans__Steven_N**Report ID:**565**Relevance:**100

**Title:**Where Did The Brownian Particle Go?**Author(s):**Pemantle, Robin; Peres, Yuval; Pitman, Jim; Yor, Marc; **Date issued:**Aug 1999**Abstract:**Consider the radial projection onto the unit sphere of the path a $d$-dimensional Brownian motion $W$, started at the center
of the sphere and run for unit time. Given the occupation measure $\mu$ of this projected path, what can be said about the
terminal point $W(1)$, or about the range of the original path? In any dimension, for each Borel set $A \subseteq S^(d-1)$,
the conditional probability that the projection of $W(1)$ is in $A$ given $\mu(A)$ is just $\mu (A)$. Nevertheless, in
dimension $d \ge 3$, both the range and the terminal point of $W$ can be recovered with probability 1 from $\mu$. In particular,
for $d \ge 3$ the conditional law of the projection of $W(1)$ given $\mu$ is not $\mu$. In dimension 2 we conjecture that
the projection of $W(1)$ cannot be recovered almost surely from $\mu$, and show that the conditional law of the projection
of $W(1)$ given $\mu$ is not $\mu$.**Pub info:**Electronic Journal of Probability, Vol. 6 (2001) Paper no. 10, pages 1-22**Keyword note:**Pemantle__Robin Peres__Yuval Pitman__Jim Yor__Marc**Report ID:**564**Relevance:**100

**Title:**On the distribution of ranked heights of excursions of a Brownian bridge**Author(s):**Pitman, Jim; Yor, Marc; **Date issued:**Aug 1999

http://nma.berkeley.edu/ark:/28722/bk0000n418f (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n4190 (PostScript) **Abstract:**The distribution of the sequence of ranked maximum and minimum values attained during excursions of a standard Brownian bridge
is described. The height of the $j$th highest maximum $M_j$ over a positive excursion of the bridge has the same distribution
as $M_1/j$, where the distribution of $M_1$ is given by L\'evy's formula $P( M_1 > x ) = e^(-2x^2)$. The probability density
of the height of the $j$th highest maximum of excursions of the reflecting Brownian bridge is given by a modification of the
known $\theta$-function series for the density of the maximum absolute value of the bridge. These results are obtained from
a more general description of the distribution of ranked values of a homogeneous functional of excursions of the standardized
bridge of a self-similar recurrent Markov process.**Pub info:**Annals of Probability vol. 29, pages 362-384 (2001)**Keyword note:**Pitman__Jim Yor__Marc**Report ID:**563**Relevance:**100

**Title:**Nonparametric estimation of a periodic function**Author(s):**Hall, Peter; Reimann, James; Rice, John; **Date issued:**Jul 1999

http://nma.berkeley.edu/ark:/28722/bk0000n415s (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n416b (PostScript) **Abstract:**Motivated by applications to light curves of periodic variable stars, we study nonparametric methods for estimating both the
period and the amplitude function from noisy observations of a periodic function made at irregularly spaced times. It is shown
that nonparametric estimators of period converge at parametric rates and attain a semiparametric lower bound which is the
same if the shape of the periodic function is unknown as if it were known. Also, first-order properties of nonparametric estimators
of the amplitude function are identical to those that would obtain if the period were known. Numerical simulations and applications
to real data show the method to work well in practice.**Keyword note:**Hall__Peter Reimann__James_Dennis Rice__John_Andrew**Report ID:**562**Relevance:**100

**Title:**Stochastic billiards on general tables**Author(s):**Evans, Steven N.; **Date issued:**Jul 1999

http://nma.berkeley.edu/ark:/28722/bk0000n4124 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n413p (PostScript) **Abstract:**We consider stochastic analogues of classical billiard systems. A particle moves at unit speed with constant direction in
the interior of a bounded, $d$--dimensional region with continuously differentiable boundary. The boundary need not be connected;
that is, the ``table'' may have interior ``obstacles''. When the particle strikes the boundary, a new direction is chosen
uniformly at random from the directions that point back into the interior of the region and the motion continues. Such chains
are closely related to those that appear in shake--and--bake simulation algorithms. For the discrete time Markov chain that
records the locations of successive hits on the boundary, we show that, uniformly in the starting point, there is exponentially
fast total variation convergence to an invariant distribution. By analysing an associated non--linear, first--order PDE, we
investigate which regions are such that this chain is reversible with respect to surface measure on the boundary. We also
establish a result on uniform total variation C\'esaro convergence to equilibrium for the continuous time Markov process that
tracks the position and direction of the particle. A key ingredient in our proof is a result on the geometry of $C^1$ regions
that can be described loosely as follows: associated with any bounded $C^1$ region is an integer $N$ such that it is always
possible to pass a message between any two locations in the region using a relay of exactly $N$ locations with the property
that every location in the relay is directly visible from its predecessor. Moreover, the locations of the intermediaries can
be chosen from a fixed, finite subset of positions on the boundary of the region. We also consider corresponding results
for polygonal regions in the plane.**Keyword note:**Evans__Steven_N**Report ID:**561**Relevance:**100

**Title:**A polytope related to empirical distributions, plane trees, parking functions, and the associahedron**Author(s):**Pitman, Jim; Stanley, Richard; **Date issued:**Jun 1999

http://nma.berkeley.edu/ark:/28722/bk0000n3b3v (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n3b4d (PostScript) **Abstract:**The volume of the n-dimensional polytope of all (y_1, ... , y_n) with y_i > 0 and y_1 + ... + y_i < x_1 + ... + x_i for all
i for arbitrary (x_1, ... , x_n) with x_i > 0 for all i defines a polynomial in variables x_i which admits a number of interpretations,
in terms of empirical distributions, plane partitions, and parking functions. We interpret the terms of this polynomial as
the volumes of chambers in two different polytopal subdivisions. The first of these subdivisions generalizes to a class of
polytopes called sections of order cones. In the second subdivision, the chambers are indexed in a natural way by rooted binary
trees with n+1 vertices, and the configuration of these chambers provides a representation of another polytope with many applications,
the associahedron.**Pub info:**Discrete and Computational Geometry 27: 603-634 (2002)**Keyword note:**Pitman__Jim Stanley__Richard**Report ID:**560**Relevance:**100

**Title:**A new methodology for evaluating incident detection algorithms**Author(s):**Ostland, M.; Petty, K. F.; Bickel, P. J.; Kwon, J.; Rice, J. A.; **Date issued:**June 1999**Keyword note:**Ostland__Michael_Anthony Petty__Karl_F Bickel__Peter_John Kwon__Jaimyoung Rice__John_Andrew**Report ID:**559**Relevance:**100

**Title:**Some properties of the arc sine law related to its invariance under a family of rational maps**Author(s):**Pitman, Jim; Yor, Marc; **Date issued:**Apr 1999

http://nma.berkeley.edu/ark:/28722/bk0000n397j (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n3983 (PostScript) **Abstract:**This paper shows how the invariance of the arc sine distribution on $(0,1)$ under a family of rational maps is related on
the one hand to various integral identities with probabilistic interpretations involving random variables derived from Brownian
motion with arc sine, Gaussian, Cauchy and other distributions, and on the other hand to results in the analytic theory of
iterated rational maps.**Keyword note:**Pitman__Jim Yor__Marc**Report ID:**558**Relevance:**100

**Title:**A score test for the linkage analysis of qualitative and quantitative traits based on identity by descent data on sib-pairs**Author(s):**Dudoit, Sandrine; Speed, Terence P.; **Date issued:**Apr 1999**Abstract:**We propose a general likelihood-based approach to the linkage analysis of qualitative and quantitative traits using identity
by descent (IBD) data from sib-pairs. We consider the likelihood of IBD data conditional on phenotypes (discrete or continuous)
and test the null hypothesis of no linkage between a marker locus and a gene influencing the trait using a score test in the
recombination fraction $\theta$ between the two loci. This method unifies the linkage analysis of qualitative and quantitative
traits into a single inferential framework, yielding a simple and intuitive test statistic. The score statistic readily extends
to accommodate incomplete IBD data at the test locus, by using the hidden Markov model implemented in the programs MAPMAKER/SIBS
and GENEHUNTER to obtain the multipoint inheritance distribution for each sib-pair (Kruglyak and Lander (1995) and Kruglyak
et al. (1996)). The linkage score test is derived under general genetic models, which may include multiple unlinked genes.
Population genetic assumptions, such as random mating or linkage equilibrium between the trait loci, are not required. This
score test is thus particularly promising for the analysis of complex human traits. Conditioning on phenotypes avoids unrealistic
random sampling assumptions and allows sib-pairs from differing ascertainment mechanisms to be incorporated into a single
likelihood analysis. It allows in particular the selection of sib-pairs based on their trait values and the analysis of only
those pairs having the most informative phenotypes. A further advantage of the score test is that it is based on the full
likelihood, i.e. the likelihood based on all phenotype data rather than just differences of sib-pair phenotypes. Considering
only phenotype differences, as in Haseman and Elston (1972) and Kruglyak and Lander (1995), may result in important losses
in power. Simulation studies indicate that the linkage score test generally matches or outperforms the Haseman-Elston test,
the largest gains in power being for selected samples of sib-pairs with extreme phenotypes.**Keyword note:**Dudoit__Sandrine Speed__Terry_P**Report ID:**556**Relevance:**100

**Title:**Snakes and spiders: Brownian motion on R-trees**Author(s):**Evans, Steven N.; **Date issued:**Apr 1999

http://nma.berkeley.edu/ark:/28722/bk0000n384c (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n385x (PostScript) **Abstract:**We consider diffusion processes on a class of $\bR$--trees. The processes are defined in a manner similar to that of Le Gall's
Brownian snake. Each point in the tree has a real--valued ``height'' or ``generation'', and the height of the diffusion
process evolves as a Brownian motion. When the height process decreases the diffusion retreats back along a lineage, whereas
when the height process increases the diffusion chooses among branching lineages according to relative weights given by a
possibly infinite measure on the family of lineages. The class of $\bR$--trees we consider can have branch points with countably
infinite branching and lineages along which the branch points have points of accumulation. We give a rigorous construction
of the diffusion process, identify its Dirichlet form, and obtain a necessary and sufficient condition for it to be transient.
We show that the tail $\sigma$--field of the diffusion is always trivial and draw the usual conclusion that bounded space--time
harmonic functions are constant. In the transient case, we identify the Martin compactification and obtain the corresponding
integral representations of excessive and harmonic functions. Using Ray--Knight methods, we show that the only entrance laws
for the diffusion are the trivial ones that arise from starting the process inside the state--space. Finally, we use the Dirichlet
form stochastic calculus to obtain a semimartingale description of the diffusion that involves local time additive functionals
associated with each branch point of the tree.**Keyword note:**Evans__Steven_N**Report ID:**555**Relevance:**100

**Title:**Algebraic evaluations of some Euler integrals, duplication formulae for Appell's hypergeometric function $F_1$, and Brownian
variations**Author(s):**Ismail, Mourad E. H.; Pitman, Jim; **Date issued:**Apr 1999

http://nma.berkeley.edu/ark:/28722/bk0000n3917 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n392s (PostScript) **Abstract:**Explicit evaluations of the symmetric Euler integral $\int_0^1 u^(\alpha) (1-u)^(\alpha) f(u) du$ are obtained for some particular
functions $f$. %such as $f(u) = [(1- yu)(1-zu)]^(\beta)$ for %$\beta = \alpha + \hf$ as well as for some other values.
These evaluations are related to duplication formulae for Appell's hypergeometric function $F_1$ which give reductions of
$F_1 ( \alpha, \beta, \beta, 2 \alpha, y, z)$ in terms of more elementary functions for for arbitrary $\beta$ with $z = y/(y-1)$
and for $\beta = \alpha + \hf$ with arbitrary $y,z$. These duplication formulae generalize the evaluations of some symmetric
Euler integrals implied by the following result: if a standard Brownian bridge is sampled at time $0$, time $1$, and at $n$
independent random times with uniform distribution on $[0,1]$, then the broken line approximation to the bridge obtained from
these $n+2$ values has a total variation whose mean square is $n(n+1)/(2n+1)$. Key words and phrases: Brownian bridge, Gauss's
hypergeometric function, Lauricella's multiple hypergeometric series, uniform order statistics, Appell functions.**Keyword note:**Ismail__Mourad_E_H Pitman__Jim**Report ID:**554**Relevance:**100

**Title:**Constructions of a Brownian path with a given minimum**Author(s):**Bertoin, Jean; Pitman, Jim; de Chavez, Juan Ruiz; **Date issued:**Apr 1999**Abstract:**We construct a Brownian path conditioned on its minimum value over a fixed time interval by simple transformations of a Brownian
bridge**Pub info:**Electronic Communications in Probability, Vol. 4 (1999) Paper no. 5, pages 31-37**Keyword note:**Bertoin__Jean Pitman__Jim Chavez__Juan_Ruiz_de**Report ID:**553**Relevance:**100

**Title:**Inverse Problems as Statistics**Author(s):**Stark, P. B.; **Date issued:**Apr 1999**Abstract:**H.W. Engl, A.K. Louis, J.R. Mclaughlin and W. Rundell, eds., Springer-Verlag, New York, pp. 253-275 (invited). What
mathematicians, scientists, engineers, and statisticians mean by ``inverse problem'' differs. For a statistician, an inverse
problem is an inference or estimation problem. The data are finite in number and contain errors, as they do in classical
estimation or inference problems, and the unknown typically is infinite-dimensional, as it is in nonparametric regression.
The additional complication in an inverse problem is that the data are only indirectly related to the unknown. Standard statistical
concepts, questions, and considerations such as bias, variance, mean-squared error, identifiability, consistency, efficiency,
and various forms of optimality apply to inverse problems. This article discusses inverse problems as statistical estimation
and inference problems, and points to the literature for a variety of techniques and results.**Pub info:**in Surveys on Solution Methods for Inverse Problems, Colton, D.,**Keyword note:**Stark__Philip_B**Report ID:**552**Relevance:**100

**Title:**The 1990 and 2000 Census Adjustment Plans**Author(s):**Stark, Philip B.; **Date issued:**Mar 1999**Date modified:**revised 16 May 2000

http://nma.berkeley.edu/ark:/28722/bk0000n408x (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n409g (PostScript) **Abstract:**A revised plan for the 2000 Decennial Census was announced in a 24 February 1999 Bureau of the Census publication and a press
statement by K. Prewitt, Director of the Bureau of the Census. Census 2000 will include counts and ``adjusted'' counts. The
adjustments involve complicated procedures and calculations on data from a sample of blocks, extrapolated throughout the country
to demographic groups called ``post-strata.'' The 2000 adjustment plan is called Accuracy and Coverage Evaluation (ACE). ACE
is quite similar to the 1990 adjustment plan, called the Post-Enumeration Survey (PES). The 1990 PES fails some plausibility
checks and probably would have reduced the accuracy of counts and state shares. ACE and PES differ in sample size, data capture,
timing, record matching, post-stratification, methods to compensate for missing data, the treatment of movers, and details
of the data analysis. ACE improves on PES in a number of ways, including using a larger sample, using a simpler model to assign
``match probabilities'' to records with insufficient data, and incorporating mail-back return rates into some post-strata.
Nonetheless, ACE shares the most serious problems of PES. The ``Be Counted'' program, census response submission over the
internet, computer unduplication of records, the treatment of movers, a new definition of ``correct address,'' more limited
search for matching records, the use of optical character recognition (OCR) to capture data, the data collection schedule,
and the assignment of ``residence probabilities'' to some sample records, are likely to make ACE less accurate than the 1990
PES.**Keyword note:**Stark__Philip_B**Report ID:**550**Relevance:**100

**Title:**Ecological inference and the ecological fallacy**Author(s):**Freedman, David A.; **Date issued:**Mar 1999

http://nma.berkeley.edu/ark:/28722/bk0000n4058 (PDF)

http://nma.berkeley.edu/ark:/28722/bk0000n406t (PostScript) **Abstract:**This paper reviews several methods for making ecological inferences, that is, inferring the behavior of individuals from aggregate
data. Also considered is the ecological fallacy, which is the idea that relationships observed for groups necessarily hold
for individuals.**Keyword note:**Freedman__David**Report ID:**549**Relevance:**100