BRIGITTE VALLÉE'S RESEARCH, PREPRINTS, AND REPRINTS
This webpage describes my research in a detailed manner until 2007.
For a description of my recent results since then, and a list of
publications, see here.
Ages ago, I started my career in the Department of Mathematics at the
University of Caen. My interests then gradually shifted from
"pure" number theory to
computational number theory and related problems in cryptography,
most notably integer factorization and lattice reduction.
In this area, I obtained average-case bounds for the
celebrated LLL algorithms and once held the world record for the
best proven bound on the complexity of integer factorization.
Currently the core of my work
deals with analysis
I am especially interested in applications of dynamical systems theory
to some of the major algorithms and data
structures studied by theoretical computer scientists.
I feel for instance responsible for the introduction of transfer operators
in this area. My most recent contributions deal with the
average-case analysis of lattice reduction in two or more dimensions,
the exact analysis of a rich class of continued fraction
algorithms, the application of transfer operators to analytic
information theory, and the related analysis of digital trees ("tries")
under nonuniform input models.
research logo: the discs associated with continued fraction expansions
of real numbers.
Festoons appearing in the analysis of lattice reduction.
- Lattice Reduction in Two Dimensions:
Analyses under Realistic Probabilistic Models.
By Brigitte Vallée and Antonio Vera. Version of March 2007;
submitted to AofA07 (International Conference on Analysis of Algorithms).
The Gaussian algorithm for lattice reduction in dimension 2 is
precisely analysed under a class of realistic probabilistic models, which are of
interest when applying the Gauss algorithm "inside"
the LLL algorithm. The
proofs deal with the underlying dynamical systems and transfer operators. All
the main parameters are studied: execution parameters which describe the
behaviour of the algorithm itself as well as output parameters, which describe
the geometry of reduced bases.
- Hausdorff Dimension
of Real Numbers with Bounded Digit Averages.
By Eda Cesaratto and Brigitte Vallée. Version of August 2006,
In Acta Arithmetica.
This paper considers numeration schemes,
defined in terms of dynamical systems and studies the
set of reals which obey some constraints
on their digits. Classically, the studied constraints involve each digit in an independent way.
Here, more conditions are studied, which only provide (additive) constraints on each digit prefix.
The main example of interest deals with reals whose all the digit prefix averages in their continued
fraction expansion are bounded by M.
We first provide a characterization of the Hausdorff dimension ,
in terms of the dominant eigenvalue
of the weighted transfer operator relative to the dynamical system.
We then come back to our main example;
with the previous characterization at hand and use of the Mellin Transform,
we exhibit the behaviour of |s_M-1| when the bound $M$ becomes large.
Pattern matching statistics on correlated sources.
by Jérémie Bourdon and Brigitte Vallée.
In LATIN 2006, pages 224--237,
Lecture Notes in Computer Science, 3887, Springer.
In pattern matching algorithms, two characteristic parameters play an important rôle :
the number of occurrences of a given pattern, and the number of positions where a pattern occurrence ends.
These two parameters may differ in a significant way.
Here, we consider a general framework where the text is produced by a probabilistic source, which can be built by a dynamical system. Such ``dynamical sources'' encompass the classical sources
--memoryless sources, and Markov chains--, and may
possess a high degree of correlations.
We are mainly interested in two situations : the pattern is a general word of a regular expression, and
we study the number of occurrence positions -- the pattern is a finite set of strings, and we study the number of occurrences.
In both cases, we determine the mean and the
variance of the parameter, and prove that its distribution is asymptotically Gaussian. In this way, we extend methods and results which have been already obtained for classical sources to this general ``dynamical''
framework. Our methods use various techniques: formal languages, and generating functions, as in previous works. However, in this correlated model, it is not possible to use a direct transfer
into generating functions, and we mainly deal with generating operators which generate... generating functions.
- Sharp estimates for the main parameters
of the Euclid Algorithm. By Loïck Lhote and Brigitte
Vallée. In LATIN 2006, pages 689--702,
Lecture Notes in Computer Science, 3887, Springer.
Laws for the Main Parameters of the Euclid
Algorithms. By Loïck Lhote and Brigitte
Vallée. This is the (longer) journal version
of the previous item, submitted to a special issue.
We provide sharp estimates for the
probabilistic behaviour of the main parameters of the Euclid
algorithm, and we study in particular the distribution of the
bit-complexity which involves two main parameters : digit-costs and
length of continuants. We perform a "dynamical analysis" which heavily
uses the dynamical system underlying the Euclidean algorithm. Baladi
and Vall'ee  have recently designed a general framework for
"distributional dynamical analysis", where they have exhibited
asymptotic gaussian laws for a large class of digit-costs. However,
this family contains neither the bit-complexity cost nor the length of
continuants. We first show here that an asymptotic gaussian law also
holds for the length of continuants at a fraction of the
execution. There exist two gcd algorithms, the standard one which only
computes the gcd, and the extended one which also computes the Bezout
pair, and is widely used for computing modular inverses. The extended
algorithm is more regular than the standard one, and this explains
that our results are more precise for the extended algorithm. We prove
that the bit-complexity of the extended Euclid algorithm
asymptotically follows a gaussian law, and we exhibit the speed of
convergence towards the normal law. We describe also conjectures
[quite plausible], under which we can obtain an asymptotic gaussian
law for the plain bit-complexity, or a sharper estimate of the speed
of convergence towards the gaussian law.
On the non-randomness of modular arithmetic progressions: a solution
to a problem by V. I. Arnold. By Eda Cesaratto,
Alain Plagne, and Brigitte Vallée.
In Discrete Mathematics and Theoretical Computer Science,
Proceedings of the Mathematics and Computer
Science IV Conference, Nancy, September 2006.
23 pages. [PS]
We solve a problem by V. I. Arnold dealing with
"how random" modular arithmetic progressions can be. After making
precise how Arnold proposes to measure the randomness of a modular
sequence, we show that this measure of randomness takes a simplified
form in the case of arithmetic progressions. This simplified
expression is then estimated using the methodology of dynamical
analysis, which operates with tools coming from dynamical systems
theory. In conclusion, this study shows that modular arithmetic
progressions are far from behaving like purely random sequences,
according to Arnold's definition.
Dynamics. By Brigitte Vallée.
In Discrete and Continuous Dynamical
Systems, vol. 15:1, 2006, pages 281--352.
Here's the big chart of what is going on as regards dynamical
analysis of Euclidean algorithms. This big survey paper
combines methods from dynamical system theory and analytic
combinatorics and serves as lecture notes for a short course I gave in
2004 at CIRM,
Luminy. Six types of algorithms (including those based on least
significant digits, most significant digits, odd and even remainders,
subtractions, and so on) are cast into a unifying framework.
Statistics on digits and number of iterations (as well as many other
cost measures) can be thoroughly analysed.
Statistics for Dynamical
Sources, Slides, 2006.
These are slides relative to a general analysis
of occurrences of patterns in texts produced by all sorts of
sources. Presented at the
Indo-French Workshop in Computer Science, Bangalore, February 2006 :-).
Lyapunov Tortoise and the Dyadic Hare.
By Benoît Daireaux, Véronique
Maume-Deschamps, and Brigitte Vallée.
Here is the version for the
Conference, Barcelona, June 2005, whose proceedings appear in
Mathematics and Theoretical Computer Science.
It is relative to the gcd algorithm directed by
Least Significant Bits (LSB algorithm). The analysis is non-standard:
it requires us to introduce
a dynamical system, which is both a 2-adic extension and a real extension
of the algorithm---this leads us to the framework of products of random matrices
and Lyapunov exponents. The algorithm can be viewed as a race between
a "dyadic" hare with a speed of 2 bits by step and a
"real" tortoise with a
speed equal to about 0.05 bits by step. Even if the tortoise starts before the
hare, the hare easily catches up with the tortoise [unlike in
and the algorithm terminates. Oh, by the way, who invented the LSB algorithm?
It's Damien Stehlé
and Paul Zimmermann,
who are both from INRIA Nancy.
Also available in
The 3-D landscape of the number of
"turns" of the Euclidean algorithm,
LSB version has to do with the tortoise and the hare race.
Dimension of Real Numbers with Bounded Digit Averages, by
Brigitte Vallée and Eda Cesaratto.
What is the measure
of numbers whose binary representation has in
all its prefixes more 0's than 1's?
Easy! This has null measure. Yes, but... Can we talk of Hausdorff
dimension and such? We address a
question of this sort (and even solve it)
for continued fraction expansions...
This paper considers numeration schemes, defined in terms of dynamical
systems and studies the set of reals which obey some constraints on their
In this general setting, (almost) all sets have zero Lebesgue measure, even
though the nature of the constraints and the numeration schemes
are very different. Sets of zero measure appear in many areas of
science, and Hausdorff dimension
has shown to be an appropriate tool for studying their nature.
the main example of interest deals with reals all of whose digit averages
in a continued fraction expansion are bounded by M.
This setting can be translated in terms of random walks
where each step performed depends on the present digit,
and walks under study are constrained to be always under a line of slope M.
We provide a characterization of the Hausdorff dimension s(M)
in terms of the dominant eigenvalue of
the weighted transfer operator relative to the dynamical system.
Currently submitted to a journal.
Also available in Postscript.
Keywords: Dynamical systems, Transfer operator, Hausdorff dimension,
Quasi-Powers Theorem, Large deviations,
Shifting of the mean, Mellin transform analysis.
Real Numbers with
Bounded Digit Averages. Here is the shorter version
(everything is relative!) which appeared in the Third Colloquium on
Combinatorics, Probability, Trees, and Algorithms
[or a permutation thereof], Vienna, September 2004.
- Euclidean Dynamics (slides)
These are slides of an invited lecture given at the 10th International
Workshop on Analysis of Algorithms, Berkeley, June 2004.
It surveys dynamical methos in the context of number-theoretic
algorithms. I was proud to produce my first generation of slides
in LaTeX (only a few scribbled figures are still missing).
This version has been especially compiled for all friends from the Southern
A plot of the maximum-digit-average
function for binary representations.
- Distributional Analyses
of Euclidean Algorithms and
Euclidean Algorithms are Gaussian
with Viviane Baladi.
The first (Distributional Analysis) is a front end to the second
(...Gaussian), which is the more detailed
journal submission, replete with
proofs. Take your favorite Euclidean algorithm, one that computes GCD's,
like the standard, odd-quotient, or neareast-integer brand.
Choose your favorite "cost" measure like the number of iterations,
the number of occurrences of your birthdate, the number of times a
quotient (digit) that is a prime occurs, and so on. Then,
over fractions p/q with denominator less than a fixed bound,
the distribution of this cost functional
is asymptotically Gaussian.
This vastly generalizes an earlier result of Doug Hensley
who proved the theorem in the
case of the standard algorithm and the number of iterations.
Also we obtain optimal error terms. That was tricky, and Viviane and
I had to prove the existence of a pole-free strip for certain
pseudo-Dirichlet series by suitably extending subtle estimates due to
Anyhow, I was happy that the dynamical systems and transfer operator
approach now gives us access to distributions and to estimates that are as tight as can be for a wide variety of parameters.
Distributional analysis... has been presented at the
meeting in New Orleans, in January 2004. The document has 15 pages
nicely laid out on two columns.
...Gaussian has been submitted to a number theory(?) journal.
The version has also been
as cs.DS/0307062 as a 45 page document. (Make sure you
get Version #3, revised April 2004.) Makes for perfect bedtime reading!
- Exponential decay of correlations
for surface semi-flows without finite Markov partitions,
with Viviane Baladi. That's probably the most amazing
title of any paper I have ever been author/coauthor of :-).
Let me just say here that it presents another approach to estimates
of Dolgopyat's type which applies to systems with an infinite
number of branches.
The need for this arose during our quest for
Dated November 2003.
To appear in Proceedings of the American Mathematical Society
- Dynamical Analysis of the Parameterized
with Benoit Daireaux.
Practitioners of GCD calculations in computer algebra
and cryptography are certainly aware of Lehmer's optimization
of Euclid's algorithm. Actual computers manipulate words instead of
single digits, which is a form of parallelism that can be taken advantage of:
simply compute a chunk of the continued fraction expansion of
p/q by applying the continued fraction algorithm to
the leading words of p and q,
proceed with care, and finally adjust.
We analyse a version of this idea and obtain very precise estimates,
based on--guess what!---dynamical systems.
Lectures on Number Theory, Elliptic Curves and Cryptology
Kapil Hari Paranjape: These lecture notes succinctly describe
some of the algorithms at stake here.
dynamiques et algorithmique.
This started with a mini-course taught at the ALÉA'2002 meeting by Viviane Baladi and myself.
Frédéric Chazal, Véronique Maume, and I wrote down
which have been subsequently carefully scrutinized by Frédéric Chyzak [thanks, Fred!].
Aagh! It's in French. If you read that exotic language,
you'll find a summary in 30 pages of the core theory (hopefully)
written in an accessible style. You'll strat by
an introduction to the dynamics of
maps on the interval, density transformers and Markov sources. Branches
may or maynot be complete. Each case carries with it an interesting
class of transfer operators acting on a suitable functional space.
the method of inducing may be used to cope with intermittency phenomena.
There's usually a dominant spectral eigenvalue and analytic
perturbation theory applies. Bingo!
Everything then becomes almost as pleasant as with the usual theory of
finite Markov chains. Applications to analysis of algorithms are
then sketched. These concern statistical properties of random texts
and patterns under quite general source models, dictionaries
and digital trees, as well as a wide class of Euclidean and
continued fraction algorithms. The text concludes with some open problems
(as of 2002) on which progress has been made recently---see also above.
Appears in Algorithms Seminar 2001-2002,
F. Chyzak Ed.,
INRIA Res. Report 5003, November 2003, pp. 121--150.
The German 10DM banknote with the
effigy of Gauss, the discoverer of the (in)famous Bell curve.
Dynamical Analysis of a-Euclidean Algorithms
There are perhaps half a dozen ways you may divide 73 by 17: the usual one
where quotients are taken by default (i.e., 4),
or you may take quotients by excess (5), or to the nearest integer (here 4),
or you may impose the remainder to be odd or you may want it to be even.
(There are good algorithmic reasons to do so--think
of the binary gcd algorithm.)
It's surprising that two algorithms that closely resemble one another
may exhibit rather different behaviours, e.g., going from linear to
quadratic time complexity.
We now have a fairly good picture of what goes on,
as you may guess by reading these pages. Here, we consider
what has been nicknamed the "Japanese" g.c.d. algorithm,
which is defined by the constraint that the remainder should always
be taken in an interval [a-1,a]. The corresponding problems
become trickier as the image below suggests: there the algorithm is seen as
moving a window of the class of transformations associated with the
usual continued fraction map. This leads to somewhat hairy
dynamical systems with incomplete branches, and even a phase transition
that it might be interesting to study further...
Joint work with Jérémie Bourdon and Benoit Daireaux.
Journal of Algorithms, Vol. 44, No. 1, Jul 2002, pp. 246-285.
Erratum to: "Dynamical sources in information theory :
fundamental intervals and word prefixes"
Errare humanum, perseverare diabolicum.
So, the (long) paper cited in the title of
this erratum erred, as it omitted a certain technical condition
related to distortion in expanding maps of the interval.
(Don't panic: the main results of the old paper stand, it's just
that their precise domain of validity needs to be amended a bit.)
Thanks to good friends in Dijon, Veronique and Frédéric,
this could be fixed. I have put this erratum here, not only to get the record
straight, but also because aficionados may
appreciate the general discussion of suitable
functional spaces that accompanies it.
Joint work with Frédéric Chazal and Veronique Maume.
To appear in Algorithmica, 2003.
Hidden Word Statistics.
You find your name "hidden"
in a medieval treatise on alchemy, say nicely aligned along the
diagonal of a page. Does this "mean" something?
Well, the answer is (maybe) contained in this paper
and its junior companion, Hidden Patterns Statistics,
which is a shorter conference version
described below. Here, you'll find
that the law of a hidden word in a random text is almost
certainly Gaussian, with explicit expressions for the mean, the variance,
and a few goodies related to the de Bruijn graph, which finds itself
highjacked for the purpose of the present research.
Joint work with Philippe Flajolet and Wojtek Szpankowski. Submitted
to Journal of the A.C.M. (fall 2002).
Generalized Pattern-Matching Statistics.
Under most current practices, pattern-matching is analysed by
enumerative combinatorics of words or by finite-state Markov chains.
Here, we depart from this framework and show how,
statistically, occurrences of patterns can be predicted
with great accuracy in a diversity of models. What
I find pleasing about this paper is the simplicity of the conceptual
framework, entirely based on density transformers.
Also, it applies to a wide range of source models
(including memoryless aka Bernoulli and Markov, but also many more)
and to a general notion of patterns. There's a catch, though. Only
the mean and variance are characterized at this
level of generality. I hope to report soon
on some of
the corresponding distributional analyses (no wonder, the distributions
have to be asymptotically normal, but that's not so easy to prove!).
Joint work with Jérémie Bourdon. Appears in
Mathematics and Computer Science II: Algorithms,
Trees, Combinatorics and Probabilities,
Proceedings of Versailles Colloquium, Birhäuser Verlag,
2002, pp. 249-265.
Some Euclidean algorithms are fast,
others are slow. The question really is Why?.
Hidden Pattern Statistics.
A pattern is "hidden"
in a text if it appears somewhere in the text, but the letters of the pattern do
not necessarily occur contiguously. For instance Brigitte is "hidden" inside the following
sentence of Hamlet: "Bring with thee
airs from heaven or blasts from hell,..." How significant is this? [guess!]
Well, here we characterise the distribution of the number of occurrences of a fixed pattern in a random text,
and show that it is asymptoticaly normal. Thus, you can set your own thresholds
in order to decide what is meaningful and what is not. The distances between letters may be
constrained or not, and still the law has to be gaussian! The problem was initially motivated by intrusion
detection in computer security, but is general enough that it should be of [some?] relevance to
computational biology. The analysis is, I think, an amusing mix of combinatorial, analytic, and probabilistic methods.
Joint work with Philippe Flajolet, Yves Guivarc'h,
and Wojtek Szpankowski.
In Proceedings of ICALP'01, Crete, July 2001.
Lecture Notes in Computer Science, Vol. 2076, pp. 152--165.
Hidden words are everywhere to be found. But what does finding a particular hidden word really mean?
Average-bit Complexity of Euclidean Algorithms.
So far, most of my research had been dedicated to somewhat abstract measures of
computational complexity, with complexity being measured in the number of
arithmetic operations irrespective of the operands' sizes.
In this (and the following paper),
what is addressed is the Boolean cost, where every bit operation
Well, it turns out that techniques
based on functional analysis and on classical methods
from the analysis of algorithms (e.g., the enumerative uses
of regular expressions) combine nicely and open access to
the average bit complexity of Euclidean algorithms.
It appears for instance that
a standard gcd computation performed on
binary numbers is hardly more costly than
a naive multiplication. True, this is essentially known to
it is the first time, I believe, that such complete
analyses at bit-complexity level are proposed.
(Joint paper with Ali Akhavi. Appears in the Proceedings of ICALP'2000,
Geneva, July 2000. Lecture Notes in Computer Science, volume
1853, pp. 374-87.)
Digits and Continuants in Euclidean Algorithms.
What this paper does is to analyse various characteristics
of continued fraction expansion of rational numbers.
Analysis of algorithms is still lurking in the background
but is not the primary focus. Here, I give new results on
the lengths of various encodings of
rational numbers by continued fraction representations.
Some of these based on Elias or Shannon-Fano codes
even appear to be very close to the information-theoretic optimum,
with less than a 3% loss.
(Appears in the special issue of
Journal de Theorie des Nombres de Bordeaux
dedicated to Jacques Martinet on the occasion of his retirement, JTNB 12 (2000) pp 531-570)
A Unifying Framework for a Class of Euclidean Algorithms (short).
Dynamical Analysis of a Class of Euclidean Algorithms (long).
In the short version, you'll find as promised
a unifying framework for the analysis of eight (yes, eight!)
Euclidean algorithms--all this condensed in twelve(!) pages.
There's even more in the long version...
The production chain starts with certain generating functions of costs
(of the mixed Dirichlet and ordinary type),
relates them to Ruelle operators and their spectral properties,
then concludes about averages and moments of costs
by means of a suitable Tauberian theorem.
The algorithms, some old and some new, appear to
split into two categories---the slow ones and the fast ones---and this is
eventually related to simple contraction properties of fixed point
of certain maps of the interval.
(Paper presented at the conference LATIN'2000.
Published by Springer Verlag in Lecture Notes in Computer Science,
vol. 1776, Gonnet, Panario and Viola Editors. pp. 343-354.
Longer version submitted to Theoretical Computer Science, June 2000.)
Continued Fractions, Comparison Algorithms, and
Fine Structure Constants.
How to compare two fractions or equivalently
determine the sign of a 2x2 determinant?
Although this may sound like a silly question, there are good
and bad algorithms when measured either in terms of numerical
accuracy or in terms of computational cost.
Here is an étude in the
mathematics involved in the basic HAKMEM algorithms
already discussed in these pages (see year 1998)
and in its generalization to sorting.
In this largely expository paper, we take the reader
through a light tour of dynamical systems (symbolic dynamics),
number theory (continued fractions), special functions (multiple zeta
values), functional analysis (transfer operators), numerical analysis
(series acceleration), and complex analysis (the Riemann hypothesis).
These domains all eventually contribute
to a detailed characterization of the complexity of comparison and
sorting algorithms, either on average or in probability.
(Joint work by Philippe Flajolet and Brigitte Vallée.
Proceedings of the Canadian Mathematical Society, volume
titled ``Constructive, Experimental, and Nonlinear Analysis''
in the honour of Jonathan Borwein; Vol 27 (2000) pp 53-82.)
On the Stack-Size of General Tries.
This paper is relative to digital trees (aka tries),
as discussed below in relation to earlier papers.
Say you're looking for some information in a tree.
It is well known that the storage cost of a recursive traversal
coincides with the height of the tree.
However, the recursive traversal is suboptimal in terms of storage since it
keeps track of subtrees to which it will never need to come back.
A well-known transformation of the traversal algorithm into
an optimized nonrecursive form yields a stack-based algorithm.
It is precisely the goal of this paper to analyse the corresponding
stack-size on average and in distribution
in the case of digital trees. As you might guess,
the analysis can be carried out under rather general assumptions on the
source that produces digits. In particular, for
a binary alphabet and an unbiased source, the reduction in storage
consumption is by a factor of log(3)/log(2), that is, about 1.58496.
Joint paper with Jérémie Bourdon
and Markus Nebel. In Les cahiers du GREYC,2000, n°4
Fundamental squares are fundamental when comparing two fractions, but also occasionally in computational geometry!
Dynamical sources in information theory :
fundamental intervals and word prefixes.
Information theory deals with the way sources
produce sequences of symbols (digits, letters)
and with the associated metric phenomena.
Traditional applications, following Shannon,
are in the area of
error correction and data compression.
This paper shows how a number of characteristics of sources can be analysed
precisely in a unified framework based on expanding maps of the interval
and their companion transfer operators. A special role is played
by the thinning process that successively refines a partition
of the interval by appending a new digit. The approach presented has some
benefits. It unifies the memoryless models (where letters are drawn
independently according to a common probability distribution),
the Markovian models (that allow
bounded correlations between letters), and a wide class of sources with
unboundedly correlated digits, like the digits of
continued fraction expansions. This allows for a general treatment of
a wide class of sources and it eventually leads, as shown in the next paper
listed, to a simple abstract analysis of digital trees.
In Algorithmica 29(1/2), 2001, pp. 262-306.
Dynamical Sources in Information Theory: A General Analysis of
What is a digital tree, also known in computer science jargon as a
Well, assume you want to store a dictionary of the
Hittite language. You can group together all the words that start with
an A and put them in a separate fascicle. No need to repeat the
initial A, of course. Then you do the same with B,
C, &c. If you proceed recursively (on the second letter, and so on),
you transform your set of words
composing the dictionary into a tree: this is precisely a digital tree.
Such trees have been known for a long time (starting
with Fredkin in 1959) to be an efficient data
structure. The present paper shows the role
of fundamental intervals and word prefixes (as explored
in the previous paper) in providing in easy access
to probabilistic properties of randomly generated tries.
In particular, fundamental constants, like the entropy and the
probability of [digit] coincidence, themselves related to
spectral properties of transfer operators
are main characters of the story. A consequence of the present
study is an abstract analysis of tries that encompasses
most existing results on height, size, and path length
and even gives a bit more!
Joint paper with Julien Clément
and Philippe Flajolet;
In Algorithmica 29(1/2), 2001, pp. 307-369.
Apéry's famous continued fraction for zeta(3). What is a typical contnued fraction like?
And what does it have to say about algorithms?
Average-case analyses of three algorithms
for computing the Jacobi symbol.
The Jacobi symbol
is a very important tool in algebra, since it is related
to quadratic characteristics in modular arithmetics.
Interest in its efficient computation
is now reawakened with its utilization in primality tests
or more generally in cryptography.
All the efficient algorithms
that compute the Jacobi symbol have a structure similar
to that of the GCD's algorithm,
but they involve a modified division-remainder process and
are associated with continued fractions of a special type.
Three major algorithms (the "Odd", the "Even",
and the "Ordinary" algorithm) are considered
and are completely analysed in the average case.
It appears that their average-case behaviour is very much linked to
the analytic properties of the class of linear fractional
transformations involved. Two of the algorithms,
(the Odd Algorithm and the Ordinary Algorithm) have, as anticipated,
an average-case complexity that is O(logN).
while the Even Algorithm
has average cost O(log(N)^2),
which is similar to the GCD subtractive algorithm.
This last surprising property is linked to the fact that the
corresponding transformation involves a weakly repulsive point.
Transfer operators and the method of "inducing" play a role in this
analysis, and, naturally, precise constants accompany
(Joint work with Charly Lemée, October 1998.)
Dynamique des fractions continues à contraintes
périodiques [in French].
The golden section has all its continued fraction quotients equal to 1,
while the pattern for e-2 is
1,1,2,1,1,4,1,1,6, etc. What do numbers that satisfy similar periodic
constraints look like? Here, I deal with arbitrary sets of periodic
constraints. These sets are thus analogous to the classical Cantor sets,
where some digit in a base representation is excluded.
This paper determines the Hausdorff dimension of constrained reals and
the asymptotic density of the corresponding rational
or quadratic irrational numbers. Other applications
regarding these Cantor-like sets include
the analysis of the mean height of continued fraction representations
(the cost of Euclid's algorithm) as well as the estimation
of the mean value of
Lévy's constant for quadratic irrationals.
(In Journal of Number Theory 72(1998), no. 2, 183--235.)
Continued Fraction Algorithms, Functional Operators, and
Structure Constants,by P. Flajolet and
B. Vallée. A large amount of research has been devoted to the
analysis of metric and probabilistic properties of continued
fractions since Gauss first guessed the stationary regime
of the continued fraction transformation. This range of
problems has been one of the original sources of ergodic theory,
with names like Paul Lévy, Kuzmin, or Khinchin.
New perspectives then opened when Dieter Mayer, starting in the mid 1970's,
discovered that the nascent theory of transfer operators initiated by Ruelle
could be successfully applied to this range of problems.
In this paper we survey in elementary terms the analysis by transfer
operators of a number of basic algorithms, including:
the continued fraction and the Euclidean GCD algorithm, the HAKMEM
algorithm for comparison of real numbers and 2-dimensional lattice reduction,
as well as the digital tree
("trie") built on continued fraction representations.
It is amusing to note that the behaviour of the continued fraction
tree involves tiny fluctuations whose nature is related to the
nontrivial zeros of the classical zeta function and
to the Riemann hypothesis.
(The full paper appears in
Theoretical Computer Science, March 1998, vol 194
(1-2), pp. 1-34. The version offered here is
a preliminary one that was the text of an invited lecture
at the 7th conference
"Fibonacci Numbers and Applications",
Graz, July 1996.)
HAKMEM Memorandum, from a team at MIT in 1972;
see especially the part that deals with
home page that contains a version his paper
"The Number of Steps in the Euclidean algorithm", where he proves
this number to be asymptotically Gaussian. The home page of my coauthor,
Dynamics of the Binary Euclidean Algorithm:
Functional Analysis and Operators.
Computing greatest common divisors (GCD's)
lies at the heart of many problems of computational number theory and
computer algebra. Perhaps the most efficient algorithm
and the one used most often in the case of integers
is the binary GCD algorithms.
This algorithms is based on subtraction and elimination of powers of two.
division is thus needed and the algorithm goes well with binary
computers: its cost on binary hardware appears to
be barely higher than that of a single division.
Richard Brent outlined in 1976 the principles of a heuristic analysis
that appeared to match the facts quite well. However, the
problem of finding an exact and rigorous average-case analysis
has remained hitherto unsolved. In this paper I
come up with a complete analysis that is based the introduction of
suitable functional spaces and operators.
The developments are nontrivial owing to the presence
of both singularities and periodicities.
I prove in particular
that the expected number of steps of the binary GCD algorithm
is of the form A.logN where the interesting constant
A is discussed in the the latest edition of
The Art of Computer Programming, Volume 2, by Don Knuth.
(The final paper appears in Algorithmica, 22
(1998), no. 4, 660--685.)
The Analysis of Hybrid Trie
by Julien Clément, Philippe Flajolet,
and Brigitte Vallée.
In 1997, Bentley and Sedgewick revisited the classical
digital tree ("trie") structure that has so many
applications in the processing of
textual data. One of the problems with the basic form of
tries is that, for large alphabets,
the branching degree tends to become high, so that
search time degrades appreciably while the storage cost due to pointers
becomes prohibitive. A simple way to remedy these
problems consists in defining a hybrid tree structure
where alphabetic searches at each node of the trie are guided by a binary
search tree. Bentley and Sedgewick demonstrated
the practicality of this approach by means of performance measurements on
actual data. The hybrid character of these trees
(called "ternary search trees" in the new edition of Sedgewick's
Algorithms) poses a challenge to analysts.
In this paper, we provide a complete analysis of hybrid tries
as well as a theoretical model that makes it possible to
compare them with alternative implementations
(like list-tries or array-tries). The theoretical predictions match
convincingly measurements on textual data (the novel
Moby Dick, by Herman Melville!). They justify the fact that
hybrid tries lead to an improvement in time complexity by a
typical factor of 3.
(In Proceedings of the Ninth ACM_SIAM Symposium on
Discrete Algorithms (SODA'98), San Francisco, January 1998,
pp. 531-539. The form available here is
INRIA RR3295, November 1997, 9 pages.)
summary by Julien Clément in the Algorithms
Seminar at INRIA, 1997-98. The
original papers of Bentley
and Sedgewick entitled Ternary Search Trees
and Fast Algorithms for Sorting and Searching Strings
(Dr Doob's Journal and SODA'97).
The home pages of my coauthors,
Julien Clément and
"Moby Dick" serves to compare theoretical
estimates to simulations in the case of trie algorithms.
One of Devroye's random tree simulations.
Opérateurs de Ruelle généralisés
et analyse en moyenne des algorithmes d'Euclide et de Gauss [in French].
The main theme of this paper is:
What happens when the Euclidean algorithm or the Gaussian
algorithm for lattice reduction are subjected to nonuniform data?
Here, I treat the case of a general real-analytic density.
The standard framework of Ruelle operators no longer applies.
I then generalize Ruelle operators
in such a way that they transform analytic functions
of two complex variables. This new generalization gives rise to operators
that are better suited to fine-grain analysis of iterates of
the continued fraction transformation, while their complete spectrum
can be related to that of the classical Ruelle-Mayer operator.
As a consequence, very precise characterizations of the complexity of
both algorithms under nonuniform input models are derived.
(Preliminary version of Acta Informatica, LXXXI.2 (1997),
Dynamical Systems and the Analysis of General Tries
Digital trees or tries are a fundamental data structure
that implements dictionary built on words or the digital structure
of data. They support efficiently insertions, deletions, and all sorts
of queries, including partial match. (For these reasons they have
been used in the computerization of the Oxford English Dictionary.)
With the exception of Devroye's work who considered a general
most analyses have focussed on fairly simple randomness models,
like Bernoulli or Markov. I introduce here a new framework that
is capable of dealing with (unboundedly) correlated digits in
Under this framework, the main characteristics of digital trees,
namely, height, size, or path length, are precisely analysed.
(This paper was presented at the RALCOM'97 meeting in Santorini
Island (!), Greece, October 1997.)
An average-case analysis of the Gaussian
algorithm for lattice reduction,
by H. Daudé, P. Flajolet, and
A two-dimensional lattice has infinitely many bases.
However only one of these bases has a "nice" aspect
and is formed with nearly orthogonal and reasonably small
vectors --the reduced basis. Lattice reduction aims to find
such a basis. The problem intervenes in many questions of
number theory (e.g., the reduction of quadratic forms)
and Gauss proposed an essentially optimal algorithm
based on a modified continued fraction algorithm.
The worst case of this reduction algorithm is logarithmic
in the size of the input vectors. Here , we prove that the average case
number of steps is asymptotically constant and that the probability
of executing a large number of reduction steps decays exponentially.
Our treatment characterizes the behaviour of the algorithm in terms of
a conditional invariant measure that is related to the transfer operator of
classical continued fractions. In a way, analytically, the Gaussian
algorithm is like a version of the GCD algorithms that would "leak" (!).
(Appears in Combinatorics, Probability
and Computing 6 (4), 1997, pp. 397-433;
the version available here is a preliminary one: INRIA, RR
2798, February 1996.)
Three transformations of the interval iterate to
give rise to binary representations, standard continued fractions, and
centred continued fractions.
Back to my Home Page.