Génération aléatoire uniforme de mots de langages rationnels

Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur n des mots engendrés. Le premier algorithme est dédié aux...

Full description

Saved in:
Bibliographic Details
Published inTheoretical computer science Vol. 159; no. 1; pp. 43 - 63
Main Author Denise, Alain
Format Journal Article
LanguageFrench
Published Elsevier B.V 28.05.1996
Online AccessGet full text
ISSN0304-3975
1879-2294
DOI10.1016/0304-3975(95)00200-6

Cover

Abstract Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur n des mots engendrés. Le premier algorithme est dédié aux langages dont les séries génératrices possèdent un unique pôle, éventuellement multiple; sa complexité en temps est de l'ordre de n log n, et l'espace mémoire occupé est en log n. Le second algorithme est réservé aux langages dont les séries génératrices possèdent la propriété suivante: il existe un unique pôle de plus petit module, et ce pôle est simple. Après un pré-traitement en temps polynomial en n, le tirage aléatoire de tout mot s'effectue en temps moyen et espace linéaires. The problem of generating uniformly at random words of a given language has been the subject of extensive study in the last few years. An important part of that work is devoted to the generation of words of context-free languages (see, e.g., [6, 8, 9, 12]). For a given integer n > 0, the words of length n > 0 of any unambiguous context-free language can be generated uniformly at random by using algorithms derived from the general method which was introduced by Wilf [14, 15] and systematized by Flajolet et al. [7]. Clearly, this can be applied to the set of rational languages, which constitute an important special case of context-free languages. Most authors use the uniform measure of complexity (see [1]) in order to compute the complexity of the algorithms of generation. This measure is based on the following hypotheses: any simple arithmetic operation (addition, multiplication) has time cost 0(1), and a constant amount of memory space is taken by any number. Thus, we know that words of any rational language can be generated by using an algorithm which, with respect to the uniform measure of complexity, runs in linear time (in terms of the length of the words) and constant space [9]. This measure is realistic only if there is a reasonable bound on the numbers involved in the operations. However, the classical random generation algorithms involve operations on numbers which grow exponentially in terms of the length of the words to be generated. Moreover, the programs which make use of these algorithms are generally used to generate very large words, for example for the purpose of studying the asymptotic behavior of some parameters. Therefore, the uniform measure does not reflect the real behavior of such programs. It turns out that the logarithmic measure of complexity is much more realistic: one assumes that the space taken by a number k is O( log k), and that any simple arithmetic operation can be done in time O( log k). It is with respect to this measure that we will evaluate the performance of algorithms in this paper. Our goal is to design efficient algorithms (in terms of logarithmic complexity) to generate uniformly at random words from certain classes of rational languages. We consider rational languages defined by their minimal finite deterministic automata. When computing complexity, neither the size of the automaton nor the cardinality of the alphabet are taken in account. In Section 2 we present some background on rational languages and their generating series. We describe briefly the classical method for generating words of such languages and we study its logarithmic complexity. We show that it is at best quadratic for most languages. This is due mainly to computations on numbers which grow exponentially with the length of the words to be generated. In order to improve significantly the efficiency of the algorithms, we must avoid handling of large numbers, or at least decrease substantially the frequency of computations on such numbers. Another alternative, briefly discussed in [7] and [12], is to compute with floating point numbers instead of integers. In this case, the logarithmic complexity is time-linear. However, using floating point numbers leads inevitably to approximations which prevent the exact uniformity of the generation. In Sections 3 and 4 we show that, in some cases, we can avoid computations on large numbers entirely or almost entirely, while keeping the exact uniformity of the generation. We determine two classes of rational languages for which this is the case. Section 3 concerns languages whose associated generating series have a unique singularity. We present a simple version of the classical algorithm, which totally avoids handling of large numbers. The logarithmic complexity of the method is O( n log n) in time and O( log n) in memory space. Section 4 focuses on languages whose associated generating series have the following property: there exists a unique singularity of minimum modulus, and this singularity is simple. For such languages we give a probabilistic version of the classical algorithm which generates words randomly while avoiding most computations on large numbers. This method needs a preprocessing stage, which can be done in polynomial time and linear space in terms of the length n of the words. Following preprocessing, any word of length n can be generated in average linear time and space.
AbstractList Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur n des mots engendrés. Le premier algorithme est dédié aux langages dont les séries génératrices possèdent un unique pôle, éventuellement multiple; sa complexité en temps est de l'ordre de n log n, et l'espace mémoire occupé est en log n. Le second algorithme est réservé aux langages dont les séries génératrices possèdent la propriété suivante: il existe un unique pôle de plus petit module, et ce pôle est simple. Après un pré-traitement en temps polynomial en n, le tirage aléatoire de tout mot s'effectue en temps moyen et espace linéaires. The problem of generating uniformly at random words of a given language has been the subject of extensive study in the last few years. An important part of that work is devoted to the generation of words of context-free languages (see, e.g., [6, 8, 9, 12]). For a given integer n > 0, the words of length n > 0 of any unambiguous context-free language can be generated uniformly at random by using algorithms derived from the general method which was introduced by Wilf [14, 15] and systematized by Flajolet et al. [7]. Clearly, this can be applied to the set of rational languages, which constitute an important special case of context-free languages. Most authors use the uniform measure of complexity (see [1]) in order to compute the complexity of the algorithms of generation. This measure is based on the following hypotheses: any simple arithmetic operation (addition, multiplication) has time cost 0(1), and a constant amount of memory space is taken by any number. Thus, we know that words of any rational language can be generated by using an algorithm which, with respect to the uniform measure of complexity, runs in linear time (in terms of the length of the words) and constant space [9]. This measure is realistic only if there is a reasonable bound on the numbers involved in the operations. However, the classical random generation algorithms involve operations on numbers which grow exponentially in terms of the length of the words to be generated. Moreover, the programs which make use of these algorithms are generally used to generate very large words, for example for the purpose of studying the asymptotic behavior of some parameters. Therefore, the uniform measure does not reflect the real behavior of such programs. It turns out that the logarithmic measure of complexity is much more realistic: one assumes that the space taken by a number k is O( log k), and that any simple arithmetic operation can be done in time O( log k). It is with respect to this measure that we will evaluate the performance of algorithms in this paper. Our goal is to design efficient algorithms (in terms of logarithmic complexity) to generate uniformly at random words from certain classes of rational languages. We consider rational languages defined by their minimal finite deterministic automata. When computing complexity, neither the size of the automaton nor the cardinality of the alphabet are taken in account. In Section 2 we present some background on rational languages and their generating series. We describe briefly the classical method for generating words of such languages and we study its logarithmic complexity. We show that it is at best quadratic for most languages. This is due mainly to computations on numbers which grow exponentially with the length of the words to be generated. In order to improve significantly the efficiency of the algorithms, we must avoid handling of large numbers, or at least decrease substantially the frequency of computations on such numbers. Another alternative, briefly discussed in [7] and [12], is to compute with floating point numbers instead of integers. In this case, the logarithmic complexity is time-linear. However, using floating point numbers leads inevitably to approximations which prevent the exact uniformity of the generation. In Sections 3 and 4 we show that, in some cases, we can avoid computations on large numbers entirely or almost entirely, while keeping the exact uniformity of the generation. We determine two classes of rational languages for which this is the case. Section 3 concerns languages whose associated generating series have a unique singularity. We present a simple version of the classical algorithm, which totally avoids handling of large numbers. The logarithmic complexity of the method is O( n log n) in time and O( log n) in memory space. Section 4 focuses on languages whose associated generating series have the following property: there exists a unique singularity of minimum modulus, and this singularity is simple. For such languages we give a probabilistic version of the classical algorithm which generates words randomly while avoiding most computations on large numbers. This method needs a preprocessing stage, which can be done in polynomial time and linear space in terms of the length n of the words. Following preprocessing, any word of length n can be generated in average linear time and space.
Author Denise, Alain
Author_xml – sequence: 1
  givenname: Alain
  surname: Denise
  fullname: Denise, Alain
  email: denise@labri.u-bordeaux.fr
  organization: LaBRI, Université Bordeaux I, URA CNRS 1304, 351 cours de la Liberation, F-33405 Talence, France
BookMark eNo9UEtOwzAUtFCRaAs3YJElLAx2nj_xBoSqUpAqsYG15TjPVVDqSHHgTjlHLkZCEbOZzWh-K7KIbURCrjm744yrewZMUDBa3hh5y1jOGFVnZMkLbWieG7Egy3_JBVml9MkmSK2W5HE3DnEcOtfXbcxcMw6ub-sOs69Yh7Y7YlZhdmz7NHPj4sEdMGUnecQmXZLz4JqEV3-8Jh_P2_fNC92_7V43T3uKPIeelmhc0E4JCLL0UBhQBTLjgxEAujBCODBQOaGDliVyD6zg6IvgFPAy5LAmDyffKRO_a-xs8jVGj9VU1ve2amvLmZ3vsPNWO2-1RtrfO6yCH8w_V3g
ContentType Journal Article
Copyright 1996
Copyright_xml – notice: 1996
DBID 6I.
AAFTH
DOI 10.1016/0304-3975(95)00200-6
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
Computer Science
EISSN 1879-2294
EndPage 63
ExternalDocumentID 0304397595002006
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29Q
4.4
457
4G.
5VS
6I.
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABAOU
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABTAH
ABVKL
ABXDB
ABYKQ
ACAZW
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADMUD
AEBSH
AEKER
AENEX
AEXQZ
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ARUGR
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG9
M26
M41
MHUIS
MO0
N9A
NCXOZ
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SCC
SDF
SDG
SES
SEW
SPC
SPCBC
SSV
SSW
SSZ
T5K
TAE
TN5
WH7
WUQ
XJT
YNT
ZMT
ZY4
~G-
ID FETCH-LOGICAL-e123t-be9af7a643f5bc389368e09cf943378944a393da47f75be1c3081ec8fa631bf23
IEDL.DBID .~1
ISSN 0304-3975
IngestDate Fri Feb 23 02:21:01 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language French
License http://www.elsevier.com/open-access/userlicense/1.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-e123t-be9af7a643f5bc389368e09cf943378944a393da47f75be1c3081ec8fa631bf23
OpenAccessLink https://www.sciencedirect.com/science/article/pii/0304397595002006
PageCount 21
ParticipantIDs elsevier_sciencedirect_doi_10_1016_0304_3975_95_00200_6
PublicationCentury 1900
PublicationDate 1996-05-28
PublicationDateYYYYMMDD 1996-05-28
PublicationDate_xml – month: 05
  year: 1996
  text: 1996-05-28
  day: 28
PublicationDecade 1990
PublicationTitle Theoretical computer science
PublicationYear 1996
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Berstel, Reutenauer (BIB4) 1984
Goldwurm (BIB8) 1995; 54
Zimmermann (BIB16) 1994; 1
Hickey, Cohen (BIB9) 1983; 12
Hochstättler, Loebl, Moll (BIB10) 1993
Flajolet, Goldwurm, Steyaert (BIB6) 1990
Wilf (BIB15) 1977; 24
Nijenhuis, Wilf (BIB14) 1979
Mairson (BIB12) 1994; 49
Aho, Hopcroft, Ullman (BIB1) 1974
Flajolet, Zimmermann, Van Cutsem (BIB7) 1994; 132
Mignotte (BIB13) 1977
Devroye (BIB5) 1986
Knuth (BIB11) 1969; Vol. 2
Autebert (BIB2) 1987
Barcucci, Pinzani, Sprugnoli (BIB3) 1991
References_xml – year: 1974
  ident: BIB1
  article-title: The Design and Analysis of Computer algorithms
– volume: 132
  start-page: 1
  year: 1994
  end-page: 35
  ident: BIB7
  article-title: A calculus for the random generation of labelled combinatorial structures
  publication-title: Theoret. Comput. Sci.
– year: 1991
  ident: BIB3
  article-title: Génération aléatoire des animaux dirigés
– volume: 12
  start-page: 645
  year: 1983
  end-page: 655
  ident: BIB9
  article-title: Uniform random generation of strings in a context-free language
  publication-title: SIAM. J. Comput.
– year: 1986
  ident: BIB5
  article-title: Non-uniform Random Variate Generation
– volume: 49
  start-page: 95
  year: 1994
  end-page: 99
  ident: BIB12
  article-title: Generating words in a context free language uniformly at random
  publication-title: Inform. Process. Lett.
– year: 1984
  ident: BIB4
  article-title: Les séries rationnelles et leurs langages
– volume: 24
  start-page: 281
  year: 1977
  end-page: 291
  ident: BIB15
  article-title: A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects
  publication-title: Adv. Math.
– year: 1979
  ident: BIB14
  article-title: Combinatorial Algorithms
– year: 1990
  ident: BIB6
  article-title: Random generation and context-free languages
– start-page: 267
  year: 1993
  end-page: 278
  ident: BIB10
  article-title: Generating convex polyominoes at random
  publication-title: Actes du 5
– year: 1987
  ident: BIB2
  article-title: Langages algébriques
– volume: 1
  start-page: 38
  year: 1994
  end-page: 46
  ident: BIB16
  article-title: Gaïa: a package for the random generation of combinatorial structures
  publication-title: Maple Tech.
– volume: 54
  start-page: 229
  year: 1995
  end-page: 233
  ident: BIB8
  article-title: Random generation of words in an algebraic language in linear binary space
  publication-title: Inform. Process. Lett.
– volume: Vol. 2
  year: 1969
  ident: BIB11
  article-title: The Art of Computer Programming
  publication-title: Seminumerical algorithms
– year: 1977
  ident: BIB13
  article-title: Séries rationnelles en une variable
  publication-title: Séries formelles en variables non commutatives et applications, actes de la cinquième Ecole de Printemps d'informatique théorique
SSID ssj0000576
Score 1.5440165
Snippet Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité...
SourceID elsevier
SourceType Publisher
StartPage 43
Title Génération aléatoire uniforme de mots de langages rationnels
URI https://dx.doi.org/10.1016/0304-3975(95)00200-6
Volume 159
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier ScienceDirect Freedom Collection Journals
  customDbUrl:
  eissn: 1879-2294
  dateEnd: 20210913
  omitProxy: true
  ssIdentifier: ssj0000576
  issn: 0304-3975
  databaseCode: ACRLP
  dateStart: 19950109
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect (Elsevier)
  customDbUrl:
  eissn: 1879-2294
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000576
  issn: 0304-3975
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Journal Collection
  customDbUrl:
  eissn: 1879-2294
  dateEnd: 20210506
  omitProxy: true
  ssIdentifier: ssj0000576
  issn: 0304-3975
  databaseCode: AIKHN
  dateStart: 19950109
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect Open Access Journals (Elsevier)
  customDbUrl:
  eissn: 1879-2294
  dateEnd: 20210929
  omitProxy: true
  ssIdentifier: ssj0000576
  issn: 0304-3975
  databaseCode: IXB
  dateStart: 19750601
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1879-2294
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000576
  issn: 0304-3975
  databaseCode: AKRWK
  dateStart: 19750601
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV27TsMwFLVQWWDgUUA8Kw8MMJg0tZ3EEyqI0oLaiUrdLNuxpUoQUJuu_E-_oz_GdR6lrExRLMdyzk3uOcm910boOjVM247VJKLaECaUI54WiKE0ZQL4u019ofBwFPXH7GXCJxu1MD6tsvL9pU8vvHXVElRoBl_TaeBjekCmXHAveYpVt_3iX_BI333_ZnmAHCnDlT4AAL3r6rkwCtZtN4LfFmOQaIOSNmimd4D2Kn2Iu-UUDtGWmzXRfr33Aq5exSbaHa7XW50fofvn1TJbLUtrYvW-WsK3NNwYXmS-8urD4tRiMMrcH_0vSvAic1x2z2Aqx2jce3p77JNqawRigWpyoi2gGiuQE45r40VHlNi2ME4wSuNEMKaooKlisYu5tqGhQP3WJE5FNNSuQ09QI_vM7CnCPNE8tkaBEklY6phOYVCTsjDxEUGtzlBcQyL_WEaC05V1kpgHU3owpeCyAFNG5_--8gLtlNnRnHSSS9TIZwt7BeSf61Zh3hba7g5e-yM4G0wefgCe9q6U
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELVQOQAHlgJixwcOcLDS1HYSnxAgSgttT63Um2U7tlSpBNTlo_od_THGWUq5copkOZbzRpn37JmxEbpLDdO2aTWJqDaECeWIpwViKE2ZAP5uUF8o3OtH7SF7H_HRRi2MT6ssfX_h03NvXbYEJZrB93gc-JgekCkX3Esef-r2NoiBhk_r6oyef70xj4t4pY8AQPeqfC6MgnXbveAP-SAk2uCkDZ5pHaL9UiDip2IOR2jLTevooLp8AZf_Yh3t9dYHrs6O0ePbapmtloU5sZqslrCYhi_Di8yXXn1anFoMVpn5p9-jBDcyw0X3DKZygoat18FLm5R3IxALXDMn2gKssQI94bg2XnVEiW0I4wSjNE4EY4oKmioWu5hrGxoK3G9N4lREQ-2a9BTVsq_MniHME81jaxRIkYSljukUBjUpCxMfEtTqHMUVJPKPaSR4XVlliXkwpQdTCi5zMGV08e83b9FOe9Drym6n_3GJdotUaU6ayRWqzacLew1KYK5vclP_AOwPry0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=G%C3%A9n%C3%A9ration+al%C3%A9atoire+uniforme+de+mots+de+langages+rationnels&rft.jtitle=Theoretical+computer+science&rft.au=Denise%2C+Alain&rft.date=1996-05-28&rft.pub=Elsevier+B.V&rft.issn=0304-3975&rft.eissn=1879-2294&rft.volume=159&rft.issue=1&rft.spage=43&rft.epage=63&rft_id=info:doi/10.1016%2F0304-3975%2895%2900200-6&rft.externalDocID=0304397595002006
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0304-3975&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0304-3975&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0304-3975&client=summon