Génération aléatoire uniforme de mots de langages rationnels
Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur n des mots engendrés. Le premier algorithme est dédié aux...
Saved in:
Published in | Theoretical computer science Vol. 159; no. 1; pp. 43 - 63 |
---|---|
Main Author | |
Format | Journal Article |
Language | French |
Published |
Elsevier B.V
28.05.1996
|
Online Access | Get full text |
ISSN | 0304-3975 1879-2294 |
DOI | 10.1016/0304-3975(95)00200-6 |
Cover
Abstract | Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur
n des mots engendrés. Le premier algorithme est dédié aux langages dont les séries génératrices possèdent un unique pôle, éventuellement multiple; sa complexité en temps est de l'ordre de
n
log
n, et l'espace mémoire occupé est en
log
n. Le second algorithme est réservé aux langages dont les séries génératrices possèdent la propriété suivante: il existe un unique pôle de plus petit module, et ce pôle est simple. Après un pré-traitement en temps polynomial en
n, le tirage aléatoire de tout mot s'effectue en temps moyen et espace linéaires.
The problem of generating uniformly at random words of a given language has been the subject of extensive study in the last few years. An important part of that work is devoted to the generation of words of context-free languages (see, e.g., [6, 8, 9, 12]). For a given integer
n > 0, the words of length
n > 0 of any unambiguous context-free language can be generated uniformly at random by using algorithms derived from the general method which was introduced by Wilf [14, 15] and systematized by Flajolet et al. [7]. Clearly, this can be applied to the set of rational languages, which constitute an important special case of context-free languages.
Most authors use the
uniform measure of complexity (see [1]) in order to compute the complexity of the algorithms of generation. This measure is based on the following hypotheses: any simple arithmetic operation (addition, multiplication) has time cost 0(1), and a constant amount of memory space is taken by any number. Thus, we know that words of any rational language can be generated by using an algorithm which, with respect to the uniform measure of complexity, runs in linear time (in terms of the length of the words) and constant space [9]. This measure is realistic only if there is a reasonable bound on the numbers involved in the operations. However, the classical random generation algorithms involve operations on numbers which grow exponentially in terms of the length of the words to be generated. Moreover, the programs which make use of these algorithms are generally used to generate very large words, for example for the purpose of studying the asymptotic behavior of some parameters. Therefore, the uniform measure does not reflect the real behavior of such programs. It turns out that the
logarithmic measure of complexity is much more realistic: one assumes that the space taken by a number
k is
O(
log
k), and that any simple arithmetic operation can be done in time
O(
log
k). It is with respect to this measure that we will evaluate the performance of algorithms in this paper.
Our goal is to design efficient algorithms (in terms of logarithmic complexity) to generate uniformly at random words from certain classes of rational languages. We consider rational languages defined by their minimal finite deterministic automata. When computing complexity, neither the size of the automaton nor the cardinality of the alphabet are taken in account.
In Section 2 we present some background on rational languages and their generating series. We describe briefly the classical method for generating words of such languages and we study its logarithmic complexity. We show that it is at best quadratic for most languages. This is due mainly to computations on numbers which grow exponentially with the length of the words to be generated. In order to improve significantly the efficiency of the algorithms, we must avoid handling of large numbers, or at least decrease substantially the frequency of computations on such numbers. Another alternative, briefly discussed in [7] and [12], is to compute with floating point numbers instead of integers. In this case, the logarithmic complexity is time-linear. However, using floating point numbers leads inevitably to approximations which prevent the exact uniformity of the generation.
In Sections 3 and 4 we show that, in some cases, we can avoid computations on large numbers entirely or almost entirely, while keeping the exact uniformity of the generation. We determine two classes of rational languages for which this is the case.
Section 3 concerns languages whose associated generating series have a unique singularity. We present a simple version of the classical algorithm, which totally avoids handling of large numbers. The logarithmic complexity of the method is
O(
n
log
n) in time and
O(
log
n) in memory space.
Section 4 focuses on languages whose associated generating series have the following property: there exists a unique singularity of minimum modulus, and this singularity is simple. For such languages we give a probabilistic version of the classical algorithm which generates words randomly while avoiding most computations on large numbers. This method needs a preprocessing stage, which can be done in polynomial time and linear space in terms of the length
n of the words. Following preprocessing, any word of length
n can be generated in average linear time and space. |
---|---|
AbstractList | Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité est mesurée en termes de complexité logarithmique, en fonction de la longueur
n des mots engendrés. Le premier algorithme est dédié aux langages dont les séries génératrices possèdent un unique pôle, éventuellement multiple; sa complexité en temps est de l'ordre de
n
log
n, et l'espace mémoire occupé est en
log
n. Le second algorithme est réservé aux langages dont les séries génératrices possèdent la propriété suivante: il existe un unique pôle de plus petit module, et ce pôle est simple. Après un pré-traitement en temps polynomial en
n, le tirage aléatoire de tout mot s'effectue en temps moyen et espace linéaires.
The problem of generating uniformly at random words of a given language has been the subject of extensive study in the last few years. An important part of that work is devoted to the generation of words of context-free languages (see, e.g., [6, 8, 9, 12]). For a given integer
n > 0, the words of length
n > 0 of any unambiguous context-free language can be generated uniformly at random by using algorithms derived from the general method which was introduced by Wilf [14, 15] and systematized by Flajolet et al. [7]. Clearly, this can be applied to the set of rational languages, which constitute an important special case of context-free languages.
Most authors use the
uniform measure of complexity (see [1]) in order to compute the complexity of the algorithms of generation. This measure is based on the following hypotheses: any simple arithmetic operation (addition, multiplication) has time cost 0(1), and a constant amount of memory space is taken by any number. Thus, we know that words of any rational language can be generated by using an algorithm which, with respect to the uniform measure of complexity, runs in linear time (in terms of the length of the words) and constant space [9]. This measure is realistic only if there is a reasonable bound on the numbers involved in the operations. However, the classical random generation algorithms involve operations on numbers which grow exponentially in terms of the length of the words to be generated. Moreover, the programs which make use of these algorithms are generally used to generate very large words, for example for the purpose of studying the asymptotic behavior of some parameters. Therefore, the uniform measure does not reflect the real behavior of such programs. It turns out that the
logarithmic measure of complexity is much more realistic: one assumes that the space taken by a number
k is
O(
log
k), and that any simple arithmetic operation can be done in time
O(
log
k). It is with respect to this measure that we will evaluate the performance of algorithms in this paper.
Our goal is to design efficient algorithms (in terms of logarithmic complexity) to generate uniformly at random words from certain classes of rational languages. We consider rational languages defined by their minimal finite deterministic automata. When computing complexity, neither the size of the automaton nor the cardinality of the alphabet are taken in account.
In Section 2 we present some background on rational languages and their generating series. We describe briefly the classical method for generating words of such languages and we study its logarithmic complexity. We show that it is at best quadratic for most languages. This is due mainly to computations on numbers which grow exponentially with the length of the words to be generated. In order to improve significantly the efficiency of the algorithms, we must avoid handling of large numbers, or at least decrease substantially the frequency of computations on such numbers. Another alternative, briefly discussed in [7] and [12], is to compute with floating point numbers instead of integers. In this case, the logarithmic complexity is time-linear. However, using floating point numbers leads inevitably to approximations which prevent the exact uniformity of the generation.
In Sections 3 and 4 we show that, in some cases, we can avoid computations on large numbers entirely or almost entirely, while keeping the exact uniformity of the generation. We determine two classes of rational languages for which this is the case.
Section 3 concerns languages whose associated generating series have a unique singularity. We present a simple version of the classical algorithm, which totally avoids handling of large numbers. The logarithmic complexity of the method is
O(
n
log
n) in time and
O(
log
n) in memory space.
Section 4 focuses on languages whose associated generating series have the following property: there exists a unique singularity of minimum modulus, and this singularity is simple. For such languages we give a probabilistic version of the classical algorithm which generates words randomly while avoiding most computations on large numbers. This method needs a preprocessing stage, which can be done in polynomial time and linear space in terms of the length
n of the words. Following preprocessing, any word of length
n can be generated in average linear time and space. |
Author | Denise, Alain |
Author_xml | – sequence: 1 givenname: Alain surname: Denise fullname: Denise, Alain email: denise@labri.u-bordeaux.fr organization: LaBRI, Université Bordeaux I, URA CNRS 1304, 351 cours de la Liberation, F-33405 Talence, France |
BookMark | eNo9UEtOwzAUtFCRaAs3YJElLAx2nj_xBoSqUpAqsYG15TjPVVDqSHHgTjlHLkZCEbOZzWh-K7KIbURCrjm744yrewZMUDBa3hh5y1jOGFVnZMkLbWieG7Egy3_JBVml9MkmSK2W5HE3DnEcOtfXbcxcMw6ub-sOs69Yh7Y7YlZhdmz7NHPj4sEdMGUnecQmXZLz4JqEV3-8Jh_P2_fNC92_7V43T3uKPIeelmhc0E4JCLL0UBhQBTLjgxEAujBCODBQOaGDliVyD6zg6IvgFPAy5LAmDyffKRO_a-xs8jVGj9VU1ve2amvLmZ3vsPNWO2-1RtrfO6yCH8w_V3g |
ContentType | Journal Article |
Copyright | 1996 |
Copyright_xml | – notice: 1996 |
DBID | 6I. AAFTH |
DOI | 10.1016/0304-3975(95)00200-6 |
DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Mathematics Computer Science |
EISSN | 1879-2294 |
EndPage | 63 |
ExternalDocumentID | 0304397595002006 |
GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 29Q 4.4 457 4G. 5VS 6I. 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABAOU ABBOA ABEFU ABFNM ABJNI ABMAC ABTAH ABVKL ABXDB ABYKQ ACAZW ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADMUD AEBSH AEKER AENEX AEXQZ AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q G8K GBLVA GBOLZ HVGLF HZ~ IHE IXB J1W KOM LG9 M26 M41 MHUIS MO0 N9A NCXOZ O-L O9- OAUVE OK1 OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SCC SDF SDG SES SEW SPC SPCBC SSV SSW SSZ T5K TAE TN5 WH7 WUQ XJT YNT ZMT ZY4 ~G- |
ID | FETCH-LOGICAL-e123t-be9af7a643f5bc389368e09cf943378944a393da47f75be1c3081ec8fa631bf23 |
IEDL.DBID | .~1 |
ISSN | 0304-3975 |
IngestDate | Fri Feb 23 02:21:01 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | French |
License | http://www.elsevier.com/open-access/userlicense/1.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-e123t-be9af7a643f5bc389368e09cf943378944a393da47f75be1c3081ec8fa631bf23 |
OpenAccessLink | https://www.sciencedirect.com/science/article/pii/0304397595002006 |
PageCount | 21 |
ParticipantIDs | elsevier_sciencedirect_doi_10_1016_0304_3975_95_00200_6 |
PublicationCentury | 1900 |
PublicationDate | 1996-05-28 |
PublicationDateYYYYMMDD | 1996-05-28 |
PublicationDate_xml | – month: 05 year: 1996 text: 1996-05-28 day: 28 |
PublicationDecade | 1990 |
PublicationTitle | Theoretical computer science |
PublicationYear | 1996 |
Publisher | Elsevier B.V |
Publisher_xml | – name: Elsevier B.V |
References | Berstel, Reutenauer (BIB4) 1984 Goldwurm (BIB8) 1995; 54 Zimmermann (BIB16) 1994; 1 Hickey, Cohen (BIB9) 1983; 12 Hochstättler, Loebl, Moll (BIB10) 1993 Flajolet, Goldwurm, Steyaert (BIB6) 1990 Wilf (BIB15) 1977; 24 Nijenhuis, Wilf (BIB14) 1979 Mairson (BIB12) 1994; 49 Aho, Hopcroft, Ullman (BIB1) 1974 Flajolet, Zimmermann, Van Cutsem (BIB7) 1994; 132 Mignotte (BIB13) 1977 Devroye (BIB5) 1986 Knuth (BIB11) 1969; Vol. 2 Autebert (BIB2) 1987 Barcucci, Pinzani, Sprugnoli (BIB3) 1991 |
References_xml | – year: 1974 ident: BIB1 article-title: The Design and Analysis of Computer algorithms – volume: 132 start-page: 1 year: 1994 end-page: 35 ident: BIB7 article-title: A calculus for the random generation of labelled combinatorial structures publication-title: Theoret. Comput. Sci. – year: 1991 ident: BIB3 article-title: Génération aléatoire des animaux dirigés – volume: 12 start-page: 645 year: 1983 end-page: 655 ident: BIB9 article-title: Uniform random generation of strings in a context-free language publication-title: SIAM. J. Comput. – year: 1986 ident: BIB5 article-title: Non-uniform Random Variate Generation – volume: 49 start-page: 95 year: 1994 end-page: 99 ident: BIB12 article-title: Generating words in a context free language uniformly at random publication-title: Inform. Process. Lett. – year: 1984 ident: BIB4 article-title: Les séries rationnelles et leurs langages – volume: 24 start-page: 281 year: 1977 end-page: 291 ident: BIB15 article-title: A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects publication-title: Adv. Math. – year: 1979 ident: BIB14 article-title: Combinatorial Algorithms – year: 1990 ident: BIB6 article-title: Random generation and context-free languages – start-page: 267 year: 1993 end-page: 278 ident: BIB10 article-title: Generating convex polyominoes at random publication-title: Actes du 5 – year: 1987 ident: BIB2 article-title: Langages algébriques – volume: 1 start-page: 38 year: 1994 end-page: 46 ident: BIB16 article-title: Gaïa: a package for the random generation of combinatorial structures publication-title: Maple Tech. – volume: 54 start-page: 229 year: 1995 end-page: 233 ident: BIB8 article-title: Random generation of words in an algebraic language in linear binary space publication-title: Inform. Process. Lett. – volume: Vol. 2 year: 1969 ident: BIB11 article-title: The Art of Computer Programming publication-title: Seminumerical algorithms – year: 1977 ident: BIB13 article-title: Séries rationnelles en une variable publication-title: Séries formelles en variables non commutatives et applications, actes de la cinquième Ecole de Printemps d'informatique théorique |
SSID | ssj0000576 |
Score | 1.5440165 |
Snippet | Nous donnons deux algorithmes de génération aléatoire et uniforme de mots, qui s'appliquent à des classes particulières de langages rationnels. Leur efficacité... |
SourceID | elsevier |
SourceType | Publisher |
StartPage | 43 |
Title | Génération aléatoire uniforme de mots de langages rationnels |
URI | https://dx.doi.org/10.1016/0304-3975(95)00200-6 |
Volume | 159 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier ScienceDirect Freedom Collection Journals customDbUrl: eissn: 1879-2294 dateEnd: 20210913 omitProxy: true ssIdentifier: ssj0000576 issn: 0304-3975 databaseCode: ACRLP dateStart: 19950109 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect (Elsevier) customDbUrl: eissn: 1879-2294 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000576 issn: 0304-3975 databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Journal Collection customDbUrl: eissn: 1879-2294 dateEnd: 20210506 omitProxy: true ssIdentifier: ssj0000576 issn: 0304-3975 databaseCode: AIKHN dateStart: 19950109 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect Open Access Journals (Elsevier) customDbUrl: eissn: 1879-2294 dateEnd: 20210929 omitProxy: true ssIdentifier: ssj0000576 issn: 0304-3975 databaseCode: IXB dateStart: 19750601 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVLSH databaseName: Elsevier Journals customDbUrl: mediaType: online eissn: 1879-2294 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000576 issn: 0304-3975 databaseCode: AKRWK dateStart: 19750601 isFulltext: true providerName: Library Specific Holdings |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV27TsMwFLVQWWDgUUA8Kw8MMJg0tZ3EEyqI0oLaiUrdLNuxpUoQUJuu_E-_oz_GdR6lrExRLMdyzk3uOcm910boOjVM247VJKLaECaUI54WiKE0ZQL4u019ofBwFPXH7GXCJxu1MD6tsvL9pU8vvHXVElRoBl_TaeBjekCmXHAveYpVt_3iX_BI333_ZnmAHCnDlT4AAL3r6rkwCtZtN4LfFmOQaIOSNmimd4D2Kn2Iu-UUDtGWmzXRfr33Aq5exSbaHa7XW50fofvn1TJbLUtrYvW-WsK3NNwYXmS-8urD4tRiMMrcH_0vSvAic1x2z2Aqx2jce3p77JNqawRigWpyoi2gGiuQE45r40VHlNi2ME4wSuNEMKaooKlisYu5tqGhQP3WJE5FNNSuQ09QI_vM7CnCPNE8tkaBEklY6phOYVCTsjDxEUGtzlBcQyL_WEaC05V1kpgHU3owpeCyAFNG5_--8gLtlNnRnHSSS9TIZwt7BeSf61Zh3hba7g5e-yM4G0wefgCe9q6U |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELVQOQAHlgJixwcOcLDS1HYSnxAgSgttT63Um2U7tlSpBNTlo_od_THGWUq5copkOZbzRpn37JmxEbpLDdO2aTWJqDaECeWIpwViKE2ZAP5uUF8o3OtH7SF7H_HRRi2MT6ssfX_h03NvXbYEJZrB93gc-JgekCkX3Esef-r2NoiBhk_r6oyef70xj4t4pY8AQPeqfC6MgnXbveAP-SAk2uCkDZ5pHaL9UiDip2IOR2jLTevooLp8AZf_Yh3t9dYHrs6O0ePbapmtloU5sZqslrCYhi_Di8yXXn1anFoMVpn5p9-jBDcyw0X3DKZygoat18FLm5R3IxALXDMn2gKssQI94bg2XnVEiW0I4wSjNE4EY4oKmioWu5hrGxoK3G9N4lREQ-2a9BTVsq_MniHME81jaxRIkYSljukUBjUpCxMfEtTqHMUVJPKPaSR4XVlliXkwpQdTCi5zMGV08e83b9FOe9Drym6n_3GJdotUaU6ayRWqzacLew1KYK5vclP_AOwPry0 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=G%C3%A9n%C3%A9ration+al%C3%A9atoire+uniforme+de+mots+de+langages+rationnels&rft.jtitle=Theoretical+computer+science&rft.au=Denise%2C+Alain&rft.date=1996-05-28&rft.pub=Elsevier+B.V&rft.issn=0304-3975&rft.eissn=1879-2294&rft.volume=159&rft.issue=1&rft.spage=43&rft.epage=63&rft_id=info:doi/10.1016%2F0304-3975%2895%2900200-6&rft.externalDocID=0304397595002006 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0304-3975&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0304-3975&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0304-3975&client=summon |