An empirical study of domain knowledge and its benefits to substructure discovery

Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting c...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 9; no. 4; pp. 575 - 586
Main Authors Djoko, S., Cook, D.J., Holder, L.B.
Format Journal Article
LanguageEnglish
Published New York, NY IEEE 01.07.1997
IEEE Computer Society
Subjects
Online AccessGet full text
ISSN1041-4347
DOI10.1109/69.617051

Cover

Abstract Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting characteristics specific to the domain. The paper presents a method for guiding the discovery process with domain specific knowledge. The SUBDUE discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Domain knowledge is incorporated into SUBDUE following a single general methodology to guide the discovery process. Results show that domain specific knowledge improves the search for substructures that are useful to the domain and leads to greater compression of the data. To illustrate these benefits, examples and experiments from the computer programming, computer aided design circuit, and artificially generated domains are presented.
AbstractList Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting characteristics specific to the domain. The paper presents a method for guiding the discovery process with domain specific knowledge. The SUBDUE discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Domain knowledge is incorporated into SUBDUE following a single general methodology to guide the discovery process. Results show that domain specific knowledge improves the search for substructures that are useful to the domain and leads to greater compression of the data. To illustrate these benefits, examples and experiments from the computer programming, computer aided design circuit, and artificially generated domains are presented
Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting characteristics specific to the domain. The paper presents a method for guiding the discovery process with domain specific knowledge. The SUBDUE discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Domain knowledge is incorporated into SUBDUE following a single general methodology to guide the discovery process. Results show that domain specific knowledge improves the search for substructures that are useful to the domain and leads to greater compression of the data. To illustrate these benefits, examples and experiments from the computer programming, computer aided design circuit, and artificially generated domains are presented.
Author Djoko, S.
Holder, L.B.
Cook, D.J.
Author_xml – sequence: 1
  givenname: S.
  surname: Djoko
  fullname: Djoko, S.
  organization: Dept. of Comput. Sci. & Eng., Texas Univ., Arlington, TX, USA
– sequence: 2
  givenname: D.J.
  surname: Cook
  fullname: Cook, D.J.
– sequence: 3
  givenname: L.B.
  surname: Holder
  fullname: Holder, L.B.
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=2816667$$DView record in Pascal Francis
BookMark eNqN0M9LwzAUB_AcJrhND1495SCCh86kTdPmOIa_YCCCnkuavEq0S2aSKvvvzdzYQTwIgeSRz3s8vhM0ss4CQmeUzCgl4pqLGacVKekIjSlhNGMFq47RJIQ3Qkhd1XSMnuYWw2ptvFGyxyEOeoNdh7VbSWPxu3VfPehXwNJqbGLALVjoto_ocBjaEP2g4uABaxOU-wS_OUFHnewDnO7vKXq5vXle3GfLx7uHxXyZqaKoYsYpA6igLijLhZKdIESUpCU5k0xXVNel1LotU9WpPK0quGiT6uqWEE04Kabocjd37d3HACE2q7QC9L204IbQ5IJRWv4H1oyJvBAJXuyhDCmNzkurTGjW3qyk3yRHOedVYlc7prwLwUN3EJQ02-Abns5P8Mle_7LKRBmNs9FL0__Zcb7rMABwmLz__AbA1ZC5
CODEN ITKEEH
CitedBy_id crossref_primary_10_3182_20130619_3_RU_3018_00442
crossref_primary_10_7717_peerj_1558
crossref_primary_10_1109_TKDE_2002_1019211
crossref_primary_10_1016_j_procs_2015_08_177
crossref_primary_10_1109_51_940050
crossref_primary_10_1109_TKDE_2005_100
crossref_primary_10_1142_S0218213001000441
crossref_primary_10_1109_5254_850825
Cites_doi 10.1109/TSMC.1983.6313167
10.1162/neco.1989.1.1.82
10.1007/BF00054839
10.1016/0167-8655(83)90033-8
10.1016/B978-0-934613-64-4.50011-6
10.1007/BF00962235
10.1109/TPAMI.1985.4767707
10.1613/jair.43
10.1007/BF00114265
10.1007/BF00116251
10.1016/0890-5401(89)90010-2
10.1007/3-540-57370-4_59
ContentType Journal Article
Copyright 1997 INIST-CNRS
Copyright_xml – notice: 1997 INIST-CNRS
DBID AAYXX
CITATION
IQODW
7SC
8FD
JQ2
L7M
L~C
L~D
7TB
FR3
DOI 10.1109/69.617051
DatabaseName CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Mechanical & Transportation Engineering Abstracts
Engineering Research Database
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
Mechanical & Transportation Engineering Abstracts
Engineering Research Database
DatabaseTitleList Technology Research Database
Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Applied Sciences
EndPage 586
ExternalDocumentID 2816667
10_1109_69_617051
617051
GroupedDBID -~X
.DC
0R~
1OL
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TAF
TN5
UHB
VH1
AAYXX
CITATION
IQODW
RIG
7SC
8FD
JQ2
L7M
L~C
L~D
7TB
FR3
ID FETCH-LOGICAL-c337t-614ee7e831429caf900950b024a4d71d85addb54a4fc2781969baf9f8b00d0603
IEDL.DBID RIE
ISSN 1041-4347
IngestDate Sun Sep 28 01:05:35 EDT 2025
Thu Oct 02 10:42:08 EDT 2025
Mon Jul 21 09:10:25 EDT 2025
Thu Apr 24 22:53:15 EDT 2025
Wed Oct 01 03:49:39 EDT 2025
Wed Aug 27 02:52:18 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords System architecture
Circuit design
Domain knowledge
Data compression
Expert system
Knowledge representation
Knowledge base
Data mining
Algorithm
Computational complexity
Graph
System performance
Coding
Database
Minimum description lenght principle
Inference rule
Computer aided design
Software engineering
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c337t-614ee7e831429caf900950b024a4d71d85addb54a4fc2781969baf9f8b00d0603
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
PQID 28449239
PQPubID 23500
PageCount 12
ParticipantIDs proquest_miscellaneous_29411560
ieee_primary_617051
crossref_primary_10_1109_69_617051
pascalfrancis_primary_2816667
proquest_miscellaneous_28449239
crossref_citationtrail_10_1109_69_617051
ProviderPackageCode CITATION
AAYXX
PublicationCentury 1900
PublicationDate 1997-07-01
PublicationDateYYYYMMDD 1997-07-01
PublicationDate_xml – month: 07
  year: 1997
  text: 1997-07-01
  day: 01
PublicationDecade 1990
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 1997
Publisher IEEE
IEEE Computer Society
Publisher_xml – name: IEEE
– name: IEEE Computer Society
References conklin (bibk05757) 1992
levinson (bibk05758) 1984
derthick (bibk057521) 1991
bibk05754
bibk05755
rissanen (bibk057517) 1989
bibk05756
rao (bibk057522) 1992
bruton (bibk057523) 1980
segen (bibk05759) 1988
cook (bibk057516) 1994; 1
(bibk05753) 1991
bibk057518
thompson (bibk057510) 1991
pednault (bibk057519) 1989
bibk05751
bibk057520
bibk05752
winston (bibk057511) 1975
bibk057514
bibk057515
bibk057512
bibk057513
References_xml – year: 1991
  ident: bibk057510
  article-title: concept formation in structured domains
  publication-title: Concept Formulation Knowledge and Experience in Unsupervised Learning
– start-page: 111
  year: 1992
  ident: bibk05757
  article-title: discovery of spatial concepts in crystallographic databases
  publication-title: Proc Ninth Int l Machine Learning Workshop
– ident: bibk057514
  doi: 10.1109/TSMC.1983.6313167
– start-page: 1,603
  year: 1989
  ident: bibk057519
  article-title: some experiments in applying inductive inference principles to surface reconstruction
  publication-title: Proc Int l Joint Conf Artificial Intelligence
– ident: bibk057520
  doi: 10.1162/neco.1989.1.1.82
– start-page: 157
  year: 1975
  ident: bibk057511
  article-title: learning structural descriptions from examples
  publication-title: The Psychology of Computer Vision
– ident: bibk057518
  doi: 10.1007/BF00054839
– ident: bibk057513
  doi: 10.1016/0167-8655(83)90033-8
– ident: bibk05751
  doi: 10.1016/B978-0-934613-64-4.50011-6
– ident: bibk05756
  doi: 10.1007/BF00962235
– ident: bibk057512
  doi: 10.1109/TPAMI.1985.4767707
– year: 1991
  ident: bibk05753
  publication-title: Knowledge Discovery in Databases
– start-page: 565
  year: 1991
  ident: bibk057521
  article-title: a minimal encoding approach to feature discovery
  publication-title: Proc Ninth Nat l Conf Artificial Intelligence
– start-page: 29
  year: 1988
  ident: bibk05759
  article-title: learning graph models of shape
  publication-title: Proc Fifth Int l Conf Machine Learning
– start-page: 717
  year: 1992
  ident: bibk057522
  article-title: learning engineering models with the minimum description length principle
  publication-title: Proc 10th Nat l Conf Artificial Intelligence
– start-page: 203
  year: 1984
  ident: bibk05758
  article-title: a self-organizing retrieval system for graphs
  publication-title: Proc Second Nat l Conf Artificial Intelligence
– volume: 1
  start-page: 231
  year: 1994
  ident: bibk057516
  article-title: substructure discovery using minimum description length and background knowledge
  publication-title: J Artificial Intelligence Research
  doi: 10.1613/jair.43
– year: 1980
  ident: bibk057523
  publication-title: RC-Active Circuits Theory and Design
– ident: bibk05752
  doi: 10.1007/BF00114265
– year: 1989
  ident: bibk057517
  publication-title: Stochastic Complexity in Stastistical Inquiry
– ident: bibk05754
  doi: 10.1007/BF00116251
– ident: bibk05755
  doi: 10.1016/0890-5401(89)90010-2
– ident: bibk057515
  doi: 10.1007/3-540-57370-4_59
SSID ssj0008781
Score 1.618562
Snippet Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However,...
SourceID proquest
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 575
SubjectTerms Acceleration
Applied sciences
Artificial intelligence
Circuits
Computer science; control theory; systems
Data analysis
Data compression
Design automation
Design methodology
Exact sciences and technology
Information retrieval. Graph
Information systems. Data bases
Learning and adaptive systems
Memory organisation. Data processing
Process design
Programming
Software
Theoretical computing
Transaction databases
Title An empirical study of domain knowledge and its benefits to substructure discovery
URI https://ieeexplore.ieee.org/document/617051
https://www.proquest.com/docview/28449239
https://www.proquest.com/docview/29411560
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  issn: 1041-4347
  databaseCode: RIE
  dateStart: 19890101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://ieeexplore.ieee.org/
  omitProxy: false
  ssIdentifier: ssj0008781
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA7qSQ8-VsX1GcSDl9Y-sklzXMRFBAXBhb2VtElg0W3Fdg_6651Ju8VVEaGHtJ0Smknmkcx8Q8gF02A1RDzxFFPSAw0VeVkeCQ9MWyUtWOycY-7w_QO_HbO7yWDS4my7XBhjjAs-Mz423Vm-LvM5bpVdIXg4pkuvioQ3qVqd0E2Eq0cKzgW4RDETLYhQGMgrLv3mwyXV42qpYCSkqmAwbFPF4odAdlpmtNWkb1cOnBCDS579eZ35-cc36MZ__sA22WytTTpspscOWTFFj2wtKjnQdmH3yMYXWMJd8jgsqJm9Th16CHUAtLS0VJczNS1otwtHVaHptK5oBvLSYqMuaQWCyEHSzt8MxZRfDBF93yPj0c3T9a3Xll7w8jgWNTiUzBhhkjgEfZUrK9EUCzJQ6IppEepkAHIxG8CdBc4miLGTAZVNYBXrgAfxPlkrysIcECoY1yGSgmXAdJiogVSx4pbbCNpx3CeXC66keYtLjuUxXlLnnwQy5XC5geuT8470tQHj-I2ohwPfESyenixxunsduYNT0SdnC86nsMDw1EQVppxXQMAQxE7-QSFZiAnph7_2fETWG8hbDPE9JmvAA3MChkydnbop_Akqv_Dg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEA6iB_Xg6qq46moQD1669pGmzVHEZX2CoOCtpE0Ci2672O5Bf70zabf4QoQe0nZKaCaZRzLzDSHHTIHV4PPYkUwKBzSU76SZHzlg2kphwGLnHHOHb-_46JFdPYVPDc62zYXRWtvgMz3Apj3LV0U2w62yUwQPx3TppZAxFtbJWq3YjSNbkRTcC3CKAhY1MEKeK065GNSfflE-tpoKxkLKEobD1HUsfohkq2eGnTqBu7TwhBhe8jyYVekge_8G3vjPX1gna429Sc_qCbJBFnTeJZ15LQfaLO0uWf0ETLhJ7s9yqifTscUPoRaClhaGqmIixzlt9-GozBUdVyVNQWIabFQFLUEUWVDa2aummPSLQaJvW-RxePFwPnKa4gtOFgRRBS4l0zrSceCBxsqkEWiMuSmodMlU5Kk4BMmYhnBngLcxouykQGViWMfK5W6wTRbzItc7hEaMKw9JwTZgyotlKGQgueHGh3YQ9MjJnCtJ1iCTY4GMl8R6KK5IOFx24HrkqCWd1nAcvxF1ceBbgvnT_hdOt699e3Qa9cjhnPMJLDE8N5G5LmYlEDCEsRN_UAjmYUr67q89H5Ll0cPtTXJzeXe9R1ZqAFwM-N0ni8AP3QezpkoP7HT-AOYr9C0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+empirical+study+of+domain+knowledge+and+its+benefits+to+substructure+discovery&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Djoko%2C+S.&rft.au=Cook%2C+D.J.&rft.au=Holder%2C+L.B.&rft.date=1997-07-01&rft.pub=IEEE&rft.issn=1041-4347&rft.volume=9&rft.issue=4&rft.spage=575&rft.epage=586&rft_id=info:doi/10.1109%2F69.617051&rft.externalDocID=617051
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon