Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression
In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation th...
        Saved in:
      
    
          | Published in | IEEE transactions on pattern analysis and machine intelligence Vol. 29; no. 9; pp. 1546 - 1562 | 
|---|---|
| Main Authors | , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        Los Alamitos, CA
          IEEE
    
        01.09.2007
     IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0162-8828 1939-3539  | 
| DOI | 10.1109/TPAMI.2007.1085 | 
Cover
| Abstract | In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data. | 
    
|---|---|
| AbstractList | [...] we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data. In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed [abstract truncated by publisher]. In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data. In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm which depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data. In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm which depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm which depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.  | 
    
| Author | Wei Hong Yi Ma Derksen, H. Wright, J.  | 
    
| Author_xml | – sequence: 1 surname: Yi Ma fullname: Yi Ma organization: Univ. of Illinois at Urbana-Champaign, Urbana – sequence: 2 givenname: H. surname: Derksen fullname: Derksen, H. – sequence: 3 surname: Wei Hong fullname: Wei Hong – sequence: 4 givenname: J. surname: Wright fullname: Wright, J.  | 
    
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18972985$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/17627043$$D View this record in MEDLINE/PubMed  | 
    
| BookMark | eNqFkt1rFDEUxYNU7Lb67IMgQ0F9mm2-J3ks61dhFwXrc7ibuVNSZmfWZKbY_96Ms1YoaMlDEvidk3DPOSFHXd8hIS8ZXTJG7fnV14vN5ZJTWi0ZNeoJWTArbCmUsEdkQZnmpTHcHJOTlG4oZVJR8Ywcs0rzikqxIJtveL3DboAh9F3RN8VmbIdwCzHAgMUm_MS6eA8DFLcBinWf0t18XfV16K4L6Op83O0jppQNnpOnDbQJXxz2U_L944er1edy_eXT5epiXXqpq6H0rFYSt8ZawfSWS6ObRnOmva1F_r5oECQqlEKZxnrjveGSA7eKATRegzgl72bffex_jJgGtwvJY9tCh_2YnKVCS55Vj5LGUG3z0zyTb_9LVnRamj0KCikV57TK4NkD8KYfY5cH44yWbArBZuj1ARq3O6zdPoYdxDv3J6IMvDkAkDy0TYTOh_SXM7bi1qjMqZnzMccUsXE-zLEOEULrGHVTZdzvyripMm6qTNadP9DdW_9T8WpWBES8p_O8DVOV-AWghMg7 | 
    
| CODEN | ITPIDJ | 
    
| CitedBy_id | crossref_primary_10_1016_j_imavis_2015_01_005 crossref_primary_10_1007_s11063_020_10306_8 crossref_primary_10_1007_s11265_015_1030_4 crossref_primary_10_1109_TIT_2022_3196707 crossref_primary_10_1016_j_neucom_2013_05_017 crossref_primary_10_1007_s11263_024_02145_0 crossref_primary_10_3724_SP_J_1004_2010_01661 crossref_primary_10_1016_j_patcog_2012_06_011 crossref_primary_10_1137_070707312 crossref_primary_10_1109_TNNLS_2021_3084976 crossref_primary_10_1016_j_neucom_2019_02_055 crossref_primary_10_1109_TCSVT_2016_2595328 crossref_primary_10_1007_s13042_019_00999_2 crossref_primary_10_1016_j_neucom_2018_08_054 crossref_primary_10_1007_s00521_016_2353_1 crossref_primary_10_1109_TPAMI_2018_2871850 crossref_primary_10_1109_TNNLS_2016_2553155 crossref_primary_10_1109_ACCESS_2019_2925196 crossref_primary_10_1109_TASLP_2023_3302237 crossref_primary_10_1016_j_patrec_2009_07_020 crossref_primary_10_1109_ACCESS_2019_2906332 crossref_primary_10_1016_j_patcog_2012_11_021 crossref_primary_10_1007_s11263_013_0694_0 crossref_primary_10_1016_j_cviu_2012_05_002 crossref_primary_10_1016_j_ins_2022_07_049 crossref_primary_10_1109_TIP_2013_2273665 crossref_primary_10_1016_j_patrec_2013_08_006 crossref_primary_10_1016_j_eswa_2023_119977 crossref_primary_10_1109_JSAC_2025_3531575 crossref_primary_10_1016_j_neunet_2025_107129 crossref_primary_10_1109_TPAMI_2012_88 crossref_primary_10_1109_TIP_2018_2859628 crossref_primary_10_1016_j_eswa_2024_125375 crossref_primary_10_1111_mice_12063 crossref_primary_10_1016_j_patrec_2017_08_023 crossref_primary_10_1109_TPAMI_2011_130 crossref_primary_10_32604_cmc_2024_050920 crossref_primary_10_1109_TIP_2010_2047903 crossref_primary_10_1016_j_neucom_2017_03_071 crossref_primary_10_1016_j_patcog_2013_04_013 crossref_primary_10_1109_TKDE_2013_114 crossref_primary_10_1109_TIP_2017_2736603 crossref_primary_10_4304_jsw_8_3_547_553 crossref_primary_10_1109_TIT_2016_2573311 crossref_primary_10_1109_TIP_2009_2018002 crossref_primary_10_1142_S0218001413500043 crossref_primary_10_1109_TGRS_2024_3493415 crossref_primary_10_1109_TRO_2016_2552548 crossref_primary_10_1016_j_measurement_2019_107432 crossref_primary_10_1631_FITEE_2200297 crossref_primary_10_1109_JSAIT_2020_3039170 crossref_primary_10_1016_j_ins_2018_05_032 crossref_primary_10_1016_j_neucom_2019_12_019 crossref_primary_10_1016_j_neucom_2023_127012 crossref_primary_10_1109_TKDE_2023_3303343 crossref_primary_10_1109_MSP_2010_940005 crossref_primary_10_1109_TPAMI_2016_2614980 crossref_primary_10_1109_JPROC_2009_2037655 crossref_primary_10_3389_fnins_2023_1252179 crossref_primary_10_3390_sym16091216 crossref_primary_10_1007_s10044_022_01101_3 crossref_primary_10_1016_j_neunet_2025_107251 crossref_primary_10_1016_j_neucom_2018_10_012 crossref_primary_10_1016_j_eswa_2024_124619 crossref_primary_10_1109_TIT_2018_2879912 crossref_primary_10_1109_TPAMI_2009_191 crossref_primary_10_1137_060655523 crossref_primary_10_1049_iet_spr_2012_0191 crossref_primary_10_1016_j_knosys_2016_11_013 crossref_primary_10_1109_TPAMI_2015_2513407 crossref_primary_10_1016_j_neucom_2017_10_060 crossref_primary_10_1016_j_patrec_2010_09_016 crossref_primary_10_1016_j_eswa_2019_06_047 crossref_primary_10_1016_j_patcog_2018_07_002 crossref_primary_10_1109_TIP_2019_2917857 crossref_primary_10_1109_JSTSP_2018_2879743 crossref_primary_10_1016_j_jvcir_2016_03_017 crossref_primary_10_1109_TNNLS_2018_2876327 crossref_primary_10_1016_j_neucom_2015_05_069 crossref_primary_10_1016_j_patcog_2025_111557 crossref_primary_10_1016_j_neunet_2020_06_022 crossref_primary_10_1007_s11263_009_0314_1 crossref_primary_10_3390_math10060940 crossref_primary_10_1016_j_icarus_2023_115797 crossref_primary_10_1016_j_patcog_2009_01_010 crossref_primary_10_1109_TCYB_2013_2286106 crossref_primary_10_1016_j_patrec_2011_01_013 crossref_primary_10_1145_2037676_2037688 crossref_primary_10_1109_TAFFC_2016_2554556 crossref_primary_10_1016_j_eswa_2021_116359 crossref_primary_10_1109_TPAMI_2010_146 crossref_primary_10_1109_TKDE_2020_2995896 crossref_primary_10_1016_j_neucom_2024_129065 crossref_primary_10_1109_TPAMI_2017_2739147 crossref_primary_10_1007_s11263_011_0444_0 crossref_primary_10_1109_TCYB_2014_2361489 crossref_primary_10_1109_ACCESS_2019_2920592 crossref_primary_10_1109_MSP_2010_939739 crossref_primary_10_1109_TIP_2008_920761 crossref_primary_10_1007_s10440_008_9398_9 crossref_primary_10_1007_s11063_018_9783_y crossref_primary_10_1145_3663483 crossref_primary_10_1016_j_sigpro_2013_02_010 crossref_primary_10_1109_TWC_2024_3461336 crossref_primary_10_1109_TCSVT_2018_2793359 crossref_primary_10_1007_s10208_009_9043_7 crossref_primary_10_1007_s11042_009_0346_0 crossref_primary_10_1016_j_neunet_2017_08_001 crossref_primary_10_1007_s41870_020_00559_w crossref_primary_10_1016_j_ins_2024_121058 crossref_primary_10_1109_TPAMI_2014_2377740 crossref_primary_10_1016_j_image_2017_12_011 crossref_primary_10_1007_s11263_008_0144_6 crossref_primary_10_1109_TSP_2016_2613070 crossref_primary_10_1371_journal_pone_0059377 crossref_primary_10_1109_TNNLS_2014_2306063 crossref_primary_10_1016_j_inffus_2013_10_012 crossref_primary_10_1016_j_neunet_2025_107173 crossref_primary_10_1137_24M1655093 crossref_primary_10_3390_e24040456 crossref_primary_10_1007_s10044_022_01085_0 crossref_primary_10_1016_j_neucom_2018_01_006 crossref_primary_10_1016_j_neucom_2017_12_033 crossref_primary_10_1109_TMM_2022_3207922 crossref_primary_10_1007_s11263_012_0535_6 crossref_primary_10_1007_s11063_017_9726_z crossref_primary_10_1214_12_AOS1034 crossref_primary_10_1007_s10044_012_0272_z crossref_primary_10_1016_j_knosys_2023_110874 crossref_primary_10_1016_j_jvcir_2014_11_002 crossref_primary_10_1109_ACCESS_2020_3000816 crossref_primary_10_1007_s11042_015_3148_6 crossref_primary_10_1016_j_image_2021_116137 crossref_primary_10_1109_TIP_2012_2192742 crossref_primary_10_1109_TIP_2017_2691557 crossref_primary_10_3844_jcssp_2011_279_283 crossref_primary_10_1162_tacl_a_00512 crossref_primary_10_1109_TCYB_2018_2878069 crossref_primary_10_1016_j_patcog_2018_04_029 crossref_primary_10_1109_TGRS_2024_3404636 crossref_primary_10_1515_cait_2016_0089 crossref_primary_10_1109_TNNLS_2020_3040379 crossref_primary_10_1016_j_neucom_2017_07_041 crossref_primary_10_1016_j_patcog_2020_107749 crossref_primary_10_1016_j_knosys_2021_107950 crossref_primary_10_1016_j_ins_2017_05_007 crossref_primary_10_1016_j_patrec_2017_12_020 crossref_primary_10_3390_s19030560 crossref_primary_10_1088_0031_9155_60_5_1807 crossref_primary_10_1007_s10586_023_03978_z crossref_primary_10_1007_s00371_021_02238_8 crossref_primary_10_1109_TSMC_2016_2531645 crossref_primary_10_1109_TIP_2018_2855412 crossref_primary_10_1016_j_patcog_2010_08_015 crossref_primary_10_1142_S0218001414500062 crossref_primary_10_1214_13_AOS1199 crossref_primary_10_1016_j_cma_2022_115760 crossref_primary_10_1016_j_patcog_2014_03_006 crossref_primary_10_1109_LGRS_2022_3168722 crossref_primary_10_1109_TIP_2010_2044965 crossref_primary_10_1016_j_neucom_2015_12_123 crossref_primary_10_1109_TMM_2024_3443616 crossref_primary_10_1109_TNNLS_2015_2490080 crossref_primary_10_1007_s11063_018_9901_x crossref_primary_10_1109_TIP_2023_3263102 crossref_primary_10_1214_11_AOS914 crossref_primary_10_1109_TSP_2020_3018665 crossref_primary_10_1145_2980179_2980243 crossref_primary_10_1007_s11063_018_9859_8 crossref_primary_10_1007_s10044_019_00786_3 crossref_primary_10_1109_ACCESS_2019_2908718 crossref_primary_10_1016_j_knosys_2024_112921 crossref_primary_10_1038_s41588_023_01522_8 crossref_primary_10_1049_iet_cvi_2013_0149 crossref_primary_10_1016_j_cviu_2007_07_005 crossref_primary_10_1109_TWC_2023_3349330 crossref_primary_10_1016_j_neucom_2024_127813 crossref_primary_10_1016_j_patcog_2022_108780 crossref_primary_10_1088_1742_6596_2637_1_012040 crossref_primary_10_1109_TIP_2009_2020534 crossref_primary_10_1117_1_3257933 crossref_primary_10_1016_j_cviu_2014_11_004 crossref_primary_10_1007_s00521_019_04317_3 crossref_primary_10_1109_ACCESS_2020_2977273 crossref_primary_10_1162_NECO_a_00762 crossref_primary_10_1109_TIP_2017_2699481 crossref_primary_10_1145_2597181 crossref_primary_10_1109_TPAMI_2009_40 crossref_primary_10_1145_2856058 crossref_primary_10_1016_j_cma_2022_115420 crossref_primary_10_1016_j_compbiomed_2010_06_007 crossref_primary_10_1109_TSP_2017_2781649 crossref_primary_10_1109_TNNLS_2022_3164540 crossref_primary_10_1109_ACCESS_2022_3175199 crossref_primary_10_1016_j_ins_2023_119143 crossref_primary_10_1016_j_neucom_2017_11_021 crossref_primary_10_1016_j_imavis_2023_104769 crossref_primary_10_1016_j_knosys_2022_110247 crossref_primary_10_1016_j_neucom_2022_05_069 crossref_primary_10_1007_s11263_023_01893_9 crossref_primary_10_1007_s00041_008_9040_2 crossref_primary_10_1016_j_patcog_2018_05_020 crossref_primary_10_3390_s111211141 crossref_primary_10_1016_j_jfranklin_2023_02_011 crossref_primary_10_1007_s11063_023_11164_w crossref_primary_10_1109_TIP_2014_2329449 crossref_primary_10_1109_TSP_2012_2187642 crossref_primary_10_1016_j_ins_2015_12_038 crossref_primary_10_1016_j_neunet_2022_03_039  | 
    
| Cites_doi | 10.1198/016214501753168398 10.1109/ICCV.2005.112 10.1109/TIT.2002.804056 10.1109/18.720544 10.1109/5.726788 10.1109/18.340468 10.1080/01621459.1963.10500845 10.1017/CBO9780511810817 10.1007/3-540-47979-1_53 10.1002/9780470191613 10.1016/0005-1098(78)90005-5 10.1017/CBO9780511807213 10.1017/CBO9780511804441 10.1109/TIT.1982.1056489 10.1007/978-0-387-21606-5 10.1109/34.990138 10.1109/18.720554 10.1023/A:1011174803800 10.1145/331499.331504 10.1071/BT9660127 10.1162/089976699300016728 10.2307/2984875 10.1162/089976600300015088 10.1109/CVPR.2003.1211332 10.1109/TPAMI.2005.244 10.1002/047174882x  | 
    
| ContentType | Journal Article | 
    
| Copyright | 2007 INIST-CNRS Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2007  | 
    
| Copyright_xml | – notice: 2007 INIST-CNRS – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2007  | 
    
| DBID | 97E RIA RIE AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 7X8  | 
    
| DOI | 10.1109/TPAMI.2007.1085 | 
    
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts  Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database MEDLINE - Academic  | 
    
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering MEDLINE - Academic  | 
    
| DatabaseTitleList | Technology Research Database Technology Research Database MEDLINE Technology Research Database MEDLINE - Academic Technology Research Database  | 
    
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering Computer Science Applied Sciences  | 
    
| EISSN | 1939-3539 | 
    
| EndPage | 1562 | 
    
| ExternalDocumentID | 2333975221 17627043 18972985 10_1109_TPAMI_2007_1085 4288157  | 
    
| Genre | orig-research Research Support, U.S. Gov't, Non-P.H.S Journal Article  | 
    
| GroupedDBID | --- -DZ -~X .DC 0R~ 29I 4.4 53G 5GY 5VS 6IK 97E 9M8 AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT ADRHT AENEX AETEA AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P FA8 HZ~ H~9 IBMZZ ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RXW RZB TAE TN5 UHB VH1 XJT ~02 AAYXX CITATION IQODW RIG CGR CUY CVF ECM EIF NPM 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 7X8  | 
    
| ID | FETCH-LOGICAL-c467t-c1d54eb899316b2486ff6216c9d39393fea4e5e4358f9c8cc8242a2951aafc6a3 | 
    
| IEDL.DBID | RIE | 
    
| ISSN | 0162-8828 | 
    
| IngestDate | Sun Sep 28 01:34:49 EDT 2025 Sat Sep 27 18:26:23 EDT 2025 Sun Sep 28 10:54:16 EDT 2025 Sun Sep 28 07:34:02 EDT 2025 Sun Sep 07 03:46:26 EDT 2025 Mon Jul 21 06:03:50 EDT 2025 Mon Jul 21 09:16:03 EDT 2025 Wed Oct 01 06:44:19 EDT 2025 Thu Apr 24 23:02:07 EDT 2025 Wed Aug 27 02:47:51 EDT 2025  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 9 | 
    
| Keywords | Cluster analysis Parameter estimation Multivariate mixed data Image processing Data compression Degenerate system DNA chip data clustering Modeling Phase transitions Outlier lossy coding Classification System identification Pattern analysis Deterministic approach Bioinformatics microarray data clustering Mixed distribution lossy compression Rate distortion theory data segmentation Image segmentation rate distortion Gaussian process Optimal solution Asymptotic approximation Artificial intelligence  | 
    
| Language | English | 
    
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c467t-c1d54eb899316b2486ff6216c9d39393fea4e5e4358f9c8cc8242a2951aafc6a3 | 
    
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23 ObjectType-Article-1 ObjectType-Feature-2  | 
    
| PMID | 17627043 | 
    
| PQID | 864117629 | 
    
| PQPubID | 23500 | 
    
| PageCount | 17 | 
    
| ParticipantIDs | crossref_citationtrail_10_1109_TPAMI_2007_1085 proquest_miscellaneous_34452207 proquest_miscellaneous_880692162 pubmed_primary_17627043 proquest_miscellaneous_70707061 crossref_primary_10_1109_TPAMI_2007_1085 proquest_miscellaneous_903642824 proquest_journals_864117629 pascalfrancis_primary_18972985 ieee_primary_4288157  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2007-09-01 | 
    
| PublicationDateYYYYMMDD | 2007-09-01 | 
    
| PublicationDate_xml | – month: 09 year: 2007 text: 2007-09-01 day: 01  | 
    
| PublicationDecade | 2000 | 
    
| PublicationPlace | Los Alamitos, CA | 
    
| PublicationPlace_xml | – name: Los Alamitos, CA – name: United States – name: New York  | 
    
| PublicationTitle | IEEE transactions on pattern analysis and machine intelligence | 
    
| PublicationTitleAbbrev | TPAMI | 
    
| PublicationTitleAlternate | IEEE Trans Pattern Anal Mach Intell | 
    
| PublicationYear | 2007 | 
    
| Publisher | IEEE IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
    
| Publisher_xml | – name: IEEE – name: IEEE Computer Society – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
    
| References | ref13 ref12 ref34 ref14 Kamvar (ref25) 2002 ref36 ref33 ref10 ref32 Ghahramani (ref30) 1996 ref2 ref1 ref17 ref16 ref19 ref18 Ghahramani (ref31) 2000; 12 ref24 Yang (ref35) 2006 ref26 ref20 ref22 MacQueen (ref9) ref28 ref27 Benson (ref23) 1994 ref29 ref8 Forgy (ref7) 1965; 21 Ghahramani (ref15) 1996 Xing (ref11) ref4 ref3 Madiman (ref21) ref6 ref5  | 
    
| References_xml | – ident: ref20 doi: 10.1198/016214501753168398 – ident: ref34 doi: 10.1109/ICCV.2005.112 – ident: ref26 doi: 10.1109/TIT.2002.804056 – volume-title: Technical Report 2002-11 year: 2002 ident: ref25 article-title: Interpreting and Extending Classical Agglomerative Clustering Methods Using a Model-Based Approach – ident: ref27 doi: 10.1109/18.720544 – ident: ref10 doi: 10.1109/5.726788 – ident: ref22 doi: 10.1109/18.340468 – ident: ref24 doi: 10.1080/01621459.1963.10500845 – ident: ref28 doi: 10.1017/CBO9780511810817 – ident: ref33 doi: 10.1007/3-540-47979-1_53 – volume-title: Proc. 2004 IEEE Int’l Symp. Information Theory ident: ref21 article-title: Minimum Description Length versus Maximum Likelihood in Lossy Data Compression – ident: ref14 doi: 10.1002/9780470191613 – ident: ref18 doi: 10.1016/0005-1098(78)90005-5 – start-page: 281 volume-title: Proc. Fifth Berkeley Symp. Math., Statistics, and Probability ident: ref9 article-title: Some Methods for Classification and Analysis of Multivariate Observations – start-page: 96 volume-title: Technical Report CRG-TR year: 1996 ident: ref30 article-title: The EM Algorithm for Mixtures of Factor Analyzers – volume-title: technical report, Coordinated Science Laboratory year: 2006 ident: ref35 article-title: Segmentation of Natural Images via Lossy Data Compression – year: 1994 ident: ref23 article-title: Concave Minimization: Theory, Applications and Algorithms publication-title: Handbook of Global Optimization – start-page: 96 volume-title: Technical Report CRG-TR year: 1996 ident: ref15 article-title: The EM Algorithm for Mixtures of Factor Analyzers – ident: ref36 doi: 10.1017/CBO9780511807213 – ident: ref29 doi: 10.1017/CBO9780511804441 – ident: ref6 doi: 10.1109/TIT.1982.1056489 – volume-title: Proc. Ann. Conf. Neural Information Processing Systems ident: ref11 article-title: Distance Metric Learning, with Application to Clustering with Side Information – ident: ref2 doi: 10.1007/978-0-387-21606-5 – ident: ref4 doi: 10.1109/34.990138 – ident: ref19 doi: 10.1109/18.720554 – volume: 21 start-page: 768 year: 1965 ident: ref7 article-title: Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classifications (Abstract) publication-title: Biometrics – ident: ref32 doi: 10.1023/A:1011174803800 – ident: ref1 doi: 10.1145/331499.331504 – ident: ref8 doi: 10.1071/BT9660127 – volume: 12 start-page: 449 year: 2000 ident: ref31 article-title: Variational Inference for Bayesian Mixtures of Factor Analyzers publication-title: Advances in Neural Information Processing Systems – ident: ref3 doi: 10.1162/089976699300016728 – ident: ref13 doi: 10.2307/2984875 – ident: ref16 doi: 10.1162/089976600300015088 – ident: ref12 doi: 10.1109/CVPR.2003.1211332 – ident: ref5 doi: 10.1109/TPAMI.2005.244 – ident: ref17 doi: 10.1002/047174882x  | 
    
| SSID | ssj0014503 | 
    
| Score | 2.4640734 | 
    
| Snippet | In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that... [...] we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.  | 
    
| SourceID | proquest pubmed pascalfrancis crossref ieee  | 
    
| SourceType | Aggregation Database Index Database Enrichment Source Publisher  | 
    
| StartPage | 1546 | 
    
| SubjectTerms | Algorithms Applied sciences Artificial Intelligence Asymptotic properties Bioinformatics Coding Compressing Computer science; control theory; systems Computer Simulation Data Clustering Data compression Data Compression - methods Data Interpretation, Statistical Data Segmentation Distortion Exact sciences and technology Gaussian distribution Image coding Image segmentation Lossy Coding Lossy Compression Maximum likelihood estimation Microarray Data Clustering Models, Statistical Multivariate Analysis Multivariate Mixed Data Normal Distribution Optimization Parameter estimation Pattern Recognition, Automated - methods Pattern recognition. Digital image processing. Computational geometry Phase distortion Rate Distortion Segmentation Segments Signal processing algorithms Studies  | 
    
| Title | Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression | 
    
| URI | https://ieeexplore.ieee.org/document/4288157 https://www.ncbi.nlm.nih.gov/pubmed/17627043 https://www.proquest.com/docview/864117629 https://www.proquest.com/docview/34452207 https://www.proquest.com/docview/70707061 https://www.proquest.com/docview/880692162 https://www.proquest.com/docview/903642824  | 
    
| Volume | 29 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1939-3539 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014503 issn: 0162-8828 databaseCode: RIE dateStart: 19790101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB61PcGB0pZHaCk-cOBAtnk4TnysSquCWIREK_UWOX6gCppUbLYCfn1n7GxKEZHQXna1k-zanrG_ycx8A_AakZDMC6Fia5WlMKOJK2VcjP6ypEYE2jifIPtJnJ7zDxfFxRq8HWthrLU--czO6K2P5ZtOL-lR2QFC5SotynVYLysRarXGiAEvfBdkRDBo4ehGDDQ-aSIPzj4fzt8HtkLKtSeeUNwCyoTn9w4j312FciPVAqfHhb4W08DTH0AnmzBf_fWQd_Jttuybmf79F6vj_47tMTwakCg7DKqzBWu23YbNVZcHNhj9Njz8g7JwB-Zf7NeroV6pZZ1jvoL3Bj1uBK1sfvnTGvZO9YrdXCr2EQf9K3w86uiQZKo1jH4iZN-2T-D85Pjs6DQeWjLEGnfUPtapKbht0EnLU9FkvBLOiSwVWppc4stZxW1hEYNVTupK6wohgMoQxinltFD5U9hou9Y-ByadEg3V1WacbqmbMk9NIrImM4ihszSC2Wptaj3wlVPbjO-191sSWft1pT6aJdGcFhG8GS-4DlQd06I7tAKj2DD5EezfW_y721QSnRC6bnelDfVg64u6EjwlhZIRvBq_RSOlyItqbbdc1Dkn4vqknJYoiXYJsVUEbEICN1ohcaqzaRFJQWX0oXkEz4Kq3o1g0PgX_x75LjwID60peW4PNvofS_sS0Vbf7HszuwVjZiPk | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB6VcgAOFFooptDugQMHnPqxXnuPVaFKIa6QSKXerPU-UAW1EXEq4Ncz43VcioiEckmUsZPdndn9xjPzDcArREIyzYQKrVWWwowmLJRxIfrLkhoRaOP6BNkzMT3n7y-yiw14M9bCWGv75DM7obd9LN-0ekmPyg4RKhdxlt-BuxnnPPPVWmPMgGd9H2TEMGjj6EgMRD5xJA_nH4_KU89XSNn2xBSKm0Ae8fTWcdT3V6HsSLXACXK-s8V66NkfQSdbUK7-vM88-TJZdvVE__qL1_F_R_cIHg5YlB155XkMG7bZhq1Vnwc2mP02PPiDtHAHyk_289VQsdSw1rG-hvcafW6Eray8_GENe6s6xa4vFZvhoH_6j8ctHZNMNYbRT_j82-YJnJ-8mx9Pw6EpQ6hxT-1CHZuM2xrdtDQWdcIL4ZxIYqGlSSW-nFXcZhZRWOGkLrQuEASoBIGcUk4LlT6FzaZt7DNg0ilRU2VtwumWus7T2EQiqRODKDqJA5is1qbSA2M5Nc74WvWeSySrfl2pk2ZORKdZAK_HC755so71oju0AqPYMPkB7N9a_JvbFBLdELpub6UN1WDti6oQPCaFkgEcjN-imVLsRTW2XS6qlBN1fZSvl8iJeAnRVQBsjQRutULiVCfrRSSFldGL5gHselW9GcGg8c__PfIDuDedl7Nqdnr2YQ_u-0fYlEr3Aja770v7ErFXV-_3JvcbxQEnMQ | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Segmentation+of+Multivariate+Mixed+Data+via+Lossy+Data+Coding+and+Compression&rft.jtitle=IEEE+transactions+on+pattern+analysis+and+machine+intelligence&rft.au=Ma%2C+Yi&rft.au=Derksen%2C+H&rft.au=Wei%2C+Hong&rft.au=Wright%2C+J&rft.date=2007-09-01&rft.issn=0162-8828&rft.volume=29&rft.issue=9&rft.spage=1546&rft.epage=1562&rft_id=info:doi/10.1109%2FTPAMI.2007.1085&rft.externalDBID=NO_FULL_TEXT | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0162-8828&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0162-8828&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0162-8828&client=summon |