Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression

In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation th...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 29; no. 9; pp. 1546 - 1562
Main Authors Yi Ma, Derksen, H., Wei Hong, Wright, J.
Format Journal Article
LanguageEnglish
Published Los Alamitos, CA IEEE 01.09.2007
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0162-8828
1939-3539
DOI10.1109/TPAMI.2007.1085

Cover

Abstract In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
AbstractList [...] we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed [abstract truncated by publisher].
In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm which depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm which depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm which depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
Author Wei Hong
Yi Ma
Derksen, H.
Wright, J.
Author_xml – sequence: 1
  surname: Yi Ma
  fullname: Yi Ma
  organization: Univ. of Illinois at Urbana-Champaign, Urbana
– sequence: 2
  givenname: H.
  surname: Derksen
  fullname: Derksen, H.
– sequence: 3
  surname: Wei Hong
  fullname: Wei Hong
– sequence: 4
  givenname: J.
  surname: Wright
  fullname: Wright, J.
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18972985$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/17627043$$D View this record in MEDLINE/PubMed
BookMark eNqFkt1rFDEUxYNU7Lb67IMgQ0F9mm2-J3ks61dhFwXrc7ibuVNSZmfWZKbY_96Ms1YoaMlDEvidk3DPOSFHXd8hIS8ZXTJG7fnV14vN5ZJTWi0ZNeoJWTArbCmUsEdkQZnmpTHcHJOTlG4oZVJR8Ywcs0rzikqxIJtveL3DboAh9F3RN8VmbIdwCzHAgMUm_MS6eA8DFLcBinWf0t18XfV16K4L6Op83O0jppQNnpOnDbQJXxz2U_L944er1edy_eXT5epiXXqpq6H0rFYSt8ZawfSWS6ObRnOmva1F_r5oECQqlEKZxnrjveGSA7eKATRegzgl72bffex_jJgGtwvJY9tCh_2YnKVCS55Vj5LGUG3z0zyTb_9LVnRamj0KCikV57TK4NkD8KYfY5cH44yWbArBZuj1ARq3O6zdPoYdxDv3J6IMvDkAkDy0TYTOh_SXM7bi1qjMqZnzMccUsXE-zLEOEULrGHVTZdzvyripMm6qTNadP9DdW_9T8WpWBES8p_O8DVOV-AWghMg7
CODEN ITPIDJ
CitedBy_id crossref_primary_10_1016_j_imavis_2015_01_005
crossref_primary_10_1007_s11063_020_10306_8
crossref_primary_10_1007_s11265_015_1030_4
crossref_primary_10_1109_TIT_2022_3196707
crossref_primary_10_1016_j_neucom_2013_05_017
crossref_primary_10_1007_s11263_024_02145_0
crossref_primary_10_3724_SP_J_1004_2010_01661
crossref_primary_10_1016_j_patcog_2012_06_011
crossref_primary_10_1137_070707312
crossref_primary_10_1109_TNNLS_2021_3084976
crossref_primary_10_1016_j_neucom_2019_02_055
crossref_primary_10_1109_TCSVT_2016_2595328
crossref_primary_10_1007_s13042_019_00999_2
crossref_primary_10_1016_j_neucom_2018_08_054
crossref_primary_10_1007_s00521_016_2353_1
crossref_primary_10_1109_TPAMI_2018_2871850
crossref_primary_10_1109_TNNLS_2016_2553155
crossref_primary_10_1109_ACCESS_2019_2925196
crossref_primary_10_1109_TASLP_2023_3302237
crossref_primary_10_1016_j_patrec_2009_07_020
crossref_primary_10_1109_ACCESS_2019_2906332
crossref_primary_10_1016_j_patcog_2012_11_021
crossref_primary_10_1007_s11263_013_0694_0
crossref_primary_10_1016_j_cviu_2012_05_002
crossref_primary_10_1016_j_ins_2022_07_049
crossref_primary_10_1109_TIP_2013_2273665
crossref_primary_10_1016_j_patrec_2013_08_006
crossref_primary_10_1016_j_eswa_2023_119977
crossref_primary_10_1109_JSAC_2025_3531575
crossref_primary_10_1016_j_neunet_2025_107129
crossref_primary_10_1109_TPAMI_2012_88
crossref_primary_10_1109_TIP_2018_2859628
crossref_primary_10_1016_j_eswa_2024_125375
crossref_primary_10_1111_mice_12063
crossref_primary_10_1016_j_patrec_2017_08_023
crossref_primary_10_1109_TPAMI_2011_130
crossref_primary_10_32604_cmc_2024_050920
crossref_primary_10_1109_TIP_2010_2047903
crossref_primary_10_1016_j_neucom_2017_03_071
crossref_primary_10_1016_j_patcog_2013_04_013
crossref_primary_10_1109_TKDE_2013_114
crossref_primary_10_1109_TIP_2017_2736603
crossref_primary_10_4304_jsw_8_3_547_553
crossref_primary_10_1109_TIT_2016_2573311
crossref_primary_10_1109_TIP_2009_2018002
crossref_primary_10_1142_S0218001413500043
crossref_primary_10_1109_TGRS_2024_3493415
crossref_primary_10_1109_TRO_2016_2552548
crossref_primary_10_1016_j_measurement_2019_107432
crossref_primary_10_1631_FITEE_2200297
crossref_primary_10_1109_JSAIT_2020_3039170
crossref_primary_10_1016_j_ins_2018_05_032
crossref_primary_10_1016_j_neucom_2019_12_019
crossref_primary_10_1016_j_neucom_2023_127012
crossref_primary_10_1109_TKDE_2023_3303343
crossref_primary_10_1109_MSP_2010_940005
crossref_primary_10_1109_TPAMI_2016_2614980
crossref_primary_10_1109_JPROC_2009_2037655
crossref_primary_10_3389_fnins_2023_1252179
crossref_primary_10_3390_sym16091216
crossref_primary_10_1007_s10044_022_01101_3
crossref_primary_10_1016_j_neunet_2025_107251
crossref_primary_10_1016_j_neucom_2018_10_012
crossref_primary_10_1016_j_eswa_2024_124619
crossref_primary_10_1109_TIT_2018_2879912
crossref_primary_10_1109_TPAMI_2009_191
crossref_primary_10_1137_060655523
crossref_primary_10_1049_iet_spr_2012_0191
crossref_primary_10_1016_j_knosys_2016_11_013
crossref_primary_10_1109_TPAMI_2015_2513407
crossref_primary_10_1016_j_neucom_2017_10_060
crossref_primary_10_1016_j_patrec_2010_09_016
crossref_primary_10_1016_j_eswa_2019_06_047
crossref_primary_10_1016_j_patcog_2018_07_002
crossref_primary_10_1109_TIP_2019_2917857
crossref_primary_10_1109_JSTSP_2018_2879743
crossref_primary_10_1016_j_jvcir_2016_03_017
crossref_primary_10_1109_TNNLS_2018_2876327
crossref_primary_10_1016_j_neucom_2015_05_069
crossref_primary_10_1016_j_patcog_2025_111557
crossref_primary_10_1016_j_neunet_2020_06_022
crossref_primary_10_1007_s11263_009_0314_1
crossref_primary_10_3390_math10060940
crossref_primary_10_1016_j_icarus_2023_115797
crossref_primary_10_1016_j_patcog_2009_01_010
crossref_primary_10_1109_TCYB_2013_2286106
crossref_primary_10_1016_j_patrec_2011_01_013
crossref_primary_10_1145_2037676_2037688
crossref_primary_10_1109_TAFFC_2016_2554556
crossref_primary_10_1016_j_eswa_2021_116359
crossref_primary_10_1109_TPAMI_2010_146
crossref_primary_10_1109_TKDE_2020_2995896
crossref_primary_10_1016_j_neucom_2024_129065
crossref_primary_10_1109_TPAMI_2017_2739147
crossref_primary_10_1007_s11263_011_0444_0
crossref_primary_10_1109_TCYB_2014_2361489
crossref_primary_10_1109_ACCESS_2019_2920592
crossref_primary_10_1109_MSP_2010_939739
crossref_primary_10_1109_TIP_2008_920761
crossref_primary_10_1007_s10440_008_9398_9
crossref_primary_10_1007_s11063_018_9783_y
crossref_primary_10_1145_3663483
crossref_primary_10_1016_j_sigpro_2013_02_010
crossref_primary_10_1109_TWC_2024_3461336
crossref_primary_10_1109_TCSVT_2018_2793359
crossref_primary_10_1007_s10208_009_9043_7
crossref_primary_10_1007_s11042_009_0346_0
crossref_primary_10_1016_j_neunet_2017_08_001
crossref_primary_10_1007_s41870_020_00559_w
crossref_primary_10_1016_j_ins_2024_121058
crossref_primary_10_1109_TPAMI_2014_2377740
crossref_primary_10_1016_j_image_2017_12_011
crossref_primary_10_1007_s11263_008_0144_6
crossref_primary_10_1109_TSP_2016_2613070
crossref_primary_10_1371_journal_pone_0059377
crossref_primary_10_1109_TNNLS_2014_2306063
crossref_primary_10_1016_j_inffus_2013_10_012
crossref_primary_10_1016_j_neunet_2025_107173
crossref_primary_10_1137_24M1655093
crossref_primary_10_3390_e24040456
crossref_primary_10_1007_s10044_022_01085_0
crossref_primary_10_1016_j_neucom_2018_01_006
crossref_primary_10_1016_j_neucom_2017_12_033
crossref_primary_10_1109_TMM_2022_3207922
crossref_primary_10_1007_s11263_012_0535_6
crossref_primary_10_1007_s11063_017_9726_z
crossref_primary_10_1214_12_AOS1034
crossref_primary_10_1007_s10044_012_0272_z
crossref_primary_10_1016_j_knosys_2023_110874
crossref_primary_10_1016_j_jvcir_2014_11_002
crossref_primary_10_1109_ACCESS_2020_3000816
crossref_primary_10_1007_s11042_015_3148_6
crossref_primary_10_1016_j_image_2021_116137
crossref_primary_10_1109_TIP_2012_2192742
crossref_primary_10_1109_TIP_2017_2691557
crossref_primary_10_3844_jcssp_2011_279_283
crossref_primary_10_1162_tacl_a_00512
crossref_primary_10_1109_TCYB_2018_2878069
crossref_primary_10_1016_j_patcog_2018_04_029
crossref_primary_10_1109_TGRS_2024_3404636
crossref_primary_10_1515_cait_2016_0089
crossref_primary_10_1109_TNNLS_2020_3040379
crossref_primary_10_1016_j_neucom_2017_07_041
crossref_primary_10_1016_j_patcog_2020_107749
crossref_primary_10_1016_j_knosys_2021_107950
crossref_primary_10_1016_j_ins_2017_05_007
crossref_primary_10_1016_j_patrec_2017_12_020
crossref_primary_10_3390_s19030560
crossref_primary_10_1088_0031_9155_60_5_1807
crossref_primary_10_1007_s10586_023_03978_z
crossref_primary_10_1007_s00371_021_02238_8
crossref_primary_10_1109_TSMC_2016_2531645
crossref_primary_10_1109_TIP_2018_2855412
crossref_primary_10_1016_j_patcog_2010_08_015
crossref_primary_10_1142_S0218001414500062
crossref_primary_10_1214_13_AOS1199
crossref_primary_10_1016_j_cma_2022_115760
crossref_primary_10_1016_j_patcog_2014_03_006
crossref_primary_10_1109_LGRS_2022_3168722
crossref_primary_10_1109_TIP_2010_2044965
crossref_primary_10_1016_j_neucom_2015_12_123
crossref_primary_10_1109_TMM_2024_3443616
crossref_primary_10_1109_TNNLS_2015_2490080
crossref_primary_10_1007_s11063_018_9901_x
crossref_primary_10_1109_TIP_2023_3263102
crossref_primary_10_1214_11_AOS914
crossref_primary_10_1109_TSP_2020_3018665
crossref_primary_10_1145_2980179_2980243
crossref_primary_10_1007_s11063_018_9859_8
crossref_primary_10_1007_s10044_019_00786_3
crossref_primary_10_1109_ACCESS_2019_2908718
crossref_primary_10_1016_j_knosys_2024_112921
crossref_primary_10_1038_s41588_023_01522_8
crossref_primary_10_1049_iet_cvi_2013_0149
crossref_primary_10_1016_j_cviu_2007_07_005
crossref_primary_10_1109_TWC_2023_3349330
crossref_primary_10_1016_j_neucom_2024_127813
crossref_primary_10_1016_j_patcog_2022_108780
crossref_primary_10_1088_1742_6596_2637_1_012040
crossref_primary_10_1109_TIP_2009_2020534
crossref_primary_10_1117_1_3257933
crossref_primary_10_1016_j_cviu_2014_11_004
crossref_primary_10_1007_s00521_019_04317_3
crossref_primary_10_1109_ACCESS_2020_2977273
crossref_primary_10_1162_NECO_a_00762
crossref_primary_10_1109_TIP_2017_2699481
crossref_primary_10_1145_2597181
crossref_primary_10_1109_TPAMI_2009_40
crossref_primary_10_1145_2856058
crossref_primary_10_1016_j_cma_2022_115420
crossref_primary_10_1016_j_compbiomed_2010_06_007
crossref_primary_10_1109_TSP_2017_2781649
crossref_primary_10_1109_TNNLS_2022_3164540
crossref_primary_10_1109_ACCESS_2022_3175199
crossref_primary_10_1016_j_ins_2023_119143
crossref_primary_10_1016_j_neucom_2017_11_021
crossref_primary_10_1016_j_imavis_2023_104769
crossref_primary_10_1016_j_knosys_2022_110247
crossref_primary_10_1016_j_neucom_2022_05_069
crossref_primary_10_1007_s11263_023_01893_9
crossref_primary_10_1007_s00041_008_9040_2
crossref_primary_10_1016_j_patcog_2018_05_020
crossref_primary_10_3390_s111211141
crossref_primary_10_1016_j_jfranklin_2023_02_011
crossref_primary_10_1007_s11063_023_11164_w
crossref_primary_10_1109_TIP_2014_2329449
crossref_primary_10_1109_TSP_2012_2187642
crossref_primary_10_1016_j_ins_2015_12_038
crossref_primary_10_1016_j_neunet_2022_03_039
Cites_doi 10.1198/016214501753168398
10.1109/ICCV.2005.112
10.1109/TIT.2002.804056
10.1109/18.720544
10.1109/5.726788
10.1109/18.340468
10.1080/01621459.1963.10500845
10.1017/CBO9780511810817
10.1007/3-540-47979-1_53
10.1002/9780470191613
10.1016/0005-1098(78)90005-5
10.1017/CBO9780511807213
10.1017/CBO9780511804441
10.1109/TIT.1982.1056489
10.1007/978-0-387-21606-5
10.1109/34.990138
10.1109/18.720554
10.1023/A:1011174803800
10.1145/331499.331504
10.1071/BT9660127
10.1162/089976699300016728
10.2307/2984875
10.1162/089976600300015088
10.1109/CVPR.2003.1211332
10.1109/TPAMI.2005.244
10.1002/047174882x
ContentType Journal Article
Copyright 2007 INIST-CNRS
Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2007
Copyright_xml – notice: 2007 INIST-CNRS
– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2007
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
7X8
DOI 10.1109/TPAMI.2007.1085
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
MEDLINE - Academic
DatabaseTitleList Technology Research Database
Technology Research Database

MEDLINE
Technology Research Database
MEDLINE - Academic
Technology Research Database
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Applied Sciences
EISSN 1939-3539
EndPage 1562
ExternalDocumentID 2333975221
17627043
18972985
10_1109_TPAMI_2007_1085
4288157
Genre orig-research
Research Support, U.S. Gov't, Non-P.H.S
Journal Article
GroupedDBID ---
-DZ
-~X
.DC
0R~
29I
4.4
53G
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
ADRHT
AENEX
AETEA
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
FA8
HZ~
H~9
IBMZZ
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TN5
UHB
VH1
XJT
~02
AAYXX
CITATION
IQODW
RIG
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
F28
FR3
7X8
ID FETCH-LOGICAL-c467t-c1d54eb899316b2486ff6216c9d39393fea4e5e4358f9c8cc8242a2951aafc6a3
IEDL.DBID RIE
ISSN 0162-8828
IngestDate Sun Sep 28 01:34:49 EDT 2025
Sat Sep 27 18:26:23 EDT 2025
Sun Sep 28 10:54:16 EDT 2025
Sun Sep 28 07:34:02 EDT 2025
Sun Sep 07 03:46:26 EDT 2025
Mon Jul 21 06:03:50 EDT 2025
Mon Jul 21 09:16:03 EDT 2025
Wed Oct 01 06:44:19 EDT 2025
Thu Apr 24 23:02:07 EDT 2025
Wed Aug 27 02:47:51 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Keywords Cluster analysis
Parameter estimation
Multivariate mixed data
Image processing
Data compression
Degenerate system
DNA chip
data clustering
Modeling
Phase transitions
Outlier
lossy coding
Classification
System identification
Pattern analysis
Deterministic approach
Bioinformatics
microarray data clustering
Mixed distribution
lossy compression
Rate distortion theory
data segmentation
Image segmentation
rate distortion
Gaussian process
Optimal solution
Asymptotic approximation
Artificial intelligence
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c467t-c1d54eb899316b2486ff6216c9d39393fea4e5e4358f9c8cc8242a2951aafc6a3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
PMID 17627043
PQID 864117629
PQPubID 23500
PageCount 17
ParticipantIDs crossref_citationtrail_10_1109_TPAMI_2007_1085
proquest_miscellaneous_34452207
proquest_miscellaneous_880692162
pubmed_primary_17627043
proquest_miscellaneous_70707061
crossref_primary_10_1109_TPAMI_2007_1085
proquest_miscellaneous_903642824
proquest_journals_864117629
pascalfrancis_primary_18972985
ieee_primary_4288157
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2007-09-01
PublicationDateYYYYMMDD 2007-09-01
PublicationDate_xml – month: 09
  year: 2007
  text: 2007-09-01
  day: 01
PublicationDecade 2000
PublicationPlace Los Alamitos, CA
PublicationPlace_xml – name: Los Alamitos, CA
– name: United States
– name: New York
PublicationTitle IEEE transactions on pattern analysis and machine intelligence
PublicationTitleAbbrev TPAMI
PublicationTitleAlternate IEEE Trans Pattern Anal Mach Intell
PublicationYear 2007
Publisher IEEE
IEEE Computer Society
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: IEEE Computer Society
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref34
ref14
Kamvar (ref25) 2002
ref36
ref33
ref10
ref32
Ghahramani (ref30) 1996
ref2
ref1
ref17
ref16
ref19
ref18
Ghahramani (ref31) 2000; 12
ref24
Yang (ref35) 2006
ref26
ref20
ref22
MacQueen (ref9)
ref28
ref27
Benson (ref23) 1994
ref29
ref8
Forgy (ref7) 1965; 21
Ghahramani (ref15) 1996
Xing (ref11)
ref4
ref3
Madiman (ref21)
ref6
ref5
References_xml – ident: ref20
  doi: 10.1198/016214501753168398
– ident: ref34
  doi: 10.1109/ICCV.2005.112
– ident: ref26
  doi: 10.1109/TIT.2002.804056
– volume-title: Technical Report 2002-11
  year: 2002
  ident: ref25
  article-title: Interpreting and Extending Classical Agglomerative Clustering Methods Using a Model-Based Approach
– ident: ref27
  doi: 10.1109/18.720544
– ident: ref10
  doi: 10.1109/5.726788
– ident: ref22
  doi: 10.1109/18.340468
– ident: ref24
  doi: 10.1080/01621459.1963.10500845
– ident: ref28
  doi: 10.1017/CBO9780511810817
– ident: ref33
  doi: 10.1007/3-540-47979-1_53
– volume-title: Proc. 2004 IEEE Int’l Symp. Information Theory
  ident: ref21
  article-title: Minimum Description Length versus Maximum Likelihood in Lossy Data Compression
– ident: ref14
  doi: 10.1002/9780470191613
– ident: ref18
  doi: 10.1016/0005-1098(78)90005-5
– start-page: 281
  volume-title: Proc. Fifth Berkeley Symp. Math., Statistics, and Probability
  ident: ref9
  article-title: Some Methods for Classification and Analysis of Multivariate Observations
– start-page: 96
  volume-title: Technical Report CRG-TR
  year: 1996
  ident: ref30
  article-title: The EM Algorithm for Mixtures of Factor Analyzers
– volume-title: technical report, Coordinated Science Laboratory
  year: 2006
  ident: ref35
  article-title: Segmentation of Natural Images via Lossy Data Compression
– year: 1994
  ident: ref23
  article-title: Concave Minimization: Theory, Applications and Algorithms
  publication-title: Handbook of Global Optimization
– start-page: 96
  volume-title: Technical Report CRG-TR
  year: 1996
  ident: ref15
  article-title: The EM Algorithm for Mixtures of Factor Analyzers
– ident: ref36
  doi: 10.1017/CBO9780511807213
– ident: ref29
  doi: 10.1017/CBO9780511804441
– ident: ref6
  doi: 10.1109/TIT.1982.1056489
– volume-title: Proc. Ann. Conf. Neural Information Processing Systems
  ident: ref11
  article-title: Distance Metric Learning, with Application to Clustering with Side Information
– ident: ref2
  doi: 10.1007/978-0-387-21606-5
– ident: ref4
  doi: 10.1109/34.990138
– ident: ref19
  doi: 10.1109/18.720554
– volume: 21
  start-page: 768
  year: 1965
  ident: ref7
  article-title: Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classifications (Abstract)
  publication-title: Biometrics
– ident: ref32
  doi: 10.1023/A:1011174803800
– ident: ref1
  doi: 10.1145/331499.331504
– ident: ref8
  doi: 10.1071/BT9660127
– volume: 12
  start-page: 449
  year: 2000
  ident: ref31
  article-title: Variational Inference for Bayesian Mixtures of Factor Analyzers
  publication-title: Advances in Neural Information Processing Systems
– ident: ref3
  doi: 10.1162/089976699300016728
– ident: ref13
  doi: 10.2307/2984875
– ident: ref16
  doi: 10.1162/089976600300015088
– ident: ref12
  doi: 10.1109/CVPR.2003.1211332
– ident: ref5
  doi: 10.1109/TPAMI.2005.244
– ident: ref17
  doi: 10.1002/047174882x
SSID ssj0014503
Score 2.4640734
Snippet In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that...
[...] we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
SourceID proquest
pubmed
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 1546
SubjectTerms Algorithms
Applied sciences
Artificial Intelligence
Asymptotic properties
Bioinformatics
Coding
Compressing
Computer science; control theory; systems
Computer Simulation
Data Clustering
Data compression
Data Compression - methods
Data Interpretation, Statistical
Data Segmentation
Distortion
Exact sciences and technology
Gaussian distribution
Image coding
Image segmentation
Lossy Coding
Lossy Compression
Maximum likelihood estimation
Microarray Data Clustering
Models, Statistical
Multivariate Analysis
Multivariate Mixed Data
Normal Distribution
Optimization
Parameter estimation
Pattern Recognition, Automated - methods
Pattern recognition. Digital image processing. Computational geometry
Phase distortion
Rate Distortion
Segmentation
Segments
Signal processing algorithms
Studies
Title Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression
URI https://ieeexplore.ieee.org/document/4288157
https://www.ncbi.nlm.nih.gov/pubmed/17627043
https://www.proquest.com/docview/864117629
https://www.proquest.com/docview/34452207
https://www.proquest.com/docview/70707061
https://www.proquest.com/docview/880692162
https://www.proquest.com/docview/903642824
Volume 29
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1939-3539
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014503
  issn: 0162-8828
  databaseCode: RIE
  dateStart: 19790101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB61PcGB0pZHaCk-cOBAtnk4TnysSquCWIREK_UWOX6gCppUbLYCfn1n7GxKEZHQXna1k-zanrG_ycx8A_AakZDMC6Fia5WlMKOJK2VcjP6ypEYE2jifIPtJnJ7zDxfFxRq8HWthrLU--czO6K2P5ZtOL-lR2QFC5SotynVYLysRarXGiAEvfBdkRDBo4ehGDDQ-aSIPzj4fzt8HtkLKtSeeUNwCyoTn9w4j312FciPVAqfHhb4W08DTH0AnmzBf_fWQd_Jttuybmf79F6vj_47tMTwakCg7DKqzBWu23YbNVZcHNhj9Njz8g7JwB-Zf7NeroV6pZZ1jvoL3Bj1uBK1sfvnTGvZO9YrdXCr2EQf9K3w86uiQZKo1jH4iZN-2T-D85Pjs6DQeWjLEGnfUPtapKbht0EnLU9FkvBLOiSwVWppc4stZxW1hEYNVTupK6wohgMoQxinltFD5U9hou9Y-ByadEg3V1WacbqmbMk9NIrImM4ihszSC2Wptaj3wlVPbjO-191sSWft1pT6aJdGcFhG8GS-4DlQd06I7tAKj2DD5EezfW_y721QSnRC6bnelDfVg64u6EjwlhZIRvBq_RSOlyItqbbdc1Dkn4vqknJYoiXYJsVUEbEICN1ohcaqzaRFJQWX0oXkEz4Kq3o1g0PgX_x75LjwID60peW4PNvofS_sS0Vbf7HszuwVjZiPk
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9NAEB6VcgAOFFooptDugQMHnPqxXnuPVaFKIa6QSKXerPU-UAW1EXEq4Ncz43VcioiEckmUsZPdndn9xjPzDcArREIyzYQKrVWWwowmLJRxIfrLkhoRaOP6BNkzMT3n7y-yiw14M9bCWGv75DM7obd9LN-0ekmPyg4RKhdxlt-BuxnnPPPVWmPMgGd9H2TEMGjj6EgMRD5xJA_nH4_KU89XSNn2xBSKm0Ae8fTWcdT3V6HsSLXACXK-s8V66NkfQSdbUK7-vM88-TJZdvVE__qL1_F_R_cIHg5YlB155XkMG7bZhq1Vnwc2mP02PPiDtHAHyk_289VQsdSw1rG-hvcafW6Eray8_GENe6s6xa4vFZvhoH_6j8ctHZNMNYbRT_j82-YJnJ-8mx9Pw6EpQ6hxT-1CHZuM2xrdtDQWdcIL4ZxIYqGlSSW-nFXcZhZRWOGkLrQuEASoBIGcUk4LlT6FzaZt7DNg0ilRU2VtwumWus7T2EQiqRODKDqJA5is1qbSA2M5Nc74WvWeSySrfl2pk2ZORKdZAK_HC755so71oju0AqPYMPkB7N9a_JvbFBLdELpub6UN1WDti6oQPCaFkgEcjN-imVLsRTW2XS6qlBN1fZSvl8iJeAnRVQBsjQRutULiVCfrRSSFldGL5gHselW9GcGg8c__PfIDuDedl7Nqdnr2YQ_u-0fYlEr3Aja770v7ErFXV-_3JvcbxQEnMQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Segmentation+of+Multivariate+Mixed+Data+via+Lossy+Data+Coding+and+Compression&rft.jtitle=IEEE+transactions+on+pattern+analysis+and+machine+intelligence&rft.au=Ma%2C+Yi&rft.au=Derksen%2C+H&rft.au=Wei%2C+Hong&rft.au=Wright%2C+J&rft.date=2007-09-01&rft.issn=0162-8828&rft.volume=29&rft.issue=9&rft.spage=1546&rft.epage=1562&rft_id=info:doi/10.1109%2FTPAMI.2007.1085&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0162-8828&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0162-8828&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0162-8828&client=summon