Task allocation and reallocation for fault tolerance in multicomputer systems

The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communicatio...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on aerospace and electronic systems Vol. 30; no. 4; pp. 1094 - 1104
Main Authors Chen, C.-I.H., Cherkassky, V.
Format Journal Article
LanguageEnglish
Published IEEE 01.10.1994
Subjects
Online AccessGet full text
ISSN0018-9251
DOI10.1109/7.328753

Cover

Abstract The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor system.< >
AbstractList The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor system
Proposed here is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reIiability property. We define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). (Author)
The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor system.< >
Author Cherkassky, V.
Chen, C.-I.H.
Author_xml – sequence: 1
  givenname: C.-I.H.
  surname: Chen
  fullname: Chen, C.-I.H.
  organization: Dept. of Electr. Eng., Wright State Univ., Dayton, OH, USA
– sequence: 2
  givenname: V.
  surname: Cherkassky
  fullname: Cherkassky, V.
BookMark eNqFkEtLxDAUhbMYwZlRcO0qK3HTMY82j6UMvmDEzbguMb2FaNqMSbqYf2-1g4gIri733O8cLmeBZn3oAaEzSlaUEn0lV5wpWfEZmhNCVaFZRY_RIqXXcS1VyefocWvSGzbeB2uyCz02fYMj_BDaEHFrBp9xDh6i6S1g1-NuVJwN3W7IEHHapwxdOkFHrfEJTg9ziZ5vb7br-2LzdPewvt4UlguWC6UN4VRwK5qqrLQRxlaCcqBcmpJoxahtG9s0oDWtXiwdNaIoCJBSSctKvkQXU-4uhvcBUq47lyx4b3oIQ6rZyAui-P-gEqqsBPsflEwzOf68RJcTaGNIKUJb76LrTNzXlNSfrdeynlof0dUv1Lr8VWqOxvm_DOeTwQHAd-7h-AHpsY9V
CODEN IEARAX
CitedBy_id crossref_primary_10_1109_TAES_2014_130690
crossref_primary_10_1109_TPDS_2010_34
crossref_primary_10_1109_24_994922
Cites_doi 10.1109/TC.1985.6312211
10.1137/0603056
10.1109/MC.1986.1663180
10.1109/TC.1987.1676966
10.1109/TR.1982.5221436
10.1109/TC.1979.1675348
10.1109/TC.1976.1674656
10.1109/TC.1984.1676479
10.1109/FTCS.1991.146684
10.1109/TSE.1987.233201
10.1109/MC.1984.1659213
10.1109/TC.1985.1676563
10.1109/TC.1980.1675654
10.1016/0026-2714(84)90221-X
10.1109/TC.1986.1676799
10.1002/j.1538-7305.1970.tb01770.x
10.1109/TC.1984.1676403
ContentType Journal Article
DBID AAYXX
CITATION
8FD
H8D
L7M
7SP
7TB
FR3
DOI 10.1109/7.328753
DatabaseName CrossRef
Technology Research Database
Aerospace Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
Mechanical & Transportation Engineering Abstracts
Engineering Research Database
DatabaseTitle CrossRef
Technology Research Database
Aerospace Database
Advanced Technologies Database with Aerospace
Engineering Research Database
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
DatabaseTitleList Engineering Research Database
Technology Research Database
Technology Research Database

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EndPage 1104
ExternalDocumentID 10_1109_7_328753
328753
GroupedDBID -~X
0R~
29I
4.4
41~
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
H~9
IAAWW
IBMZZ
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
OCL
P2P
RIA
RIE
RNS
TN5
VH1
AAYXX
CITATION
8FD
H8D
L7M
7SP
7TB
FR3
ID FETCH-LOGICAL-c362t-89a03163c6d5459a6ac5613e137a409821cfdcdde9915bc1409081e6e7787c243
IEDL.DBID RIE
ISSN 0018-9251
IngestDate Sun Sep 28 00:59:02 EDT 2025
Sat Sep 27 20:53:21 EDT 2025
Sat Sep 27 22:26:38 EDT 2025
Thu Apr 24 23:11:11 EDT 2025
Wed Oct 01 01:41:00 EDT 2025
Wed Aug 27 02:52:21 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c362t-89a03163c6d5459a6ac5613e137a409821cfdcdde9915bc1409081e6e7787c243
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
PQID 27292716
PQPubID 23500
PageCount 11
ParticipantIDs proquest_miscellaneous_27292716
crossref_citationtrail_10_1109_7_328753
ieee_primary_328753
proquest_miscellaneous_29086083
crossref_primary_10_1109_7_328753
proquest_miscellaneous_28684562
ProviderPackageCode CITATION
AAYXX
PublicationCentury 1900
PublicationDate 1994-10-01
PublicationDateYYYYMMDD 1994-10-01
PublicationDate_xml – month: 10
  year: 1994
  text: 1994-10-01
  day: 01
PublicationDecade 1990
PublicationTitle IEEE transactions on aerospace and electronic systems
PublicationTitleAbbrev T-AES
PublicationYear 1994
Publisher IEEE
Publisher_xml – name: IEEE
References ref12
ref14
meyer (ref15) 1980; c 29
ref11
(ref25) 1991
ref2
johnson (ref10) 1989
ref1
ref16
ref19
ref18
mathur (ref13) 1970
blazewicz (ref6) 1979
cherkassky (ref17) 1987
(ref24) 0
ref26
ref20
ref22
ref21
kuhl (ref23) 1980
ref8
ref7
(ref27) 1984
ref9
ref5
dasarathy (ref4) 1984
chang (ref3) 1986
References_xml – start-page: 175
  year: 1986
  ident: ref3
  article-title: Distributed scheduling under deadline constraints: a comparison of sender-initiated and receiver-initiated approaches
  publication-title: Proc of the IEEE Real-Time Systems Symposium
– year: 1989
  ident: ref10
  publication-title: Design and Analysis of Fault Tolerant Digital Systems
– ident: ref2
  doi: 10.1109/TC.1985.6312211
– year: 0
  ident: ref24
  publication-title: Open Systems Interconnection - Basic Reference Model
– ident: ref21
  doi: 10.1137/0603056
– ident: ref12
  doi: 10.1109/MC.1986.1663180
– year: 1979
  ident: ref6
  publication-title: Performance of Computer Systems
– ident: ref16
  doi: 10.1109/TC.1987.1676966
– ident: ref9
  doi: 10.1109/TR.1982.5221436
– start-page: 291
  year: 1980
  ident: ref23
  article-title: Some extensions to the theory of system level fault diagnosis
  publication-title: Proc FTCS
– ident: ref5
  doi: 10.1109/TC.1979.1675348
– start-page: 375
  year: 1970
  ident: ref13
  article-title: Reliability analysis and architecture of a lightly redundant digital system: Generalized triple modular redundancy with self-repair
  publication-title: Proceedings of the Spring Joint Computer Conference (proceedings of the American Federation of Information Processing Societies Conference)
– ident: ref14
  doi: 10.1109/TC.1976.1674656
– ident: ref19
  doi: 10.1109/TC.1984.1676479
– ident: ref26
  doi: 10.1109/FTCS.1991.146684
– ident: ref1
  doi: 10.1109/TSE.1987.233201
– ident: ref11
  doi: 10.1109/MC.1984.1659213
– start-page: 885
  year: 1987
  ident: ref17
  article-title: Graceful degradation of multiprocessor systems
  publication-title: Proceedings of ICPP-17
– ident: ref7
  doi: 10.1109/TC.1985.1676563
– volume: c 29
  start-page: 720
  year: 1980
  ident: ref15
  article-title: On evaluating the Performance of degradable computer systems
  publication-title: IEEE Transactions on Computers
  doi: 10.1109/TC.1980.1675654
– year: 1984
  ident: ref27
  publication-title: General Information Manual (Auragen System 4000)
– ident: ref18
  doi: 10.1016/0026-2714(84)90221-X
– ident: ref8
  doi: 10.1109/TC.1986.1676799
– ident: ref22
  doi: 10.1002/j.1538-7305.1970.tb01770.x
– year: 1991
  ident: ref25
  publication-title: MULTIBUS-II hot board products insertion investigation
– ident: ref20
  doi: 10.1109/TC.1984.1676403
– start-page: 135
  year: 1984
  ident: ref4
  article-title: Task allocation problems in the synthesis of distributed real-time system
  publication-title: Proceedings of the IEEE Real-Time System Symposium
SSID ssj0014843
Score 1.4979013
Snippet The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround...
Proposed here is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1094
SubjectTerms Communication networks
Costs
Fault tolerance
Fault tolerant systems
Hardware
Load management
Processor scheduling
Redundancy
Reliability
Resource management
Title Task allocation and reallocation for fault tolerance in multicomputer systems
URI https://ieeexplore.ieee.org/document/328753
https://www.proquest.com/docview/27292716
https://www.proquest.com/docview/28684562
https://www.proquest.com/docview/29086083
Volume 30
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library
  issn: 0018-9251
  databaseCode: RIE
  dateStart: 19650101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://ieeexplore.ieee.org/
  omitProxy: false
  ssIdentifier: ssj0014843
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dT4MwEG-cT_rgx9Q4P2ti4hMMCpT20RiXxUSftmRvpJQ2MVvACLz413tt2fxaFt9IOQKUHve73t3vELqlPOU5jRNPgLHzYqJjTwQi8HTCiamXKYi0CbIvdDyNn2bJrOPZtrUwSimbfKZ8c2hj-UUlW7NVNoyIQdc91EsZdaVaq4BBzLoEuRD0F2x2xzMbBnyY-u66H5bHtlL58_-1RmW076q1a8tFaHJJ5n7b5L78-MXU-M_nPUB7HbjE9241HKItVfbR7jfKwSP0PBH1HJtou9urw6IsMADHrwFAsViLdtHgploo03dD4dcS28xD2fWAwI4Auj5G09Hj5GHsdS0VPAmWqvEYF6DFNJK0AOjEBRXSeBAqjFIBnh4jodSFhF8ewMYkl4YNCzCDoioFxZYkjk7QdlmV6hRhFsUJE4HiWpgyf8JYznMdSa1SnWiSDNDdcroz2fGNm7YXi8z6HQHP0sxN0ADdrCTfHMfGGpm-md_V-eXo9fIDZqAWJtYhSlW1dUbAaSDgC26QYJQZ92-DBLw7BYh6tvbe52jH8irbxL4LtN28t-oSAEqTX9ml-QkM-OQ5
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDI54HIADjwHiTZCQOHXr0iRNjgiBBmw7bdJuVZomEtrUIdZe-PU4aTeemrhVqau2aVx_ju3PCF1zGcuUUxYoMHYBJZYGKlRhYJkkrl4mI9onyPZ5Z0ifRmxU82z7WhhjjE8-M0136GP52VSXbqusFRGHrlfROqOUsqpYaxEyoKJOkWuDBoPVrplm26Fsxc3qym-2xzdT-fUH9mblYaeq1555NkKXTTJulkXa1O8_uBr_-cS7aLuGl_i2Wg97aMXkDbT1hXRwH_UGajbGLt5e7dZhlWcYoOPnAOBYbFU5KXAxnRjXecPglxz73ENdd4HAFQX07AANH-4Hd52gbqoQaLBVRSCkAj3mkeYZgCepuNLOhzDtKFbg6wnS1jbT8NMD4MhS7fiwADUYbmJQbU1odIjW8mlujhAWEWVChUZa5Qr9iRCpTG2krYkts4Qdo5v5dCe6Zhx3jS8mifc8QpnESTVBx-hqIflasWz8IdNw87s4Px-9nH_ABBTDRTtUbqblLCHgNhDwBpdICC6cA7hEAt6dA0g9-fPel2ijM-h1k-5j__kUbXqWZZ_md4bWirfSnANcKdILv0w_ACGQ54Y
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Task+allocation+and+reallocation+for+fault+tolerance+in+multicomputer+systems&rft.jtitle=IEEE+transactions+on+aerospace+and+electronic+systems&rft.au=Chen%2C+C-I+H&rft.au=Cherkassky%2C+V&rft.date=1994-10-01&rft.issn=0018-9251&rft.volume=30&rft.issue=4&rft.spage=1094&rft.epage=1104&rft_id=info:doi/10.1109%2F7.328753&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9251&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9251&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9251&client=summon