Task allocation and reallocation for fault tolerance in multicomputer systems
The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communicatio...
        Saved in:
      
    
          | Published in | IEEE transactions on aerospace and electronic systems Vol. 30; no. 4; pp. 1094 - 1104 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            IEEE
    
        01.10.1994
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0018-9251 | 
| DOI | 10.1109/7.328753 | 
Cover
| Abstract | The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor system.< > | 
    
|---|---|
| AbstractList | The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor system Proposed here is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reIiability property. We define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). (Author) The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to resource limitations defined by the system and designer. The limitations can be viewed as results from the load balancing since the execution time of each task, the number of available processors, processor speed, and memory capacity are known to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we define system reliability as the probability that the system can run the tasks successfully. After the (nonredundant) task scheduling strategy is defined, tasks are then reallocated to processors statically and redundantly. This is a form of time redundancy, in which if some processors fail during the execution, all tasks can be completed on the remaining processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known dynamic reconfiguration and rollback recovery techniques in multicomputer systems. We demonstrate the effectiveness of the task allocation and reallocation for hardware fault tolerance by illustrations of applying the methods to different examples and practical communications network multiprocessor system.< >  | 
    
| Author | Cherkassky, V. Chen, C.-I.H.  | 
    
| Author_xml | – sequence: 1 givenname: C.-I.H. surname: Chen fullname: Chen, C.-I.H. organization: Dept. of Electr. Eng., Wright State Univ., Dayton, OH, USA – sequence: 2 givenname: V. surname: Cherkassky fullname: Cherkassky, V.  | 
    
| BookMark | eNqFkEtLxDAUhbMYwZlRcO0qK3HTMY82j6UMvmDEzbguMb2FaNqMSbqYf2-1g4gIri733O8cLmeBZn3oAaEzSlaUEn0lV5wpWfEZmhNCVaFZRY_RIqXXcS1VyefocWvSGzbeB2uyCz02fYMj_BDaEHFrBp9xDh6i6S1g1-NuVJwN3W7IEHHapwxdOkFHrfEJTg9ziZ5vb7br-2LzdPewvt4UlguWC6UN4VRwK5qqrLQRxlaCcqBcmpJoxahtG9s0oDWtXiwdNaIoCJBSSctKvkQXU-4uhvcBUq47lyx4b3oIQ6rZyAui-P-gEqqsBPsflEwzOf68RJcTaGNIKUJb76LrTNzXlNSfrdeynlof0dUv1Lr8VWqOxvm_DOeTwQHAd-7h-AHpsY9V | 
    
| CODEN | IEARAX | 
    
| CitedBy_id | crossref_primary_10_1109_TAES_2014_130690 crossref_primary_10_1109_TPDS_2010_34 crossref_primary_10_1109_24_994922  | 
    
| Cites_doi | 10.1109/TC.1985.6312211 10.1137/0603056 10.1109/MC.1986.1663180 10.1109/TC.1987.1676966 10.1109/TR.1982.5221436 10.1109/TC.1979.1675348 10.1109/TC.1976.1674656 10.1109/TC.1984.1676479 10.1109/FTCS.1991.146684 10.1109/TSE.1987.233201 10.1109/MC.1984.1659213 10.1109/TC.1985.1676563 10.1109/TC.1980.1675654 10.1016/0026-2714(84)90221-X 10.1109/TC.1986.1676799 10.1002/j.1538-7305.1970.tb01770.x 10.1109/TC.1984.1676403  | 
    
| ContentType | Journal Article | 
    
| DBID | AAYXX CITATION 8FD H8D L7M 7SP 7TB FR3  | 
    
| DOI | 10.1109/7.328753 | 
    
| DatabaseName | CrossRef Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Engineering Research Database  | 
    
| DatabaseTitle | CrossRef Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace Engineering Research Database Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts  | 
    
| DatabaseTitleList | Engineering Research Database Technology Research Database Technology Research Database  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering | 
    
| EndPage | 1104 | 
    
| ExternalDocumentID | 10_1109_7_328753 328753  | 
    
| GroupedDBID | -~X 0R~ 29I 4.4 41~ 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P H~9 IAAWW IBMZZ ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 OCL P2P RIA RIE RNS TN5 VH1 AAYXX CITATION 8FD H8D L7M 7SP 7TB FR3  | 
    
| ID | FETCH-LOGICAL-c362t-89a03163c6d5459a6ac5613e137a409821cfdcdde9915bc1409081e6e7787c243 | 
    
| IEDL.DBID | RIE | 
    
| ISSN | 0018-9251 | 
    
| IngestDate | Sun Sep 28 00:59:02 EDT 2025 Sat Sep 27 20:53:21 EDT 2025 Sat Sep 27 22:26:38 EDT 2025 Thu Apr 24 23:11:11 EDT 2025 Wed Oct 01 01:41:00 EDT 2025 Wed Aug 27 02:52:21 EDT 2025  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 4 | 
    
| Language | English | 
    
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c362t-89a03163c6d5459a6ac5613e137a409821cfdcdde9915bc1409081e6e7787c243 | 
    
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23  | 
    
| PQID | 27292716 | 
    
| PQPubID | 23500 | 
    
| PageCount | 11 | 
    
| ParticipantIDs | proquest_miscellaneous_27292716 crossref_citationtrail_10_1109_7_328753 ieee_primary_328753 proquest_miscellaneous_29086083 crossref_primary_10_1109_7_328753 proquest_miscellaneous_28684562  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 1900 | 
    
| PublicationDate | 1994-10-01 | 
    
| PublicationDateYYYYMMDD | 1994-10-01 | 
    
| PublicationDate_xml | – month: 10 year: 1994 text: 1994-10-01 day: 01  | 
    
| PublicationDecade | 1990 | 
    
| PublicationTitle | IEEE transactions on aerospace and electronic systems | 
    
| PublicationTitleAbbrev | T-AES | 
    
| PublicationYear | 1994 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| References | ref12 ref14 meyer (ref15) 1980; c 29 ref11 (ref25) 1991 ref2 johnson (ref10) 1989 ref1 ref16 ref19 ref18 mathur (ref13) 1970 blazewicz (ref6) 1979 cherkassky (ref17) 1987 (ref24) 0 ref26 ref20 ref22 ref21 kuhl (ref23) 1980 ref8 ref7 (ref27) 1984 ref9 ref5 dasarathy (ref4) 1984 chang (ref3) 1986  | 
    
| References_xml | – start-page: 175 year: 1986 ident: ref3 article-title: Distributed scheduling under deadline constraints: a comparison of sender-initiated and receiver-initiated approaches publication-title: Proc of the IEEE Real-Time Systems Symposium – year: 1989 ident: ref10 publication-title: Design and Analysis of Fault Tolerant Digital Systems – ident: ref2 doi: 10.1109/TC.1985.6312211 – year: 0 ident: ref24 publication-title: Open Systems Interconnection - Basic Reference Model – ident: ref21 doi: 10.1137/0603056 – ident: ref12 doi: 10.1109/MC.1986.1663180 – year: 1979 ident: ref6 publication-title: Performance of Computer Systems – ident: ref16 doi: 10.1109/TC.1987.1676966 – ident: ref9 doi: 10.1109/TR.1982.5221436 – start-page: 291 year: 1980 ident: ref23 article-title: Some extensions to the theory of system level fault diagnosis publication-title: Proc FTCS – ident: ref5 doi: 10.1109/TC.1979.1675348 – start-page: 375 year: 1970 ident: ref13 article-title: Reliability analysis and architecture of a lightly redundant digital system: Generalized triple modular redundancy with self-repair publication-title: Proceedings of the Spring Joint Computer Conference (proceedings of the American Federation of Information Processing Societies Conference) – ident: ref14 doi: 10.1109/TC.1976.1674656 – ident: ref19 doi: 10.1109/TC.1984.1676479 – ident: ref26 doi: 10.1109/FTCS.1991.146684 – ident: ref1 doi: 10.1109/TSE.1987.233201 – ident: ref11 doi: 10.1109/MC.1984.1659213 – start-page: 885 year: 1987 ident: ref17 article-title: Graceful degradation of multiprocessor systems publication-title: Proceedings of ICPP-17 – ident: ref7 doi: 10.1109/TC.1985.1676563 – volume: c 29 start-page: 720 year: 1980 ident: ref15 article-title: On evaluating the Performance of degradable computer systems publication-title: IEEE Transactions on Computers doi: 10.1109/TC.1980.1675654 – year: 1984 ident: ref27 publication-title: General Information Manual (Auragen System 4000) – ident: ref18 doi: 10.1016/0026-2714(84)90221-X – ident: ref8 doi: 10.1109/TC.1986.1676799 – ident: ref22 doi: 10.1002/j.1538-7305.1970.tb01770.x – year: 1991 ident: ref25 publication-title: MULTIBUS-II hot board products insertion investigation – ident: ref20 doi: 10.1109/TC.1984.1676403 – start-page: 135 year: 1984 ident: ref4 article-title: Task allocation problems in the synthesis of distributed real-time system publication-title: Proceedings of the IEEE Real-Time System Symposium  | 
    
| SSID | ssj0014843 | 
    
| Score | 1.4979013 | 
    
| Snippet | The goal of task allocation in a set of interconnected processors (computers) is to maximize the efficient use of resources and thus reduce the job turnaround... Proposed here is a simple yet effective method to allocate the tasks in multicomputer systems for minimizing the interprocessor communication cost subject to...  | 
    
| SourceID | proquest crossref ieee  | 
    
| SourceType | Aggregation Database Enrichment Source Index Database Publisher  | 
    
| StartPage | 1094 | 
    
| SubjectTerms | Communication networks Costs Fault tolerance Fault tolerant systems Hardware Load management Processor scheduling Redundancy Reliability Resource management  | 
    
| Title | Task allocation and reallocation for fault tolerance in multicomputer systems | 
    
| URI | https://ieeexplore.ieee.org/document/328753 https://www.proquest.com/docview/27292716 https://www.proquest.com/docview/28684562 https://www.proquest.com/docview/29086083  | 
    
| Volume | 30 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library issn: 0018-9251 databaseCode: RIE dateStart: 19650101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://ieeexplore.ieee.org/ omitProxy: false ssIdentifier: ssj0014843 providerName: IEEE  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dT4MwEG-cT_rgx9Q4P2ti4hMMCpT20RiXxUSftmRvpJQ2MVvACLz413tt2fxaFt9IOQKUHve73t3vELqlPOU5jRNPgLHzYqJjTwQi8HTCiamXKYi0CbIvdDyNn2bJrOPZtrUwSimbfKZ8c2hj-UUlW7NVNoyIQdc91EsZdaVaq4BBzLoEuRD0F2x2xzMbBnyY-u66H5bHtlL58_-1RmW076q1a8tFaHJJ5n7b5L78-MXU-M_nPUB7HbjE9241HKItVfbR7jfKwSP0PBH1HJtou9urw6IsMADHrwFAsViLdtHgploo03dD4dcS28xD2fWAwI4Auj5G09Hj5GHsdS0VPAmWqvEYF6DFNJK0AOjEBRXSeBAqjFIBnh4jodSFhF8ewMYkl4YNCzCDoioFxZYkjk7QdlmV6hRhFsUJE4HiWpgyf8JYznMdSa1SnWiSDNDdcroz2fGNm7YXi8z6HQHP0sxN0ADdrCTfHMfGGpm-md_V-eXo9fIDZqAWJtYhSlW1dUbAaSDgC26QYJQZ92-DBLw7BYh6tvbe52jH8irbxL4LtN28t-oSAEqTX9ml-QkM-OQ5 | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDI54HIADjwHiTZCQOHXr0iRNjgiBBmw7bdJuVZomEtrUIdZe-PU4aTeemrhVqau2aVx_ju3PCF1zGcuUUxYoMHYBJZYGKlRhYJkkrl4mI9onyPZ5Z0ifRmxU82z7WhhjjE8-M0136GP52VSXbqusFRGHrlfROqOUsqpYaxEyoKJOkWuDBoPVrplm26Fsxc3qym-2xzdT-fUH9mblYaeq1555NkKXTTJulkXa1O8_uBr_-cS7aLuGl_i2Wg97aMXkDbT1hXRwH_UGajbGLt5e7dZhlWcYoOPnAOBYbFU5KXAxnRjXecPglxz73ENdd4HAFQX07AANH-4Hd52gbqoQaLBVRSCkAj3mkeYZgCepuNLOhzDtKFbg6wnS1jbT8NMD4MhS7fiwADUYbmJQbU1odIjW8mlujhAWEWVChUZa5Qr9iRCpTG2krYkts4Qdo5v5dCe6Zhx3jS8mifc8QpnESTVBx-hqIflasWz8IdNw87s4Px-9nH_ABBTDRTtUbqblLCHgNhDwBpdICC6cA7hEAt6dA0g9-fPel2ijM-h1k-5j__kUbXqWZZ_md4bWirfSnANcKdILv0w_ACGQ54Y | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Task+allocation+and+reallocation+for+fault+tolerance+in+multicomputer+systems&rft.jtitle=IEEE+transactions+on+aerospace+and+electronic+systems&rft.au=Chen%2C+C-I+H&rft.au=Cherkassky%2C+V&rft.date=1994-10-01&rft.issn=0018-9251&rft.volume=30&rft.issue=4&rft.spage=1094&rft.epage=1104&rft_id=info:doi/10.1109%2F7.328753&rft.externalDBID=NO_FULL_TEXT | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9251&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9251&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9251&client=summon |