Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (5)
  • Open Access

    ARTICLE

    High Throughput Scheduling Algorithms for Input Queued Packet Switches

    R. Chithra Devi1,*, D. Jemi Florinabel2, Narayanan Prasanth3

    CMC-Computers, Materials & Continua, Vol.70, No.1, pp. 1527-1540, 2022, DOI:10.32604/cmc.2022.019343

    Abstract The high-performance computing paradigm needs high-speed switching fabrics to meet the heavy traffic generated by their applications. These switching fabrics are efficiently driven by the deployed scheduling algorithms. In this paper, we proposed two scheduling algorithms for input queued switches whose operations are based on ranking procedures. At first, we proposed a Simple 2-Bit (S2B) scheme which uses binary ranking procedure and queue size for scheduling the packets. Here, the Virtual Output Queue (VOQ) set with maximum number of empty queues receives higher rank than other VOQ’s. Through simulation, we showed S2B has better throughput performance than Highest Ranking First… More >

  • Open Access

    ARTICLE

    AAP4All: An Adaptive Auto Parallelization of Serial Code for HPC Systems

    M. Usman Ashraf1,*, Fathy Alburaei Eassa2, Leon J. Osterweil3, Aiiad Ahmad Albeshri2, Abdullah Algarni2, Iqra Ilyas4

    Intelligent Automation & Soft Computing, Vol.30, No.2, pp. 615-639, 2021, DOI:10.32604/iasc.2021.019044

    Abstract High Performance Computing (HPC) technologies are emphasizing to increase the system performance across many disciplines. The primary challenge in HPC systems is how to achieve massive performance by minimum power consumption. However, the modern HPC systems are configured by adding the powerful and energy efficient multi-cores/many-cores parallel computing devices such as GPUs, MIC, and FPGA etc. Due to increasing the complexity of one chip many-cores/multi-cores systems, only well-balanced and optimized parallel programming technique is the solution to provide substantial increase in performance under power consumption limitations. Conventionally, the researchers face various barriers while parallelizing their serial code because they don’t… More >

  • Open Access

    ARTICLE

    A Dynamically Reconfigurable Accelerator Design Using a Sparse-Winograd Decomposition Algorithm for CNNs

    Yunping Zhao, Jianzhuang Lu*, Xiaowen Chen

    CMC-Computers, Materials & Continua, Vol.66, No.1, pp. 517-535, 2021, DOI:10.32604/cmc.2020.012380

    Abstract Convolutional Neural Networks (CNNs) are widely used in many fields. Due to their high throughput and high level of computing characteristics, however, an increasing number of researchers are focusing on how to improve the computational efficiency, hardware utilization, or flexibility of CNN hardware accelerators. Accordingly, this paper proposes a dynamically reconfigurable accelerator architecture that implements a Sparse-Winograd F(2 2.3 3)-based high-parallelism hardware architecture. This approach not only eliminates the pre-calculation complexity associated with the Winograd algorithm, thereby reducing the difficulty of hardware implementation, but also greatly improves the flexibility of the hardware; as a result, the accelerator can realize the… More >

  • Open Access

    ARTICLE

    An Implementation of the Longman's Integration Method on Graphics Hardware

    E. Mesquita1, J.Labaki 1 and L.O.S.Ferreira1

    CMES-Computer Modeling in Engineering & Sciences, Vol.51, No.2, pp. 143-168, 2009, DOI:10.3970/cmes.2009.051.143

    Abstract There is a growing trend towards solving problems of computational mechanics by parallelization strategies. The traditional approach is to implement the parallelization procedures on CPUs based on the MPI or OpenMP paradigms. Recent efforts have been made to implement computational tasks on general-purpose programmable graphics hardware (GPGPU). The GPU is specially well-suited to address problems that can be formulated in form of data-parallel computations with high arithmetic intensity. This work addresses the implementation of the Longman's integration method on graphics hardware. A serial implementation of Longman's method was rewritten under the SIMD (Single Input Multiple Data) parallel programming paradigm. The… More >

  • Open Access

    ARTICLE

    HPC: Its application in Climate Modelling

    RaviS Nanjundiah1

    CMES-Computer Modeling in Engineering & Sciences, Vol.27, No.1&2, pp. 1-24, 2008, DOI:10.3970/cmes.2008.027.001

    Abstract In this paper, application of high performance computing to climate modelling with specific reference to global General Circulation Models (GCM) is discussed. Methods of parallelization of global atmospheric models based on their numerical schemes is presented. It is seen that there is an interesting co-evolution of computer architecture and the type of numerical schemes used in general circulation models. A detailed survey of the Indian HPC scenario for meteorological computing is presented. Innovative and pioneering aspects of Indian efforts are highlighted. More >

Displaying 1-10 on page 1 of 5. Per Page