High Performance Computing

cog_lines

Single Flexible Architecture – Multiple Applications

Efficient reconfigurable and High Performance Computing (HPC) technology will be critical for industrial competitiveness and essential for stimulating innovation, productivity and growth in a wide range of industrial sectors. Cyceera’s vision is to open new horizons in these sectors by leveraging key features, outlined below, of its novel manycore technology.

Cyceera’s patented manycore architecture provides a wealth of novel features that bring a mix of processor programmability, FPGA style flexibility and all with the performance characteristics of an ASIC device. Our revolutionary manycore architecture and parallel programming tools mean developers can harness the processing power of an array of shared resources on a single device. In turn this facilitates the creation of the next generation of standard products.

Cyceera’s heterogeneous manycore technology is realised as a family of Intellectual Property (IP) Cores that can be interconnected via a powerful Network on Chip (NoC) using a simple glueless interface to form a scalable, heterogeneous, parallel processing array. A key feature of our technology is a novel self synchronising multithreading mechanism that allows different threads to utilise the same resources. Like the data path partitioning, program control can also be segmented and distributed amongst the manycore array. The manycore architecture model incorporates both fine grained (logic level) and coarse grained Function Blocks (FBs) that can be interconnected in any combination via a switch fabric based Network-on-Chip. This provides a hybrid dataflow – control flow architecture allowing algorithms derived from a directed dataflow graph and control flow constructs to be mapped directly onto the FB resources within the array. The FB processing resources can be shared by any thread during run time, which leads to a better utilisation of device resources.

Novel Features

  • Self Synchronising Thread Mechanism & Transparent To Programmer
  • Segmented & Shared Instruction Control
  • Hybrid Dataflow & Control Architecture
  • Function Block Resources Can Be Shared By Different Threads
  • Autonomous Routing Of Thread Packets – Dynamic Task Scheduling
  • Interrupt Handling

Benefits

  • Distributed, Segmented Control Logic
    • Reduced Area & Power
    • Single Cycle Control
    • Greater Throughput
  • Efficient Mapping Of High level Code To Hardware Resources
    • Reduced Program Code and Memory
    • Simpler Thread Synchronisation
    • Simpler Debugging
  • Configurable, Scalable, User Definable
    • Greater Product Differentiation
  •   All Levels of Parallelism
    • Instruction, Logic, Data, Task, Thread, Storage & IO

Traits of Conventional Multicore Devices

  • Replicated Core – Brute Force Approach
  • Large Complex Control Unit Per Core
  • Shared Bus Bottlenecks
  • Coarse & Inefficient Code Partitioning
  • Sequential Processing
    • Very High Clock Frequency
    • Lacks Parallelism
    • Many Instructions
    • Longer Processing Times
  • Weak Thread Synchronisation
  • Poor Resource Utilisation
  • Larger Silicon Real Estate