Programming Model

Manycore Parallel Programming

Cyceera’s innovative technology will provide system developers, both software and hardware, with the means to easily analyse and implement highly parallel problem spaces. Similar to Partitioned Global Address Space (PGAS), our programming model provides programmers a shared address space model that simplifies programming while enhancing performance through locality awareness. Our programming model is supported directly by the hardware. However, designers do not need to be fully familiar with the internal operations of the hardware to program the device. The novel self-synchronising threads, autonomous token passing and mapping of code to hardware are transparent to the user. Consequently, this reduces the coding burden on the programmer, minimises errors and is more productive. In fact, the programming model allows a programmer to be exposed to different levels of hardware detail enabling implementation at various levels of abstraction depending on the programmer’s skill set and application requirements.

The programming model incorporates a dataflow style approach. A problem space is recursively partitioned into multiple threads to exploit the fine degree of parallelism provided by Cyceera’s architecture. The result is analogous to a dynamic directed graph where nodes represent Function Blocks. The threads run on the array of shared resource Function Blocks and independent threads can access the resources of the same Function Block. Operations or tasks within a thread can be coded as codelets that run on processor type Function Block (e.g. VLIW) and or be configuration data that configure Function Blocks to implement tasks in hardware. The addressable Function Blocks allow results in the form of tokens to be routed automatically to other Function Blocks as defined in the directed graph in order to implement the desired algorithm.

As the Function Blocks are self-synchronising, they operate asynchronously based on the available data (tokens) passed between them. The “hardware” Function Blocks can be considered as highly parallel local hardware accelerators and are particularly useful for data streaming applications.