Home Page
People
Projects
Publications
Funding Sources
Awards / Recent News
Job Openings
Software


Useful Links











Projects



Current:
Back to the Future (BTF), Intel (2022-2025)

Previous:
ALPS, Huawei (2022-2023)
Uniserver, Horizon 2020 (2016-2019)
HARPA, FP7-STREP (2013-2016)
Eurocloud, FP7-STREP (2010-2013)
Performance-Effective Reliable Operation
Methods for Detecting the Duplication of Content in Caches and its Application to Memory Hierarchy Optimizations
Process Migration on Chip Multiprocessors to Reduce Power Density
Trace Caching
Isomorphism in Program Execution
Interval Based Policies for Power-Efficiency
High Performance Activity Migration for Thermally Constrained Single Chip Multi-Cores



BTF (2022-2025)
The Back to the Future (BTF) project aspires to revisit staple microarchitectural concepts, and exploit opportunities that have so far been ignored or under-studied. Notably, general purpose processors have significantly evolved since techniques such as branch prediction and memory dependence prediction concepts were first introduced. We argue that as the instruction window of processors expands, more opportunities to extract information from the “future” appear and should be leveraged to improve prediction mechanisms.
[ Top ]

ALPS (2022-2023)
ALPS (Agile Power Management for Future Energy-Efficient Servers in Data Centers Running Latency-Critical Applications) is a one year collaborative project between Huawei and the University of Cyprus aimed at improving the energy efficiency of CPUs used in servers running latency-critical applications in data centers. More energy efficient servers can help reduce the energy costs to operate and cool them as well as reduce their environmental impact. Moreover, energy efficient operation creates thermal headroom that can be leveraged to boost frequency, e.g., through turbo, and improve server performance. This helps increase the load that a server can serve within a required Quality of Service (QoS) which in turn can help reduce the number of servers and the capital expenses. The University of Cyprus role is to 1) develop and evaluate new core and package C-states that aim to improve energy efficiency with minimal impact on the performance of user-facing applications, 2) explore and evaluate the impact of different hardware and software knobs settings in modern servers on latency and power of modern servers running user-facing applications, 3) explore the opportunity presented by heterogeneous memory (e.g., Optane) to increase the benefits of the new core C-state (proposed in 1).
Faculty Collaborators:Haris Volos (CS).[ Top ]

Uniserver (2016-2019)
Uniserver (http://www.uniserver2020.eu/) is a 3-year project awarded from the Horizon 2020 Research and Innovation program (Feb 2016 - Feb 2019). The principal aim of UniServer is to facilitate the evolution of the Internet from an infrastructure where data is gathered in centralized data-centres widely known as The Cloud, to an infrastructure where data is handled in a distributed and localized manner close to the data sources enabling essentially Edge Computing. UniServer will realize its bold goal by greatly improving the energy efficiency, performance, dependability and security of the current state-of-the-art micro-servers, while reinforcing the supported system software. University of Cyprus is leading the characterization of the noise margins at the hardware layer, the effort of the development of the UniServer Firmware, as well as the development of an end-to-end TCO tool for exploring the benefits and trade-offs of UniServer for cloud only and mix fog-cloud internet applications deployments.
Faculty Collaborators:Pedro Trancoso (CS).[ Top ]

HARPA (2013-2016)
HARPA (http://www.harpa-project.eu/) is a FP7-STREP project which has as goal to enable next-generation embedded and high-performance heterogeneous many-cores to cost-effectively confront variations by providing Dependable-Performance: correct functionality and timing guarantees throughout the expected lifetime of a platform under thermal, power, and energy constraints.
Faculty Collaborators:Chrysostomos Nicopoulos (ECE). [ Top ]

Eurocloud (2010-2013)
Eurocloud (http://www.eurocloudserver.com/) is a 3-year FP7-STREP project (Jan 2010 - Jun 2013) that aims to deliver a next generation energy-conscious 3D Server-on-Chip for Green Cloud Services. The project is lead by ARM and the consortium includes Nokia, IMEC, EPFL and UCY. University of Cyprus is leading the Reliability and Fault Tolerance work-package and is contributing to the other work-packages: Physical Prototype, Workload Characterization, 3D Architecture Design and Power Management, On-chip Hierarchies and Interconnect.
Faculty Collaborators:Chrysostomos Nicopoulos (ECE). [ Top ]

Performance-Effective Reliable Operation
Faults in non-architectural structures, such as predictors and replacement arrays, do not affect correctness. Because of this, reliability research has mainly focused on protecting architectural structures against faults. This work shows that non-architectural structures are also important to protect because faults in these structures can significantly degrade performance. The goal of this project is to develop efficient, low cost mechanisms that enable processors to operate in the presence of non-architectural faults without significant performance degradation. Currently we are perusing two directions of work. One is to develop a repair mechanism based on the concept of address remapping that mitigates the effects of faults in predictor arrays. The second is the development of an analytical approach that allows us to accurately estimate the performance effects of faults without the need for extensive simulations.
Another direction of this work is concerned with low voltage operation. Low voltage operation is used as means to battle the increasing power constraints imposed by smaller technology nodes. The problem with low voltage operation is that it causes transistors to become unreliable. In this work, we are evaluating the use of block-disabling together with performance enhancing mechanisms, such as victim caching and prefetching, to enable reliable and performance effective operation when operating below Vcc-min. [ Top ]

Methods for Detecting the Duplication of Content in Caches and its Application to Memory Hierarchy Optimizations
This work has shown that when there is a miss for a block in a cache the required block of data may reside already in the cache but under a different tag. We refer to this phenomenon as Cache-Content-Duplication. This project aims to (1) characterize the phenomenon of cache-content-duplication, (2) investigates its potential for improving the performance of various types of instruction caches, (3) develop efficient methods for detecting cache-content-duplication, and (4) propose several memory hierarchy enhancements that exploit cache-content-duplication. [ Top ]

Process Migration on Chip Multiprocessors to Reduce Power Density
One of the goals of this project is to develop microarchitectural solutions for the power density problem using single chip multicore architectures. A two-way chip-multicore is already a reality and with increasing on-chip capacity many more cores will soon be available on a single chip. Power-density can be reduced on a multicore by distributing activity through execution migration from one core to another core periodically. However, a multicore dedicated to solving the power-density problem may have additional functional requirements from a conventional multicore. For example, may need mechanisms that facilitate frequent and performance efficient switches between cores. Frequent switching, however, comes at a cost because each migration may incur significant overhead, both in time and energy. For instance, it may be necessary for every migration to require the transfer of some state, such as registers, from one core to the other. Furthermore, cache and predictor state in the new core may be better to be warmed up rather than start cold. The above suggest that a trade-off exists between switching granularity, the state that is transferred, the length of the warm-up phase and performance-energy efficiency. This work will explore these trade-off present in a multicore microarchitecture aimed to alleviate the power density problem. Another objective of this work is to develop an analytical model that will allow a computer architect to understand and reason about temperature. The model can be used as a guideline during the design of microarchitectural mechanims to decide on how to better distribute power to reduce temperature and power-density. [ Top ]

Trace Caching
Trace caching is an emerging method for providing high bandwidth instruction supply to current and future superscalar processors. A trace cache can enable a processor to fetch in parallel long sequence of instructions across basic blocks and therefore overcome the sequential resolution of control transfer instructions. The central idea of this project is that a trace cache can facilitate other optimizations as important as high instruction bandwidth. In particular, we propose trace construction to be enhanced with optimization functionalities that can increase performance and reduce power. Performance increases and power decreases may be attained if selected traces are optimized to traces with more parallelism and/or having fewer instructions. Some of the important dimensions of the design space explored in this work are: front-end microarchitecture, criteria for long trace selection, value based optimizations, performance and power optimizations and power estimation. [ Top ]

Isomorphism in Program Execution
This activity attempts to investigate the underlying causes of redundancy in program execution and develop methods for exploiting it. The adopted approach is holistic in that it considers the effects from the entire dynamic dependence graph of a program's execution. This work identifies a fundamental program property that relates program structure and the observed redundancy in programs: Instruction-Isomorphism. Two instruction instances are said to be isomorphic if their backward dynamic data dependence slices are identical. By definition isomorphic instructions produce exactly the same output. Our objectives is to characterize isomorphism in program, determine transformations that facilitate isomorphism, and investigate methods for exploiting it. Currently we are considering the application of isomorphism to branch prediction and confidence estimation. [ Top ]

Interval Based Policies for Power-Efficiency
This works has demonstrated that if the execution of a program is divided into distinct intervals, it is possible for one processor or configuration to provide the best power efficiency over every interval, and yet have worse overall power efficiency over the entire execution than other configurations. This unintuitive behavior is a result of a seemingly intuitive use of power efficiency metrics, and can result in suboptimal design and execution decisions. This behavior may occur when using the energy-delay product and energy-delay^2 product metrics but not with the energy metric. We are currently pursuing two directions of work. One is aim to develop off-line computationally efficient algorithms or heuristics that can determine or approximate the optimal power efficiency. Under certain conditions (regarding the statistical properties of the terms of the interval behavior) it may be possible to establish the optimal power efficiency. The other direction is to develop on-the-fly heuristics that improve power efficiency decisions. [ Top ]