Parallel Distributed Infrastructure for Minimization of Energy

Ten minutes with... Oscar Palomar, Barcelona Supercomputing Center

Thu, 2015-04-16

Oscar Palomar is a senior researcher in the Computer Architecture for Parallel Paradigms group at Barcelona Supercomputing Center (BSC). His research interests relate to vector and low-power computer architectures. In the ParaDIME project, he works closely with fellow BSC researchers Santhosh Rethinagiri and Ruben Titos, while the principal investigators are BSC’s Adrián Cristal and Osman Ünsal.


1. What are your research interests? What do you most enjoy researching?
My research is mainly in two related areas of computer architecture: vector and low-power architectures. Vector architectures have been around for a long time, but we are looking at them from a new perspective and for new types of application, such as databases. In scalar architectures, used in conventional processors, each instruction defines an operation, meaning that if you have to add two arrays of numbers you will have to add each pair of values sequentially, with the add instruction in a loop. Vector architectures allow adding the whole two arrays using only one instruction, which is more efficient due to multiple reasons, for example that the instruction only has to be read and the processor prepared for the operation once. 
When vector architectures were dominant in supercomputing, the most important constraint affecting their design was not power; computers were built to run as fast as possible, as cheaply as possible. Today, technology trends have made power a key issue. Vector architectures require a small increase in the amount of power required but they ensure that the operations go much faster and therefore represent a more energy-efficient computer design when vector operation is common in the workloads. We’re now seeing a return to vectors, with some designs, such as that of the Intel Xeon Phi, approaching vector architectures in the instructions they offer, although to my mind these could be made more efficient by using vector implementation as well as instructions.
2. What areas are you concentrating on within the ParaDIME project?
Within the ParaDIME project, we have published one paper on vectors, but I’ve mostly been working closely with BSC researchers Santhosh Rethinagiri and Ruben Titos on heterogeneous  and multi-core architectures. You can find out more about Santhosh’s work in the interview with him on this website. Ruben has been researching how to make inter-core communication more efficient. In a multicore processor (a chip with several processing units), there are two main approaches to communicating and exchanging data. The first is shared memory, where all the cores in the processor access a single memory address space, while in the second, each core accesses its own private memory address space and uses message passing to communicate that it is sending/receiving data to/from another core. 
One of the assumptions of the ParaDIME project is that message passing is more efficient than shared memory; however, most chip manufacturers implement shared-memory architectures and this situation looks likely to continue for multiple reasons. An important one is that most applications use shared memory. Ruben is therefore looking at techniques to improve the efficiency of message passing on shared-memory architectures and has proposed a way of avoiding redundant copies of data. This has two benefits: it improves performance and reduces energy consumption, as moving data requires high amounts of energy. 
3. How did you come to be a computer-science researcher? Have you always enjoyed computer science?
I suppose I first got interested in computer science when my parents bought a Spectrum computer when I was a kid, which I started to program and have fun with. At school I always liked science, particularly physics and maths, although I only remember having one programming class and I don’t think I got much out of it. I also remember a philosophy teacher who told us that if we didn’t understand his logic class, we should forget about studying computer science. We didn’t really understand his class – although I think that had more to do with his teaching than anything else – but I think this might have actually motivated me more. 
I went on to study computer science at Barcelona Tech (Universitat Politècnica de Catalunya) and realised that the topic which interested me most was computer architectures. When doing my final project on computer architectures a professor suggested that I do a PhD, which I went on to do at the same university. 
Things might have changed since I was at university, but one thing I felt was missing from the course at that time was a focus on power. I think it’s also really important for computer scientists to learn about different areas: programmers need some architecture awareness, for example, and vice versa.
4. Why is it important for computers to be more energy efficient? What are the major technical challenges which need to be overcome to achieve this?
Obviously the less energy computers use, the lower the energy costs, especially over the long term. Every time you switch on your computer you’re using energy, so if it were more energy efficient it would mean that if you use it over the next three years, the result would be three years’ worth of energy saving. 
For mobile devices, energy is the most important constraint, due to the need to reduce the number of times you have to charge the battery. Batteries are crucial, in fact – we need batteries which last longer and are faster to charge, and/or have a charging system in the background – but these are out of the hands of computer scientists; all we can do is improve the energy efficiency of devices. 
Heterogeneous architectures are definitely the way to go to achieve energy savings: now we need to work out what they will look like and how to make them usable for programmers, so that they don’t need to have in-depth knowledge of hardware to program them. At the moment, for example, we don’t have enough compiler support  – computer programs that transform source code written in a programming language into instructions that can be executed by the computer– for vectors. This means that we have to write low-level code to use directly the vector instructions. Using other accelerators as GPGPUs or FPGAs is also non-trivial and demanding for the programmer.
5. What have you learned from working with other researchers on European projects? Do you think it’s a productive experience, despite cultural and linguistic differences?
Working with researchers from other institutions has helped me get perspective on where our area of research lies in the hardware/software development chain: for the researchers at Neuchâtel, for example, BSC works at the low-level end, whereas for IMEC we are more high level. Also, as we’re experimenting with new ideas which don’t currently exist on any chips, at BSC we’re using simulations to try out our results, whereas at Neuchâtel and Technische Universität Dresden they are using real hardware. This means that our timescales and research methods are significantly different (thousands of times slower), and that we have to work carefully together to ensure that the results are meaningful.
6. How is ParaDIME different from other projects focusing on energy-efficient computing, such as Mont-Blanc? What would a sequel to the ParaDIME project look like?
Like Mont-Blanc, ParaDIME is also looking at small, ARM-based, energy-efficient cores, although we are looking at more heterogeneous architectures. However, Mont-Blanc aims to build a supercomputer prototype, whereas ParaDIME’s research is focused more on data centres
ParaDIME uses the Scala and AKKA programming models, which are not intended for supercomputing. These are examples of actor models; that is, inherently concurrent models which assume that everything is an actor which can make local decisions, create more actors, send messages to other actors, and determine how to respond to the next message received. The next steps which ParaDIME could take would be to integrate different elements by getting an actor model to use efficient support for message passing and use heterogeneous accelerators to improve the implementation of the model. 
7. What do you think that BSC, and perhaps even Catalonia more generally, bring to this area of research?
There is a tradition of researching architectures, and specifically vector architectures, at BSC and Barcelona Tech. BSC Director Mateo Valero has published highly influential papers on vector architectures, so I think that when people is interested in vector architectures, they consider BSC a reference. As for Catalonia, I think there’s a tradition of critical thinking (critical perhaps being the operative word) which is useful when it comes to research. 
8. What are your predictions regarding the future of information and technology systems, especially regarding energy consumption and innovative architectures?
I think it’s dangerous to start making predictions, but I think we can safely say there will be a lot more connected devices and many of these will be working more autonomously. For that to work, and for smart cities to be really feasible, we will need large-scale data centres to process the data. We will also need very energy-efficient, small devices to process as much data as possible locally for things such as smart traffic distribution. This means decisions made locally but with global processing; for both of these areas, energy efficiency is of key importance.