Multi-Objective Optimizations for a Superscalar Architecture with Selective Value Prediction
نویسندگان
چکیده
This work extends an earlier manual design space exploration of our developed Selective Load Value Prediction based superscalar architecture to the L2 unified cache. After that we perform an automatic design space exploration using a special developed software tool by varying several architectural parameters. Our goal is to find optimal configurations in terms of CPI (Cycles per Instruction) and energy consumption. By varying 19 architectural parameters, as we proposed, the design space is over 2.5 millions of billions configurations which obviously means that only heuristic search can be considered. Therefore, we propose different methods of automatic design space exploration based on our developed FADSE tool which allow us to evaluate only 2500 configurations of the above mentioned huge design space! The experimental results show that our automatic design space exploration (DSE) provides significantly better configurations than our previous manual DSE approach, considering the proposed multi-objective approach.
منابع مشابه
Multi-objective optimisations for a superscalar architecture with selective value prediction
This work extends an earlier manual design space exploration of our developed Selective Load Value Prediction based superscalar architecture to the L2 unified cache. After that we perform an automatic design space exploration using a special developed software tool by varying several architectural parameters. Our goal is to find optimal configurations in terms of CPI (Cycles per Instruction) an...
متن کاملExploiting selective instruction reuse and value prediction in a superscalar architecture
In our previously published research we discovered some very difficult to predict branches, called unbiased branches. Since the overall performance of modern processors is seriously affected by misprediction recovery, especially these difficult branches represent a source of important performance penalties. Our statistics show that about 28% of branches are dependent on critical Load instructio...
متن کاملPerformance Limits Due to Inter-Cluster Data Forwarding in Wire-Limited ILP Microprocessors
The growing speed gap between transistors and wire interconnects is forcing the development of distributed, or clustered, architectures. These designs partition the chip into small regions with fast intra-cluster communication. Longer latency is required to communicate between clusters. The hardware and/or software is responsible for scheduling instructions to clusters such that critical path c...
متن کاملExergy , economy and pressure drop analyses for optimal design of recuperator used in microturbine
The optimal design of a plate-fin recuperator of a 200-kW microturbine was studied in this paper. The exergy efficiency, pressure drop and total cost were selected as the three important objective functions of the recuperator. Genetic Algorithm (GA) and Non-dominated Sorting Genetic Algorithm (NSGA-II) were respectively employed for single-objective and multi-objective optimizations. By opt...
متن کامل