Chip multiprocessing has demonstrated to be a promising approach in microprocessor design. With ever-increasing concerns for energy consumption, performance-energy trade-offs are often necessary, especially in the design of real-time embedded systems. This paper presents our performance and energy study on an in-house developed FPGA-based mixed-mode chip multiprocessor, where the SIMD (Single-Instruction, Multiple-Data), MIMD (Multiple-Instruction, Multiple-Data) and M-SIMD (Multiple-SIMD) computing modes can coexist simultaneously in one system. We propose performance-energy trade-off techniques based on the observation that SIMD and MIMD tasks involve substantially different granularities of computation and communication, which result in different time and energy behaviors; this provides us with opportunities to realize various performance-energy objectives. Generalized matrix-matrix multiplication (MMM) is employed as an example to illustrate our approach. Experimental results on a Xilinx Virtex II XC2V6000-5 FPGA demonstrate the effectiveness of the proposed approach.