New Horizons

Welcome to my blog

My name is Sven Andersson and I
work as a consultant in embedded
system design, implemented in ASIC
and FPGA.
In my spare time I write this blog
and I hope it will inspire others to
learn more about this fantastic field.
I live in Stockholm Sweden and have
my own company


You are welcome to contact me
and ask questions or make comments
about my blog.


New Horizons
What's new
Starting a blog
Writing a blog
Using an RSS reader

Zynq Design From Scratch
Started February 2014
1 Introduction
Changes and updates
2 Zynq-7000 All Programmable SoC
3 ZedBoard and other boards
4 Computer platform and VirtualBox
5 Installing Ubuntu
6 Fixing Ubuntu
7 Installing Vivado
8 Starting Vivado
9 Using Vivado
10 Lab 1. Create a Zynq project
11 Lab 1. Build a hardware platform
12 Lab 1. Create a software application
13 Lab 1. Connect to ZedBoard
14 Lab 1. Run a software application
15 Lab 1. Benchmarking ARM Cortex-A9
16 Lab 2. Adding a GPIO peripheral
17 Lab 2. Create a custom HDL module
18 Lab 2. Connect package pins and implement
19 Lab 2. Create a software application and configure the PL
20 Lab 2. Debugging a software application
21 Running Linux from SD card
22 Installing PetaLinux
23 Booting PetaLinux
24 Connect to ZedBoad via ethernet
25 Rebuilding the PetaLinux kernel image
26 Running a DHCP server on the host
27 Running a TFTP server on the host
28 PetaLinux boot via U-boot
29 PetaLinux application development
30 Fixing the host computer
31 Running NFS servers
32 VirtualBox seamless mode
33 Mounting guest file system using sshfs
34 PetaLinux. Setting up a web server
35 PetaLinux. Using cgi scripts
36 PetaLinux. Web enabled application
37 Convert from VirtualBox to VMware
38 Running Linaro Ubuntu on ZedBoard
39 Running Android on ZedBoard
40 Lab2. Booting from SD card and SPI flash
41 Lab2. PetaLinux board bringup
42 Lab2. Writing userspace IO device driver
43 Lab2. Hardware debugging
44 MicroZed quick start
45 Installing Vivado 2014.1
46 Lab3. Adding push buttons to our Zynq system
47 Lab3. Adding an interrupt service routine
48 Installing Ubuntu 14.04
49 Installing Vivado and Petalinux 2014.2
50 Using Vivado 2014.2
51 Upgrading to Ubuntu 14.04
52 Using Petalinux 2014.2
53 Booting from SD card and SPI flash
54 Booting Petalinux 2014.2 from SD card
55 Booting Petalinux 2014.2 from SPI flash
56 Installing Vivado 2014.3

Chipotle Verification System

EE Times Retrospective Series
It all started more than 40 years ago
My first job as an electrical engineer
The Memory (R)evolution
The Microprocessor (R)evolution

Four soft-core processors
Started January 2012
Table of contents
OpenRISC 1200
Nios II

Using the Spartan-6 LX9 MicroBoard
Started August 2011
Table of contents
Problems, fixes and solutions

FPGA Design From Scratch
Started December 2006
Table of contents
Acronyms and abbreviations

Actel FPGA design
Designing with an Actel FPGA. Part 1
Designing with an Actel FPGA. Part 2
Designing with an Actel FPGA. Part 3
Designing with an Actel FPGA. Part 4
Designing with an Actel FPGA. Part 5

A hardware designer's best friend
Zoo Design Platform

Installing Cobra Command Tool
A processor benchmark

Porting a Unix program to Mac OS X
Fixing a HyperTerminal in Mac OS X
A dream come true

Stockholm by bike

The New York City Marathon

Kittelfjall Lappland

Tour skating in Sweden and around the world
Wild skating
Tour day
Safety equipment
A look at the equipment you need
Skate maintenance
Books, photos, films and videos
Weather forecasts

38000 feet above see level
A trip to Spain
Florida the sunshine state

Photo Albums
Seaside Florida
Ronda Spain
Sevilla Spain
Cordoba Spain
Alhambra Spain
Kittelfjäll Lapland
Landsort Art Walk
Skating on thin ice

100 Power Tips for FPGA Designers

Adventures in ASIC
Computer History Museum
Design & Reuse
d9 Tech Blog
EDA Cafe
EDA DesignLine
Eli's tech Blog
FPGA Arcade
FPGA Central
FPGA developer
FPGA Journal
FPGA World
Lesley Shannon Courses
Mac 2 Ubuntu
Programmable Logic DesignLine
World of ASIC

If you want to be updated on this weblog Enter your email here:

rss feed

Saturday, February 22, 2014
Zynq design from scratch. Part 15.

Benchmarking the ARM Cortex-A9 processor

In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. The term 'benchmark' is also mostly utilized for the purposes of elaborately-designed benchmarking programs themselves.

Benchmarking is usually associated with assessing performance characteristics of computer hardware, for example, the floating point operation performance of a CPU, but there are circumstances when the technique is also applicable to software. Software benchmarks are, for example, run against compilers or database management systems.

CPU core benchmarking

Although it doesn’t reflect how you would use a processor in a real application, sometimes it’s important to isolate the CPU’s core from the other elements of the processor and focus on one key element. For example, you might want to have the ability to ignore memory and I/O effects and focus primarily on the pipeline operation. This is CoreMark’s domain. CoreMark is capable of testing a processor’s basic pipeline structure, as well as the ability to test basic read/write operations, integer operations, and control operations. Read more.


CoreMark is a benchmark that aims to measure the performance of central processing units (CPU) used in embedded systems. It was developed in 2009 by Shay Gal-On at EEMBC and is intended to become an industry standard, replacing the antiquated Dhrystone benchmark. The code is written in C code and contains implementations of the following algorithms: list processing (find and sort), Matrix (mathematics) manipulation (common matrix operations), state machine (determine if an input stream contains valid numbers), and CRC.

Downloading CoreMark

The test suite can be downloaded from

Here is the result after unpacking. We will create a new application in SDK called CoreMark and copy the the marked c-files to the src directory.


Trying to compile the Coremark program without modifications gives the following error:

undefined reference to `clock_gettime'

We are running this application "bare metal" (without OS). This means we don't have access to a real-time clock (RTC) and we can not use the library routines in time.h.  It looks like we have to write our own "clock_gettime" routine.

Bare-metal application development

Xilinx software design tools facilitate the development of embedded software applications for many runtime environments. Xilinx embedded design tools create a set of hardware platform data files that include:

• An XML-based hardware description file describing processors, peripherals, memory maps, and additional system data
• A bitstream file containing optional Programmable Logic (PL) programming data
• A block RAM Memory Map (BMM) file
• PS configuration data used by the Zynq-7000 AP SoC First Stage Bootloader (FSBL).

The bare-metal Board Support Package (BSP) is a collection of libraries and drivers that form the lowest layer of your application. The runtime environment is a simple, semi-hosted and single-threaded environment that provides basic features, including boot code, cache functions, exception handling, basic file I/O, C library support for memory allocation and other calls, processor hardware access macros, timer functions, and other functions required to support bare-metal applications. Using the hardware platform data and bare-metal BSP, you can develop, debug, and deploy bare-metal applications using SDK.

Board support package

The BSP <standalone_bsp_0> we generated in our first software project stores all the information about our board setup and all the software we need to start writing a bare metal program. The libsrc directory contains low-level drivers and example code to be used when writing software to access the hardware in the processing system. We will take a closer look in the scutimer_v1_02_a directory.

Writing our own clock_gettime

We will use one of the timers available in the in ARM processor to count clock cycles and measure time intervals. Let's take a look on the timer setup. Here is a picture taken from chapter 8 in the Zynq-7000 Technical Reference Manual.


Each Cortex-A9 processor has its own private 32-bit timer and 32-bit watchdog timer. Both processors share a global 64-bit timer. These timers are always clocked at 1/2 of the CPU frequency (667MHz). On the system level, there is a 24-bit watchdog timer and two 16-bit triple timer/counters. The system watchdog timer is clocked at 1/4 or 1/6 of the CPU frequency, or can be clocked by an external signal from an MIO pin or from the PL. The two triple timers/counters are always clocked at 1/4 or 1/6 of the CPU frequency, and are used to count the widths of signal pulses from an MIO pin or from the PL. Read more about the timers in the Cortex-A9 MPCore Technical Reference Manual chapter 4.

Program example

Here is an example program that uses the ARM CPU private timer to measure the time it takes to run the CoreMark benchmark program. It is used in the core_portme.c to read the timer counter register before the program starts and when it has finished.

ee_u32 GetTimerValue(ee_u32 TimerIntrId,ee_u16 Mode)


    int                 Status;
    XScuTimer_Config    *ConfigPtr;
    volatile ee_u32     CntValue  = 0;
    XScuTimer           *TimerInstancePtr = &Timer;

    if (Mode == 0) {

      // Initialize the Private Timer so that it is ready to use

      ConfigPtr = XScuTimer_LookupConfig(TimerIntrId);

      Status = XScuTimer_CfgInitialize(TimerInstancePtr, ConfigPtr,

      if (Status != XST_SUCCESS) {
          return XST_FAILURE; }

      // Load the timer prescaler register.

      XScuTimer_SetPrescaler(TimerInstancePtr, TIMER_RES_DIVIDER);

      // Load the timer counter register.

      XScuTimer_LoadTimer(TimerInstancePtr, TIMER_LOAD_VALUE);

      // Start the timer counter and read start value

      CntValue = XScuTimer_GetCounterValue(TimerInstancePtr);


    else {

       //  Read stop value
and stop the timer counter

       CntValue = XScuTimer_GetCounterValue(TimerInstancePtr);


    return CntValue;


Compiling the modified code

Here is all the source code that will be compiled. Here are the modified files core_portme.h and core_portme.c ready to be downloaded.

Compilation setup

Right-click the CoreMark project and select C/C++ Build Settings. We will define the following symbols

and select the most optimization (-O3).

Compilation print out

Running CoreMark

Here is a print out from the CoreMark program.

CoreMark benchmark result

1998 iterations/sec and the CPU running at 667MHz will give a CoreMark value of 1998/667 ≈ 3.0 CoreMark/MHz. All you compiler experts out there please let me know about other ways to improve this result.

More benchmarking

Z-7020 based ZC702 evaluation platform

Top   Previous   Next

Posted at 11:41 by

November 24, 2014   09:35 AM PST
the result is aimed at dual cores?
October 16, 2014   10:02 AM PDT
Thanks alot...

This helped me alot...

Leave a Comment:


Homepage (optional)


Previous Entry Home Next Entry