

# Signal Processors Lecture Notes, Pt 3

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3 03.04.2018

T



# <sup>2</sup> Chapter Overview

- 3. Development of DSP Systems
  - DSP Software Development Process
  - Design Challenges
  - Modelling of DSP Systems
  - Algorithm and Architecture
  - Software Architecture DSP/BIOS
  - Conclusion



#### **DSP Software Development Process**

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

3

03.04.2018



#### **Development of DSP Systems**

4





#### Algorithm Development

- Define inputs & outputs
- Algorithm description

- **Difference equations**
- Transfer function
- High-level DSP tools
  - MATLAB, Simulink, C/C++
- Synthesized inputs



# Algorithm Development

ITI

- Benefits of high-level tools
  - Availability of libraries, toolboxes saves development time
  - Easy to debug and modify high-level language programs
  - Input/output can be stored and analyzed
  - Easing development by using floating-point data format
  - Bit-true simulations for fixed-point DSP implementation



# Selection of DSP Processors

- DSPs are available in a wide range
- Understanding of application requirements
  - Choose processor that meets project's requirements
  - Find most cost-effective solution
- Selection criteria

- Arithmetic format
- Performance (MIPS, MOPS, MFLOPS, MHz)
- Price (MIPS per dollar)
- Power consumption (MIPS per mW)
- On-chip peripherals



# Software Development

- Measures of good DSP software
  - Reliability

- Maintainability
- Extensibility
- Efficiency
- Interdependence between hardware and software
  - DSP designer needs to understand both sides



### "Standard" DSP-Development

Starting point: Software in C/C++





## <sup>10</sup> DSP Development Tools

- Compiler / Assembler / Linker
  - For C/C++, ADA, and other HLLs
  - Performance HLL vs. hand-optimized Assembler
- Software Libraries

- For dedicated application areas (e.g., audio/video processing)
- Optimized for certain platforms
- Operating System
  - Manages system resources
  - Often real-time capable



# <sup>11</sup> DSP Development Platforms

#### (Instruction Set) Simulators

- Simulation of program execution on host computer
- Differences regarding accuracy, simulation speed, and completeness
- DSP Development Boards
  - DSP with peripherals as "development boards" (with connection to the developer's workstation)
  - Evaluations can be performed, but not real target hardware
- In-circuit Emulators
  - Development support on target-hardware
  - Processor cycle, breakpoints, profiling



## <sup>12</sup> Instruction Set Simulators

#### Purpose

ITI

- Executable code for desired platform
- Execution times, memory requirements, (SW) pipelining, RTDX
- Hardware support
  - Pin connect, port connect, latency, cache, peripherals, DMA
  - Data path: GPR, FU, Cross-paths, control registers
- Performance



### **Development Boards**

- DSP Development Boards
  - DSK / EVM
  - IDE
  - Quick Start
  - Reference Design
  - DSP
  - JTAG (via PCI/LAN/USB)
  - SW Runtime Kernel, Libraries











### **Design Challenges**

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

03.04.2018



## <sup>16</sup> Challenges

- Functionality
  - Often new "standards" (codecs, compression, etc.)
- Heterogeneity and complexity
  - Supported platforms
    - Special (re-configurable) Hardware, new processors
  - Specifications (language)
    - HW: VHDL, Verilog
    - SW: C, C++, MatLAB, Java
  - Complexity of target system
    - Moore's law
    - Multiprocessor-systems, networks



#### Example: Embedded DSP System

17





# <sup>18</sup> Design Challenges

- Functionality
  - Correlation with specification?
  - Satisfaction of time constraints? (real-time)
- Target System
  - Implementation in HW or SW?
- Application dependent Parameters
  - Power consumption
  - Required space
  - Costs
  - . . . .



# <sup>19</sup> Design Techniques

- "Specifying and synthesizing"
  - Specify requirements / functionality at different granularity (executable behaviour description)
  - Automatic synthesis of implementation (compiler, etc.)
  - Standard for software development (?)
- "Specifying, exploring and refining"
  - Specification => explore implementation alternatives
  - Estimate essential properties (execution time, memory, ...)
  - Iterative refinement of specification and implementation



# **Design Space Exploration** Build & RUN Application Debug Phase Debug & Compile or Runtime Errors 2 Use tuning tools to optimize application TUNING PHASE Analyze tuning result Save desired optimized Done :-)

20

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

03.04.2018



# Execution Time vs. Memory Requirements





## Parameter Tuning

| C | verrides         |                    |   |         |                    |              |      |               |  |
|---|------------------|--------------------|---|---------|--------------------|--------------|------|---------------|--|
| 7 | Function         | Profile Collection | ~ | 1.1.2.1 | ule                | an marana ka | -    |               |  |
| * | unpack_sfs       | Smallest           |   |         |                    | astest       | C N  | one           |  |
| * | back_bf          | None               |   |         | Apply 0            | bine         |      |               |  |
| ٣ | back_bf0         | None               |   |         | - Shbia o          |              |      |               |  |
| ٣ | bs_fill          | MaximumSpeed       |   |         |                    |              |      |               |  |
| ٣ | compare          | Fastest            |   |         |                    |              |      |               |  |
| ٣ | cvt_to_wave      | Fastest            |   | 8       | Profile Collection | Cycle        | Size | Options       |  |
| * | cvt_to_wave_init | None               |   | 0       | MaximumSpeed       | 4570         | 305  | -o3 -oi0      |  |
| 1 | cvt_to_wave_test | None               |   |         | Speed              | 5410         | 233  | -o3 -oi0 -ms1 |  |
| * | dummy            | None               |   | C       | MinimumSize        | 5520         | 209  | -03 -0i0 -ms3 |  |
| - |                  | 1.                 |   |         |                    |              |      |               |  |



### Modelling of DSP Systems

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

03.04.2018



# <sup>24</sup> Modeling of DSP Systems

- Model
  - Formal description of a (sub-)system
  - Abstraction: description of specific characteristics, omitting unnecessary details
- Level of abstraction
  - System
  - Module / architecture
  - Block / logic
- Perspectives
  - Behaviour
  - Structure



- Common description of DSP-algorithms
  - Represented as directed graph Nodes: processing units Edges: dataflow (in-/output) between nodes
  - "Processing" if all input-data is available Firing rules
  - Static / dynamic dataflow models, depending on "firing rules"









### Algorithm and Architecture





### Structure of a DSP Algorithm





# Algorithm and Architecture

- What is the optimal hardware architecture for a given algorithm?
  - Inherent parallelism of the algorithm
- What is the minimum computation time of an algorithm for a given architecture?
  - Transform algorithm for optimal resource usage



## <sup>30</sup> Example: Digital Filter

- How many Adder/Multiplier are required?
- What is the minimum execution time?





## <sup>31</sup> Latency vs. Throughput

Latency

- Time it takes to generate an output value from the corresponding input value
- Throughput
  - Output values per second; reciprocal of the time between two outputs.
  - Increase throughput by using pipelining, for example





#### Signal Flow Graph

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

03.04.2018



# <sup>33</sup> Signal Flow Graph (SFG) for Digital Filter





# <sup>34</sup> Transform SFG in Precedence Form

- Precedence form can be easily mapped to code for a generalpurpose signal processor
- Simplified view of an algorithm



#### Deriving Precedence Form of SFG

- Collapse unnecessary nodes 1.
- Assign node variables: input, output, delay elements; 2. basic operations and the results
- Remove all edges with delay elements; j = 13.
- Choose all initial nodes in the SFG and add to set N<sub>i</sub> 4.
- Remove edges with basic operations that are executable 5. (i.e. for which all inputs are initial nodes)
- Remove nodes that no longer have outgoing 6.
- 7. branches
- 8. j = j + 1

35

Repeat from step 4 until there are no initial nodes left 9.







### <sup>37</sup> Instruction Schedule: Scalar Processor



Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3



<sup>38</sup> Instruction Schedule: Shortest Execution Time



Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

03.04.2018



### Instruction Schedule: Optimal Schedule



Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

39



Instruction Schedule: Suboptimal Schedule



Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

40



#### Instruction Schedule: TI C6x DSP

41



Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

03.04.2018



<sup>42</sup> Compact Notation of Precedence SFG





# <sup>43</sup> Further Optimization: Block Data Processing

Example: Process 2 input values together





# 44 Schedule for 2 Input Samples







## Things we've not considered

Get input values

45

- Write output values
- How to get constant values (b<sub>0...2</sub>, a<sub>1,2</sub>)
  - Additional load instructions
  - Additional registers
- Special instructions available on DSP, e.g., MULT2, DOTP2
  - Virtually doubles the number of available units
- Limitations of the underlying DSP hardware
  - e.g, crosspaths, supported operations of functional units







Example: Dot Product – Block Processing



IT I

47



48

### Example: Dot Product – Block Processing





### Example: Dot Product Towards a Software Pipeline





### Software Architecture – DSP/BIOS

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

50

03.04.2018



### <sup>51</sup> No Software Architecture

- "Prototype implementation", "programming spikes"
  - Quickly getting complex
  - Limited Maintenance and extensibility
  - "Hardware-oriented" programming (performance)



## <sup>52</sup> No Software Architecture





## <sup>53</sup> No Software Architecture

```
Main() {
             /* Multi-threaded example */
  0
Thread_Event_0() { /* Event 0 processing thread */
   0
  while(1) {
     wait for Event_0 signal
     ProcessEvent_0
Thread_Event_1() { /* Event 1 processing thread */
   0
  while(1) {
     wait for Event_1 signal
     ProcessEvent_1
```

```
Event_0_ISR
  0
  signal Event_0
  0
Event_1_ISR
  0
  signal Event_1
  0
```



## <sup>54</sup> SW Architectures for DSPs

- SW-Development "without" SW-Architecture
  - Only for small DSP systems with limited functionality
- Typical SW requirements:
  - Easy (uniform) access to hardware
  - Multithreading (with priorities)
  - Real-time abilities
  - Integration/porting of codes/programs (third-parties)
  - Simple resource management (e.g. memory)



## 55 SW for Embedded Systems

Additional requirements (compared to general-purpose SW)

- Significant hardware differences, configurations, fixedpoint/floating-point, memory hierarchy, . . .
- Cost efficient (memory, processing power, . . . )
- Time-to-market, re-use of SW modules
- Reliability, QoS
- Real-time constraints
- Methods, tools, etc. for efficient SW-Development
  - Operating system (for DSP, embedded system)
  - Standard for algorithms
  - Framework / SW-Architecture



### <sup>56</sup> SW for Embedded Systems





# <sup>57</sup> Operating System (Core)

- Important Features
  - Hardware abstraction (interface)
  - Thread/task management
  - Thread/task communication
  - Thread/task synchronization
  - Analysis/monitoring
- (Commercial) operating systems for DSPs
  - VxWorks (POSIX-compatible, networking)
  - DSP/BIOS
  - WinCE
  - •



# <sup>58</sup> DSP/BIOS (TI)

OS Kernel integrated in CCS IDE

Threads with different priorities





#### DSP/BIOS (TI)

59

- OS Kernel integrated in CCS IDE
  - Threads with different priorities





### •• Example: Motor Control

#### Individual DSP carries out several functions

- Motor control
- Keyboard entries
- Display

Data transfer





# <sup>61</sup> Periodic function manager

### **Prioritizing Periodic Thread Functions**





#### Passing Data

62





# <sup>63</sup> DSP/BIOS Configuration





## **DSP/BIOS** Configuration

| Estimated Data Size: 8744 Est. Min. Stack Size (MAUs): 528                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | task0 properties                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |  |  |  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| <ul> <li>System</li> <li>Global Settings</li> <li>MEM - Memory Section Manager</li> <li>IRAM</li> <li>SDRAM</li> <li>BUF - Buffer Manager</li> <li>POOL - Allocator Manager</li> <li>SYS - System Settings</li> <li>HOOK - Module Hook Manager</li> <li>Instrumentation</li> <li>LOG - Event Log Manager</li> <li>STS - Statistics Object Manager</li> <li>Scheduling</li> <li>CLK - Clock Manager</li> <li>Scheduling</li> <li>CLK - Clock Manager</li> <li>Scheduling</li> <li>CLK - Clock Manager</li> <li>Sthe D - Periodic Function Manager</li> <li>SWI - Software Interrupt Service Routine Manager</li> <li>SWI - Software Interrupt Manager</li> <li>SWI - Software Interrupt Manager</li> <li>TSK - Task Manager</li> <li>Mak2</li> <li>TSK - Task Manager</li> <li>Mak2</li> <li>Stribulation</li> <li>Semaphore Manager</li> <li>MEX - Mailbox Manager</li> <li>QUE - Atomic Queue Manager</li> </ul> | <br>Property<br>comment<br>Task function<br>Task function argument 0<br>Task function argument 1<br>Task function argument 2<br>Task function argument 3<br>Task function argument 5<br>Task function argument 6<br>Task function argument 7<br>Automatically allocate stack<br>Manually allocated stack<br>Stack size (MAUs)<br>Stack Memory Segment<br>Priority<br>Environment pointer<br>Don't shut down system while this task is<br>Allocate Task Name on Target | Value <add comments="" here="">task           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0</add> |  |  |  |

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

**6**4



# <sup>65</sup> Example: Audio Processing





## Example: Audio Processing





#### **Example: Audio Processing**

67





#### Example: Audio Processing

68





### <sup>69</sup> Frameworks

- SW-Architecture for DSP Systems
  - "Standard"-algorithms can be easily integrated
  - Based on system software
    - Chip support library
    - Board support library
- Framework is adaptable
  - Functionality
  - Number of channels
  - Dynamic/static configuration
  - Memory restrictions

• • • •



70

## Reference Framework (TI)

|                                 |        |                                      | Object Creations |         | Language Support |                 |
|---------------------------------|--------|--------------------------------------|------------------|---------|------------------|-----------------|
| Feature                         | Module |                                      | STATIC           | DYNAMIC | C                | ASM<br>routines |
| Real Time Analysis and Data Cap | oture  |                                      | -                |         |                  |                 |
| Event Logging                   | LOG    | Message Log Manager                  | x                |         | x                | x               |
| Statistics Accumulation         | STS    | Statistics Accumulator Manager       | x                |         | x                | x               |
| Trace Control                   | TRC    | Trace Manager                        | x                |         | x                | х               |
| File Streaming                  | HST    | Host I/O Manager                     | x                |         | x                | x               |
| Real Time Data Exchange         | RTDX   | Target to Host Communication Manager | x                |         | x                | x               |
| Hardware Abstraction            |        |                                      |                  |         |                  |                 |
| On-Chip Timer                   | CLK    | System Clock Manager                 | x                |         | x                | x               |
| Hardware Interrutps             | HWI    | Hardware Interrupt Manager           | x                |         |                  | x               |
| Static Memory Management        | MEM *  | Memory Segment Manager               | x                |         |                  |                 |
| Dynamic Memory Management       | MEM ** | Memory Segment Manager               |                  | x       | x                |                 |
| Device – Independent I/O        |        |                                      |                  |         |                  |                 |
| Data Pipes                      | PIP    | Data Pipe Manager                    | x                |         | x                | x               |
| Data Streams                    | SIO    | Stream I/O Manager                   | x                | x       | x                |                 |



**7**1

## Reference Framework (TI)

|                                                                    |           |                                                  | Object Creations |         | Language Suppo |     |
|--------------------------------------------------------------------|-----------|--------------------------------------------------|------------------|---------|----------------|-----|
| Feature Module                                                     |           |                                                  | STATIC           | DYNAMIC | C              | ASM |
| xecution Thread Management                                         |           |                                                  |                  |         |                |     |
| Software Interrupts                                                | SWI       | Software Interrupt Manager                       | х                | x       | х              | х   |
| Periodic Functions                                                 | PRD       | Periodic Function Manager                        | х                |         | х              | х   |
| Tasks                                                              | TSK       | Multitasking Manager                             | х                | x       | х              |     |
| Idle Loop                                                          | IDL       | Idle Function and Processing Loop Manager        | x                |         | х              | х   |
| ter – Thread Communnication                                        | and Synch | ronization                                       |                  | ·       |                |     |
| Semaphores                                                         | SEM       | Semaphore Manager                                | x                | x       | х              |     |
| Resource Locks                                                     | LCK       | Resource Lock Manager                            | x                | x       | х              |     |
| Mailboxes                                                          | MBX       | Mailbox Manager                                  | х                | X       | х              |     |
| Queues                                                             | QUE       | Queue Manager                                    | х                | x       | x              |     |
|                                                                    |           |                                                  |                  |         |                |     |
| ther Services                                                      |           |                                                  |                  |         |                |     |
| ther Services<br>Atomic Functions (optimized and<br>non-premptive) |           | Atomic Functions written in Assembly<br>Language | N/A              | N/A     | x              |     |



### Conclusion

Institute for Technical Informatics, DI Dr. Eugen Brenner Signal Processors, Pt 3

ITI

72

03.04.2018





### **Additional References**



## <sup>74</sup> Additional References

- Philip D. Lapsley: **DSP processor fundamentals: architectures and features** Berkeley Design Technology Inc, 1994. (Link: UB TUG)
- Lars Wanhammer: DSP Integrated Circuits Academic Press series in engineering, 1999, 1. ed.
- "How to Get Started With the DSP/BIOS Kernel" (SPRA782)
- "TMS320 DSP/BIOS User's Guide" (SPRU423)
- "Code Composer Studio Development Tools" (SPRU509)
- "DSP/BIOS Technical Overview" (SPRA780, SPRA646)