

# **Key Design Features**

- Synthesizable, technology independent IP Core for FPGA, ASIC or SoC
- Supplied as human readable VHDL (or Verilog) source code
- Output supports full flow control permitting output pixels to be stalled (or even whole frames if necessary)
- Supports any video resolution<sup>1</sup>
- Support for RGB or YCbCr pixel formats
- Includes frame skip and frame repeat functionality to compensate for different input and output frame rates
- Generic 128-bit external memory interface with configurable burst size
- Linear memory bursts minimise page-breaks in synchronous memory architectures
- Ideal for interfacing to all types of memory such as SRAM, SDRAM, DDR, DDR2, DDR3, DDR4 etc.
- Supports 300 MHz+ operation on basic FPGA devices<sup>2</sup>

# Applications

- Buffering video frames in external memory
- Real-time digital video applications
- Video genlock applications
- Adapting to different pixel-clock rates and frame rates
- Essential component in video processing pipelines

# **Generic Parameters**

| Generic name         | Description                                                        | Туре    | Valid range  |
|----------------------|--------------------------------------------------------------------|---------|--------------|
| bits_per_pixel (bbp) | Input video bits<br>per pixel                                      | integer | 16, 24 or 32 |
| mem_start_addr       | Start address in<br>memory of frame<br>buffer<br>(128-bit aligned) | integer | ≥ 0          |
| mem_burst_size       | Size of memory<br>read / write burst<br>(in 128-bit words)         | integer | ≥2           |
| mem_frame_repeat     | Enable / disable<br>frame repeat<br>mode                           | boolean | True/False   |



Figure 1: Video Frame Buffer architecture

# **Pin-out Description**

#### SYSTEM SIGNALS

Block Diagram

| Pin name     | I/O | Description                                  | Active state |
|--------------|-----|----------------------------------------------|--------------|
| clk          | in  | Synchronous system<br>clock                  | rising edge  |
| reset        | in  | Asynchronous system<br>reset                 | low          |
| fb_proc      | out | Frame processed strobe                       | high         |
| fb_skip      | out | Frame skip strobe                            | high-pulse   |
| fb_repeat    | out | Frame repeat strobe<br>(when repeat enabled) | high-pulse   |
| fb_err_ovfl1 | out | Input FIFO overflow<br>error                 | high         |
| fb_err_ovfl2 | out | Output FIFO overflow error                   | high         |
| fb_err_uflow | out | Output pixel underflow flag                  | high         |

1 External memory permitting

2 Xilinx® 7-series used as a benchmark

VID\_FRAME\_BUFFER



#### Video Frame Buffer IP Core Rev. 2.0

#### INPUT VIDEO INTERFACE

| Pin name                        | I/O | Description                                                      | Active state |
|---------------------------------|-----|------------------------------------------------------------------|--------------|
| pixin<br>[bits_per_pixel - 1:0] | in  | Input pixel                                                      | data         |
| pixin_sof                       | in  | Start of frame flag<br>(coincident with first<br>pixel in frame) | high         |
| pixin_val                       | in  | Input pixel valid                                                | high         |

#### PROGRAMMABLE INPUT VIDEO PARAMETERS

| Pin name                        | I/O | Description                                                      | Active state |
|---------------------------------|-----|------------------------------------------------------------------|--------------|
| pixels_per_line (ppl)<br>[15:0] | in  | Number of pixels in<br>each line of input<br>video               | data         |
| lines_per_frame (lpf)<br>[15:0] | in  | Number of lines in<br>each frame of input<br>video               | data         |
| words_per_frame<br>[31:0]       | in  | Size of one frame in<br>128-bit words<br>(ppl * lpf * bbp) / 128 | data         |

#### OUTPUT VIDEO INTERFACE

| Pin name                         | I/O | Description                                                      | Active state |
|----------------------------------|-----|------------------------------------------------------------------|--------------|
| pixout<br>[bits_per_pixel - 1:0] | out | Output pixel                                                     | data         |
| pixout_vsync                     | out | Vertical sync flag<br>(coincident with first<br>pixel in frame)  | high         |
| pixout_hsync                     | out | Horizontal sync flag<br>(coincident with first<br>pixel in line) | high         |
| pixout_val                       | out | Output pixel valid                                               | high         |
| pixout_rdy                       | in  | Ready to accept<br>output pixel<br>(handshake signal)            | high         |

#### GENERIC 128-BIT MEMORY INTERFACE

| Pin name          | <i>I/O</i> | Description                                             | Active state        |
|-------------------|------------|---------------------------------------------------------|---------------------|
| mem_rw            | out        | Memory read / write<br>flag                             | 0: write<br>1: read |
| mem_wdata [127:0] | out        | Memory write data                                       | data                |
| mem_addr [31:0]   | out        | Memory read / write<br>address                          | data                |
| mem_addr_val      | out        | Memory request valid                                    | high                |
| mem_addr_rdy      | in         | Ready to accept<br>memory request<br>(handshake signal) | high                |
| mem_rdata [127:0] | in         | Memory read data                                        | data                |
| mem_rdata_val     | in         | Memory read data valid                                  | high                |

# **General Description**

The VID\_FRAME\_BUFFER (VFB) IP Core is a high-speed multi-format video frame buffer that samples an input video stream and buffers it in an external memory. The VFB is capable of very high-speed operation - achieving over 300 MHz on standard FPGA platforms.

The VFB will automatically adapt to different input and output frame rates. If the input frame rate is too high, then the VFB will drop or 'skip' an input frame. Likewise, if the output frame rate is higher than the input frame rate, then frames will be repeated<sup>3</sup>. The result is a system that seamlessly adapts to the different frame rates at the input and output of the VFB.

The memory port is a generic 128-bit read/write interface that may be connected to a wide variety of memory types and memory controllers. Memory read/write requests are sent as a sequential linear burst that is optimized for transfers over synchronous memory.

By using a series of VFB IP Cores in parallel, multiple video-sources may be synchronized together. Figure 1. shows the architecture of the Video Frame Buffer in more detail.

#### Input video interface

The VFB supports any input pixel format as long as the pixels are aligned to a 16, 24 or 32-bit word boundary. Input pixels are sampled on the rising-edge of the system clock when *pixin\_val* is high. The signal *pixin\_sof* is an active high flag that is coincident with the first pixel of the input frame.

Note that the input video interface is free running and non-stallable. If the input frame rate is too high for the available memory bandwidth, then input frames will be dropped.

#### Output video interface

Pixels flow out of the VFB in accordance with the valid-ready pipeline protocol. This protocol is used by all Zipcores video IP, and allows for simple connectivity between modules.

Output pixels and syncs are transferred out of the VFB on the rising edge of the system clock when *pixin\_val* and *pixin\_rdy* are both high. In addition, the output may be stalled, allowing pixels (or even whole frames) to be held back by asserting *pixout\_rdy* low. In order to identify the boundary between frames and lines, the sync signals *pixout\_vsync* and *pixout\_hsync* are provided. The vsync signal is asserted with the first output pixel of a frame and the hsync signal is asserted with the first output pixel of a line.

#### Generic memory interface

The memory interface is a generic single-ported 128-bit read/write type that may be connected to a wide variety of memories and memory controllers.

Each memory request is sent using the valid-ready protocol. A request is transferred on a rising clock edge when *mem\_addr\_val* and *mem\_addr\_rdy* are asserted high. If the request is a write then the flag *mem\_rw* is asserted low. For a memory read, then the *mem\_rw* flag is asserted high. The *mem\_addr* signal is common to both read and write requests.

#### 3 Assuming frame-repeat mode is enabled



### Video Frame Buffer IP Core Rev. 2.0

Requests are sent as a sequential linear burst with the number of words in each burst being controlled by the generic parameter *mem\_burst\_size*.

The burst size controls the number of sequential read or write requests. Setting a larger burst size will increase the number sequential accesses to memory and potentially lower the number of page-breaks. Conversely, making the burst size too large may starve the next read or write request of memory bandwidth. For this reason, care should be taken when selecting this parameter.

The parameter *words\_per\_frame* defines the size of one complete frame of input video in 128-bit words. The parameter *mem\_frame\_repeat* determines whether video frames should be repeated if the output frame rate is higher than the input frame rate. Finally, the parameter *mem\_start\_addr* defines where frame-buffer should start in physical memory. The memory must be large enough to support 4 complete frames of input video. This is shown in figure 2 as a system memory map.



Figure 2: System memory map (128-bit word aligned)

System flags and diagnostic signals

The *fb\_skip* flag is an active high strobe that pulses high every time an input frame is dropped. This signal shows activity when the input frame rate is higher than the output frame rate. Conversely, the *fb\_repeat* flag pulses high every time an output frame is repeated. This signal will be active when the output frame rate is higher than the input frame rate. The signal *fb\_proc* is pulsed high every time an input frame is processed. A combination of all three flags may be used to provide real-time information about the input and output video stream. Figure 3 shows the relationship between the output frames and frame repeat/skip flags.

| Input frame s | equence                  |               |                |          |          |          |  |
|---------------|--------------------------|---------------|----------------|----------|----------|----------|--|
| Frame #1      | Frame #2                 | Frame #3      | Frame #4       | Frame #5 | Frame #6 | Frame #7 |  |
| Output frame  | sequence - I             | epeated fran  | nes            |          |          |          |  |
| Frame #1      | Frame #2                 | Frame #2      | Frame #3       | Frame #4 | Frame #4 | Frame #5 |  |
|               |                          |               |                |          | Π        |          |  |
| fb_repeat     |                          |               |                |          |          |          |  |
| Output frame  | sequence -               | skipped frame | es             |          |          |          |  |
| Output frame  | sequence - s<br>Frame #2 | skipped frame | es<br>Frame #5 | Frame #7 |          |          |  |



In order to maintain a steady video output display, the designer should aim for a well balanced system where the incidence of frame skip and frame repeat is reduced. The optimum system is where the input frame rate and output frame rate are the same or evenly matched.

The most important diagnostic flags to take note of are the signals  $fb\_err\_ovfl1$ ,  $fb\_err\_ovfl2$  and  $fb\_err\_uflow$ . The signal  $fb\_err\_ovfl1$  indicates that the input FIFOs have overflowed. An input FIFO overflow condition occurs when the input pixel rate is too high. The signal  $fb\_err\_ovfl2$  indicates that the output read FIFOs have overflowed<sup>4</sup>. Finally, the  $fb\_err\_uflow$  flag is asserted high if there is a dropout of valid output pixels. This is not necessarily an error, but it could indicate a system with insufficient memory read bandwidth.

The only way to recover from an error condition is to assert a system reset. On reset, the VFB will resynchronize to the next input frame and operation will continue as normal.

#### Practical system considerations

(a) Internally, the VFB is 128-bit word aligned. This means that the size of a single video frame must be divisible by an integer number of 128-bit words. In particular, the following calculation must result in a whole number:

(b) As the memory interface divides each frame into discrete bursts of 128-bit words, the size of a single video frame must be divisible by the memory burst size. Likewise, the following calculation must result in a whole number:

 $bursts\_per\_frame = \frac{words\_per\_frame}{mem\_burst\_size}$ 

4 See cases (c) and (f) - Practical system considerations



For common video resolutions, the parameters *words\_per\_frame* and *mem\_burst\_size* generally come out as integer numbers. However, for more obscure user-defined video modes, the input video resolution or burst size may need to be adjusted to give integer values.

(c) There comes a point when the input pixel data rate becomes too high for the VFB to tolerate and the input pixel FIFOs overflow. When this happens, even the dropping of individual input frames will not work, as the instantaneous pixel-rate exceeds the maximum bandwidth available. Assuming an 'ideal', non-stalling memory interface where the bandwidth is shared equally between reads and writes, then the minimum system clock frequency required for a given input pixel clock frequency is given by:



As an example, consider a 65 MHz input pixel clock at 24-bits/pixel. The minimum system clock frequency allowed to avoid internal overflow would be:  $65^{(24/128)} = 24.375$  MHz. In practice, however, a higher system clock-frequency is often required to compensate for inefficiencies in the memory interface. For instance, due to page-breaks and auto-refresh etc.

(d) In order to minimize the performance bottleneck at the memory interface, the external memory should be clocked at the system clock frequency or better.



(e) The external memory should be large enough to accommodate up to 4 frames of video. The size in 128-bit words is given by:



For example, consider an XGA (1024x768) input source at 16-bits/pixel. In this case, a minimum memory size of: 1024x768x16x4/128 = 384k x 128-bit would be required. A 1M x 128-bit memory or greater would be a good choice in this instance.

(f) The internal FIFOs have enough buffering to accommodate 7 'in-flight' read memory bursts for a maximum burst size of 64. For this reason, the memory read latency must not exceed 448 system clock cycles. If a very high memory read latency is expected, then please contact Zipcores and the amount of internal buffering can be adjusted accordingly.

# **Functional Timing**

#### Input video interface

Figure 4 shows the signalling at the input to the VFB. The input pixel and the sof flag are sampled on the rising edge of *clk* when *pixin\_val* is high. When *pixin\_val* is de-asserted then the input pixel is ignored.



Figure 4: Input video interface timing

#### Output video interface

Output pixels and syncs are transferred out of the VFB on the rising clockedge of *clk* when *pixin\_val* and *pixin\_rdy* are both high. If *pixin\_rdy* is held low, then the output is stalled and the frame-buffer will buffer input pixels (or whole frames) until *pixin\_rdy* is asserted high again. Figure 5 shows the output video timing at the start of a new output frame. Both *pixin\_vsync* and *pixin\_hsync* are asserted high with the first pixel of a new frame.



Figure 5: Output video interface timing – start of new output frame

Figure 6 demonstrates the timing at the start of a new line. A new line begins with *pixin\_hsync* coincident with the first pixel. The signal *pixin\_vsync* is held low.



# Video Frame Buffer IP Core Rev. 2.0



Figure 6: Output video interface timing - start of new output line

#### Generic 128-bit memory interface

Figure 7 shows a series of write bursts to memory. In this particular example, the parameter *mem\_burst\_size* has been set to  $4^5$ . Each memory burst is a block write of 4 words. The addresses are guaranteed to be sequential within a burst. Between bursts, the *mem\_addr\_valid* signal is de-asserted for one cycle.

At any point during the write transfer, the handshake signal *mem\_addr\_rdy* may be asserted low. In the low state, the memory request is stalled until *mem\_addr\_rdy* is asserted high again.



Figure 7: Memory write burst timing (burst size of 4)

The timing is very similar for a read burst. Figure 8 shows a single read burst and corresponding read data returned from memory.



Figure 8: Memory read burst timing (burst size of 4)

# Source File Description

All source files are provided as text files coded in VHDL. The following table gives a brief description of each file.

| Source file                | Description                         |
|----------------------------|-------------------------------------|
| video_in.txt               | Text-based source video file        |
| video_src_reader.vhd       | Reads text-based source video file  |
| mem_model_pack.vhd         | Memory model functions              |
| ram_model.vhd              | Single port memory model            |
| mem_model_1Mx128bit.vhd    | Large 1Mx128 memory model           |
| pipeline_reg.vhd           | Pipeline register element           |
| vid_in_reg.vhd             | Video input register                |
| vid_out_reg.vhd            | Video output register               |
| vid_sync_fifo.vhd          | Synchronous pixel FIFO              |
| vid_sync_fifo_reg.vhd      | Sync FIFO internal register         |
| ram_dp_w_r.vhd             | Dual port RAM component             |
| vid_align_frame.vhd        | Aligns pixels to the start of frame |
| vid_pack128.vhd            | Pixel packer                        |
| pack_16_to_32.vhd          | 16-bit to 32-bit packer             |
| pack_24_to_32.vhd          | 24-bit to 32-bit packer             |
| pack_32_to_32.vhd          | 32-bit to 32-bit packer             |
| pack_32_to_128.vhd         | 32-bit to 128-bit packer            |
| vid_frame_fifo.vhd         | Main frame-FIFO controller          |
| vid_mem_write.vhd          | Memory write burst controller       |
| vid_mem_read.vhd           | Memory read burst controller        |
| vid_mem_arb.vhd            | Memory R/W arbiter                  |
| vid_unpack128.vhd          | Pixel unpacker                      |
| unpack_32_to_16.vhd        | 32-bit to 16-bit unpacker           |
| unpack_32_to_24.vhd        | 32-bit to 24-bit unpacker           |
| unpack_32_to_32.vhd        | 32-bit to 32-bit unpacker           |
| unpack_128_to_32.vhd       | 128-bit to 32-bit unpacker          |
| vid_sync_regen.vhd         | Video sync generator                |
| vid_uflow_check.vhd        | Pixel underflow checker             |
| vid_frame_buffer.vhd       | Top-level component                 |
| vid_frame_buffer_bench.vhd | Top-level test bench                |

<sup>5</sup> A larger burst size is advised for synchronous memory types to reduce page-breaks. A burst size of 4 is shown for example only.



Video Frame Buffer IP Core Rev. 2.0

# **Functional Testing**

An example VHDL testbench is provided for use in a suitable VHDL simulator. The compilation order of the source code is the same order as described in the source file description above.

The VHDL testbench instantiates the VID\_FRAME\_BUFFER component and the user may modify the generic parameters in order to set up the desired test conditions.

The source video for the simulation is generated by the video sourcereader component. This component reads a text-based file which contains the RGB pixel data. The text file is called *video\_in.txt* and should be placed in the top-level simulation directory.

The file *video\_in.txt* follows a simple format which defines the state of signals: *pixin\_val*, *pixin\_sof*, and *pixin* on a clock-by-clock basis. An example file for a 24-bit/pixel input source might be the following:

| 1 | 1 | 000000 | # pixel 0, frame 0               |
|---|---|--------|----------------------------------|
| 1 | 0 | 111111 | <pre># pixel 1, frame 0</pre>    |
| 0 | 0 | 000000 | <pre># don't care!</pre>         |
| 1 | 0 | 222222 | <pre># pixel 2, frame 0</pre>    |
| 1 | 0 | 333333 | <pre># pixel 3, frame 0</pre>    |
| • |   |        |                                  |
| • |   |        |                                  |
| 1 | 1 | 000000 | # pixel 0 frame 1                |
| 1 | 0 | 111111 | <pre># pixel 1 frame 1 etc</pre> |

In this example, the first line of the *video\_in.txt* file asserts the input signals pixin\_val = 1, pixin\_sof = 1, and pixin = 0x000000, the second line asserts the input signals pixin\_val = 1, pixin\_sof = 0, and pixin = 0x111111 etc.

The simulation must be run for at least 30 ms during which time an output text file called *video\_out.txt* will be generated. This file contains a sequential list of output pixels in a similar format. Each line defines the state of the signals: *pixout\_val, pixout\_vsync, pixout\_hsync* and *pixout*. An example output file might be:

| 1 | 1 | 1 | 000000 | # | pixel | Ο, | frame | Ο, | line | 0 |     |
|---|---|---|--------|---|-------|----|-------|----|------|---|-----|
| 1 | 0 | 0 | 111111 | # | pixel | 1, | frame | Ο, | line | 0 |     |
| 1 | 0 | 0 | 222222 | # | pixel | 2, | frame | Ο, | line | 0 |     |
| 1 | 0 | 0 | 333333 | # | pixel | З, | frame | Ο, | line | 0 |     |
| 1 | 0 | 0 | 444444 | # | pixel | 4, | frame | Ο, | line | 0 |     |
| 1 | 0 | 0 | 555555 | # | pixel | 5, | frame | Ο, | line | 0 |     |
| 1 | 0 | 0 | 666666 | # | pixel | 6, | frame | Ο, | line | 0 |     |
| 1 | 0 | 0 | 777777 | # | pixel | 7, | frame | Ο, | line | 0 |     |
| 1 | 0 | 1 | 000000 | # | pixel | Ο, | frame | Ο, | line | 1 |     |
| 1 | 0 | 0 | 111111 | # | pixel | 1, | frame | Ο, | line | 1 |     |
| • |   |   |        |   |       |    |       |    |      |   |     |
| • |   |   |        |   |       |    |       |    |      |   |     |
| 1 | 1 | 1 | 000000 | # | pixel | Ο, | frame | 1, | line | 0 |     |
| 1 | 0 | 0 | 000000 | # | pixel | 1, | frame | 1, | line | 0 | etc |

In the example test provided, a series of 8 frames of QVGA (320x240) as 24-bit RGB video are buffered in the VFB. Each video frame is numbered 1 to 4 in sequence to ensure that the frame output order is correct. The results of the simulation are shown in Figure 9.



Figure 9: VFB simulation output - 8 frames in sequence



# Synthesis and Implementation

The files required for synthesis and the design hierarchy is shown below:

#### vid\_frame\_buffer.vhd Ο vid in reg.vhd vid\_align\_frame.vhd 0 vid\_pack128.vhd Ο pack\_16\_to\_32.vhd pack\_24\_to\_32.vhd pack 32 to 32.vhd pack\_32\_to\_128.vhd 0 vid\_sync\_fifo.vhd ram\_dp\_w\_r.vhd vid\_sync\_fifo\_reg.vhd vid frame fifo.vhd 0 vid\_mem\_write.vhd vid\_mem\_read.vhd vid mem arb.vhd 0 pipeline\_reg.vhd vid sync fifo.vhd 0 ram\_dp\_w\_r.vhd vid\_sync\_fifo\_reg.vhd vid\_unpack128.vhd 0 unpack\_32\_to\_16.vhd unpack\_32\_to\_24.vhd unpack\_32\_to\_32.vhd unpack 128 to 32.vhd vid\_sync\_regen.vhd 0 vid\_out\_reg.vhd 0 pipeline reg.vhd vid\_uflow\_check.vhd 0

The VHDL core is designed to be technology independent. However, as a benchmark, synthesis results have been provided for the Xilinx® 7-series FPGAs. Synthesis results for other FPGAs and technologies can be provided on request.

No special synthesis constraints are required. However, setting frame repeat mode to false will generally result in a slightly faster design. Trial synthesis results are shown with the generic parameters set to: *bits\_per\_pixel* = 24, *mem\_start\_addr* = 0, *mem\_burst\_size* = 64, *mem\_frame\_repeat* = false.

Resource usage is specified after place and route of the design.

#### XILINX® 7-SERIES FPGAS

| Resource type        | Artix-7 | Kintex-7 | Virtex-7 |
|----------------------|---------|----------|----------|
| Slice Register       | 1709    | 1709     | 1709     |
| Slice LUTs           | 960     | 957      | 957      |
| Block RAM            | 4       | 4        | 4        |
| DSP48                | 0       | 0        | 0        |
| Occupied Slices      | 671     | 683      | 683      |
| Clock freq. (approx) | 300 MHz | 350 MHz  | 400 MHz  |

# **Revision History**

| Revision | Change description                                                                                                                                                       | Date       |
|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
| 1.0      | Initial revision                                                                                                                                                         | 02/02/2010 |
| 1.1      | Added practical design considerations section                                                                                                                            | 04/03/2010 |
| 1.2      | Moved to 128-bit version                                                                                                                                                 | 25/02/2011 |
| 1.3      | Parameters:<br>pixels_per_line, lines_per_frame and<br>words_per_frame are now programmable                                                                              | 08/04/2011 |
| 2.0      | Major new release and code clean-up.<br>Frame buffer now runs off one system clock.<br>Support for odd-sized burst lengths. Added<br>new underflow flag for system debug | 07/06/2017 |
|          |                                                                                                                                                                          |            |
|          |                                                                                                                                                                          |            |
|          |                                                                                                                                                                          |            |

# **X-ON Electronics**

Largest Supplier of Electrical and Electronic Components

Click to view similar products for Development Software category:

Click to view products by Zipcores manufacturer:

Other Similar products are found below :

 SRP004001-01
 SW163052
 SYSWINEV21
 WS01NCTF1E
 W128E13
 SW89CN0-ZCC
 IP-UART-16550
 MPROG-PRO535E
 AFLCF-08 

 LX-CE060-R21
 WS02-CFSC1-EV3-UP
 SYSMAC-STUDIO-EIPCPLR
 1120270005
 SW006021-2H
 ATATMELSTUDIO
 2400573
 2702579

 2988609
 SW006022-DGL
 2400303
 88970111
 DG-ACC-NET-CD
 55195101-102
 SW1A-W1C
 MDK-ARM
 SW006021-2NH

 B10443
 SW006021-1H
 SW006021-2
 SW006022-2
 SW006023-2
 SW007023
 MIKROE-730
 MIKROE-2401
 MIKROE-499
 MIKROE-722

 MIKROE-724
 MIKROE-726
 MIKROE-728
 MIKROE-732
 MIKROE-734
 MIKROE-736
 MIKROE-744
 MIKROE-928

 MIKROE-936
 1120270002
 1120270003
 1120275015
 NT-ZJCAT1-EV4