

# Key Design Features

- Synthesizable, technology independent VHDL IP Core
- Versatile 24-bit RGB/YCbCr video scaler capable of scaling up or down by any factor. Different pixel formats supported on request
- 24-bit accumulator with 24-bit scale-pitch in [24 12] format
- Supports all video resolutions between 16x16 and 2<sup>16</sup>x 2<sup>16</sup> pixels
- Fully pipelined architecture with simple flow control
- Features a 2x2 polyphase filter in the x and y dimensions.
   Each filter has 16 unique phases or interpolation points
- Fully programmable filter coefficients to suit the desired application
- Example bilinear coefficients shipped with the design
- Output rate is 1 x 24-bit pixel per clock for scaling factors > 1
- Generates one scaled output frame for every input frame
- No frame buffer required
- Supports 250MHz+ operation on basic FPGA devices<sup>1</sup>

# **Applications**

- High quality 24-bit RGB/YCbCr video scaling
- Conversion of popular video formats to any other resolution such as VGA to XGA, SVGA to HD1080 etc.
- Digital TV set-top boxes and home media solutions
- Conversion to non-standard video resolutions e.g. for use in portable devices and flat-panel displays
- Dynamic scaling of video in a window on a frame-by-frame basis
- Picture in Picture (PiP) applications

## Generic Parameters

| Generic name    | Description                   | Туре    | Valid range                               |
|-----------------|-------------------------------|---------|-------------------------------------------|
| line_width      | Width of linestores in pixels | integer | 2 <sup>4</sup> < pixels < 2 <sup>16</sup> |
| log2_line_width | Log2 of linestore width       | integer | log2(line_width)                          |

# **Block Diagram**



Figure 1: Video scaler architecture

# **Pin-out Description**

| Pin name             | <i>I</i> /O | Description                                                                                            | Active state |
|----------------------|-------------|--------------------------------------------------------------------------------------------------------|--------------|
| clk                  | in          | Synchronous clock                                                                                      | rising edge  |
| reset                | in          | Asynchronous reset                                                                                     | low          |
| scale_pitch_x [23:0] | in          | 1 / (x scale factor)  Specified as an unsigned                                                         | data         |
|                      |             | number in [24 12] format                                                                               |              |
| scale_pitch_y [23:0] | in          | 1 / (y scale factor)                                                                                   | data         |
|                      |             | Specified as an unsigned number in [24 12] format                                                      |              |
| input_ppl [15:0]     | in          | Number of pixels per line in<br>the source input frame<br>(Specified as an unsigned<br>16-bit number)  | data         |
| input_lpf [15:0]     | in          | Number of lines per frame in the source input frame (Specified as an unsigned 16-bit number)           | data         |
| output_ppl [15:0]    | in          | Number of pixels per line in<br>the scaled output frame<br>(Specified as an unsigned<br>16-bit number) | data         |
| output_lpf [15:0]    | in          | Number of lines per frame in the scaled output frame (Specified as an unsigned 16-bit number)          | data         |

<sup>1</sup> Xilinx® Virtex6 used as a benchmark



## Pin-out Description cont ...

| Pin name      | I/O | Description                                                            | Active state |
|---------------|-----|------------------------------------------------------------------------|--------------|
| pixin [23:0]  | in  | 24-bit pixel in                                                        | data         |
| pixin_vsync   | in  | Vertical sync in (Coincident with first pixel of input frame)          | high         |
| pixin_hsync   | in  | Horizontal sync in (Coincident with first pixel of input line)         | high         |
| pixin_val     | in  | Input pixel valid                                                      | high         |
| pixin_rdy     | out | Ready to accept input pixel (Handshake signal)                         | high         |
| pixout [23:0] | out | 24-bit pixel out                                                       | data         |
| pixout_vsync  | out | Vertical sync out<br>(Coincident with first pixel<br>of output frame)  | high         |
| pixout_hsync  | out | Horizontal sync out<br>(Coincident with first pixel<br>of output line) | high         |
| pixout_val    | out | Output pixel valid                                                     | high         |
| pixout_rdy    | in  | Ready to accept output pixel (Handshake signal)                        | high         |

## **General Description**

XY2\_SCALER is a very high quality video scaler capable of generating interpolated output images from 16x16 up to  $2^{16} \times 2^{16}$  pixels in resolution. The architecture permits seamless scaling (either up or down) depending on the chosen scale factor. Internally, the scaler uses a 24-bit accumulator and a bank of polyphase FIR filters with 16 phases or interpolation points. All filter coefficients are programmable, allowing the user to define a wide range of filter characteristics.

Pixels flow in and out of the scaling engine in accordance with the valid-ready pipeline protocol. Pixels are transferred into the scaler on a rising clock-edge when <code>pixin\_val</code> is high and <code>pixin\_rdy</code> is high. Likewise, pixels are transferred out of the scaler on a rising clock-edge when <code>pixout\_val</code> is high and <code>pixout\_rdy</code> is high. As such, the pipeline protocol allows both input and output interfaces to be stalled independently.

The scaler is partitioned into a horizontal scaling section in series with a vertical scaling section as shown by Figure 1.

#### Scale pitch, pixels per line and lines per frame

The output resolution of the scaled output image is controlled by the generic parameters <code>scale\_pitch\_x</code>, <code>scale\_pitch\_y</code>, <code>input\_ppl</code>, <code>input\_lpf</code>, <code>output\_ppl</code> and <code>output\_lpf</code>. The scale pitch may be calculated using the following formula:

pitch = 
$$(\frac{Input resolution}{Output resolution}) * 2^{12}$$

As an example, consider the scaling of VGA format video (640x480) to XGA format video (1024x768). In this case the scale pitch in the x and y dimensions would be 0.625. As the value must be specified as a 12.12-bit number the actual scale pitch must be multiplied by  $2^{12}$  giving the generic value '2560'.

In addition the user must also specify the exact resolution of the source input frame and the scaled output frame using the parameters: <code>input\_ppl, input\_lpf, output\_ppl</code> and <code>output\_lpf</code>. The following tables give a list of generic parameters required for the conversion of some example video formats.

#### SCALE UP

| Video<br>IN       | Video<br>OUT        | Pitch<br>X | Pitch<br>Y | I/P<br>PPL | I/P<br>LPF | O/P<br>PPL | O/P<br>LPF |
|-------------------|---------------------|------------|------------|------------|------------|------------|------------|
| VGA<br>640x480    | SVGA<br>800x600     | 3277       | 3277       | 640        | 480        | 800        | 600        |
| SVGA<br>800x600   | XGA<br>1024x768     | 3200       | 3200       | 800        | 600        | 1024       | 768        |
| XGA<br>1024x768   | HD1080<br>1920x1080 | 2184       | 2913       | 1024       | 768        | 1920       | 1080       |
| SXGA<br>1280x1024 | 2K<br>2048x1080     | 2560       | 3884       | 1280       | 1024       | 2048       | 1080       |

#### SCALE DOWN

| Video<br>IN         | Video<br>OUT      | Pitch<br>X | Pitch<br>Y | I/P<br>PPL | I/P<br>LPF | O/P<br>PPL | O/P<br>LPF |
|---------------------|-------------------|------------|------------|------------|------------|------------|------------|
| SVGA<br>800x600     | VGA<br>640x480    | 5120       | 5120       | 800        | 600        | 640        | 480        |
| XGA<br>1024x768     | SVGA<br>800x600   | 5243       | 5243       | 1024       | 768        | 800        | 600        |
| HD1080<br>1920x1080 | XGA<br>1024x768   | 7680       | 5760       | 1920       | 1080       | 1024       | 768        |
| 2K<br>2048x1080     | SXGA<br>1280x1024 | 6554       | 4320       | 2048       | 1080       | 1280       | 1024       |

#### Flow control

Pixels flow in and out of the scaling engine in accordance with the valid-ready pipeline protocol<sup>2</sup>. The scaling operation occurs on a line-by-line basis with the signal pixin\_hsync specifying the start of a new line and pixin\_vsync specifying the start of a new frame. All pixels into the scaler (including pixin\_vsync and pixin\_hsync) must be qualified by the pixin\_val signal asserted high, otherwise changes to the input signals will be ignored. Note that the first pixel of a new frame is accompanied by a valid vsync and hsync. The first pixel in a new line is accompanied by hsync only.

On receipt of the first vsync, the scaling operation begins and output pixels are generated in accordance with the chosen scale parameters. Generally, for scale-down (decimation) operations, the input interface will not stall. Conversely, for scale-up (interpolation) the number of output pixels will be greater than the number of input pixels. This will result in the occasional stalling of the input due to the change in ratio.

<sup>2</sup> See Zipcores application note: app\_note\_zc001.pdf for more examples of how to use the valid-ready pipeline protocol



#### Loading of scale parameters

The scale parameters are fully programmable and allow the input video to be scaled differently on a frame-by-frame basis. With careful design, the architecture also permits different video sources to be multiplexed into the same scaler with different scaling parameters.

Parameters are updated continuously on every rising clock edge and must remain stable during the scaling operation. When programming new scale parameters (e.g. due to a change of video mode) it is necessary to assert the system reset signal for at least one clock cycle to avoid any possible corruption in the output video. This is often convenient to do during the vertical blanking period of an input video frame when there are no active pixels. After reset the scaler will lock to the next clean input frame before the scaling operation continues.

#### Scaling algorithm

The scaler uses a 2-tap polyphase filter in the x-dimension and a 2-tap polyphase filter in the y-dimension. By default, the x and y filters use bilinear interpolation (Figure 2). In addition, the user may also use a different function to derive the filter coefficients depending on the application<sup>3</sup>.



Figure 2: Bilinear function – x and y filter tap positioning

## **Functional Timing**

Figure 3 shows the signalling at the input to the scaler at the start of a new frame. The first line of a new frame begins with <code>pixin\_vsync</code> and <code>pixin\_hsync</code> asserted high together with the first pixel. Note that the signals <code>pixin, pixin\_vsync</code> and <code>pixin\_hsync</code> are only valid if <code>pixin\_val</code> is also asserted high. In addition, the diagram shows what happens when <code>pixin\_rdy</code> is de-asserted. In this case, the pipeline is stalled and the upstream interface must hold-off before further pixels are processed.



Figure 3: First line of a new frame

Figure 4 shows the signalling at the output of the scaler. The output uses exactly the same protocol as the input. Each new output line begins with <code>pixout\_hsync</code> and <code>pixout\_val</code> asserted high. In this particular example, it shows <code>pixout\_val</code> de-asserted for 1 clock-cycle, in which case, the output pixel should be ignored. Remember that transfers at a valid-ready interface are only permitted when valid and ready are both simultaneously high.



Figure 4: Scaler output showing invalid pixel

<sup>3</sup> See Zipcores application note: app\_note\_zc003.pdf for examples of how to generate coefficient sets



## Source File Description

All source files are provided as text files coded in VHDL. The following table gives a brief description of each file.

| Source file             | Description                              |
|-------------------------|------------------------------------------|
| video_in.txt            | Text-based source video file             |
| video_file_reader.vhd   | Reads text-based source video file       |
| pipeline_reg.vhd        | Pipelined register element               |
| pipeline_shovel.vhd     | Pipelined 'shovel' register              |
| ram_dp_w_r.vhd          | Dual port RAM component                  |
| fifo_sync.vhd           | Synchronous FIFO                         |
| x2_buffer.vhd           | Pixel input buffer/shift register        |
| x2_filter_pack.vhd      | Package containing x-filter coefficients |
| x2_filter_polyphase.vhd | Horizontal scaler output pixel filter    |
| x2_scaler.vhd           | Horizontal scaler component              |
| y2_buffer.vhd           | Line buffer                              |
| y2_filter_pack.vhd      | Package containing y-filter coefficients |
| y2_filter_polyphase.vhd | Vertical scaler output pixel filter      |
| y2_scaler.vhd           | Vertical scaler component                |
| xy2_reg.vhd             | Video scaler input registers             |
| xy2_scaler.vhd          | Video scaler top-level component         |
| xy2_scaler_bench.vhd    | Top-level test bench                     |

## **Functional Testing**

An example VHDL testbench is provided for use in a suitable VHDL simulator. The compilation order of the source code is as follows:

- video\_file\_reader.vhd
- 2. pipeline\_reg.vhd
- 3. pipeline\_shovel.vhd
- 4. ram\_dp\_w\_r.vhd
- fifo\_sync.vhd
- 6. x2\_buffer.vhd
- x2\_filter\_pack.vhd
- 8. x2\_filter\_polyphase.vhd
- 9. x2\_scaler.vhd
- 10. y2\_buffer.vhd
- 11. y2\_filter\_pack.vhd
- 12. y2\_filter\_polyphase.vhd
- 13. y2\_scaler.vhd
- 14. xy2\_reg.vhd
- 15. xy2\_scaler.vhd
- 16. xy2\_scaler\_bench.vhd

The VHDL testbench instantiates the XY2\_SCALER component and the user may modify the generic parameters in order to generate the desired scaled output image.

The source video for the simulation is generated by the video file-reader component. This component reads a text-based file which contains the RGB pixel data. The text file is called <code>video\_in.txt</code> and should be placed in the top-level simulation directory.

The file *video\_in.txt* follows a simple format which defines the state of signals: *pixin\_val*, *pixin\_vsync*, *pixin\_hsync* and *pixin* on a clock-by-clock basis. An example file might be the following:

1 1 1 00 11 22 # pixel 0 line 0 (start of frame)

1 0 0 33 44 55 # pixel 1

0 0 0 00 00 00 # don't care!

1 0 0 66 77 88 # pixel 2

1 0 1 00 11 22 # pixel 0 line 1 etc..

In this example, the first line of of the  $video\_in.txt$  file asserts the input signals pixin\_val = 1, pixin\_vsync = 1, pixin\_hsync = 1 and pixin = 0x001122.

The simulation must be run for at least 10 ms during which time an output text file called *video\_out.txt* will be generated. This file contains a sequential list of 24-bit output pixels in the same format as *video\_in.txt*. The example provided scales a 768x576 source test pattern by a factor of 0.833 in the x and y dimensions to give a VGA output image of 640x480 pixels. Figure 5 shows the resulting image from the test.



Figure 5: Output frame from the hardware simulation example (Scale-down of 768x576 to 640x480)

### **Performance**

The Bilinear Video Scaling Engine was tested with a large number of scale factors to verify correct operation and to observe the quality of the output video. The true definition and quality is difficult to show within the limitations of this document, however, example images can be provided on request.

The video scaler was also verified using a Xilinx® Spartan6 SP605 development board as a platform. The photo in Figure 6 demonstrates the scale down of a PAL source image to a small custom video window of 500x400 pixels on an SXGA (1280x1024) background.





Figure 6: Scaler demo lab setup (Generation of a small 500x400 video window on SXGA background)

# Synthesis

The files required for synthesis and the design hierarchy is shown below:

- xy2\_scaler.vhd
  - O xy2\_reg.vhd
    - pipeline\_reg.vhd
  - O x2\_scaler.vhd
    - pipeline\_shovel.vhd
    - x2\_buffer.vhd
    - x2\_filter\_polyphase.vhd
      - pipeline\_reg.vhd
  - O y2\_scaler.vhd
    - pipeline\_shovel.vhd
    - y2\_buffer.vhd
      - ram\_dp\_w\_r.vhd
    - fifo\_sync.vhd
      - pipeline\_reg.vhd
    - y2\_filter\_polyphase.vhd
      - pipeline\_reg.vhd

The VHDL core is designed to be technology independent. However, as a benchmark, synthesis results have been provided for the Xilinx® Virtex6 and Spartan6 FPGA devices. Synthesis results for other FPGAs and technologies can be provided on request.

Fixing the scale parameters at the scaler input will result in the most optimum scaler design. In addition, the speed of the design may be improved by tying the signal <code>pixout\_rdy</code> low. This may be possible if the designer knows that the pipeline downstream of the scaler will always be able to accept output pixels. Careful attention must be made to the width of the line stores as this will effect the amount of RAM resource used in the design.

Trial synthesis results are shown with the generic parameters set to:  $line\_width = 1024$  and  $log2\_line\_width = 10$ . Resource usage is specified after Place and Route.

#### VIRTEX 6

| Resource type            | Quantity used |
|--------------------------|---------------|
| Slice register           | 690           |
| Slice LUT                | 738           |
| Block RAM                | 3             |
| DSP48                    | 12            |
| Occupied Slices          | 301           |
| Clock frequency (approx) | 320 MHz       |

#### SPARTAN 6

| Resource type            | Quantity used |
|--------------------------|---------------|
| Slice register           | 690           |
| Slice LUT                | 742           |
| Block RAM                | 3             |
| DSP48                    | 12            |
| Occupied Slices          | 288           |
| Clock frequency (approx) | 170 MHz       |

## **Revision History**

| Revision | Change description                                                                                                      | Date       |
|----------|-------------------------------------------------------------------------------------------------------------------------|------------|
| 1.0      | Initial revision                                                                                                        | 05/02/2009 |
| 1.1      | Added extra items to key features                                                                                       | 12/06/2009 |
| 1.2      | Updated synthesis results                                                                                               | 15/12/2009 |
| 1.4      | Added scaling formula. Updated source file descriptions to include shovels                                              | 18/02/2010 |
| 1.5      | Updated synthesis results in line with minor source code changes                                                        | 27/01/2012 |
| 2.0      | Major revision. Simplified loading of scale parameters. Modified architecture to support one frame out for one frame in | 10/05/2013 |
| 2.1      | Moved to 16-bit scale parameters to support resolutions up to 2 <sup>16</sup> x 2 <sup>16</sup>                         | 11/08/2014 |
|          |                                                                                                                         |            |
|          |                                                                                                                         |            |
|          |                                                                                                                         |            |

# **X-ON Electronics**

Largest Supplier of Electrical and Electronic Components

Click to view similar products for Development Software category:

Click to view products by Zipcores manufacturer:

Other Similar products are found below:

SRP004001-01 SW163052 SYSWINEV21 WS01NCTF1E W128E13 SW89CN0-ZCC IP-UART-16550 MPROG-PRO535E AFLCF-08-LX-CE060-R21 WS02-CFSC1-EV3-UP SYSMAC-STUDIO-EIPCPLR 1120270005 SW006021-2H ATATMELSTUDIO 2400573 2702579 2988609 SW006022-DGL 2400303 88970111 DG-ACC-NET-CD 55195101-101 55195101-102 SW1A-W1C MDK-ARM SW006021-2NH B10443 SW006021-1H SW006021-2 SW006023-2 SW007023 MIKROE-730 MIKROE-2401 MIKROE-499 MIKROE-722 MIKROE-724 MIKROE-726 MIKROE-728 MIKROE-732 MIKROE-734 MIKROE-736 MIKROE-738 MIKROE-744 MIKROE-928 MIKROE-936 1120270002 1120270003 1120275015 NT-ZJCAT1-EV4