Video systems, which are now deeply embedded in every aspect of consumer applications, are increasingly common in the automotive, robotics and industrial sectors. The growth in non-consumer applications is largely due to the ADVENT of the HDMI standard and faster, more efficient DSPS and FPgas.
This article will outline the various requirements for stereo vision (3D video) using analog or HDMI cameras. This paper describes a FPGA-based system that combines two video streams into a SINGLE 3D video stream, which is transmitted via an HDMI 1.4 transmitter, and a DSP-based system that can save DMA bandwidth compared to the usual need to receive data from two cameras. In addition, this article describes a method for implementing a side-by-side format that can be used by 3D cameras or systems that require 3D video.
1, an overview of the
Stereoscopic vision requires the use of two cameras about 5.5 cm apart, which is the typical distance between human eyes, as shown in Figure 1.
The advanced function block diagram shown in Figure 2 uses two synchronous cameras, two video decoders, and an FPGA using the same video standard. To ensure exactly the same frame rate, the camera must be locked to a common reference sequence. Without synchronization, and without the use of external storage, it would be impossible to combine the output and store it as a complete video frame.
Figure 3 shows two row locked video streams being combined into a stereo image.
Figure 4 shows that asynchronous video streams cannot be merged without saving the entire video frame in external storage.
The output of the two synchronous cameras is then digitized by a video decoder (such as ADV7181D, ADV7182 or ADV7186 for analog cameras); It can also be digitized by an HDMI receiver such as ADV7610 or ADV7611 for digital camcorders.
Both the video decoder and HDMI receiver employ an internal phase-locked loop (PLL) to generate clock and pixel data on their output bus. This means that when digitizing analog video, or receiving AN HDMI stream, two separate clock domains are generated for each camera. In addition, there may be alignment errors between the two video streams. These timing differences and alignment errors must be compensated in back-end devices such as FPgas, which first bring data to a common clock domain and then combine the two video images into a single stereo video frame. The synchronized video stream is then sent via a 3d-enabled HDMI 1.4 HDMI transmitter (such as ADV7511 or ADV7513) — or it can be supplied to ADSP (such as adsp-bf609 Blackfin® processor) for further processing.
2. Clock architecture
The video decoder has two completely different clock sources, depending on whether they are locked or not. When the video PLL is locked to the input sync signal — horizontal sync (video decoder) or TMDS clock (HDMI) — the result is a clock locked to the input video source. When the video is out of lock, or when the PLL is in forced free running mode, the video PLL does not lock to the input synchronization signal, resulting in a clock output that is locked to the crystal clock. In addition, the clock may not be output after reset because the LLC clock driver is set to high impedance mode after reset.
Therefore, if a system has two or more video paths starting from a video decoder or HDMI receiver, even if the same crystal clock is provided to both video decoders or HDMI receivers, there will still be two different clock domains with different frequencies and different phases, because each device generates its own clock based on its own PLL.
3. Synchronization system with locked video decoder
Stereoscopic video typically uses two video sources, where each video decoder is locked to the input video signal and generates its own clock based on input level synchronization or TMDS clock.
When the two cameras are synchronized — or the lines are locked to the same reference timing — the splitter lines are always aligned. Since two independent video decoders will receive the same horizontal synchronization signal, the pixel clock will have the same pixel clock frequency. This enables you to bring both data paths into the same clock domain, as shown in Figure 5.
4. Asynchronous video system
Unfortunately, one of the video decoders can lose the lock due to poor video source signal quality, as shown in Figure 6.
Alternatively, the camera loses synchronization because the video link is disconnected, as shown in Figure 7.
This results in different frequencies in the two data paths, which in turn results in asymmetrical amounts of data entering the back end.
Video lost-lock can be detected by using an interrupt (SD_UNLOCK for SD video decoder, CP_UNLOCK for component video decoder, or TMDSPLL_LCK register in HDMI receiver) that will be involved after a certain delay.
The video decoder integrates an unstable horizontal synchronization smoothing mechanism, so the detection of a video lost-lock may require two or three lines. This delay can be reduced by controlling the lock loss in the FPGA.
4.1. Three-state clock mode
When designing FPGA clock resources, it is important to know that, by default, many video decoders and HDMI products set the clock and data lines to tri-state mode after reset. Therefore, LLC pixel clocks are not suitable for synchronous resets.
4.2 Data alignment errors in two video streams
To simplify the system and reduce the memory required to merge the two images, the data arriving at the FPGA should be synchronized so that the NTH pixel on line MTH from the first camera is received at the same time as the NTH pixel on line MTH from the second camera.
On the FPGA input, this can be difficult to implement because the two video paths may have different delays: a line lock camera may output lines with alignment errors, different connection lengths may increase alignment errors, and a video decoder may introduce variable start delays. Due to these delays, a system with a row lock camera will have some pixels with alignment errors.
4.3 Alignment error of line lock camera
Even line lock cameras can output video lines with alignment errors. Figure 8 shows the vSYNC signal from the CVBS output of the two cameras. One camera (synchronous host) provides the row lock signal to the second camera (synchronous slave). The alignment error of 380 ns is clearly visible.
Figure 9 shows the data transmitted by the video decoder at the output of these cameras. You can see a shift of 11 pixels.
4.4 Different connection lengths
All electrical connections introduce propagation delays, so make sure both video paths have the same track and cable length.
4.5. Video decoder /HDMI receiver delay
All video decoders introduce delays that may vary depending on the functionality enabled. In addition, some video devices contain factors that can increase random start delays — such as dark FIFOS. The random start delay of a typical stereo system with a video decoder is about 5 pixel clocks. A system with an HDMI transmitter and receiver (as shown in Figure 10) might have a random start delay of around 40 pixel clocks.
4.6 Alignment error compensation
Figure 11 shows a system in which a video decoder digitizes the analog signals from each camera. The data and clock of each video path are independent. Both video paths connect to the FIFO, which buffers the input data to compensate for data alignment errors. When outputting data, the FIFO uses a shared clock from one of the decoders. In a locking system, the two data paths should have exactly the same clock frequency to ensure that FIFO overflow or underflow does not occur if the camera line is locked and the video decoder is locked.
By enabling or disabling THE FIFO output, the control module can maintain the FIFO level to minimize pixel alignment errors. If the correct compensation measures are taken, the output of the FPGA module should be two data paths aligned with the first pixel. This data is then fed to the FPGA back end to generate the 3D format.
4.7 Alignment error measurement
Alignment errors between two digital data streams can be measured at the video FIFO output using a single clock counter reset on a vSYNC (VS) pulse of one of the input signals. The alignment error of the two video streams (VS_A_IN and VS_B_in) shown in Figure 12 is 4 pixels. The counter measures the alignment error using the method shown in Listing 1. The count starts on the rising edge of VS1 and ends on the rising edge of VS2. If the total pixel length of a frame is known, the negative skew (VS2 precedes VS1) can be calculated by subtracting the count from the frame length. The negative value should be calculated when the deflection exceeds half the length of the pixel frame. The results are applied to reallocate the data stored in the FIFO.
4.8 generate 3D video from two aligned video streams
Once the pixel, row, and frame data are all truly synchronized, the FPGA can convert the video data into a 3D video stream, as shown in Figure 13.
Input data is read into memory by the common clock. The timing sequence analyzer checks the input synchronization signal and extracts the video timing sequence, including the horizontal front and back edge length, vertical front and back edge length, horizontal and vertical synchronization length, horizontal effective line length, vertical effective line number and synchronization signal polarization.
This information is passed along with the current horizontal and vertical pixel positions to the synchronous timing regenerator, which generates a modified timing to support the desired 3D video structure. The newly generated timing should be delayed to ensure that the FIFO contains the required amount of data.
4.9 Side-by-side 3D video
The least memory-intensive architecture is the side-by-side format, which requires only a two-line buffer (FIFO) to store lines of content from two video sources. The width of the side-by-side format should be twice that of the original input mode. To do this, a double clock should be used to provide a clock for a regeneration synchronization sequence with double horizontal length. The double clock used to provide the clock for the back end will empty the first FIFO and the second FIFO at double rate so that the images are displayed side by side, as shown in Figure 14. The side-by-side images are shown in Figure 15.
Learning source: ADI official website
Journal download: download.csdn.net/download/m0…