Vistors:

USTC-TD: A Test Dataset and Benchmark for Image and Video Coding in 2020s

University of Science and Technology of China (USTC),
MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition,
Intelligent Visual Data Coding Laboratory (iVC Lab)

Figure 1. Examples of USTC Video Test Dataset.


Abstract

Image/video coding has been a remarkable research area for both academia and industry for many years. Testing datasets, especially high-quality image/video datasets are desirable for the justified evaluation of coding-related research, practical applications, and standardization activities. We put forward a test dataset namely USTC-TD, which has been successfully adopted in the practical end-to-end image/video coding challenge of the IEEE International Conference on Visual Communications and lmage Processing (VCIP) in 2022 and 2023. USTC-TD contains 40 images at 4K spatial resolution and 10 video sequences at 1080p spatial resolution, featuring various content due to the diverse environmental factors (e.g. scene type, texture, motion, view) and the designed imaging factors (e.g. illumination, lens, shadow). We quantitatively evaluate USTCTD on different image/video features (spatial, temporal, color, lightness), and compare it with the previous image/video test datasets, which verifies the wider coverage and more diversity of the proposed dataset. We also evaluate both classic standardized and recent learned image/video coding schemes on USTC-TD with PSNR and MS-SSIM, and provide an extensive benchmark for the evaluated schemes. Based on the characteristics and specific design of the proposed test dataset, we analyze the benchmark performance and shed light on the future research and development of image/video coding. All the data are released online: https://esakak.github.io/USTC-TD.

Image Test Dataset

Our proposed dataset aims to cover various scenarios, and try to collect and simulate the data in the real-world coding transmission scenes, which makes the evaluation of image coding schemes more closer to the actual application.

Considering the various content elements, we combine different environmental conditions (scene type, texture, view, etc) and captured conditions (resolution, illumination, lens, shadow, etc) in the collection process.

Figure 2. Illustration of the image dataset in USTC-TD 2022.


Compared to USTD-TD 2022, USTC-TD 2023 considers more extreme elements in real-world scenes.

Figure 3. Illustration of the image dataset in USTC-TD 2023.


Video Test Dataset

Based on the characteristics of previous video datasets, our proposed dataset aims to cover more typical characteristics of video content. Compared to the image data, temporal-domain properties are unique to video, especially in the diverse motion types with more environmental conditions in natural videos. There are usually multiple moving objects of arbitrary shapes and various motion types in video frames, leading to complex motion fields, which challenge the video coding schemes. Therefore, we simulate the video data with various temporal correlation types, including different kinds of motion types and lens motion.

Figure 4. Illustration of the video dataset in USTC-TD 2022.


Dataset Details

Construction

Based on the characteristics of previous image/video datasets, our proposed dataset aims to cover various scenarios, and try to collect and simulate the data in the real-world coding transmission scenes, which makes the evaluation of image/video coding schemes more closer to the actual application.

TABLE 1. THE CONFIGURATION OF USTC-TD 2022 IMAGE DATASET.

TABLE 2. THE CONFIGURATION OF USTC-TD 2023 IMAGE DATASET.

TABLE 3. THE CONFIGURATION OF USTC-TD 2023 VIDEO DATASET.

Analysis

To comprehensively verify the wide coverage of our proposed dataset for various content elements and qualitatively analyze the superiority of USTC-TD, we evaluate the USTC-TD on different image/video features and compare it with the previous image/video common test datasets (image datasets: Kodak, CLIC, Tecnick, video datasets: HEVC CTC, VVC CTC, MCL-JCV, UVG). For analysis of image/video features, we select the spatial information (SI), colorfulness (CF), lightness information (LI), and temporal information (TI) to characterize each dataset along the dimensions of space, color, lightness, and temporal correlation, which are commonly used to evaluate the quality of dataset.


Figure 5. The visualization of the evaluation of spatial information (SI) and colorfulness (CF) features on different image test datasets. Scatter diagram represents the SI versus CF, and corresponding convex hulls indicates the coverage of different datasets. The histogram represents the number of images under different SI scores.


Figure 6. The visualization of the evaluation of lightness information (LI) and CF features on different image test datasets. Scatter diagram represents the LI versus CF, and corresponding convex hulls indicates the coverage of different
datasets. The histogram represents the number of images under different LI scores.



Figure 7. The visualization of the evaluation of temporal information (TI) and SI features on different video test datasets. Scatter diagram represents the TI versus SI, and corresponding convex hulls indicates the coverage of different datasets. The histogram represents the number of videos under different TI scores.



Dataset Evaluation

In this section, we establish the baselines and evaluate recent advanced state-of-the-art learned image/video compression algorithms, and standardization activities with different metrics (PSNR, MS-SSIM, et al.), and comprehensively benchmark their performance on our proposed datasets.


Rate-distortion Curves

Figure 8. Overall rate-distortion (RD) curves of advanced image compression schemes on PSNR and MS-SSIM metrics, the results are evaluated on USTC-TD image dataset 2022 and 2023.

Figure 9. Overall rate-distortion (RD) curves of advanced video compression schemes on PSNR and MS-SSIM metrics, the results are evaluated on USTC-TD video dataset 2023.

BD-RATE for PSNR/MS-SSIM Metrics

TABLE IV. BD-RATE (%) COMPARISON FOR PSNR. THE ANCHOR IS VTM.
TABLE V. BD-RATE (%) COMPARISON FOR MS-SSIM. THE ANCHOR IS VTM.

BibTeX

@arxiv{USTC-TD,
    author    = {Zhuoyuan Li*, Junqi Liao*, Xihua Sheng, Haotian Zhang, Yuqi Li, Chuanbo Tang, Yifan Bian, Xinmin Feng, Yao Li, Changsheng Gao, Li Li, and Dong Liu},
    title     = {USTC-TD: USTC Test Dataset for Image and Video Coding in 2020s},
    booktitle = {arxiv},
    year      = {2024},
}
            

Contributors

Supervisors

Dong Liu

Li Li

Changsheng Gao



Students

Zhuoyuan Li
Ph.D.

Junqi Liao
Ph.D.

Chuanbo Tang
Ph.D.

Haotian Zhang
Ph.D.

Yuqi Li
M.S.

Yifan Bian
Ph.D.

Xihua Sheng
Ph.D.

Yao Li
Ph.D.

Xinmin Feng
M.S.

Acknowledgement

We appreciate the utilization and support of some organizations, and also thanks to the supervisors and USTC’s volunteers featured in the contribution of this dataset.


Actors: Cunhui Dong, Ziyi Zhuang, Feihong Mei, Qiaoxi Chen, Bojun Liu.


Testers: Jialin Li, Xiongzhuang Liang.


Organizations: IEEE International Conference on Visual Communications and lmage Processing (VCIP) 2022 and 2023.


Thanks to the IEEE Dataport , we have submitted the data to this open-sourced dataset website for the convenient access of the IEEE community's researchers.

Copyright

The released images and sequences are captured and processed by the University of Science and Technology of China (USTC). All intellectual property rights remain with USTC. If the users need our datasets for their works, please cite the datasets paper or this website.


The following uses are allowed for the contributed dataset:
     1. Data (images and videos) may be published in research papers, technical reports, and development events.
     2. Data (images and videos) may be utilized by standardization activities. (e.g., ITU4, MPEG5, AVS6, VQEG7, et al.).


The following uses are NOT allowed for the contributed dataset:
     1. Do not publish snapshots in product brochures.
     2. Do not use video for marketing purposes.
     3. Redistribution is not permitted.
     4. Do not use it in television shows, commercials, or movies.

Contact

If you have any questions or advice on these datasets, please contact us:
Zhuoyuan Li: email-zhuoyuanli@mail.ustc.edu.cn, wechat-ustc_lizhuoyuan
Junqi Liao: email-liaojq@mail.ustc.edu.cn, wechat-liaojq98


If you have any questions or advice on this website, please contact Yifan Bian.