Point cloud is emerged as a promising media format to represent realistic 3D objects or scenes in applications, such as virtual reality, teleportation, etc. How to accurately quantify the subjective point cloud quality for application-driven optimization, however, is still a challenging and open problem. In this paper, we attempt to tackle this problem in a systematic means. First, we produce a fairly large point cloud dataset where ten popular point clouds are augmented with seven types of impairments (e.g., compression, photometry/color noise, geometry noise, scaling) at six different distortion levels, and organize a formal subjective assessment with tens of subjects to collect mean opinion scores (MOS) for all 420 processed point cloud samples (PPCS). We then try to develop an objective metric that can accurately estimate the subjective quality. Towards this goal, we choose to project the 3D point cloud onto six perpendicular image planes of a cube for the color texture image and corresponding depth image, and aggregate image-based global (e.g., JensenShannon (JS) divergence) and local features (e.g., edge, depth, pixel-wise similarity, complexity) among all projected planes for a final objective index. Model parameters are fixed constants after performing the regression using a small and independent dataset previously published. The proposed metric has demonstrated the state-of-the-art performance for predicting the subjective point cloud quality compared with the weighted peak signal-tonoise ratio (PSNR), structural similarity (SSIM). The dataset and model are made publicly accessible at http://vision.nju.edu.cn for all interested audiences.
Q. Yang, H. Chen, Z. Ma, Y. Xu, R. Tang and J. Sun, Predicting the Perceptual Quality of Point Cloud: A 3D-to-2D Projection-Based Exploration, in IEEE Transactions on Multimedia, doi: 10.1109/TMM.2020.3033117.
This database made by Qi Yang(email@example.com) and Rongjun Tang(firstname.lastname@example.org) from Shanghai Jiao Tong University, we welcome everyone to carry on the test and propose the modification opinion. If you use our database in your paper, please cite our paper: Predicting the Perceptual Quality of Point Cloud: A 3D-to-2D Projection-Based Exploration, submitted to Trans. Multimedia.
There are 10 reference samples, including ‘Ricardo’(used as training session, no subjective score), ‘redandblack’, ‘loot’, ‘soldier’, ‘longdress’, ‘Hhi’, ‘shiva’, ‘shatue’, ‘ULB_Unicorn’ and ‘Romanoillamp’ (Shown in Fig. 1).
Fig. 1 Database Samples
The details of each sample shown in Table 1.
Table 1 Point cloud sample illustration
Each sample was processed into 7 different distortion in six levels:
OT: Octree-based compression
CN: Color Noise
D+C: Downscaling and Color noise
D+G: Downscaling and Geometry Gaussian noise
GGN: Geometry Gaussian noise
C+G: Color noise and Geometry Gaussian noise
OT: Compression noise is exemplified using the octree pruning method provided in well-known Point Cloud Library (PCL) (http://pointclouds.org/downloads/). Octree pruning removes leaf nodes to adjust tree resolution for compression purpose. Here, we have experimented different compression levels by removing points at 13%, 27%, 43%, 58%, 70% and 85%. It is difficult to guarantee the point removal percentage at the exact number. Thus, we allow ±3% deviation.
CN: Color noise, or photometric noise is applied to the photometric attributes (RGB values) of the points. We inject the noise for 10%, 30%, 40%, 50%, 60%, and 70% points that are randomly selected, where noise levels are respectively and again randomly given within ±10, ±30, ±40, ±50, ±60, and ±70 for corresponding points (e.g., 10% random points with ±10 noise, 30% random points with ±30, and so on so forth). Noise is equally applied to R, G, B attributes. Clipping is used if the noisy intensity p = p + n, is out of the range of [0, 255], e.g., if p < 0, p= 0; and if p > 255, p = 255.
DS: We randomly downsample the point clouds by removing 15%, 30%, 45%, 60%, 75%, 90% points from the original point clouds. We directly utilize the downscaling function pcdownscample () offered by the Matlab software.
DS+CN or D+C: We combine aforementioned DS and CN where the downsampling process is firstly applied and then the color noise is added in a consecutive order, e.g., 15% DS and 10% random points with ±10 noise, 30% DS and 30% random points with ±30 noise, and so on so forth).
DS+GGN or D+G: GGN and DS are superimposed. The DS process is firstly applied before augmenting the GGN consecutively, e.g., 15% DS with 0:05% GGN, 30% DS with 0:1% GGN, and so on so forth).
GGN: We apply Gaussian distributed geometric shift to each point randomly. In this study, all the points will be augmented with a random geometric shift that is within 0:05%, 0:1%, 0:02%, 0:5%, 0:7%, 1:2% of the bounding box.
CN+GGN or C+G: Both GGN and CN are superimposed. The GGN is firstly applied, and the is the CN, e.g., 0:05% GGN and 10% random points with ±10 noise, 0:1% GGN and 30% random points with ±30 noise, and so on so forth).
We have adopted the single stimulus method for subjective rating, and all the steps are compliant with the ITU-R Recommendation BT. 500. We use the CloudCompare as our point cloud rendering software for collecting the MOS. The computer used in subjective experiments is equipped with a popular Dell SE2216H monitor (21.5 inches and the resolution is 1920×1080), and configured with an Intel i5-6300 HQ CPU, 8-GigaBytes RAM and 1-TeraBytes hard drive. Raw scores are given in the range of [1, 10], associated with five quality scales (e.g., 1-2: bad, 3-4: poor, 5-6: fair, 7-8: good and 9-10: excellent).
More details about raw MOS processing please check our paper.