Viewport Image Quality Database Considering Peripheral Vision Impact

Introduction:

Conventional images and videos are often rendered within the central vision area of the human visual system (HVS) with uniform quality. Recent virtual reality (VR) device with head mounted display (HMD) extends the field of view (FoV) significantly to include both central and peripheral vision areas for immersive experience (this is illustrated in Fig.1 ). It exhibits the unequal image quality sensation among these areas because of the non-uniform distribution of photoreceptors on our retina as shown in Fig. 2.


Fig.1		Fig.2

Inspired by this visual characteristic, we propose to study the perceptual quality of rendered image with respect to the eccentric angle θ across different vision areas within current FoV (or viewport) by wearing the HMD, when user stabilizes the attention on some area. Often times, image quality is controlled by the quantization stepsize q and spatial resolution s, both separately and jointly. Therefore, we measure the visual quality threshold on q and s at different θ through extensive human subject tests using mainstream VR HMD for application purpose, leading to analytical models that express q-threshold and s-threshold as a function of θ explicitly. These models with validated accuracy can be applied to set different quality weights at different regions ( as shown in Fig.3 ), so as to significantly reduce the transmission data size but without subjective quality loss.

Fig.3

Fig. 3

Citation:

The work is under the review. We’ll update it as soon as possible.

Description:

In general, we show image sample pairs, and degrade the quality of one image sample step-wisely during the test session to obtain the visual quality threshold under the guideline of the double stimulus and just-noticeable-distortion (JND) criteria. Each pair is displayed sequentially for about 5 seconds with a 3-seconds pause to record the subjective JND opinion. For each pair, one is with the uniform quality at its highest quality (e.g., one example is q = q_min and s = s_max ) which is noted as the anchor, and the other one is with the UEQ settings in central and peripheral areas. Various quality scales are compared against the anchor to identify the visual thresholds at which our HVS could perceive the quality difference. In this work, we divide the rendered 110° HMD FoV into three main regions, i.e., CVA (Central Vision Area) with one-side eccentricity θ from 0° to 9°, NPA (Near Peripheral Area) with θ∈ (9°, 30°] and the rest from 30° to 55° for FPA (Far Peripheral Area). More details about the test protocol can be learned from our paper. (We will upload the codes for this implementation later. The final UI is similar to Fig. 4)

Fig.4

Eight immersive images, i.e., Attic, Temple, Ship, Train,Beach, Sculpture, Football, Desert, from the SUN360 database uniformly downsampled to the spatial resolution of 4096×2160, are chosen as our test materials, as shown in Fig. 5.[download link: origin test scenes].These test samples cover a wide range of the content characteristics, representing typical scenarios of immersive applications. Besides, each test image contains meaningful saliency area within user’s FoV when rendered in the HMD for consumption.

Fig.5

In our experiment, every image is prepared with multiple versions to cover the sufficient quality scales, by applying different combinations of the s or/and q to compress images using the x264 encoder (we used the H.264 encoder in the ffmpeg tool). As shown in Table I, we have performed three independent tests to study the separate and joint visual quality threshold of q and s. Subjects’ visual quality thresholds of CVA,NPA,FPA can be downloaded [download link:#1 results;#2 results; #3 results. From their opinions, the image with non-uniform quality above these thresholds keeps the perceptual quality as the initial anchor image.

Table.1