Modeling the Screen Content Image Quality via Multiscale Edge Attention Similarity

Made by Shanghai Jiao Tong University

Written by  |  10/03/2020 - 18/09


Screen content image (SCI) prevails because of the explosive growth of screen-oriented applications. This leads to extensive studies on SCI quality assessment and modeling for application optimization. In this paper, we propose a full reference multiscale edge attention (MSEA) similarity index to efficiently measure the perceptual quality of a screen image. This model considers the perceptual impacts of fixation attention, edge structure and edge contrast jointly, to accurately capture the masking phenomena (e.g., frequency selectivity, luminance, contrast, etc.) of our human visual system (HVS) when viewing a screen image. Specifically, we decompose the images using Gaussian and Laplacian pyramids which are then used to derive the edge structure, and edge contrast feature maps. Together with the fixation attention map generated by weighted luminance difference between the reference and distorted SCIs, we could eventually offer a MSEA similarity map for final index score. We have evaluated this model using a publicly accessible screen image database. Simulation results have shown that the MSEA similarity index correlates with the collected subjective mean opinion score (MOS) very well. In fact, it is ranked at the first place for both Pearson linear correlation coefficient (PLCC) and Root mean squared error (RMSE), and ranked at the second place for Spearman rank-order correlation coefficient (SROCC) measurements, among existing quality metrics. 



  title={Modeling the Screen Content Image Quality via Multiscale Edge Attention Similarity},

  author={Yang, Qi and Ma, Zhan and Xu, Yiling and Yang, Le and Zhang, Wenjun and Sun, Jun},

  journal={IEEE Trans. Broadcasting},




Model Description

Our IQA model is illustrated in Fig. 1 and generally includes a serial computational operation (e.g., multiscale feature extraction, similarity measurement and pooling) to mimic the biological behaviors of the image assessment of our HVS. The functional form of the proposed model is given below:

Fig.1 Illustrative diagram of the proposed MSEA model for full reference SCI quality assessment

WFA(x, y), SES(x, y) and SEC(x, y) for weighted fixation attention(FA) map to consider the visual impacts when focusing our attention to salient edge regions (e.g., saccades and fixation), edge structure (ES) feature similarity, and edge contrast (EC) similarity, at pixel position (x, y) of an image. Each feature component can be derived from the multiscale image representations. This model belongs to the full reference IQA category.


Mode Performance

We test the proposed MSEA on the SIQAD, and the results shown in Table 1. We use red, blue and cyan to highlight the top three results for each measurement criterion. 

Table 1 Performance comparison of various IQA models using SIQAD database

It is obvious that the results of IQA models designed for SCIs outperform those IQA models for the natural images, for a dominant percentage. The reason is that the state-of-the-art SCI assessment models consider the HVS characteristics that our vision system is more sensitive to the edge distortion of the SCI because of the mixture of texts and graphics. It turns out that the edge distortion is a major contributing factor in assessing the quality of SCI. On average, the proposed MSEA similarity index is ranked at the first place for PLCC and RMSE criteria, but at the second place for SROCC following the MDOGS index.