Our work, titled as, "CodedVision: Towards Joint Image Understanding and Compression via End-to-End Learning", had been awarded the Best Paper Finalist. This work had tried to explore the possibility to perform the vision tasks and compression jointly at front-end via the end-to-end learning framework. According to the results, this work had provided the 7.8% BD-Rate compression improvement compared with the HEVC Intra profile based image coding, and comparable accuracy for classification when compared with the conventional method.
Abstract:
We present a CodedVision framework to achieve image content understanding and compression jointly, leveraging the recent advances in deep neural networks. We have introduced an eight-layer deep residual network to extract image features for compression and understanding. For compression, a scalar quantizer and an entropy coder are utilized to remove redundancy. Rate-distortion optimization is integrated to improve the coding eciency where rate is estimated via a piecewise linear approximation. A noticeable 7.8% BD-Rate (Bjontegaard delta rate) gain is presented against the state-of-the-art HEVC intra based image compression. For content understanding, we patch another residual network-based classier to perform the classication, with reasonable accuracy at the current stage.
