Ƶ

Professor Raouf Hamzaoui

Job: Professor in Media Technology

Faculty: Computing, Engineering and Media

School/department: School of Engineering and Sustainable Development

Research group(s): Institute of Engineering Sciences

Address: Ƶ, The Gateway, Leicester, LE1 9BH

T: +44 (0)116 207 8096

E: rhamzaoui@dmu.ac.uk

W:

 

Personal profile

Raouf Hamzaoui received the MSc degree in mathematics from the University of Montreal, Canada, in 1993, the Dr.rer.nat. degree from the University of Freiburg, Germany, in 1997 and the Habilitation degree in computer science from the University of Konstanz, Germany, in 2004. He was an Assistant Professor with the Department of Computer Science of the University of Leipzig, Germany and with the Department of Computer and Information Science of the University of Konstanz. In September 2006, he joined Ƶ where he is a Professor in Media Technology and Head of the Signal Processing and Communications Systems Group in the Institute of Engineering Sciences. Raouf Hamzaoui is an IEEE Senior member. He was a member of the Editorial Board of the IEEE Transactions on Multimedia and IEEE Transactions on Circuits and Systems for Video Technology. He has published more than 120 research papers in books, journals, and conferences. His research has been funded by the EU, DFG, Royal Society, Chinese Academy of Sciences, China Ministry of Science and Technology, and industry and received best paper awards (ICME 2002, PV’07, CONTENT 2010, MESM’2012, UIC-2019,  CCF Transactions on Pervasive Computing and Interaction 2020).

Research group affiliations

Institute of Engineering Sciences (IES)

Signal Processing and Communications Systems (SPCS)

 

Publications and outputs


  • dc.title: Enhancing Octree-based Context Models for Point Cloud Geometry Compression with Attention-based Child Node Number Prediction dc.contributor.author: Sun, Chang; Yuan, Hui; Mao, Xiaolong; Lu, Xin; Hamzaoui, Raouf dc.description.abstract: In point cloud geometry compression, most octree-based context models use the cross-entropy between the one-hot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classification problem. As a result, it fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution. We first analyze why the cross-entropy loss function fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution. Then, we propose an attention-based child node number prediction (ACNP) module to enhance the context models. The proposed module can predict the number of occupied child nodes and map it into an 8-dimensional vector to assist the context model in predicting the probability distribution of the occupancy of the current node for efficient entropy coding. Experimental results demonstrate that the proposed module enhances the coding efficiency of octree-based context models. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: Colored Point Cloud Quality Assessment Using Complementary Features in 3D and 2D Spaces dc.contributor.author: Cui, Mao; Zhang, Yun; Fan, Chunling; Hamzaoui, Raouf; Li, Qinglan dc.description.abstract: Point Cloud Quality Assessment (PCQA) plays an essential role in optimizing point cloud acquisition, encoding, transmission, and rendering for human-centric visual media applications. In this paper, we propose an objective PCQA model using Complementary Features from 3D and 2D spaces, called CF-PCQA, to measure the visual quality of colored point clouds. First, we develop four effective features in 3D space to represent the perceptual properties of colored point clouds, which include curvature, kurtosis, luminance distance and hue features of points in 3D space. Second, we project the 3D point cloud onto 2D planes using patch projection and extract a structural similarity feature of the projected 2D images in the spatial domain, as well as a sub-band similarity feature in the wavelet domain. Finally, we propose a feature selection and a learning model to fuse high dimensional features and predict the visual quality of the colored point clouds. Extensive experimental results show that the Pearson Linear Correlation Coefficients (PLCCs) of the proposed CF-PCQA were 0.9117, 0.9005, 0.9340 and 0.9826 on the SIAT-PCQD, SJTU-PCQA, WPC2.0 and ICIP2020 datasets, respectively. Moreover, statistical significance tests demonstrate that the CF-PCQA significantly outperforms the state-of-the-art PCQA benchmark schemes on the four datasets. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: Dependence-Based Coarse-to-Fine Approach for Reducing Distortion Accumulation in G-PCC Attribute Compression dc.contributor.author: Guo, Tian; Yuan, Hui; Hamzaoui, Raouf; Wang, Xiaohui; Wang, Lu dc.description.abstract: Geometry-based point cloud compression (G-PCC) is a state-of-the-art point cloud compression standard. While G-PCC achieves excellent performance, its reliance on the predicting transform leads to a significant dependence problem, which can easily result in distortion accumulation. This not only increases bitrate consumption but also degrades reconstruction quality. To address these challenges, we propose a dependence-based coarse-to-fine approach for distortion accumulation in G-PCC attribute compression. Our method consists of three modules: level-based adaptive quantization, point-based adaptive quantization, and Wiener filter-based refinement level quality enhancement. The level-based adaptive quantization module addresses the interlevel-of-detail (LOD) dependence problem, while the point-based adaptive quantization module tackles the interpoint dependence problem. On the other hand, the Wiener filter-based refinement level quality enhancement module enhances the reconstruction quality of each point based on the dependence order among LODs. Extensive experimental results demonstrate the effectiveness of the proposed method. Notably, when the proposed method was implemented in the latest G-PCC test model (TMC13v23.0), a Bjφntegaard delta rate of −4.9%, −12.7%, and −14.0% was achieved for the Luma, Chroma Cb, and Chroma Cr components, respectively. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: Crowdsourced Estimation of Collective Just Noticeable Difference for Compressed Video with the Flicker Test and QUEST+ dc.contributor.author: Jenadeleh, Mohsen; Hamzaoui, Raouf; Reips, Ulf-Dietrich; Saupe, Dietmar dc.description.abstract: The concept of videowise just noticeable difference (JND) was recently proposed for determining the lowest bitrate at which a source video can be compressed without perceptible quality loss with a given probability. This bitrate is usually obtained from estimates of the satisfied used ratio (SUR) at different encoding quality parameters. The SUR is the probability that the distortion corresponding to the quality parameter is not noticeable. Commonly, the SUR is computed experimentally by estimating the subjective JND threshold of each subject using a binary search, fitting a distribution model to the collected data, and creating the complementary cumulative distribution function of the distribution. The subjective tests consist of paired comparisons between the source video and compressed versions. However, as shown in this paper, this approach typically overestimates or underestimates the SUR. To address this shortcoming, we directly estimate the SUR function by considering the entire population as a collective observer. In our method, the subject for each paired comparison is randomly chosen, and a state-of-the-art Bayesian adaptive psychometric method (QUEST+) is used to select the compressed video in the paired comparison. Our simulations show that this collective method yields more accurate SUR results using fewer comparisons than traditional methods. We also perform a subjective experiment to assess the JND and SUR for compressed video. In the paired comparisons, we apply a flicker test that compares a video interleaving the source video and its compressed version with the source video. Analysis of the subjective data reveals that the flicker test provides, on average, greater sensitivity and precision in the assessment of the JND threshold than does the usual test, which compares compressed versions with the source video. Using crowdsourcing and the proposed approach, we build a JND dataset for 45 source video sequences that are encoded with both advanced video coding (AVC) and versatile video coding (VVC) at all available quantization parameters. Our dataset and the source code have been made publicly available at http://database.mmsp-kn.de/flickervidset-database.html. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: Evaluating the Impact of Point Cloud Downsampling on the Robustness of LiDAR-based Object Detection dc.contributor.author: Golarits, Marcell; Rosza, Zoltan; Hamzaoui, Raouf; Allidina, Tanvir; Lu, Xin; Sziranyi, Tamas dc.description.abstract: LiDAR-based 3D object detection relies on the relatively rich information captured by LiDAR point clouds. However, computational efficiency often requires the downsampling of these point clouds. This paper studies the impact of downsampling strategies on the robustness of a state-of-the-art object detector, namely PointPillars. We compare the performance of the approach under random sampling and farthest point sampling, evaluating the model’s accuracy in detecting objects across various downsampling ratios. The experiments were conducted on the popular KITTI dataset.

  • dc.title: Survey of IP-based air-to-ground data link communication technologies dc.contributor.author: Özmen, Sergun; Hamzaoui, Raouf; Chen, Feng dc.description.abstract: The main purpose of an air traffic management system is to provide air traffic services to an aircraft moving within the controlled airspace. Very high frequency (VHF) radio in continental regions, as well as high frequency (HF) radio and satellite communications in remote areas, are used today as the primary way of delivering air traffic services. The technical limitations and constraints associated with the current technology, such as line-of-sight requirement, vulnerability to interference, and limited coverage, cause degraded voice quality and discontinuity in service. At the same time, voice-based communication may affect flight safety due to poor language skills, call sign confusion, and failure to use standard phraseology. For this reason, text-based communication over a VHF data link (VDL) has been proposed as an alternative. However, it is predicted that VDL will be insufficient to support increasing air traffic and intensive data exchanges due to its lack of mobility support and limited resources to ensure service continuity. This paper surveys next-generation data link technologies based on the current state of the "industry standard" for aeronautical communication. These include Aeronautical Mobile Airport Communication System (AeroMACS), L-band Digital Aeronautical Communications System (LDACS), and Airborne New Advanced Satellite Techniques & Technologies in a System Integrated Approach (ANASTASIA). The paper also surveys IP-based text communication solutions over these next-generation data links. We analyze the efficiency of the proposed solutions with regard to service continuity and aeronautical application requirements. We conclude the survey by identifying open problems and future trends. dc.description: open access article

  • dc.title: PCAC-GAN: A Sparse-Tensor-Based Generative Adversarial Network for 3D Point Cloud Attribute Compression dc.contributor.author: Xiaolong, Mao; Hui, Yuan; Xin, Lu; Hamzaoui, Raouf; Wei, Gao dc.description.abstract: Learning-based methods have proven successful in compressing the geometry information of point clouds. For attribute compression, however, they still lag behind nonlearning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers. Our method also includes a module that adaptively selects the resolution of the voxels used to voxelize the input point cloud. Sparse vectors are used to represent the voxelized point cloud, and sparse convolutions process the sparse tensors, ensuring computational efficiency. To the best of our knowledge, this is the first application of GANs to compress point cloud attributes. Our experimental results show that our method outperforms existing learning-based techniques and rivals the latest G-PCC test model (TMC13v23) in terms of visual quality. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: 3DAttGAN: A 3D attention-based generative adversarial network for joint space-time video super-resolution dc.contributor.author: Fu, Congrui; Yuan, Hui; Shen, Liquan; Hamzaoui, Raouf; Zhang, Hao dc.description.abstract: Joint space-time video super-resolution aims to increase both the spatial resolution and the frame rate of a video sequence. As a result, details become more apparent, leading to a better and more realistic viewing experience. This is particularly valuable for applications such as video streaming, video surveillance (object recognition and tracking), and digital entertainment. Over the last few years, several joint space-time video super-resolution methods have been proposed. While those built on deep learning have shown great potential, their performance still falls short. One major reason is that they heavily rely on two-dimensional (2D) convolutional networks, which restricts their capacity to effectively exploit spatio-temporal information. To address this limitation, we propose a novel generative adversarial network for joint space-time video super-resolution. The novelty of our network is twofold. First, we propose a three-dimensional (3D) attention mechanism instead of traditional two-dimensional attention mechanisms. Our generator uses 3D convolutions associated with the proposed 3D attention mechanism to process temporal and spatial information simultaneously and focus on the most important channel and spatial features. Second, we design two discriminator strategies to enhance the performance of the generator. The discriminative network uses a two-branch structure to handle the intra-frame texture details and inter-frame motion occlusions in parallel, making the generated results more accurate. Experimental results on the Vid4, Vimeo-90K, and REDS datasets demonstrate the effectiveness of the proposed method. The source code is publicly available at https://github.com/FCongRui/3DAttGan.git. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: PU-Mask: 3D Point Cloud Upsampling via an Implicit Virtual Mask dc.contributor.author: Liu, Hao; Yuan, Hui; Hamzaoui, Raouf; Liu, Qi; Li, Shuai dc.description.abstract: We present PU-Mask, a virtual mask-based network for 3D point cloud upsampling. Unlike existing upsampling methods, which treat point cloud upsampling as an “unconstrained generative” problem, we propose to address it from the perspecitive of “local filling”, i.e., we assume that the sparse input point cloud (i.e., the unmasked point set) is obtained by locally masking the original dense point cloud with virtual masks. Therefore, given the unmasked point set and virtual masks, our goal is to fill the point set hidden by the virtual masks. Specifically, because the masks do not actually exist, we first locate and form each virtual mask by a virtual mask generation module. Then, we propose a mask-guided transformer-style asymmetric autoencoder (MTAA) to restore the upsampled features. Moreover, we introduce a second-order unfolding attention mechanism to enhance the interaction between the feature channels of MTAA. Next, we generate a coarse upsampled point cloud using a pooling technique that is specific to the virtual masks. Finally, we design a learnable pseudo Laplacian operator to calibrate the coarse upsampled point cloud and generate a refined upsampled point cloud. Extensive experiments demonstrate that PU-Mask is superior to the state-of-the-art methods. Our code will be made available at: https://github.com/liuhaoyun/PU-Mask dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

  • dc.title: Enhancing Context Models for Point Cloud Geometry Compression with Context Feature Residuals and Multi-Loss dc.contributor.author: Sun, Chang; Yuan, Hui; Li, Shuai; Lu, Xin; Hamzaoui, Raouf dc.description.abstract: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficult for the context model to accurately predict the probability distribution of node occupancy. Second, as the one-hot encoding is not the actual probability distribution of node occupancy, the cross-entropy loss function is inaccurate. To address these problems, we propose a general structure that can enhance existing context models. We introduce the context feature residuals into the context model to amplify the differences between contexts. We also add a multi-layer perception branch, that uses the mean squared error between its output and node occupancy as a loss function to provide accurate gradients in backpropagation. We validate our method by showing that it can improve the performance of an octreebased model (OctAttention) and a voxel-based model (VoxelDNN) on the object point cloud datasets MPEG 8i and MVUB, as well as the LiDAR point cloud dataset SemanticKITTI. dc.description: The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

.

Key research outputs

  • H. Liu, H. Yuan, J. Hou, R. Hamzaoui, W. Gao, PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling, IEEE Transactions on Image Processing, vol. 31, pp. 7389-7402, 2022, doi: 10.1109/TIP.2022.3222918.

  • Q. Liu, H. Yuan, J. Hou, R. Hamzaoui, H. Su, Model-based joint bit allocation between geometry and color for video-based 3D point cloud compression, IEEE Transactions on Multimedia, vol. 23, pp. 3278-3291, 2021, doi: 10.1109/TMM.2020.3023294.

  • Ahmad, S., Hamzaoui, R., Al-Akaidi, M., Adaptive unicast video streaming with rateless codes and feedback, IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, pp. 275-285, Feb. 2010.
  • Röder, M., Cardinal, J., Hamzaoui, R., Efficient rate-distortion optimized media streaming for tree-structured packet dependencies, IEEE Transactions on Multimedia, vol. 9, pp. 1259-1272, Oct. 2007.  
  • Röder, M., Hamzaoui, R., Fast tree-trellis list Viterbi decoding, IEEE Transactions on Communications, vol. 54, pp. 453-461, March 2006.
  • Röder, M., Cardinal, J., Hamzaoui, R., Branch and bound algorithms for rate-distortion optimized media streaming, IEEE Transactions on Multimedia, vol. 8, pp. 170-178, Feb. 2006.
  • Stankovic, V., Hamzaoui, R., Xiong, Z., Real-time error protection of embedded codes for packet erasure and fading channels, IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 1064-1072, Aug. 2004.
  • Stankovic, V., Hamzaoui, R., Saupe, D., Fast algorithm for rate-based optimal error protection of embedded codes, IEEE Transactions on Communications, vol. 51, pp. 1788-1795, Nov. 2003.
  • Hamzaoui, R., Saupe, D., Combining fractal image compression and vector quantization, IEEE Transactions on Image Processing, vol. 9, no. 2, pp. 197-208, 2000.
  • Hamzaoui, R., Fast iterative methods for fractal image compression, Journal of Mathematical Imaging and Vision 11,2 (1999) 147-159.

Research interests/expertise

  • Image and Video Compression
  • Multimedia Communication
  • Error Control Systems
  • Image and Signal Processing
  • Machine Learning
  • Pattern Recognition
  • Algorithms

Areas of teaching

Signal Processing

Image Processing

Data Communication

Media Technology

Qualifications

Master’s in Mathematics (Faculty of Sciences of Tunis), 1986

MSc in Mathematics (University of Montreal), 1993

Dr.rer.nat (University of Freiburg), 1997

Habilitation in Computer Science (University of Konstanz), 2004

Courses taught

Digital Signal Processing

Mobile Communication 

Communication Networks

Signal Processing

Multimedia Communication

Digital Image Processing

Mobile Wireless Communication

Research Methods

Pattern Recognition

Error Correcting Codes

Honours and awards

Outstanding Associate Editor Award, IEEE Transactions on Multimedia, 2020

Certificate of Merit for outstanding editorial board service, IEEE Transactions on Multimedia, 2018

Best Associate Editor award, IEEE Transactions on Circuits and Systems for Video Technology, 2014

Best Associate Editor award, IEEE Transactions on Circuits and Systems for Video Technology, 2012

Membership of professional associations and societies

IEEE Senior Member

IEEE Signal Processing Society

IEEE Multimedia Communications Technical Committee 

British Standards Institute (BSI) IST/37 committee 

Current research students

Sergun Ozmen, PT PhD student since July 2019

 

 

Professional esteem indicators

 Guest Editor , Electronics Letters, 2024.

Guest Editor IEEE Open Journal of Circuits and Systems, Special Section on IEEE ICME 2020.

Guest Editor IEEE Transactions on Multimedia, Special Issue on Hybrid Human-Artificial Intelligence for Multimedia Computing.

Editorial Board Member Frontiers in Signal Processing (2021-) 

Editorial Board Member IEEE Transactions on Multimedia (2017-2021)

Editorial Board Member IEEE Transactions on Circuits and Systems for Video Technology (2010-2016)

Co-organiser Special Session on 3D Point Cloud Acquisition, Processing and Communication (3DPC-APC), 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) December 13 – 16, 2022, Suzhou, China.

Co-organiser 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis, at ACM Multimedia 2022, Lisbon, Oct. 2022.

Area Chair IEEE International Conference on Image Processing (ICIP) 2024, Abu Dhabi, Oct. 2024.

Area Chair IEEE International Conference on Multimedia and Expo (ICME) 2024, Niagara Falls, July 2024.

Area Chair IEEE International Conference on Image Processing (ICIP) 2023, Kuala Lumpur, Oct. 2023.

Area Chair IEEE International Conference on Multimedia and Expo (ICME) 2023, Brisbane, July 2023.

Area Chair IEEE International Conference on Image Processing (ICIP) 2022, Bordeaux, October 2022.

Area Chair for Multimedia Communications, Networking and Mobility IEEE International Conference on Multimedia and Expo (ICME) 2022, Taipei, July 2022.

Area Chair, IEEE ICIP 2021, Anchorage, September 2021

Area Chair for Multimedia Communications, Networking and Mobility, IEEE ICME 2021, Shenzhen, July 2021

Workshops Co-Chair, IEEE ICME 2020, London, July 2020.

Technical Program Committee Co-Chair, IEEE MMSP 2017, London-Luton, Oct. 2017.