In structural health monitoring (SHM) systemsdata loss inevitably occurs and reduces the applicability of SHM techniquessuch as condition assessment and damage identification. The current mainstream data-driven methodgenerative adversarial networks (GAN)suffers from convergence difficultylimiting the accuracy and efficiency of response reconstruction. In this studya conditional diffusion model with data fusion (DF-CDM) is proposed for structural dynamic response reconstruction. The original unsupervised diffusion model is improved by introducing the conditional input and modifying the deep denoising neural network to achieve supervised learning. Besidesdata fusion is developed to further utilize the frequency-domain information and improve the reconstruction quality of the time-domain signals. The proposed model is validated on a three-span continuous bridge. Results show that the diffusion model with data fusion achieves the highest accuracy on the test set with R2RMSE and MAE equaling 0.8210.0053 and 0.0042 respectively. Compared with the state-of-the-art GAN modelthe diffusion model without the adversarial modules has a more stable training process and better reconstruction performance in both time and frequency domains. The modal identification and robustness analysis further verify the effectiveness of the proposed model. The proposed diffusion model with data fusion achieves high-quality structural response reconstructionguaranteeing the applicability and reliability of subsequent response-based structural analysis.