Publications

(2025). TB-Bench: Training and Testing Multi-Modal AI for Understanding Spatio-Temporal Traffic Behaviors from Dashcam Images/Videos. CVPR 2025 Workshop on Autonomous Driving.
(2025). Multimodal artificial intelligence approaches using large language models for expert-level landslide image analysis. Computer-Aided Civil and Infrastructure Engineering (CACIE), 2025.
(2025). Critical Scenario Prediction Planning and Reasoning. In submission to IEEE Transactions on Intelligent Vehicles (TIV) 2025.
(2024). KTVIC: A Vietnamese Image Captioning Dataset on the Life Domain. ArXiv preprint, 2024.
(2024). Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction. IEEE Transactions on Intelligent Vehicles (TIV), 2024.
(2023). Visual Abductive Reasoning Meets Driving Hazard Prediction: Problem Formulation and Dataset. ArXiv preprint, 2023.
(2023). Leveraging Video Coding Knowledge for Deep Video Enhancement. ArXiv preprint, 2023.
(2022). GRIT: Faster and Better Image Captioning Transformer Using Dual Visual Features. European Conference on Computer Vision (ECCV) 2022.
(2021). Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks. International Joint Conference on Artificial Intelligence (IJCAI) 2021.
(2020). Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs. European Conference on Computer Vision (ECCV) 2020.