Professional Summary

Wassim Swaileh

Wassim Swaileh, PhD

AI Engineer -- Robotics, Autonomous Systems & Multimodal AI

About Me

AI Engineer with 8+ years post-PhD experience building production AI models and training pipelines for multimodal and autonomous systems. Expertise in VLMs, vision-language-action models, 3D scene reconstruction, and large-scale ML infrastructure.

Currently working at Huawei Suomi Finland Research Center as a Senior Researcher in Multimodal AI, where I:

  • Build and ship production-scale video generation models using VLMs and diffusion architectures
  • Design end-to-end training pipelines with containerized environments on ModelArts MLOps platform
  • Co-developed ReWind (CVPR 2025), a large language model for long video understanding
  • Spearhead OCR capabilities, deploying multimodal perception models across multiple languages

My research spans:

  • Video Understanding & Generation
  • Vision-Language Models (LLaVA, CLIP, QwenVL)
  • Vision-Language-Action (VLA) Models
  • Generative Diffusion Models
  • 4D LiDAR Generation & Autonomous Systems
  • 3D Scene Reconstruction
  • Kinematic Time Series Analysis
  • Document AI & OCR

Core Expertise

Video AI

Video understanding, generation, and long video analysis with memory models

Multimodal AI

VLMs, VLA models, and multimodal perception systems

Autonomous Systems

4D LiDAR generation, sensor fusion, and 3D reconstruction

Career Highlights

2023 - Present
Senior Researcher - Multimodal AI
Huawei Suomi Finland Research Center, Helsinki

Building production-scale video generation models and co-developing ReWind (CVPR 2025) for long video understanding.

2021 - 2023
R&D Engineer - Robotics & Sensor AI
IRISA Lab UMR 6074, Rennes, France

Deep learning pipelines for kinematic time series analysis and trajectory reconstruction using TCNs.

2019 - 2021
Lecturer & Researcher - Deep Learning & 3D AI
ETIS Lab UMR 8051, Cergy Paris University

UNet-based architectures for visual perception and 3D modeling of ancient maps.

2018 - 2019
R&D Engineer - Document AI
LITIS Lab EA 4108, Rouen, France

Production DNN pipelines for named entity extraction from historical documents (EURHISFIRM project).

Key Achievements

CVPR 2025 Publication
Tier-1 Conference

Co-developed ReWind, a large language model for long video understanding with instructed learnable memory.

Patent Submitted
AI-Journalist

Apparatus to Generate Media Contents Conditioned to User Preferences Settings.

Get In Touch

Interested in collaborating or learning more about my work?

Contact Me