Mohamed Afham

I'm a graduate student at the Technical University of Darmstadt - Visual Inference Lab, working with Prof. Stefan Roth under ELLIS. My broader research interest lies in the intersection of Computer Vision and Machine Learning. I was fortunate to be interned at FAIR at Meta AI during my graduate studies. My graduate study is generously supported by the ELIZA Scholarship from the German Academic Exchange Service (DAAD).

Previously I spent a wonderful year at Meta AI as an AI Resident working on long-form video representation learning. I completed my bachlor's degree at University of Moratuwa, Sri Lanka, where my thesis was on Learning Representations for 3D Point Cloud Processing, advised by Dr. Ranga Rodrigo. I did a research internship with Prof. Salman Khan at MBZUAI, UAE during my undergraduate.

I'm interested in broader areas in Computer Vision and Machine Learning with focus in the subdomains of Self-Supervised Learning, 3D Vision, and Learning with Limited Labels (few-shot, zero-shot).

Email / CV / Google Scholar / Twitter / LinkedIn / Github

News

[Oct 2024]   Joined FAIR at Meta as a Research Scientist Intern.
[Sep 2024]   One paper accepted at ACCV 2024.
[Apr 2024]   One paper accepted at MIDL 2024.
[Sep 2023]   Admitted as a graduate student at Technical University of Darmstadt in Germany.
[Jul 2023]   One paper accepted at ICCV 2023 Workshops.
[Oct 2022]   Two papers accepted at ECCV 2022 Workshops.
[Jul 2022]   Joined Meta AI at New York City as an AI Resident.
[Mar 2022]   Serving as a reviewer for ECCV 2022, IROS 2022 and IET-Computer Vision journal.
[Mar 2022]   One paper accepted at CVPR 2022.
[Jan 2022]   One paper accepted at ICASSP 2022.
[Nov 2021]   Serving as a reviewer for CVPR 2022.
[Oct 2021]   One paper accepted at BMVC 2021.
[Oct 2021]    Towards Accurate Cross-Domain In-Bed Human Pose Estimation: preprint available on arxiv.
[Sep 2021]    Our team NFP Undercover emerged 2nd runners up at IEEE VIP Cup.
[Jun 2021]    Joined VeracityAI as an Associate Machine Learning Engineer.
[Apr 2021]    Rich Semantics Improve Few-Shot Learning: preprint available on arxiv.
[Nov 2020]   Our team Wanderers emerged as IEEE SMC winners of the BR41N.io hackathon.
[Oct 2020]   Joined MBZUAI as a Research Assistant.

Research

I'm fascinated by the growth of computer vision community towards making the models see and understand the world as humans do. In particular, I'm intrigued by the results of the models learnt with self-supervision or with label constrained environments.

UnCLe SAM: Unleashing SAM’s Potential for Continual Prostate MRI Segmentation
Amin Ranem, Mohamed Afham, Moritz Fuchs, Anirban Mukhopadhyay

MIDL 2024
Paper / Code

Description: Introduced UnCLe SAM, a novel approach leveraging the pre-trained Segment Anything Model (SAM) for continual medical image segmentation, particularly in dynamic environments with sparse data.

Outcome: Demonstrated state-of-the-art performance in continual prostate MRI segmentation tasks, outperforming existing methods like Lifelong nnU-Net and addressing challenges in model rigidity and plasticity.

Feature Generator for Few-Shot Learning
Heethanjan Kanagalingam, Thenukan Pathmanathan, Navaneethan Ketheeswaran, Mokeeshan Vathanakumar, Mohamed Afham, Ranga Rodrigo

ACCV 2024
Paper / Code

Description: Introduced a novel feature generator that synthesizes visual features from class-level textual descriptions to enhance embeddings in few-shot learning tasks. The generator is trained using a combination of classifier loss, discriminator loss, and distance loss to ensure accurate same-class feature generation.

Outcome: Achieved significant improvements over baseline methods, with a 10% increase in accuracy for 1-shot and around 5% for 5-shot approaches on benchmarks like miniImageNet and tieredImageNet. Demonstrated the effectiveness of integrating semantic information into feature generation for few-shot learning.

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding
Mohamed Afham, Isuru Dissanayake, Dinithi Dissanayake, Amaya Dharmasiri, Kanchana Thilakarathna, Ranga Rodrigo

CVPR 2022
Paper / Code / Project Page

Description: Introduced a joint learning objective encapsulating intra-modal correspondence within point cloud modality and cross-modal correspondence between point cloud and 2D image modalities, leveraging contrastive learning.

Outcome: Produced state-of-the-art performance in downstream tasks such as 3D object classification, few-shot object classification and 3D object part segmentation, outperforming previous unsupervised learning methods.

Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding
Mohamed Afham, Satya Narayan Shukla, Omid Poursaeed, Pengchuan Zhang, Ashish Shah, Sernam Lim

ICCV 2023, Workshop on Resource Efficient Deep Learning for Computer Vision
Paper

Description: We propose a task-agnostic, unsupervised and scalable approach based on Kernel Temporal Segmentation (KTS) for adaptive sampling and tokenizing long videos.

Outcome: Produce competitive performance on several benchmarks for long video modeling, specifically in tasks such as video classification and temporal action localization.

Visual - Semantic Contrastive Alignment for Few-Shot Image Classification
Mohamed Afham, Ranga Rodrigo

ECCV 2022, Workshop on Computer Vision in the Wild
Paper

Description: Proposed an auxiliary multimodal contrastive learning objective between visual and semantic class prototypes to enhance the visual class-discriminative capability of several few-shot baselines.

Outcome: Outperformed the standard meta learning baselines in few-shot learning by simply plugging in the proposed multimodal contrastive learning objective.

Towards Accurate Cross-Domain In-Bed Human Pose Estimation
Mohamed Afham^*, Udith Haputhanthri^*, Jathurshan Pradeepkumar^*, Mithunjha Anandakumar, Ashwin De Silva, Chamira Edussooriya
(* denotes equal contribution)

ICASSP 2022
Paper / Code

Description: Proposed a novel learning strategy with two-fold data augmentation and self-supervised knowledge distillation to reduce the domain discrepancy between labeled source domain and unlabeled target domain.

Outcome: Improved performance on SLP dataset over two standard pose estimation baselines.

Rich Semantics Improve Few-Shot Learning
Mohamed Afham, Salman Khan, Muhammad Haris Khan, Muzammal Naseer, Fahad Shahbaz Khan

BMVC 2021
Paper / Code / Presentation

Description: Proposed a multi-modal architecture for few-shot learning which leverages the class-level descriptions to learn better representations.

Outcome: Improved state-of-the-art performances on CUB, VGG-Flowers and ShapeWorld and competitive performance on miniImagenet.

Experience

		Meta AI, Montreal, Canada Research Scientist Intern Oct 2024 – Mar 2025
		Meta AI, New York, USA AI Resident Jul 2022 – Jul 2023


							VeracityAI, Colombo, Sri Lanka Associate Machine Learning Engineer June 2021 - Feb 2022


						Mohamed Bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE Research Assistant Oct 2020 - Apr 2021 Advisor: Salman Khan

Education


							Technical University of Darmstadt, Germany Master's + PhD in Computer Science Oct 2023 - Present

University of Moratuwa, Sri Lanka
Bachelor's in Science (Engineering) specialized in Electronics and Telecommunication
Aug 2017 - Jul 2022

I borrowed this website layout from here!