Publication Date

Fall 2025

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Amith Kamath Belman

Second Advisor

William Andreopoulos

Third Advisor

Shantanu Deshpande

Keywords

Machine Learning, Convolutional Layers, Multimodal, Transformer-based encoders

Abstract

This paper presents a Dual-Stream Transformer–based architecture for multi-modal user verification, leveraging both keyboard and mouse dynamics to capture complementary behavioral patterns. Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) and other sequential models have shown success in evaluating sequential relationships; however, they mainly focus on short-term patterns and can sometimes struggle to capture long-term patterns. The proposed architecture employs two parallel Transformer-based encoders, each dedicated to one behavioral modality. The two streams integrate temporal convolutional layers for local feature ex- traction and self-attention mechanisms for modeling global temporal dependencies. This allows the system to learn subtle and high-level behavioral representations. We introduce two implementations of the Dual-Stream Architecture model. The first implementation demonstrates a late fusion to allow the model to learn from each modality independently while the second implementation introduces an early fusion through a dot-product fusion mechanism for the two streams to learn from each other. The experimental results show that the late fusion architecture verifies specified genuine users from impostor users effectively while the early fusion lacks in optimization. This research highlights the potential of Transformer-based multimodal fusion as a solution for continuous and unobtrusive user authentication.

Recommended Citation

Xiongz, Johny, "Dual Stream Transformer based Architecture for Multimodal User Verification" (2025). Master's Projects. 1615.
DOI: https://doi.org/10.31979/etd.q93h-5ujw
https://scholarworks.sjsu.edu/etd_projects/1615

Download

Available for download on Saturday, December 19, 2026

Included in

Computer Sciences Commons

COinS

DOI

https://doi.org/10.31979/etd.q93h-5ujw

Master's Projects

Dual Stream Transformer based Architecture for Multimodal User Verification

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Dual Stream Transformer based Architecture for Multimodal User Verification

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links