Faculty Research, Scholarly, and Creative Activity

Rosetta-Xai: An Automated Evaluation and Explainability Framework for Code Translation Models

Publication Date

4-1-2026

Document Type

Article

Publication Title

Software Impacts

Volume

DOI

10.1016/j.simpa.2026.100811

Abstract

This paper presents Rosetta-XAI, a comprehensive software framework for evaluating and explaining Large Language Model (LLM) behavior in cross-language code conversion tasks. The system implements a four-stage automated pipeline: (1) code generation by LLMs accessed through the Ollama API inference service, (2) regex-based extraction of code blocks from markdown responses, (3) language-specific syntax and compilation validation with temporary artifact management, and (4) execution with timeout protections and CSV-based checkpoint recovery. The framework supports evaluation of 15 specialized code LLMs (1.3B–34B parameters), including DeepSeek Coder, Code Llama, CodeGemma, and Granite Code across 17 Rosetta Code programming tasks, generating 42 bidirectional conversion pairs among seven languages (C, C++, Go, Java, JavaScript, Python, Rust). Beyond traditional pass@1 accuracy metrics, the system incorporates explainability analysis through Shapley Value Sampling and Feature Ablation techniques implemented via Captum and PyTorch, enabling researchers to quantify token-level feature importance during translation. All pipeline components include XAI-enhanced variants supporting follow-up question analysis for interpretability studies. Built using Python with pandas for metrics aggregation and subprocess management for multi-language execution, the modular architecture separates extraction, validation, and execution concerns. Results are systematically organized into structured directories tracking accepted code, compilation failures, syntax errors, and execution outputs, with comprehensive metrics exported to CSVs for reproducible research and comparative model analysis.

Funding Number

23-RSG-07-077

Keywords

Code translation, Explainable artificial intelligence, Feature ablation, Large language models, Model interpretability, Shapley values

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Department

Applied Data Science

Recommended Citation

Vishnu S. Pendyala and Neha Bais Thakur. "Rosetta-Xai: An Automated Evaluation and Explainability Framework for Code Translation Models" Software Impacts (2026). https://doi.org/10.1016/j.simpa.2026.100811

Download

Find in your library

COinS

Faculty Research, Scholarly, and Creative Activity

Rosetta-Xai: An Automated Evaluation and Explainability Framework for Code Translation Models

Publication Date

Document Type

Publication Title

Volume

DOI

Abstract

Funding Number

Keywords

Creative Commons License

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

Rosetta-Xai: An Automated Evaluation and Explainability Framework for Code Translation Models

Authors

Publication Date

Document Type

Publication Title

Volume

DOI

Abstract

Funding Number

Keywords

Creative Commons License

Department

Recommended Citation

Share

Search

Browse All

Links