Research

My research interests span the following areas.

  • Software Engineering (SE), programming languages (PL), and data science
  • Data science, SE & PL for data-intensive systems, and data-driven SE.
  • Modular reasoning about properties of software.
  • Advanced modularity and separation of concerns mechanisms and modular reasoning about such mechanisms.

Selected Books and Publications

bookimage

  • One of my long-term project has been to develop a new pedagogy and a textbook draft to teach programming languages and functional programming to students who start in Computer Science programs that teach an imperative language such as Java. This textbook appeared as: Hridesh Rajan, An Experiential Introduction to Principles of Programming Languages, MIT Press, Cambridge, MA, pp. 304, May 2022.

  • Hridesh Rajan and Gary T. Leavens, “Ptolemy: A Language with Quantified, Typed Events,” ECOOP ’08: 22nd European Conference on Object-Oriented Programming, July, 2008. This work addressed the debate on separating crosscutting concerns while preserving modular reasoning.

  • Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, “Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories,” 35th International Conference on Software Engineering, May, 2013. Boa was the first cyberinfrastructure for big data-driven discovery in software engineering.

  • Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan, “A Comprehensive Study on Deep Learning Bug Characteristics,” ESEC/FSE’19: The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), August, 2019. This work provided the first rigorous taxonomy of defects and repairs in neural-network code.

  • Mohammad Wardat, Wei Le, and Hridesh Rajan, “DeepLocalize: Fault Localization for Deep Neural Networks,” ICSE’21: The 43nd International Conference on Software Engineering, May, 2021. DeepLocalize was the first approach for bug localization in deep learning models.

  • Rangeet Pan and Hridesh Rajan, “On Decomposing a Deep Neural Network into Modules,” ESEC/FSE’2020: The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, November, 2020. This was the first work on decomposition and modularity of deep neural networks, and started the sub-field.

More ….

Current Research Agenda

I study the modularity of AI-enabled systems, with a particular focus on deep learning. Training, updating, and repurposing large models often consumes substantial financial and environmental resources. My approach is to decompose deep neural networks into well-defined modules so that specific components can be isolated, reused, or replaced without retraining the entire model. The goal is sustainable, cost-aware AI development where improvement and adaptation are local, not global.

My broader agenda is to enhance programmer productivity and improve the reliability of the systems we build. I design programming abstractions that reduce error-prone tasks, strengthen modular structure, and support modular reasoning. With these abstractions in place, compilers and frameworks can implement complex concerns automatically, lowering defect rates and accelerating development. By improving modularity and the structure of reasoning, we can increase the scalability of analysis and verification for both human review and automated tooling.

We are recruiting undergraduate students, graduate students, postdoctoral fellows, and in some cases research scientists for the following projects.

  • Boa Infrastructure: Starting with our ICSE 2023 paper, entitled Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories that introduced Boa the first cyberinfrastructure for big data-driven discovery in software engineering, we have led democratization of big science in this area. We are looking into expanding the infrastructure, connecting Boa more explicitly with LLM, automating generation of Boa-like infrastructure for other domains, and reducing the cost of running Boa queries so that they can be executed on smaller clusters.

  • Separate and Independent Testing of Data and AI Models: Our ICSE 2025 paper, entitled Mock Deep Testing: Toward Separate Development of Data and Models for Deep Learning, has introduced our methodology of mock deep testing for unit testing of Deep Learning applications. We are looking into expanding the applicability of this work for more kinds of deep learning models as well as different kinds of data.

  • Design by Contract for AI Systems: Our ICSE 2024 paper, Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in Deployment, showed an approach for inferring data preconditions. More work is needed on expanding this approach to other models, and other properties.

  • While the three areas (discussed above) are being worked on, we are always happy to have a conversation about follow-up work building on our paper. For a complete list of our papers, please visit my lab’s webpages.

Contact me

You can contact me using either of the e-mail addresses below. When writing, please substitute firstname with hridesh.