Recent Research

Current Focus

Post-Training SLMs
- SFT and aligning models (RLHF/DPO/GRPO) for specialized enterprise use cases.
- Evaluation for agent reliability — consistency, robustness, predictability, and safety.

Reasoning Systems
Developing and testing experimental reasoning frameworks designed to enhance agent reliability and predictability within complex enterprise environments and their integration with internal SLM-driven question-answering systems.
Inference Optimization
Evaluating the performance of chain-of-thought (CoT) methodologies and self-correction loops to improve logical consistency in production-scale agents.
JAX-based Architectures
- Implementing experimental GNNs and Graph Transformer layers using JAX to study and evaluate scalable alignment techniques and performance gains in functional programming paradigms compared to existing PyTorch implementations.
- Transitioning experimental research pipelines to JAX to leverage XLA (Accelerated Linear Algebra) for improved hardware utilization.
- Exploring JAX for gradient-based optimization in large-scale Graph-Transformer research.
Interpretability
Utilizing Information Theory and statistical evaluations to interpret model behavior in uncertain environments.