Inferno is a compiler designed to translate standard torch.nn.Module objects into highly-optimized, low-level CUDA kernels. By performing graph-level optimizations and employing a Just-in-Time (JIT) compilation backend, it systematically eliminates framework abstractions and Python-related overhead to achieve significant performance improvements.
Inferyx is a no-BS simulation of a real-world AI inference system — built from the ground up to mirror the insane complexity and pressure of production-scale ML deployment. From batching and retry queues to observability and dynamic worker pools, Inferyx doesn’t play around.
A hand-written, register-tiled quantization engine that beats industry-standard libraries on consumer hardware. Built from scratch in C++, CUDA, and Python. Outperforms bitsandbytes on consumer hardware through static calibration and hybrid precision.
This project showcases a cutting-edge financial fraud detection system built using Graph Neural Networks (GNNs). By leveraging graph-structured data and advanced techniques like spectral subgraph sampling and task-specific weight sharing, the system achieves a significant boost in precision and overall performance. The solution is containerized and deployed as a robust API for practical usability.
An AI-powered legal research agent designed to revolutionize the way legal professionals access and analyze information. By leveraging advanced Retrieval-Augmented Generation (RAG) techniques, this system offers a highly efficient and intelligent solution for navigating complex legal documents. Implemented a custom retriever by fine-tuning on domain specific dataset and query routing using Llama Index
Fine-tuned DeBERTa for real-time sentiment analysis of financial news and historical trends of past month to fecilitate investors and other stakeholders. With a Spring Boot backend offering secure APIs, caching, and data persistence. Deployed for production use, providing quick insights for financial decision-makers.
Leveraging deep learning techniques, this project transforms black-and-white SAR images into vibrant colorized versions. Using a custom model trained on the Lab color space, it predicts color distributions for each pixel, producing realistic and visually appealing results. Explore the process, challenges, and outcomes of bringing historical photos and monochromatic images to life with advanced neural networks.