Projects
Needle: A PyTorch-like training framework
Built a deep learning framework in C++ and Python, supporting tensor operations including convolution, broadcasting, and transpose for training neural networks. Optimized training on NVIDIA GPUs with custom CUDA kernels for reduction, matrix multiplication, and other critical operations. Designed computation graph execution with backpropagation via topological sort, implementing efficient forward and backward passes for numerous operations. Extended capabilities with gradient checkpointing and FP16 mixed-precision training, significantly improving memory efficiency.
Synthetic Data Generation for Off-Policy Preference Optimization
Generated a synthetic preference dataset using a model pool for fine-tuning vision language models to improve abductive reasoning capabilities. Used CLIP for scoring model generations to determine preference ordering, then fine-tuned PaliGemma-3B on the preference data using Direct Preference Optimization. [Dataset] [Code]
MinBERT
Implemented the BERT architecture from scratch, including positional embeddings and multi-head attention mechanisms, and trained it for sentiment classification tasks. Built a custom AdamW optimizer to enable efficient training and fine-tuning of the language model using PyTorch and Hugging Face libraries. [Code]
Hybrid Machine Translation
A Machine Translation model that utilizes the strengths of both Statistical and Neural MT. The neural sequence-to-sequence archtitecture is enhanced by the phrase-table constructed using Phrase-Based SMT. The generated unknown tokens are replaced with the translations of the aligned source phrases. [Github]