Tarun Reddi

Building big functions [neural nets] from scratch, or orchestrating big and small ones [agents]. Obsessed with making them more human-like… constantly experimenting, writing what’s learned, and recharging through nature, cycling, and hiking.

Work & Education

AI Engineer

TheoAI

Apr 2026 - Present

Working on legal AI agents, evals, and source-grounded document intelligence for litigation defense.

AI Engineer

Makora (Formerly Mako)

Aug 2025 - Mar 2026

Worked on Fine-tuning LLMs (SFT/RFT), agentic workflows for kernel gen across CUDA/HIP/Triton.

Research & Teaching Assistant

University at Buffalo, SUNY

Aug 2024 - Aug 2025

Led CV/DL research (2 papers published, Best Research Award), mentored 200+ students on Big Data/Spark.

Founder

Scoleaf

Dec 2024 - Aug 2025

Built AI tutor (1K+ users); implemented agentic orchestration, async streaming, and multimodal stacks.

Contributor

9thGen AI

Feb 2025 - May 2025

Contributed to the development of commercial agentic AI voice agents.

Master of Science in Computer Science (3.8 / 4.0)

University at Buffalo, SUNY

Aug 2023 - Jan 2025

Research Assistant

Vellore Institute of Technology

2022 - 2023

Published journal on email spoofing/vulnerabilities analyzed malware exploits and created 11 mitigation plans.

B.Tech in Computer Science and Engineering (8.7 / 10)

Vellore Institute of Technology

2019 - 2023

Skills

Python · Java · SQL · git · C++

Open Source Models

TD-HallOumi-3BText Generation

Trained version ofLlama-3.2-3B for sentence-level hallucination detection, outperforming DeepSeek R1.

Qwen2.5-Coder-KernelBookText Generation

Fine-tuned on 18k PyTorch-Triton pairs via LoRA, achieving 98.3% accuracy for Triton kernel generation.

SmolVLM-LaTeXImage + Text → Text

Fine-tuned 256M VLM that converts handwritten equations to LaTeX.

CLIP-ViT-IJEPA-VLMs-0.5BImage + Text → Text

Three frozen vision encoders (CLIP, ViT, I-JEPA) stitched into Qwen-0.5B via trainable projectors + LoRA.

CLIP-ViT-IJEPA-VLMs-1.5BImage + Text → Text

1.5B scale-up of the encoder-stitching experiment — bigger LLM, clearer embedding signal.

Blog

These blogs are posted on my Substack. (See my previous Medium articles here @teendifferent)

The Revival of Predictive Coding

Introducing EqPropMomentum: A physics-grounded optimizer for biologically plausible AI.

Adaptive Attention at Inference Time: Does It Actually Work?

A hypernetwork that rewires GPT's value heads on every forward pass. And the answer is... not straightforward.

Zero RL - From Words to Worlds

Automating the least glamorous, most expensive part of reinforcement learning: building the world.

Stitching Vision into LLMs: A Comparative Analysis of Embedding Spaces

Building a VLM from scratch using model stitching and LoRA — comparing CLIP's language alignment vs. I-JEPA's world modeling.

apply_chat_template() Is the Safety Switch

How a Single Function Call Gates Safety Alignment in Gemma, Qwen, and Other Open-Source LLMs

AI 2025 Retrospective: The Static Graph Ceiling

My thoughts on the frontier labs' gatekeeping, the rise of the app layer, and why we need open algorithms, not just open weights.

Sample-Tuned Rank-Augmented Weights

An experiment in making neural networks rewrite themselves for every single input. Inspired by the human brain.

Your Features Aren’t What You Think They Are

Evaluating Local Feature Attribution and Decision Fidelity in Deep Vision Models via Perturbation-Based Explanations

Ground Zero

New phase, new experiments.

Publications

Adaptive Driver Assistance: Context-based Approach to Pedestrian Safety

Submitted for review | Preprint - TechRxiv

Mapping Crime Dynamics: Integrating Textual, Spatial, and Temporal Perspectives

IEEE UEMCON 2024

A comprehensive examination of email spoofing: Issues and prospects for email security

Computers & Security Journal (Elsevier) 2023

A Traffic Control System

Patent 2023

Projects

0RL

Code

ReAct agent that compiles natural language into full Gymnasium/Genesis RL environments, 8-stage validation, auto-fix loop, and a live 3D viewer.

Zero Me

Code

Always-on desktop voice agent with a state-reactive blob UI. Speech-to-speech via Pipecat + Gemini Live over WebRTC.

SF Quest

Code

Gamified city exploration — RAG-powered itineraries, real-time voice agent, and NeMo + CuVS retrieval for cultural discovery.

System Cursor

Code

Experimental system-wide AI autocomplete that uses Gemini Flash with visual context for smarter, app-agnostic suggestions.

DEPTHS - Depth and Proximity Tracking for Human Support

Read

Real-time spatial awareness tool for the visually impaired, delivering depth and object tracking under 4ms, 15–25x faster than leading models.

F.E.A.S.T - Food & Ingredient AI Suggestion Technology

Code

Real-time ingredient detection recipe generation with nutritional information. Cuts costs, saves time, and inspires home cooking.

PEFT

Code Read

Guide on efficient fine-tuning adapters using multi-class image datasets. Ideal for researchers seeking high-impact, low-resource model optimization.

Face Recognition Using Meta-Learning

Code Read

Efficient facial recognition using Prototypical & Siamese Networks. High-accuracy recognition with limited data for secure verification systems.

RxRovers: Roaming for Rapid Relief

Code

Optimizes medical supply delivery in hospitals using RL. Ensures timely delivery of medical supplies, enhancing patient care.

DINO Annotator

Code Read

Auto-annotate custom datasets for object detection using Grounding DINO. Speeds up box labeling with zero-shot detection for rare or unseen objects.

IntelliRAG

Code Read

A personalized AI assistant with RAG (Retrieval-Augmented Generation). Answers questions based on personal documents, uses vector stores for fast and accurate retrieval.

NEXUS

Code

A solution hub for NEWS classification, Image Analysis, and Anamaly Detection.

Deep RL

Code

A hub for Deep Reinforcement Learning models for dynamic problem-solving. Ideal for autonomous control, gaming, and robotics.

Green AI

Code Read

Carbon footprint analysis and reduction tools for tech and AI model training. Helps to make eco-friendly choices, optimize energy, and reduce emissions.

Crimson Eye: Data-Driven Crime Analysis

Code

Enhances predictive policing by analyzing crime data. Optimizes law enforcement resource allocation.

Tiny LLM

Code Read

Tested lightweight LLMs for local personal assistance with minimal resources.

AI at Play

Code

RL in Squirrel Maze & Stock Trader for strategic learning and decision-making. Perfect for testing RL in real-world-inspired scenarios.

Music Vision

Code

Genre classification with ANN, CNN, and Transfer Learning on Mel spectrograms. Ideal for exploring deep learning in complex audio classification.

Privileged Identity Management - Intune

Code

Implements data loss prevention and privileged identity management. Ensures compliance and security.

CrowdSec Deployment Guide: Enhancing Cybersecurity

Code

Deployment steps for CrowdSec, an SOAR basedintrusion prevention system. Enhances collective security.

Gear Shift DB

Code

Custom DBMS for Formula 1 data management. Enhances efficiency and decision-making powered by PostgreSQL.

Pintos

Code

Programmed essential OS components like threading, synchronization, scheduling, and system calls.

SnapShift

Code

Efficient Python tool for bulk image scraping from Bing. Perfect for ML, web dev, and research projects needing large image datasets.

Connect

Feel free to contact me at iamtarunreddi@gmail.com

Medium (Old Articles)