STATUS: AVAILABLE FOR NEW OPPORTUNITIES

> init --user

Kuntal Pal_

AI Engineer

I build production-grade LLM systems, agentic infrastructure, and real-time AI that drive measurable impact on automation and user engagement.

Riverside, CA

View My Work Download Resume

GitHub ↗/LinkedIn ↗/Blog ↗/Email ↗

↓ SCROLL

Projects

> ls ./projects/ — recent shipped work

Finley portfolio analysis — ARIMA forecast, Markowitz allocation, risk metrics and BUY/HOLD signals

> project_01

Intelligent Financial Advisor — Finley

Production-grade multi-agent financial advisor powered by LangGraph, Claude, ARIMA forecasting, Markowitz optimisation, and Isolation Forest anomaly detection. Deployed on Hugging Face Spaces with a FastAPI backend and a custom chat UI.

LangGraph
Claude API
ARIMA
Markowitz
FastAPI
Python

View Details →

Hallucination suppression vs sigma threshold chart

> project_02

σ-RAG: Significance-Threshold Retrieval

Open-source RAG library that applies particle physics significance testing to eliminate hallucinations. Estimates the background noise distribution in embedding space and only retrieves chunks that clear a configurable σ threshold.

Python
NumPy
sentence-transformers
Statistics
RAG

View Details →

> project_03

Agentic Forecaster for sktime

Drop-in sktime forecaster that uses a ReAct agent loop to automatically select and configure time series pipelines from natural language descriptions. Supports Claude, GPT-4o, and Gemini backends.

Python
sktime
Claude API
FastMCP
ReAct

View Details →

Blog

> cat ./posts/* — notes from building real AI systems

TransformersMay 11, 2026·10 min read

Modeling Transformer from Scratch

Every line of code, every reason behind it. Attention is a routing mechanism — by the end of an attention pass, a token's vector is a weighted mixture pulled toward whatever it attended to.

AttentionMay 6, 2026·8 min read

Why Attention Is All You Need

The heart of every modern LLM. Before attention, RNNs processed sequences one token at a time — a rolling summary that forgot the distant past. Attention fixes that.

EmbeddingsMay 4, 2026·8 min read

Embeddings: From Numbers to Meaning

Token IDs are just integers — meaningless to a neural network. Embeddings map them to dense vectors where relationships are encoded in direction and distance.

01 / 02

> more posts shipping soon — say hi via Get in Touch below.

Deep Dives

> open ./deep-dives/ — ML & AI fundamentals, explained properly

> topic_01

Transformers & Attention

From the original paper to multi-head attention, positional encoding, and why the architecture took over everything.

Self-Attention
Multi-Head
Positional Encoding
Softmax
Scaled Dot-Product

> topic_02

RAG & Retrieval Systems

How retrieval-augmented generation works, what breaks in production, and how to design a retrieval pipeline that holds up.

Dense Retrieval
pgvector
Chunking
Re-ranking
Hybrid Search

> topic_03

LLM Evaluation

A practical guide to evaluating LLM outputs — metrics, frameworks, LLM-as-judge, and building eval suites that catch real regressions.

ROUGE
BERTScore
LLM-as-Judge
Ragas
Human Eval

> topic_04

Classical Machine Learning

Decision Trees, Random Forests, XGBoost, SVMs, logistic regression, regularization, bias-variance, cross-validation, and when classical beats deep learning.

XGBoost
Random Forest
SVM
L1/L2
Bias-Variance

> topic_05

Probability & Statistics

The foundations that everything in ML is built on — distributions, Bayes, hypothesis testing, and how they connect to real modelling decisions.

Bayes Theorem
MLE
Distributions
A/B Testing
CLT

> topic_06

ML Embeddings & Representations

Word2Vec, GloVe, Sentence Transformers, contrastive learning, embedding geometry, similarity metrics, PCA, and UMAP — with worked exercises.

Word2Vec
GloVe
SBERT
InfoNCE
SimCLR
PCA
UMAP
Contrastive Learning

Deep Dive →

Experience

> git log --all --oneline

> Jul 2024 — May 2026
AI Engineer @ Beehive AI, Inc.
Engineered an agentic RAG system (LangChain + Postgres pgvector) for natural-language → SQL analytics. Built an LLM-powered QA framework reaching 80% automation, a virtual persona simulation framework, and architected a multi-agent commerce system on Vertex AI + LangGraph.
> Jan 2024 — Jul 2024
AI Research Scientist @ Pocket FM
Shipped an LSTM-based sequential prediction model for long-term content performance. Led a comment ranking system using semantic embeddings + LLM summaries that lifted CTR by 4.5% and comment volume by 4.0%.
> Dec 2023 — Feb 2024
AI Fellow @ PI School, Rome
Automated reliability assessment of medical research papers using LLM-based algorithms. Improved GPT-4 evaluation accuracy to 75% via prompt optimization and pushed Mixtral to 69% with QLoRA fine-tuning.
> Jun 2023 — Sep 2023
Research Scientist Intern @ Deepgram, Inc.
Trained a self-supervised audio embedding model on 35K+ hours of audio (86% downstream accuracy). Engineered a multimodal embedding space aligning audio and text, beating text-only baselines by 5 points.

About

> cat ./about.md

I’m Kuntal — an AI engineer working on LLM systems, retrieval pipelines, and multi-agent infrastructure. I like turning messy, ambiguous problems into clean, production-shaped products.

Until recently (May 2026), I was at Beehive AI, where I built agentic RAG systems, LLM-powered evaluation frameworks, and virtual persona simulators that help product teams make sharper decisions.

Before that, I spent seven years doing theoretical and computational particle physics for my Ph.D. at UC Riverside, with valuable research experiences at Deepgram (multimodal audio embeddings), PI School in Rome (LLM evaluation for medical research), and Pocket FM (sequential prediction + comment ranking).

> Education

University of California, Riverside
Ph.D., Physics
Sep 2017 — Jun 2024
Indian Institute of Science Education and Research, Kolkata
BS-MS Dual Degree, Physics
Aug 2012 — May 2017

Skills

> ls -la ./stack/ — tools I reach for most often

> Languages

Python
SQL
Go

> ML & AI

PyTorch
TensorFlow
Scikit-learn
LLMs
RAG
Vector DBs
LangChain
LangGraph

> Deployment & MLOps

Docker
Kubernetes
GCP
Vertex AI

> Specializations

Multi-Agent Systems
Multimodal Learning
LLM Fine-tuning
Predictive Modeling

Get in Touch

> open inbox — reply latency: low

Let’s build_

My inbox is always open — whether you have a project in mind, a research collaboration, or just want to say hi, I’ll do my best to reply.

Say Hello

> kuntal.beehiveai@gmail.com · +1 (951) 202-8635

GitHub ↗/LinkedIn ↗/Blog ↗/Email ↗

Kuntal Pal_

AI Engineer

Projects

Intelligent Financial Advisor — Finley

σ-RAG: Significance-Threshold Retrieval

Agentic Forecaster for sktime

Blog

Modeling Transformer from Scratch

Why Attention Is All You Need

Embeddings: From Numbers to Meaning

Deep Dives

Transformers & Attention

RAG & Retrieval Systems

LLM Evaluation

Classical Machine Learning

Probability & Statistics

ML Embeddings & Representations

Experience

AI Engineer @ Beehive AI, Inc.

AI Research Scientist @ Pocket FM

AI Fellow @ PI School, Rome

Research Scientist Intern @ Deepgram, Inc.

About

> Education

Skills

> Languages

> ML & AI

> Deployment & MLOps

> Specializations

Get in Touch

Let’s build_