close
Skip to content
View MukundaKatta's full-sized avatar
🎯
We taught silicon to speak. Now it’s asking who it is. So am I.
🎯
We taught silicon to speak. Now it’s asking who it is. So am I.

Block or report MukundaKatta

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MukundaKatta/README.md
Typing SVG

Portfolio LinkedIn X Email

Open Source  ·  Recently Shipped  ·  Packages  ·  Projects  ·  Impact  ·  Experience  ·  Stats


  8+ years building production systems at Fortune 100 scale
  Former SDE at Amazon Web Services  •  Currently at Southwest Airlines
  Deep expertise in ML systems, distributed architectures, and full-stack engineering

Now: shipped the @mukundakatta/agent* reliability stack (fit → guard → snap → vet → cast), 6 matching MCP servers in the official MCP Registry, 3 new GitHub Actions on the Marketplace, and doubled the PyPI footprint to 52 packages (full Python ports of the npm catalog). Plus 40+ open PRs across MCP SDKs, FastMCP, claude-code-action, and Anthropic's agent SDK.


Portfolio at a Glance

PUBLIC REPOS
572
ORIGINALS
160
ACTIVE PROJECTS
122
FORKS
412
ARCHIVED
324

Every repo is indexed in claude-workspace — wired for Multica, Claude Code, Codex, OpenClaw, and Cursor to reason across the portfolio.


Latest Drop · The Agent Reliability Stack

🌐 Live at mukundakatta.github.io/agent-stack — single landing page for the whole 117-package ecosystem (npm + PyPI + MCP Registry + GitHub Marketplace).

🤗 Try it live on the HuggingFace Space · jailbreak fixtures on the HF Dataset.

Five small, focused npm packages that fix the boring problems every long-running agent eventually hits. Pure ESM JavaScript, zero runtime deps, TypeScript types in the box. Designed to compose into a pipeline: fit → guard → snap → vet → cast.

Fit it.

Token-aware message truncation with three strategies (drop-oldest, drop-middle, priority). Pluggable tokenizers. Per-model estimators.

Sandbox it.

Network-egress firewall: a declarative allowlist of domains agent tools can fetch. Throws on violation, with a clear error.

Test it.

Snapshot tests for tool-call traces. Catch silent regressions in LLM tool use the way you catch UI regressions today.

Vet it.

Validate tool args before execution. Wrap any tool function; on bad args, throw a typed error with an LLM-friendly retry hint.

Validate it.

Structured-output enforcer. Validate the model's response, retry with the validation error as feedback, return typed data or throw after N attempts. BYO LLM and validator.

npm i @mukundakatta/agentfit @mukundakatta/agentguard @mukundakatta/agentsnap @mukundakatta/agentvet @mukundakatta/agentcast

Each one also ships as an MCP server so Claude Desktop, Cursor, Cline, Windsurf, and Zed can call them directly mid-conversation:

npx -y @mukundakatta/agentfit-mcp     # fit a chat history into a budget
npx -y @mukundakatta/agentguard-mcp   # check URLs against an egress policy
npx -y @mukundakatta/agentsnap-mcp    # diff tool-call traces
npx -y @mukundakatta/agentvet-mcp     # validate tool args + generate retry hints
npx -y @mukundakatta/agentcast-mcp    # extract / validate JSON from LLM text

Sibling libraries that share a design philosophy: small, focused, zero-dep, BYO-LLM. Each one solves a single concrete reliability problem so you can pick the ones you need without dragging in a framework. Previous drop, streamparse (streaming JSON parser, npm + Homebrew + MCP Registry), is still in active use.


Open Source Focus

I contribute practical fixes to AI SDKs, MCP tooling, eval frameworks, agent infrastructure, structured outputs, and developer experience.

My lane is finding the sharp edges that slow builders down: unclear contracts, brittle tool calls, docs that almost answer the question, eval gaps where regressions hide, and AI tooling that needs better failure signals. I like small, reviewable patches with clear intent, and compact packages that turn repeated manual checks into reusable workflows.

Recent contribution areas (merged upstream):

  • Microsoft — security and architecture docs for internal AI-engineering toolchains (hve-core, physical-ai-toolchain)
  • Pydanticpydantic-ai integration with the Vercel AI SDK
  • Hugging Face ecosystemsafetensors Python bindings, sentence-transformers trainer migration docs
  • Meilisearchheed multi-target docs.rs infrastructure
  • Vercelnext.js documentation
  • Apache Software Foundation — doc / comment fixes across iceberg, pulsar, skywalking, ozone, iotdb

I keep a public log of selected OSS work in oss-contributions.

Distribution pattern. Each flagship ships as a complete unit, not a single npm package:

library  →  Python port  →  CLI binary  →  GitHub Action  →  Homebrew formula  →  MCP server
   npm           PyPI                           Marketplace        brew tap            npm

So the same problem (mcpcheck, skillint, streamparse) is solvable from any environment a developer or AI assistant happens to be in: a TypeScript app, a Python script, a CI workflow, a terminal, or directly inside Claude / Cursor / Cline / Windsurf / Zed.

Recent OSS Highlights

Recently Shipped

Last refreshed 2026-04-27 from npm, PyPI, and the GitHub API.

Latest releases

  • 2026-04-27 · PyPI footprint doubled to 52 packages. Added 26 more Python ports today (mk-agentkit meta + 5 agent infra: agent-loop-breaker-py, agent-regression-lens-py, agent-trajectory-replay-py, tool-call-contracts-py, tool-permission-gate-py · 5 evals/cost/routing: eval-dataset-smith-py, llm-trace-sampler-py, model-fallback-planner-py, model-router-policy-py, ai-supply-chain-manifest-py · 3 tools/safety: tool-result-taint-py, jailbreak-corpus-mini-py, consent-redaction-log-py · 3 RAG: rag-staleness-auditor-py, retrieval-acl-filter-py, context-drift-detector-py · 5 context/prompt: context-forge-py, context-window-packer-py, prompt-token-trim-py, prompt-version-diff-py, llm-response-schema-lite-py · 4 niche: kavach-py, mcpcheck-py, skillint-py, designlint-py)
  • 2026-04-27 · 18 new Python ports on PyPI: partial-json-stream, agentfit-py, agentguard-firewall, agentsnap-py, agentvet-py, agentcast-py, pii-sentry-py, prompt-injection-shield-py, llm-output-sanitizer-py, rag-quality-kit, vector-poison-score, embedding-dedupe, llm-cost-guard-py, semantic-cache-key, eval-flake-detector, citation-integrity-check, hallucination-risk-meter, system-prompt-leak-scan
  • 2026-04-27 · @mukundakatta/agentkit v0.1.0 · npm · meta-package re-exporting all 5 agent-stack libraries
  • 2026-04-27 · 5 of the 5 agent-stack libraries bumped to v0.1.1 with new npx-runnable CLI binaries
  • 2026-04-27 · 3 new GitHub Marketplace Actions: agentvet-action, agentsnap-action, mcp-stack-validate-action
  • 2026-04-27 · 5 new entries in the official MCP Registry: io.github.MukundaKatta/{agentfit, agentguard, agentsnap, agentvet, agentcast}
  • 2026-04-26 · @mukundakatta/agentfit-mcp v0.1.0 · npm · MCP server for agentfit
  • 2026-04-26 · @mukundakatta/agentguard-mcp v0.1.0 · npm · MCP server for agentguard
  • 2026-04-26 · @mukundakatta/agentsnap-mcp v0.1.0 · npm · MCP server for agentsnap
  • 2026-04-26 · @mukundakatta/agentvet-mcp v0.1.0 · npm · MCP server for agentvet
  • 2026-04-26 · @mukundakatta/agentcast-mcp v0.1.0 · npm · MCP server for agentcast
  • 2026-04-26 · @mukundakatta/agentcast v0.1.0 · npm · structured-output enforcer for any LLM
  • 2026-04-26 · @mukundakatta/agentfit v0.1.0 · npm · token-aware message truncation
  • 2026-04-26 · @mukundakatta/agentvet v0.1.0 · npm · tool-arg validator with retry hints
  • 2026-04-25 · @mukundakatta/agentguard v0.1.0 · npm · network-egress firewall for agent tools
  • 2026-04-25 · @mukundakatta/agentsnap v0.1.0 · npm · snapshot tests for tool-call traces
  • 2026-04-25 · @mukundakatta/streamparse v1.0.1 · npm · streaming JSON parser with CLI + Homebrew formula
  • 2026-04-25 · @mukundakatta/streamparse-mcp v1.0.1 · npm + MCP Registry (io.github.MukundaKatta/streamparse)

Recently merged PRs

Open PRs (recent batch) — substantive fixes shipped 2026-04-26 across MCP, Anthropic, FastMCP, Apache, Google Cloud, HuggingFace, OpenTelemetry:


Published Packages

npm (scope @mukundakatta):

Flagship packages:

Package Why it matters Install
@mukundakatta/streamparse version
partial JSON for LLM streams
Streaming JSON parser that yields partial valid trees as tokens arrive. Render LLM tool calls mid-stream, recover dropped responses, parse messy ` ```json ` blocks. Zero deps, 64 tests. Also published as an MCP server in the official MCP Registry. npm i @mukundakatta/streamparse
@mukundakatta/streamparse-mcp version
MCP: parse partial JSON
MCP server that lets Claude / Cursor / Cline / Windsurf / Zed parse partial, truncated, or messy JSON on demand. Three tools: parse_partial_json, extract_json_from_text, validate_json. npx -y @mukundakatta/streamparse-mcp
@mukundakatta/mcpcheck downloads
MCP config quality gate
Lint MCP config files for Claude Desktop, Cursor, Cline, Windsurf, and Zed. CLI, GitHub Action, and SARIF for code scanning. npm i -g @mukundakatta/mcpcheck
@mukundakatta/designlint downloads
frontend quality checks
HTML/CSS accessibility and design linter for contrast, touch targets, headings, form labels, and leaked secrets. npm i -g @mukundakatta/designlint
@mukundakatta/skillint downloads
AI skill validation
Lint Claude Code SKILL.md files for frontmatter, required fields, descriptions, and hardcoded secrets. npm i -g @mukundakatta/skillint
@mukundakatta/ai-eval-forge downloads
eval harness
Zero-dependency eval harness for comparing model, prompt, and agent behavior. CLI plus programmatic API; also on PyPI. npm i @mukundakatta/ai-eval-forge
@mukundakatta/codex-skill-kit downloads
Codex skill tooling
Scaffold and validate Codex skills from the command line. Published for npm and PyPI workflows. npm i -g @mukundakatta/codex-skill-kit
@mukundakatta/kavach downloads
AI-app threat signals
Small, inspectable threat-scoring library for AI-app security monitoring: signals to weighted score to tier and playbook. npm i @mukundakatta/kavach
More npm packages (43) — grouped by area

MCP servers (6) — callable directly from Claude Desktop, Cursor, Cline, Windsurf, Zed via stdio:

Package What it does
@mukundakatta/streamparse-mcp Parse partial / truncated / messy JSON for LLM tool calls. Listed in the official MCP Registry.
@mukundakatta/agentfit-mcp Token-aware message truncation: count tokens, fit a chat history into a budget.
@mukundakatta/agentguard-mcp Check URLs against a network-egress allowlist before any tool fetch.
@mukundakatta/agentsnap-mcp Diff and validate tool-call trace snapshots.
@mukundakatta/agentvet-mcp Validate tool-call args against a shape spec; produce LLM-friendly retry hints.
@mukundakatta/agentcast-mcp Extract JSON from messy LLM text and validate it against a shape.

Structured outputs & parsing (1)

Package What it does
@mukundakatta/streamparse Streaming JSON parser that yields partial valid trees as tokens arrive.

Agent infrastructure (11)

Package What it does
@mukundakatta/agentfit Token-aware message truncation; fit chat history into a context budget.
@mukundakatta/agentguard Network-egress firewall for agent tools: declarative domain allowlist.
@mukundakatta/agentsnap Snapshot tests for tool-call traces, like Jest snapshots for LLM tool use.
@mukundakatta/agentvet Validate tool args before execution, with LLM-friendly retry hints.
@mukundakatta/agentcast Structured-output enforcer: validate, retry with feedback, BYO-LLM/validator.
@mukundakatta/agent-loop-breaker Detect repeated agent steps and stop runaway loops.
@mukundakatta/agent-regression-lens Detect regressions between baseline and current AI agent runs.
@mukundakatta/agent-trajectory-replay Replay and diff AI agent event trajectories for debugging regressions.
@mukundakatta/tool-call-contracts Validate LLM tool-call payloads with small JSON-like contracts.
@mukundakatta/tool-permission-gate Policy-check agent tool calls before execution.
@mukundakatta/tool-result-taint Track untrusted tool output before it enters prompts or actions.

RAG & retrieval (6)

Package What it does
@mukundakatta/rag-quality-kit Heuristic quality metrics for RAG retrieval and grounded answers.
@mukundakatta/rag-staleness-auditor Find stale RAG chunks by age, version, and freshness requirements.
@mukundakatta/retrieval-acl-filter Enforce document ACLs after retrieval and before prompting.
@mukundakatta/vector-poison-score Score retrieved documents for vector/RAG poisoning signals.
@mukundakatta/embedding-dedupe Deduplicate near-identical embedding records by cosine similarity.
@mukundakatta/context-drift-detector Detect topic drift between user intent, retrieved context, and AI answers.

Prompt & output safety (5)

Package What it does
@mukundakatta/pii-sentry Detect and redact PII and secret-like values before AI processing.
@mukundakatta/prompt-injection-shield Prompt-injection risk scanner for untrusted AI context.
@mukundakatta/llm-output-sanitizer Sanitize LLM outputs before rendering, SQL, shell, or markdown sinks.
@mukundakatta/system-prompt-leak-scan Detect system prompt leakage in model outputs.
@mukundakatta/jailbreak-corpus-mini Small local jailbreak + prompt-injection fixture set for tests.

Context & prompt engineering (4)

Package What it does
@mukundakatta/context-forge Context engineering toolkit for ranking, packing, and risk-scanning RAG context.
@mukundakatta/context-window-packer Pack context chunks into a budget by relevance and priority.
@mukundakatta/prompt-token-trim Trim prompt messages to fit a token budget while preserving priority.
@mukundakatta/prompt-version-diff Diff prompt templates and flag risky instruction changes.

Evals & tracing (3)

Package What it does
@mukundakatta/eval-dataset-smith Generate balanced eval cases from bugs, docs, examples, and policies.
@mukundakatta/eval-flake-detector Detect flaky LLM eval cases across repeated runs.
@mukundakatta/llm-trace-sampler Sample LLM traces by risk, errors, latency, and deterministic ids.

Cost, routing & caching (4)

Package What it does
@mukundakatta/llm-cost-guard Estimate AI request cost and enforce per-request or session budgets.
@mukundakatta/model-fallback-planner Plan model fallback chains from capability, cost, and health data.
@mukundakatta/model-router-policy Policy-based model routing by capability, cost, latency, and privacy.
@mukundakatta/semantic-cache-key Stable semantic cache keys for AI prompts, tools, models, and retrieval context.

Supply chain, citations, consent (5)

Package What it does
@mukundakatta/ai-supply-chain-manifest Build and validate lightweight AI model / data / tool manifests.
@mukundakatta/citation-integrity-check Verify answer citations refer to supplied source ids.
@mukundakatta/consent-redaction-log Record consent-aware redactions for privacy review trails.
@mukundakatta/hallucination-risk-meter Estimate hallucination risk from answer, context, citations, and uncertainty language.
@mukundakatta/llm-response-schema-lite Tiny schema validator for structured LLM responses.

Install any of them with npm i @mukundakatta/<package>.

PyPI:

Package Purpose Install
claude-skill-check downloads Lint Claude Code SKILL.md files for YAML frontmatter, required fields, description quality, and secret patterns. pip install claude-skill-check
mcp-config-check downloads Validate MCP configs across Claude Desktop, Cursor, Cline, Windsurf, and Zed; catches auth, transport, duplicate, and placeholder issues. pip install mcp-config-check
claude-hooks-check downloads Audit Claude Code hooks for malformed matchers, dangerous commands, invalid events, and hardcoded secrets. pip install claude-hooks-check
claude-commands-check downloads Validate Claude Code slash-command files for naming, frontmatter, model values, allowed-tools shape, and secret leakage. pip install claude-commands-check
llm-usage-report downloads Parse raw LLM API response logs and generate token and cost reports by provider, model, day, project, or user. pip install llm-usage-report
codex-skill-kit downloads Scaffold and validate Codex skills from Python environments; mirrors the npm CLI workflow. pip install codex-skill-kit
ai-eval-forge downloads Zero-dependency LLM and agent eval harness with exact, regex, token-F1, JSON, and citation-coverage checks. pip install ai-eval-forge
agent-run-diff downloads Compare baseline and current agent runs across success, errors, tools, output drift, steps, latency, and cost. pip install agent-run-diff
More PyPI packages (44) — Python ports of the @mukundakatta JS libraries

Streaming + agent reliability stack (6)

Package What it does
partial-json-stream Streaming JSON parser that yields partial valid trees as tokens arrive.
agentfit-py Token-aware message truncation; fit a chat history into a context budget.
agentguard-firewall Network-egress firewall for agent tools.
agentsnap-py Snapshot tests for tool-call traces.
agentvet-py Validate tool args before execution; LLM-friendly retry hints.
agentcast-py Structured-output enforcer; validate, retry with feedback.

Prompt + output safety (3)

Package What it does
pii-sentry-py Detect and redact PII and secret-like values before AI processing.
prompt-injection-shield-py Prompt-injection risk scanner for untrusted AI context.
llm-output-sanitizer-py Sanitize LLM outputs before HTML / SQL / shell / markdown sinks.

RAG + retrieval (3)

Package What it does
rag-quality-kit Heuristic quality metrics for RAG retrieval and grounded answers.
vector-poison-score Score retrieved documents for vector / RAG poisoning signals.
embedding-dedupe Deduplicate near-identical embedding records by cosine similarity.

Cost, caching, evals (3)

Package What it does
llm-cost-guard-py Estimate AI request cost and enforce per-request or session budgets.
semantic-cache-key Stable semantic cache keys for AI prompts, tools, models, retrieval.
eval-flake-detector Detect flaky LLM eval cases across repeated runs.

Verification + grounding (3)

Package What it does
citation-integrity-check Verify answer citations refer to supplied source ids.
hallucination-risk-meter Estimate hallucination risk from answer + context + citations.
system-prompt-leak-scan Detect system-prompt leakage in model outputs.

Agent infrastructure + meta (6)

Package What it does
mk-agentkit Meta-package re-exporting all 5 agent-stack ports under one import.
agent-loop-breaker-py Detect repeated agent steps and stop runaway loops.
agent-regression-lens-py Detect regressions between baseline and current agent runs.
agent-trajectory-replay-py Replay and diff agent event trajectories.
tool-call-contracts-py Validate LLM tool-call payloads with small JSON-like contracts.
tool-permission-gate-py Policy-check agent tool calls before execution.

Tools / safety / privacy (4)

Package What it does
tool-result-taint-py Track untrusted tool output before it enters prompts.
jailbreak-corpus-mini-py Local jailbreak + prompt-injection fixture set for tests.
consent-redaction-log-py Record consent-aware redactions for privacy review trails.
kavach-py Threat-scoring library for AI-app security monitoring.

RAG (3)

Package What it does
rag-staleness-auditor-py Find stale RAG chunks by age, version, and freshness requirements.
retrieval-acl-filter-py Enforce document ACLs after retrieval and before prompting.
context-drift-detector-py Detect topic drift between intent, context, and answer.

Context engineering (5)

Package What it does
context-forge-py Context engineering toolkit: ranking, packing, risk-scanning.
context-window-packer-py Pack context chunks into a budget by relevance and priority.
prompt-token-trim-py Trim prompt messages to fit a token budget while preserving priority.
prompt-version-diff-py Diff prompt templates and flag risky instruction changes.
llm-response-schema-lite-py Tiny schema validator for structured LLM responses.

Evals + cost + routing (5)

Package What it does
eval-dataset-smith-py Generate balanced eval cases from bugs, docs, examples, policies.
llm-trace-sampler-py Sample LLM traces by risk, errors, latency, and deterministic ids.
llm-cost-guard-py Estimate AI request cost and enforce per-request or session budgets.
model-fallback-planner-py Plan model fallback chains from capability, cost, and health data.
model-router-policy-py Policy-based model routing by capability, cost, latency, privacy.

Niche linters (4)

Package What it does
mcpcheck-py Lint MCP config files for Claude Desktop, Cursor, Cline, Windsurf, Zed.
skillint-py Lint Claude Code SKILL.md files.
designlint-py HTML/CSS accessibility and design linter.
ai-supply-chain-manifest-py Build and validate lightweight AI model / data / tool manifests.

GitHub Marketplace (7 Actions):

Composite GitHub Actions, discoverable on the GitHub Marketplace:

Linters:

Agent-stack CI gates:

Homebrew tapmukundakatta/tools:

brew tap mukundakatta/tools
brew install claude-skill-check mcp-config-check claude-hooks-check claude-commands-check

Each ships a CLI, a programmatic API, and (for the linters) a composite GitHub Action you can drop into any workflow in 3 lines.

🤗 HuggingFacemukunda172914 Spaces · 13 Datasets:

🚀 Live Gradio playgrounds (6):

Space What you can try
agent-stack-demo All 5 libs (fit, guard, snap, vet, cast) in one app.
token-counter Count tokens for any text across Claude / GPT / Llama tokenizers.
json-extractor Pull clean JSON out of messy LLM output (fenced, inline, unfenced).
pii-redactor Find emails, phones, secrets, and IDs — mask, hash, or highlight.
prompt-injection-detector Heuristic scanner for the most common injection families.
mcp-config-validator Sanity-check Claude Desktop / Cursor / Cline / Windsurf / Zed configs.

📖 Static reference & explainer pages (8):

Space What it covers
agent-stack-tour Guided tour of all 5 libraries with install commands and live links.
why-this-stack The thinking behind the stack — what's broken, why these 5 libs.
install-cheatsheet All install commands across pip, npm, and MCP.
mcp-quickstart Add the 5 MCP servers to Claude Desktop / Cursor / Cline / Windsurf / Zed.
fit-strategies-explained Visual explainer: drop-oldest vs drop-middle vs priority.
trace-format-reference Field-by-field reference for the agentsnap trace JSON schema.
prompt-injection-taxonomy 10-category taxonomy with examples + the cheap defense for each.
dataset-cards-index One-page index of all 13 datasets below.

📊 Datasets (13) — all MIT, all datasets.load_dataset("mukunda1729/<name>") ready:

Dataset Rows Purpose
jailbreak-corpus-mini 15 Curated jailbreak fixtures across 8 categories.
prompt-injection-patterns-extended 30 Prompt-injection patterns across 10 categories.
pii-detection-fixtures 25 PII / secret strings labeled with span offsets.
tool-arg-validation-cases 20 (Tool, schema, args) tuples — valid + invalid.
mcp-tool-test-fixtures 22 MCP tool-call args across 8 categories.
llm-output-extraction-cases 20 Messy LLM outputs with expected JSON.
hallucination-risk-cases 20 Prompt → response pairs rated for hallucination risk.
rag-quality-benchmarks-mini 15 RAG eval queries with ground-truth answers.
agent-trace-samples 10 agentsnap-format tool-call traces (good + regressed pairs).
agent-budget-violations 15 Agent runs with budget caps + actual usage + root cause.
token-counting-edge-cases 20 Strings with token counts across 3 tokenizer families.
model-pricing-table 20 LLM pricing — input/output cost per 1k tokens, context window.
mcp-config-examples 15 MCP client configs across Claude Desktop, Cursor, Cline, Windsurf, Zed.

Featured Projects

Karna — AI Agent Platform

Self-hosted AI assistant with 7 messaging channels (Telegram, Slack, Discord, WhatsApp, SMS, iMessage, Web), extensible plugin SDK, semantic memory, and voice. TypeScript monorepo with Next.js dashboard and React Native mobile app.

Stack · TypeScript • Node.js • Next.js • Supabase • WebSocket • pgvector

Chetana — AI Consciousness Research Platform

Research-driven platform exploring machine consciousness through 14 indicators grounded in 6 scientific theories. Built to turn abstract AI-consciousness questions into structured experiments, scoring, and analysis.

Stack · AI Research • Evaluation • Experimentation • Python

AgentRAG — Modular RAG Pipeline

Provider-agnostic RAG framework with pluggable vector stores, chunking strategies, and retrieval methods. Designed for agentic workflows with clean API boundaries.

Stack · RAG • Vector Search • Embeddings • TypeScript

Astra Agent — AI Agent Runtime

Standalone AI agent runtime with tool execution, context management, and multi-model routing. Foundation for building autonomous AI assistants with structured tool use.

Stack · TypeScript • LLM Orchestration • Tool Use • Agents

More Projects
Project Description
Sadhak AI-powered job search command center — automated evaluation, resume tailoring, application tracking
Chetana AI consciousness research platform — 14 indicators from 6 scientific theories
Prithvi Container security scanner — vulnerability detection, compliance checks, Docker audits
Amogha Cafe Full-stack Firebase restaurant platform — real-time ordering, QR dine-in. Live
RNHT Temple community platform — events, donations, priest scheduling
Patchly AI code review bot — flags bugs, suggests fixes, explains why, like a senior engineer
Evalharness Prompt, agent, and RAG test harness — red teaming, regression testing, CI/CD for AI
AgentMem Pluggable memory management for AI agents
LLM Bench CLI CLI for benchmarking local LLMs — speed, throughput, quality
TokenWise Token usage optimization across providers

Impact at a Glance

Production AI / ML Impact
COST EFFICIENCY
78%
infrastructure cost reduction
SageMaker → Bedrock migration
LATENCY
600x
retrieval latency improvement
ML prediction system
RAG SCALE
30K+
knowledge base entries
9-stage agentic RAG pipeline
QUALITY
370+
unit tests & evaluations
production ML systems
Open Source Footprint
UPSTREAM
97
merged PRs
in external public repos
PACKAGES
144
52 npm (incl. 6 MCP servers, agentkit) +
52 PyPI + 6 in the official MCP Registry +
7 GitHub Marketplace Actions +
14 HF Spaces + 13 HF Datasets
ORIGINAL WORK
160
original public repos
maintained on GitHub
ECOSYSTEMS
6+
major org ecosystems
OpenAI, Anthropic, Google,
Microsoft, Stanford, Princeton

What I Build

 ML Systems           Fault prediction, embedding pipelines, model evaluation, cost-optimized inference
 Agentic AI           RAG pipelines, LangGraph workflows, query routing, hallucination detection
 Cloud Infrastructure AWS (Bedrock, SageMaker, ECS, OpenSearch), GCP, Azure, Kubernetes, Terraform
 Full-Stack           React/TypeScript + Java/Python backend APIs, CI/CD, zero-downtime deployments

Experience

Role Company Era Primary arena
AI/ML Engineer Southwest Airlines Aug 2025 — Present production ML, agentic RAG, Bedrock migration
AI/ML Engineer GPS IT Solutions Jun 2024 — Aug 2025 RAG platforms, model-risk governance, vector search
Software Development Engineer Amazon Web Services Aug 2022 — May 2024 enterprise cloud systems, React/Java/Python, CI/CD
Data Engineer GPS IT Solutions Jan 2022 — Aug 2022 data pipelines, AWS Glue, PySpark, analytics workflows
Software Engineer American Express Feb 2017 — Dec 2020 Python backend services, REST APIs, enterprise platforms
Highlights

Southwest Airlines — AI/ML Engineer

  • Architected ML fault prediction system for aircraft maintenance — 5 prediction types, 10K+ records, sub-second retrieval
  • Led SageMaker → Bedrock migration: 78% cost reduction ($1,740→$371/mo), 600x latency improvement
  • Designed 9-stage agentic RAG pipeline (LangGraph, Bedrock Nova Pro/Micro, FAISS + BM25) over 30K+ KB entries

GPS IT Solutions — AI/ML Engineer

  • Built GPT-4 + RAG content generation platform with compliance validation, reducing production time by 40%
  • Designed AI model risk governance framework with 23 automated evaluation tests achieving regulatory compliance
  • Architected FastAPI microservices with FAISS/Pinecone vector search on Kubernetes

Amazon Web Services (AWS) — Software Development Engineer

  • Built and shipped features for AWS Application Manager (Systems Manager) serving enterprise customers globally
  • Owned full-stack delivery: React/TypeScript frontend + Java/Python backend APIs with operational excellence
  • Designed CI/CD and IaC patterns enabling zero-downtime deployments at enterprise scale

GPS IT Solutions — Data Engineer

  • Led end-to-end migration of data pipelines from on-prem to AWS (Glue, PySpark)

American Express — Software Engineer

  • Developed Python backend services and RESTful APIs for enterprise platforms handling high-volume transactions at scale

Follow For

If you follow my work here, you’ll mostly see:

  • open-source contributions to AI SDKs and agent tooling
  • MCP, eval, and developer-experience improvements
  • practical full-stack and infrastructure-heavy AI projects
  • systems thinking around memory, retrieval, orchestration, and production reliability

Education

University of Central Missouri — M.S. in Big Data Analytics and Information Technology (Jan 2021 — May 2022)

SRM University — B.Tech in Mechanical Engineering (2012 — 2016)


Certifications

Anthropic

MCP Advanced Claude with Bedrock Claude with Vertex AI Intro to MCP Claude Code Building Claude API Agent Skills Subagents AI Fluency Claude 101

AWS

AWS GenAI Apps AWS AI Solutions AWS AI Fundamentals Amazon Q

Cloud & Infrastructure

Terraform GCP Vertex AI Agent

Stanford / Wharton

ML Stats Business Analytics Customer Analytics People Analytics

Microsoft

GenAI for Devs GitHub Copilot Copilot PM

LinkedIn Learning

Deep Learning TensorFlow NLP Python Apache Spark


Tech Stack

TypeScript Python JavaScript Java Go React Next.js Node.js Claude OpenAI AWS GCP Snowflake Supabase PostgreSQL Docker Kubernetes Terraform FastAPI Redis Apache Airflow Apache Spark LangChain


GitHub Stats

Image

Image

Image

Live Signals

Profile Views GitHub Followers GitHub Stars

ai-eval-forge npm agent-regression-lens npm codex-skill-kit PyPI OSS log activity


Open to opportunities — Senior AI/ML Engineer • GenAI Platform Engineer • Software Engineer

mukunda-ai.vercel.app • Las Vegas, NV

Pinned Loading

  1. karna karna Public

    Karna — Your Loyal AI Agent Platform. Self-hosted personal AI assistant with multi-channel messaging, extensible skills, and semantic memory.

    TypeScript 2

  2. rnht rnht Public

    Rudra Narayana Hindu Temple — community platform for events, donations, and priest scheduling

    TypeScript 1

  3. amogha-cafe amogha-cafe Public

    Amogha Cafe & Restaurant — Premium full-stack Firebase web app with ordering, admin dashboard, kitchen display, QR dine-in, live tracking & more. Live at https://amogha-cafe.web.app

    JavaScript 1

  4. astra-agent astra-agent Public

    Standalone AI agent runtime — tool execution, context management, and multi-model routing for autonomous assistants

    TypeScript 1

  5. chetana chetana Public

    Chetana — AI Consciousness Research Platform. Test AI models against 14 consciousness indicators from 6 scientific theories.

    TypeScript 1