Software Engineer, AI Platform

KRIS INFOTECH PTE. LTD.

Job Description

  • You will own how these platforms are built and run: backend services, the developer-facing portal, integrations with internal systems, and the infrastructure underneath.
  • The data science team remains closely involved as domain experts and primary internal customers.
  • You are the engineer who makes sure what they need actually ships and holds together in production.
  • This is a software engineering role first.
  • You should understand LLM and agent concepts — how tool calling works, why agent loops behave non-deterministically, how context management affects system design.
  • Multi-agent workflows — moving from single-agent execution to coordinated multi-agent systems that can handle complex, long-horizon tasks across business functions.
  • Core system integration — deeper connections with Company's operational and commercial systems, so AI agents can act on real data in real time rather than working in isolation.
  • Real-time decision support — expanding the platform's role from task automation toward live recommendations and decision assistance in high-stakes operational contexts.

What you'll work on:

  • Agent runtime & execution — session state, agent lifecycle management (pause, resume, plan mode), spec versioning, execution safety.
  • Auth, security & governance — OAuth integrations, credential rotation, PII controls, prompt injection mitigation, runtime guardrails.
  • Platform infrastructure — cost and usage tracking, distributed tracing (OpenTelemetry), scaling across business units.
  • Protocol & ecosystem integrations — MCP, A2A, WebSocket-based tooling, multi-cloud file handling.
  • Observability & developer tooling — metrics dashboards, debugging infrastructure, the developer portal that internal teams use to build and test agents.

Requirements

Must-have:

  • Strong backend engineering experience in Node.js / TypeScript. Track record of owning services end to end in production — not just shipping features to a spec.
  • Comfortable across the stack. Able to pick up frontend work in React when the ticket calls for it, even if backend is your primary strength.
  • Solid API design skills. Familiar with authentication and authorisation patterns (OAuth and similar) and production debugging.
  • Familiarity with observability practices — logging, tracing, metrics. OpenTelemetry or equivalent experience preferred.
  • Conceptual fluency with LLM agent systems: tool calling, context windows, streaming, non-determinism. Enough to make sound architectural decisions in this domain. No ML or model training background required.
  • Comfortable with ambiguity. This team is small. You will often decide how something should be built, not just execute a spec.

Nice-to-have:

  • Hands-on experience with agent frameworks or protocols (LangChain, Google ADK, MCP, A2A, or similar).
  • Cloud infrastructure experience (AWS).
  • Background in platform or developer-tooling engineering — building for other engineers as your primary customer.
  • Interest in AI security: prompt injection, guardrails, and safe agent execution are real parts of this backlog.