AI Systems Engineer
AI Systems Engineer (Agent / LLM)
Reporting Line: Directly to CEO of Sophon (Located at Guangzhou / Shanghai)
About Sophon
Sophon is an AI‑native enterprise operating system. It integrates all corporate data sources—email, instant messaging, business systems—processing every inbound and outbound message and building a persistent memory and knowledge base from each interaction.Think of it as a version of you with unlimited time and perfect information: drafting replies, managing decisions, and coordinating teams.
Unlike traditional tools (飞书、企业微信、Slack), which were designed for human‑to‑human communication with AI added later, Sophon was built for AI from the very first line of code. It not only optimizes human management loops but also enables collaboration between humans and AI agents, and even agent‑to‑agent workflows—scenarios traditional tools never envisioned.Analogy: You could buy a phone running a legacy OS with a chatbot bolted on. Or you could buy a phone where AI is woven into the operating system itself. Sophon is the latter.
About the Team- CEO: Stanford Physics (BS) + Computer Science (MS). Founded first company at 18, achieving ¥100M+ ARR in year one. Deeply involved in technical direction and coding.
- CTO: Stanford Computer Science graduate, former Tesla Autopilot engineer, led engineering teams of dozens.
Flat, fast‑paced team. Both CEO and CTO code directly; decision chains are short. You will work closely with them.
Tech Stack
Python and TypeScript.
Role Definition
This is not a research role. No pre‑training, fine‑tuning, or paper publishing.
This is about engineering Agent systems: performance and reliability depend on what the Agent “sees” and how it is structured. Your responsibility is to design and implement the Agent’s context harness—its input/output structures, retrieval strategies, and inference integration.
Key Responsibilities:
- Inference deployment & scheduling: build stable, low‑latency, cost‑controlled services around open‑source inference frameworks.
- Context engineering: decide what enters the Agent’s context window, how it is structured, when to inject or replace—moving beyond prompt hacks to system‑level design.
- Retrieval & memory integration: design boundaries between probabilistic retrieval (vectors) and deterministic injection (rules) and orchestrate their cooperation.
- Agent interface protocols: structured communication between agents, tools, and humans.
You’ll pair‑program with CEO/CTO. Architecture judgment comes from continuous coding practice—this is a hands‑on role.
Requirements- Bachelors from top 985 university or leading overseas institution.
- Actively coding.
- Real Agent/LLM application portfolio (not notebooks or simple API wrappers). Must demo and explain architecture trade‑offs, failures, fixes.
- Strong Python (async, type system, memory model, packaging). Able to use in production. TypeScript: read/write comfortably.
- Cross‑layer debugging ability: model behavior, retrieval quality, agent orchestration.
- Clear, evidence‑based judgments on mainstream choices: Tool Use vs Function Calling, structured output reliability, RAG vs fine‑tuning vs prompt engineering, multi‑agent failure modes.
- Self‑driven, no technical guidance provided—you are the guide.
- Production deployment of open‑source inference frameworks (vLLM, SGLang, TensorRT‑LLM).
- Deep knowledge of Agent frameworks (LangGraph, DSPy), contributions or source code familiarity.
- Open‑source project leadership.
- Multilingual NLP or cross‑cultural Agent experience.
- English working proficiency.
Interview Process
Resume → Video call. Live demo of your work. We’ll pose a real Sophon architecture problem (e.g., scaling file‑system‑based agent coordination, designing prompt‑degradation detection). No algorithm quizzes, no written tests.
This Role Isn’t Right for You If- Candidates primarily interested in model training or academic publication are not suitable.
- Those limited to API calls or basic prompt assembly will not meet the requirements.
- Believing that “more context” or “larger models” automatically yield better results is a misconception not aligned with this role.
- Inability to clearly articulate the rationale behind system design decisions is a disqualifier.
- Experience confined to notebooks without delivering production‑ready systems is insufficient.
- Requiring fully detailed specifications before initiating work is not compatible with the expectations of this position.