Bonfy.AI is building the trust layer for generative AI. Our Adaptive Content Security platform detects and mitigates subtle risks embedded in large language model (LLM) outputs before they reach users. From hallucinations to hidden data leaks, we enable enterprises to deploy GenAI confidently, without compromising truth, privacy, or reputation.

We are model-agnostic, outcome-driven, and unapologetically rigorous. Our customers include leading Fortune 500 teams working in high-stakes sectors where trust is not optional.

Why This Role Matters

We're looking for a Machine Learning Operations (MLOps) Engineer who will play a pivotal role in building and optimizing the systems that power trust-centric GenAI applications. This is a unique opportunity to help scale ML capabilities that deliver safe, performant, and production-grade NLP pipelines—without the need for model training or heavy data pipelines.

What You’ll Do

Optimize and improve existing model performance—leveraging LLMs and classical NLP models—through advanced prompt engineering, model identification, and sophisticated pre- and post-processing techniques.
Research and design new ML analysis and generation tasks from scratch: define requirements in collaboration with business stakeholders, identify and select models, develop and curate evaluation sets, and iterate to refine results and speed.
Write high-quality, production-grade code within our modular ML framework and ensure smooth integration into our broader system.
Collaborate with Site Reliability Engineers (SREs) to deploy models efficiently and evaluate hardware needs, including GPU profiling and optimization.
Build and refine scalable inference systems using techniques like prefix caching, request batching, and performance parallelization to maximize inference throughput in production.

What We’re Looking For

Proven experience deploying and optimizing ML/NLP systems in production environments.
Strong skills in Python and deep familiarity with LLMs, embeddings, and traditional NLP architectures.
Demonstrated success in applying prompt engineering and pre/post-processing to improve model outputs.
Practical experience building high-performance inference infrastructure.
A collaborative mindset and ability to work cross-functionally with engineers, researchers, and operations.
Comfort navigating ambiguity and iterating rapidly in a fast-paced, early-stage startup.

Bonus Points For

Background in content safety, compliance, or trust-sensitive AI use cases.
Experience building evaluation sets and metrics aligned to real-world application constraints.
Familiarity with vector databases and scalable LLM stacks (e.g., Bedrock, LangChain).
Exposure to GPU-aware system design and optimization techniques.

Why Join Us

Early-stage impact: your infrastructure will shape the way our platform scales and safeguards trust.
You’ll work directly with sharp engineers, AI scientists, and product leaders in a high-autonomy environment.
We value clarity, urgency, and intellectual honesty. Your voice will be heard, and your contributions will be real.
Competitive compensation: salary, equity, health/vision/dental, and a flexible hybrid work model.

Apply If...

You believe that MLOps isn’t just about automation—it's about ensuring AI systems remain accountable, safe, and scalable.
You understand that true trust in AI requires operational rigor, not just clever code.
You want to help build the foundation for secure, responsible AI at a moment when it matters most.

Bonfy.AI — Truth. Security. Intelligence.

Apply now

See more open positions at Bonfy.AI

Privacy policy Cookie policy