Job Details

Learn more about the app. Workinapps.com

Senior Software Engineer

2025-11-12 Droisys all cities,CA

Description:

Title: Sr. Gen AI MLOps Software Developer
Location: San Jose, CA
Schedule: On-site/In-office
Terms: 6 Months – Potential Extension

Strong skills in Python and data analysis libraries (Pandas, NumPy, SQL).
Demonstrable experience or strong projects in LLM/RAG development.
Strong proficiency in agentic LLM Libraries/Technologies like LangChain, LangGraph, AutoGen, CrewAI, etc.
Familiarity with RL, techniques for fine-tuning LLMs (e.g., LoRA), and other emerging ML methodologies.

Play key role in design, development, and deployment of large-scale high-performance enterprise ready agent frameworks and tools.
Collaborate with engineering team to understand specific needs and challenges of chip design and ensure our agent platform is well-suited to these needs
Develop and optimize retrieval and generation algorithms for enterprise data (text, code, and images) to build advanced AI applications.
Design, implement, test, and continuously optimize end-to-end RAG pipelines, including data parsing, ingestion, prompt engineering, and chunking strategies.
Collect & organize training / fine-tuning data and help build domain specific large language models.
Optimize infrastructure for performance, scalability, and reliability, and ensure secure and efficient management of data.
Stay ahead by engaging with the latest advancements in machine learning and AI to create state-of-the-art solutions.

BS or MS Degree in Electrical Engineering, Computer Science/Engineering, or a related discipline (or equivalent experience).
5+ years of proven industry experience
Skilled at rapidly taking products from concept to launch and scaling them massively by performance tuning and optimizing complex, globally distributed systems
Demonstrable experience or strong projects in LLM/RAG development.
Strong skills in Python and data analysis libraries (Pandas, NumPy, SQL).
Strong proficiency in agentic LLM Libraries/Technologies like LangChain, LangGraph, AutoGen, CrewAI, etc.
Familiarity with RL, techniques for fine-tuning LLMs (e.g., LoRA), and other emerging ML methodologies.
Optimize inference and infrastructure for low-latency, cost-effective operation (vLLM/TGI/Triton, batching, caching, quantization) on GPU/accelerators; support on-prem/VPC deployments with enterprise security controls.
A proactive approach to problem-solving and a willingness to acquire new skills and knowledge as needed to achieve results.

#J-18808-Ljbffr