About the JobMercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey.PositionGrafana Evaluation Task DesignerTypeContractCompensation$90–$150/hourCommitment10–15 hours/weekRole ResponsibilitiesDesign realistic, multi-step Grafana workflows, including dashboards, alerting rules, and data source configuration.Perform each workflow on a hosted Grafana instance to produce a reference trajectory.Write clear, specific task prompts with measurable outcomes for programmatic verification.Implement programmatic graders to check task completion accuracy.Review AI agent attempts, identify failures, and tag root causes.Calibrate task difficulty to ensure challenges are solvable, iterating on prompts based on model performance.QualificationsMust-Have2+ years of daily, professional Grafana experience.Deep familiarity with PromQL, dashboard templating, alerting pipelines, and data source configuration.Ability to articulate workflows for programmatic verification.Comfort writing basic grading scripts in Python.PreferredExperience with Grafana API automation.Kubernetes/infrastructure monitoring background.Familiarity with AI evaluation or benchmarking.Application Process (Takes 20–30 mins to complete)Upload resumeAI interview based on your resumeSubmit formResources & SupportFor details about the interview process and platform information, please check: any help or support, reach out to: ...@mercor.comPS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.#J-18808-Ljbffr