Human Interview for LLM Evaluation
2 weeks ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
We re looking for an LLM Evaluation Benchmarking Experimentation Engineer to rigorously test our proprietary LLM API and build the infrastructure for systematic model improvement. · Execute established benchmarks with methodological rigor. · ...
3 weeks ago
I'm seeking a technical mentor to help deepen my understanding of LLM evaluation and benchmarking · ...
1 month ago
We are currently on behalf of a leading global AI & language technology company looking for an English LLM Data AnnotatorEvaluator. · Commitment: 20+ hours/week · Location: · Remote – long-term UK-based · Understand what users are actually asking for in a prompt. · Compare evalua ...
3 weeks ago
This role focuses on assessing AI-generated code and strengthening model reliability across production-grade engineering workflows. · ...
1 week ago
We are currently on behalf of a leading global AI & language technology company confidentially looking for an English LLM Data Annotator/Evaluator · Role ResponsibilitiesUnderstand what users are actually asking for in a prompt. · Bullet points from qualifications: Native or flue ...
4 weeks ago
We are currently on behalf of a leading global AI & language technology company looking for an English LLM Data Annotator/Evaluator. This is a part-time/full-time contract role that requires 20+ hours/week. · Understand what users are actually asking for in a prompt. · Compare, e ...
4 weeks ago
We are currently on behalf of a leading global AI & language technology company looking for an English LLM Data Annotator/Evaluator with commitment to work at least 20+ hours/week. The role involves understanding user prompts and classifying model responses according to clear gui ...
3 weeks ago
We are currently looking for an English LLM Data AnnotatorEvaluator to work on behalf of a leading global AI & language technology company. · Understand what users are actually asking for in a prompt. · Compare evaluate and label AI model responses according to clear guidelines. ...
3 weeks ago
We are currently on behalf of a leading global AI & language technology company looking for an English LLM Data Annotator/Evaluator. · We need someone who understands what users are actually asking for in a prompt. Compare, evaluate, and label AI model responses according to clea ...
1 month ago
We are currently looking for an English LLM Data Annotator/Evaluator to work on behalf of a leading global AI & language technology company. · Understand what users are actually asking for in a prompt. · Compare, evaluate and label AI model responses according to clear guidelines ...
4 weeks ago
We are currently on behalf of a leading global AI & language technology company confidentially looking for an English LLM Data Annotator/Evaluator. · Understand what users are actually asking for in a prompt. · Compare evaluate and label AI model responses according to clear guid ...
3 weeks ago
We are currently on behalf of a leading global AI & language technology company looking for an English LLM Data Annotator/Evaluator. · ...
4 weeks ago
Position: · Software Engineer (Trajectory) · Type: · Hourly contract · Compensation: · $70 to $130 per hour · Location: · Remote · Commitment: · 10 to 40 hours per week · Role Responsibilities · Review model generated code trajectories on real world software engineering tasks. · ...
5 days ago
Audit, train, and improve Large Language Models (LLMs) specialized in finance. · Evaluating LLM outputs for accuracy. · ...
3 weeks ago
LLM Evaluation Specialist for AI Chat Workflows
Only for registered members
We're building an AI-first knowledge management platform with chat-based agents that edit documents, manage plans, and search a knowledge base (RAG). · Evaluate output quality of AI workflows · Diagnostics failure modes (hallucinations, grounding issues, tool-call failures) · ...
1 week ago
Creative Writer with Statistical Expertise Needed for LLM Evaluation
Only for registered members
We are seeking a talented creative writer who possesses a strong statistical background. · Academic Writing · ...
1 month ago
LLM + Retrieval Engineer … Build a Source-Grounded Outreach Suggestion System + Evaluation Loop
Only for registered members
We're building an internal system that helps B2B teams write non-generic outreach by using structured information pulled from public sources (company websites, competitor sites, LinkedIn posts, YouTube video transcripts etc.). · The system should generate actionable outreach sugg ...
1 month ago
Full-Time (40 hrs/week) Finance RLHF / LLM Evaluation Assistant (Rubrics, Golden Answers, QA)
Only for registered members
I'm hiring a full-time assistant to help me execute and scale finance-focused RLHF / LLM evaluation contracts. · ...
3 weeks ago
LLM Engineer | 110K | London (visa sponsorship available)
Only for registered members
The company is looking for an LLM Engineer to take ownership of language-model systems end to end - from training and fine-tuning through to evaluation and deployment. · ...
1 month ago
We're looking for a part-time · RResearch Advisor or Consultant to help guide our methodology and provide high-level input as we scale our evaluation framework and data programs. · The ideal advisor has experience with LLM evaluation. ...
2 weeks ago