AI tool usage during evaluation work

AI/LLM Usage Policy · Students & Scholars · Updated

The entire purpose of human evaluation is to provide genuine human judgment. Using AI tools to do the evaluation for you defeats the purpose and produces training data that actively degrades model quality. Our policy on AI/LLM usage is strict and non-negotiable.

What is allowed

  • Grammar and spelling correction — You may use tools like Grammarly or a spell-checker to clean up typos and grammatical errors in your written justifications.
  • Minor phrasing refinement — Small adjustments to tone or wording are acceptable, as long as the ideas, reasoning, and structure are entirely your own.

What is strictly prohibited

  • Using AI to evaluate or assess response quality. You may not ask ChatGPT, Claude, Gemini, or any other AI system to tell you which response is better.
  • Using AI to generate justifications or explanations. Your rationale for why one response is better must come from your own reasoning.
  • Using AI to predict code behaviour or runtime results. If a task involves evaluating code, you must assess it yourself.
  • Having AI compose or substantially rewrite your submissions. AI may not write your responses, even partially. Light grammar cleanup is the only exception.

Enforcement

Submissions are monitored for signals of AI-generated content. If your work is flagged as AI-assisted beyond the allowed scope:

  • Your submission will be rejected.
  • You will receive a warning or, in serious cases, immediate contract termination.
  • Repeated violations result in a permanent ban from the platform.

The rule of thumb: AI tools may fix how you write, but never what you write. The judgment, analysis, and reasoning must be yours.

Frequently asked questions

Can I use a spell-checker or grammar tool like Grammarly?

Yes. Spell-checkers and grammar tools that correct typos and grammatical errors are allowed. What's prohibited is using AI to generate, evaluate, or substantially rewrite the content of your submissions.

Can I use a thesaurus or dictionary?

Yes. Looking up word definitions or synonyms in a dictionary or thesaurus is fine. These are reference tools, not AI-powered content generators.

What about translation tools? Can I use Google Translate?

Using translation tools to understand a prompt in your non-native language is a grey area. If the task requires evaluation in a specific language, you should be proficient enough to work without translation. If you need clarification on a term, ask in the project Discord channel.

How does Sovrano AI detect AI-generated content?

We use a combination of automated detection tools, statistical analysis, and human review. The specific methods are not disclosed to prevent circumvention. Detection is ongoing and may flag work retroactively.

What happens on a first offense?

First offenses are reviewed on a case-by-case basis. Minor or ambiguous cases may result in a warning and the submission being rejected. Clear, deliberate use of AI to generate evaluation content typically results in immediate contract termination.

What if I suspect another evaluator is using AI?

Report your concern to your project lead or email hello@sovrano.ai. You can report anonymously. Provide any relevant details (patterns you've noticed, specific submissions, etc.). All reports are investigated.

Does this policy apply to assessments and interviews too?

Yes. AI and LLM tools are strictly prohibited during assessments unless the assessment explicitly states otherwise. The same rules apply to all work done on the Sovrano AI platform.

Can I use AI tools for personal research unrelated to my evaluation tasks?

Your personal use of AI tools outside of Sovrano AI work is your business. However, you may never input any project data, prompts, model outputs, or confidential information into any external AI tool — this would be both a policy violation and an NDA breach.

Related articles

Can't find what you're looking for?

Our team is here to help.

Email Support