Requesting guidance to add PT-BR pairs.

Hello maintainers!

I am interested in contributing to JudgeBench by adding a new set of response pairs in the `data/` folder, using PT-BR (Brazilian Portuguese) benchmarks.

While reading the paper and exploring the repository, I noticed that a central part of the methodology involves generating multiple responses (**'k'** responses) for each question and selecting a pair consisting of one correct and one subtly incorrect response. However, I could not find in the JudgeBench codebase the procedure or script responsible for generating and selecting these pairs.

My questions are:
- Could you point out where the logic/tool for generating the 'k' responses and selecting the correct/subtly incorrect pair for each question is implemented or described?
- Is there any preprocessing script/notebook or methodological recommendations on this process that could be shared?

My intention is to follow the project's methodological standards so that the new PT-BR benchmarks are compatible and valuable to the community.

Thank you in advance for your attention, and congratulations on your work!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requesting guidance to add PT-BR pairs. #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Requesting guidance to add PT-BR pairs. #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions