Start by establishing a handful of test cases - core use cases and failure cases that you want to ensure your prompt can handle. As you explore modifications to the prompt, use promptfoo eval to rate ...