Define and run the eval
1
Prepare the input data
Get your data in a table like a
pandas.DataFrame. More on data requirements. You can also load data from Evidently Platform, like tracing data you captured or synthetic datasets.2
Create a Dataset object
Create a Dataset object with
DataDefinition() that specifies column role and types. You can also use default type detection. How to set Data Definition.3
(Optional) Add descriptors
For LLM and text evals, define row-level
descriptors to compute. Here, you can use a variety of methods, from deterministic to LLM judges. Optionally, add row-level tests to get explicit pass/fail outcomes on set conditions. How to use Descriptors.4
Configure Report
For dataset-level evals (classification, data drift) or to summarize descriptors, create a
Report with chosen metrics or presets. How to configure Reports.5
(Optional) Add Test conditions
Add dataset-level Pass/Fail conditions, like to check if all texts in the dataset are in < 100 symbols length. How to configure Tests.
6
(Optional) Add Tags and Timestamps
Add
tags or metadata to identify specific evaluation runs or datasets, or override the default timestamp . How to add metadata.7
Run the Report
To execute the eval,
runthe Report on the Dataset (or two).8
Explore the results