Overview

Define and run the eval

To log the evaluation results to the Evidently Platform, first connect to Evidently Cloud or your local workspace and create a Project. It’s optional: you can also run evals locally in Python.

Prepare the input data

Get your data in a table like a pandas.DataFrame. More on data requirements. You can also load data from Evidently Platform, like tracing data you captured or synthetic datasets.

Create a Dataset object

Create a Dataset object with DataDefinition() that specifies column role and types. You can also use default type detection. How to set Data Definition.

eval_data = Dataset.from_pandas(
    source_df,
    data_definition=DataDefinition()
)

(Optional) Add descriptors

For LLM and text evals, define row-level descriptors to compute. Here, you can use a variety of methods, from deterministic to LLM judges. Optionally, add row-level tests to get explicit pass/fail outcomes on set conditions. How to use Descriptors.

eval_data.add_descriptors(descriptors=[
    TextLength("Question", alias="Length"),
    Sentiment("Answer", alias="Sentiment")
])

Configure Report

For dataset-level evals (classification, data drift) or to summarize descriptors, create a Report with chosen metrics or presets. How to configure Reports.

report = Report([
    DataSummaryPreset()
])

(Optional) Add Test conditions

Add dataset-level Pass/Fail conditions, like to check if all texts in the dataset are in < 100 symbols length. How to configure Tests.

report = Report([
    DataSummaryPreset(),
    MaxValue(column="Length", tests=[lt(100)]),
])

(Optional) Add Tags and Timestamps

Add tags or metadata to identify specific evaluation runs or datasets, or override the default timestamp . How to add metadata.

Run the Report

To execute the eval, runthe Report on the Dataset (or two).

my_eval = report.run(eval_data, None)

Explore the results

To upload to the Evidently Platform. How to upload results.

ws.add_run(project.id, my_eval, include_data=True)

To view locally. All output formats.

my_eval
##my_eval.json()

Quickstarts

Check for end-to-end examples:

LLM quickstart

Evaluate the quality of text outputs.

ML quickstart

Test tabular data quality and data drift.

Get Started

Setup

Evaluation library

Platform

Define and run the eval

Quickstarts

LLM quickstart

ML quickstart

Get Started

Setup

Evaluation library

Platform

​Define and run the eval

​Quickstarts

LLM quickstart

ML quickstart

Define and run the eval

Quickstarts