Querying Feedback
LangSmith makes it easy to fetch feedback associated with your runs. The Run
object itself has aggregate feedback_stats
on its body, which may
satisfy your needs. If you want the additional feedback metadata (or full list of feedback objects), you can use the SDK or API to list feedback objects based on run IDs, feedback keys, and feedback source types.
Using the list_feedback
method in the SDK or /feedback
endpoint in the API, you can fetch feedback to analyze. Most simple requests can be satisfied using the following arguments:
run_ids
: Fetch feedback for specific runs by providing their IDs.feedback_keys
: Filter feedback by specific keys, such as 'correctness' or 'quality'.feedback_source_types
: Filter feedback by the source type, such as 'model' for model-generated feedback or 'api' for feedback submitted via the API.
All the examples below assume you have created a LangSmith client and configured it with your API key to connect to the LangSmith server.
- Python SDK
- TypeScript SDK
from langsmith import Client
client = Client()
import { Client, Feedback } from "langsmith";
const client = new Client();
List all feedback for a specific run
- Python SDK
- TypeScript SDK
run_feedback = client.list_feedback(run_ids=["<run_id>"])
const runFeedback: Feedback[] = [];
for await (const feedback of client.listFeedback({
runIds: ["<run_id>"],
})) {
runFeedback.push(feedback);
}
List all feedback with a specific key
- Python SDK
- TypeScript SDK
correctness_feedback = client.list_feedback(feedback_key=["correctness"])
const correctnessFeedback: Feedback[] = [];
for await (const feedback of client.listFeedback({
feedbackKeys: ["correctness"],
})) {
correctnessFeedback.push(feedback);
}
List all model-generated feedback
- Python SDK
- TypeScript SDK
model_feedback = client.list_feedback(feedback_source_type=["model"])
const modelFeedback: Feedback[] = [];
for await (const feedback of client.listFeedback({
feedbackSourceTypes: ["model"],
})) {
modelFeedback.push(feedback);
}
Use Cases
Here are a few common use cases for querying feedback:
Compare model-generated and human feedback for a set of runs
After querying for a set of runs, you can compare the model-generated and human feedback for those specific runs:
- Python SDK
- TypeScript SDK
# Query for runs
runs = client.list_runs(project_name="<your_project>", filter='gt(start_time, "2023-07-15T00:00:00Z")')
# Extract run IDs
run_ids = [run.id for run in runs]
# Fetch model-generated feedback for the runs
model_feedback = client.list_feedback(run_ids=run_ids, feedback_source_type=["model"])
# Fetch human feedback for the runs
human_feedback = client.list_feedback(run_ids=run_ids, feedback_source_type=["api"])
// Query for runs
const runs: Run[] = [];
for await (const run of client.listRuns({
projectName: "<your_project>",
filter: 'gt(start_time, "2023-07-15T00:00:00Z")',
})) {
runs.push(run);
}
// Extract run IDs
const runIds = runs.map(run => run.id);
// Fetch model-generated feedback for the runs
const modelFeedback: Feedback[] = [];
for await (const feedback of client.listFeedback({
runIds,
feedbackSourceTypes: ["model"],
})) {
modelFeedback.push(feedback);
}
// Fetch human feedback for the runs
const humanFeedback: Feedback[] = [];
for await (const feedback of client.listFeedback({
runIds,
feedbackSourceTypes: ["api"],
})) {
humanFeedback.push(feedback);
}
Analyze feedback for a specific key and set of runs
If you're interested in analyzing feedback for a specific key, such as 'correctness' or 'quality', for a set of runs, you can query for the runs and then filter the feedback by key:
- Python SDK
- TypeScript SDK
# Query for runs
runs = client.list_runs(project_name="<your_project>", filter='gt(start_time, "2023-07-15T00:00:00Z")')
# Extract run IDs
run_ids = [run.id for run in runs]
# Fetch correctness feedback for the runs
correctness_feedback = client.list_feedback(run_ids=run_ids, feedback_key=["correctness"])
# Analyze the correctness scores
scores = [feedback.score for feedback in correctness_feedback]
// Query for runs
const runs: Run[] = [];
for await (const run of client.listRuns({
projectName: "<your_project>",
filter: 'gt(start_time, "2023-07-15T00:00:00Z")',
})) {
runs.push(run);
}
// Extract run IDs
const runIds = runs.map(run => run.id);
// Fetch correctness feedback for the runs
const correctnessFeedback: Feedback[] = [];
for await (const feedback of client.listFeedback({
runIds,
feedbackKeys: ["correctness"],
})) {
correctnessFeedback.push(feedback);
}
// Analyze the correctness scores
const scores = correctnessFeedback.map(feedback => feedback.score);