import os
from improve_diarization_with_llm import claude_corrector
'ANTHROPIC_API_KEY'] = 'your-api-key' # Replace with your actual API key
os.environ[= 'path/to/your/input/transcript.txt' # Replace with your actual input file path
input_file = 'path/to/your/output/improved_transcript.txt' # Replace with your desired output file path
output_file
= claude_corrector.ClaudeDiarizationCorrector(input_file, output_file)
corrector
# corrector.process_conversation() this assumes a valid ANTHROPIC_API_KEY environment variable and input path
improve-diarization-with-llm
This tool can take a long script (greater than 10 hours) of diarized content and improve the diarization by prompting an LLM model to look for obviously incorrect attribution and fix it. Credit to this paper for the idea: https://arxiv.org/html/2401.03506v4
Install
pip install improve_diarization_with_llm