Format conversation transcription

Convert a file containing diarized conversation into finetuning compatible format

source

convert_file

 convert_file (input_file_path, assistant_speaker, output_file_path=None,
               system_context=None)

*Convert a file containing speaker dialogues to JSON format.

Args: input_file_path (str): Path to the input text file. assistant_speaker (str): The speaker number to designate as the assistant. output_file_path (str, optional): Path to save the output JSON file. If not provided, returns the JSON string.

Returns: str or None: If output_file_path is not provided, returns the JSON string. Otherwise, saves to file and returns None.*


source

process_speakers

 process_speakers (input_text, assistant_speaker, system_context=None)

*Process the input text and convert it to the desired JSON format.

Args: input_text (str): The input text containing speaker dialogues. assistant_speaker (str): The speaker number to designate as the assistant.

Returns: dict: A dictionary containing the processed messages.*