Skip to content

Transform YouTube Videos into Conversational Avatars with Sieve API toolkit.

Notifications You must be signed in to change notification settings

aksharasoman/video2avatar-dialogue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transform YouTube Videos into Conversational Avatars with the Sieve API toolkit.

This repository contains the script pipeline.py to transform YouTube videos into engaging, conversational avatars using Sieve APIs. The goal is to automate the process of repurposing video content into interactive dialogues between two talking avatars, ideal for storytelling, educational purposes, or creating dynamic presentations.

Overview

The pipeline.py script achieves the following:

  1. Download YouTube Video: Extracts video content using the Sieve function.
  2. Summarize Content: Converts the video content into a conversational-style summary between two speakers.
  3. Text-to-Speech Conversion: Uses Sieve's TTS API to convert the summarized dialogue into speech.
  4. Talking Avatar Generation: Creates two distinct avatars to narrate the conversation using Sieve's portrait-avatar API.
  5. Merge Video Clips: Combines individual video segments into a final video using ffmpeg.

Key Benefits

  • Repurpose Content: Convert lengthy videos into bite-sized, conversational narratives.
  • Interactive Presentations: Make content more engaging with avatars.
  • Time Efficiency: Summarization saves time while retaining the core message.
  • Creative Possibilities: Perfect for storytelling, education, or marketing.

Files

  • pipeline.py: the script implementing the 'video2dialogue' tool.
  • parallelized_version/pipeline.py: It's efficient parallelized version.
  • examples/ : contains a sample output video and summary text generated.

Installation

Requirements

  • Python 3.7+
  • Sieve Python Client
  • ffmpeg
  1. Clone this repository:

    git clone https://github.com/yourusername/video2dialogue.git
    cd video2dialogue
  2. Install dependencies:

    pip install sievedata
  3. Authenticate with Sieve:

    sieve login
  4. Run the Script Ensure pipeline.py is in your project folder:

    Execute the pipeline with:

    python pipeline.py
  5. Output

    The final video featuring talking avatars will be saved in your project directory. Logs and job statuses can be monitored on the Sieve dashboard.

Explanation of pipeline.py

The script follows these steps, with the main part outlined below:

  1. Download YouTube Video:

    youtube_to_mp4 = sieve.function.get("sieve/youtube_to_mp4")
    output_video = youtube_to_mp4.run(url, resolution="highest-available", include_audio=True)
  2. Summarize as Conversation:

    visual_summarizer = sieve.function.get("sieve/visual-qa")
    summary_as_conversation = visual_summarizer.run(output_video, prompt="Summarize into a dialogue between 2 people.", fps=1)

    The use of an appropriate prompt is important.

  3. Text-to-Speech and Avatar Generation:

    tts = sieve.function.get("sieve/tts")
    portrait_avatar = sieve.function.get("sieve/portrait-avatar")

We run these sieve functions iteratively for each turn of the conversation to generate the corresponding avatar videos. For different spekers in the conversation, input different voices for the tts function and different avatar images for the portrait-avatar function .

  1. Merge Video Clips:
    ffmpeg -f concat -safe 0 -i file_list.txt -c copy output.mp4

Tutorial

For a detailed explanation, follow the tutorial here.

Demo

For a complete working example, see the demo here.

Acknowledgments

Special thanks to Sieve for their powerful APIs that made this project possible.

About

Transform YouTube Videos into Conversational Avatars with Sieve API toolkit.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published