Development of a System of Five Neural Networks for TV Broadcast Analysis

AI-Powered TV Broadcast Analysis

The goal

Our client is a developer of digital tools for marketers. This time, our task was to create a service for TV broadcast analysis. The goal of this analysis is to fine-tune advertising for TV viewers based on the content they watch.

Timeline

6 months

Year

2024

Technologies

What We Decided to Analyze and Which Neural Networks We Used

For video content analysis, we integrated five neural networks into the project:

  • The first YOLOv8 detects logos.
  • The second YOLOv8 identifies common objects (umbrella, ball, person, sneakers, dog, etc.).
  • Rev AI transcribes speech and sends the text to ChatGPT.
  • ChatGPT extracts mentions of brands, cities, and celebrities from the transcript and determines sentiment (positive/negative attitude of the speaker toward an entity).
  • Tesseract recognizes static text within the video frame, such as subtitles, captions, and text-based logos.

Case banner (mobile version)

How the Project Works: A Look at Microservices

TV broadcast analysis needs to be fast, making performance a major challenge. To ensure smooth operation, we implemented a microservice architecture.

The product consists of multiple services, each handling a specific task. This approach makes the project highly scalable, allowing the owner to add new services to enhance performance and increase network capacity as needed.

Case banner (mobile version)

Where We Got the Datasets for Logo Detection

To train YOLOv8 to recognize standard objects like people, cars, animals, and clothing, we used the COCO dataset without making any modifications.

Logo detection was more challenging, but we ultimately chose the open-source OpenLogo dataset, which contains data on 352 logos and 27,000 labeled images.

Case banner (mobile version)

Augmentation and Balancing

After discussions with the client, we decided to add 40 more logos to our dataset. To train the neural network to detect each logo, we gathered at least 50 images per logo and augmented them threefold.

We also had to adjust the original OpenLogo dataset. The number of images per logo was uneven—some logos had only 20 images, while others had 150.

To fix this, we balanced the dataset. Before training, we identified logos with fewer images than others. During training, we penalized the model more heavily when it missed logos with fewer labeled images.

Case banner (mobile version)

Project Status and Next Steps

Currently, our service analyzes video files uploaded by users. It takes 5 minutes to process a 10-minute video. In the near future, we plan to integrate WebSocket support and expand the gateway to enable live TV broadcast analysis.

Next, we will integrate the project's backend as an API into the client’s digital products. This will allow European marketers to use our service to launch ads at the most relevant moments.

For example, if the main character in an action movie is wearing Nike sneakers, why not reinforce the impact with a Nike commercial right after the film?

Project team

Daniil Semenov

Project manager

Yuri Umnov

ML engineer

Danila Skablov

Backend developer

Ivan Petrov

Backend developer

Ready to discuss your project?

Our contacts

Fill out the form to the bottom or email

Email: business@unistory.orgTelegram: unistoryapp

We'll get back to you shortly!

By clicking the button, you consent to the processing of personal data and agree to the privacy policy.

Almaty office

st. Rozybakieva 289/1, office 36,
Almaty, Kazakhstan, 050060

Integrating the future


© 2025 Unistory