8 months ago

Video Captioning

Video Processing

Computer Vision

Minkuk Kim¹, Hyeon Bae Kim¹, Jinyoung Moon², Jinwoo Choi¹, Seong Tae Kim¹

Abstract

With the growing demand for solutions to real-world video challenges,interest in dense video captioning (DVC) has been on the rise. DVC involves theautomatic captioning and localization of untrimmed videos. Several studieshighlight the challenges of DVC and introduce improved methods utilizing priorknowledge, such as pre-training and external memory. In this research, wepropose a model that leverages the prior knowledge of human-orientedhierarchical compact memory inspired by human memory hierarchy and cognition.To mimic human-like memory recall, we construct a hierarchical memory and ahierarchical memory reading module. We build an efficient hierarchical compactmemory by employing clustering of memory events and summarization using largelanguage models. Comparative experiments demonstrate that this hierarchicalmemory recall process improves the performance of DVC by achievingstate-of-the-art performance on YouCook2 and ViTT datasets.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Video Captioning

Video Processing

Computer Vision

Minkuk Kim¹, Hyeon Bae Kim¹, Jinyoung Moon², Jinwoo Choi¹, Seong Tae Kim¹

Abstract

With the growing demand for solutions to real-world video challenges,interest in dense video captioning (DVC) has been on the rise. DVC involves theautomatic captioning and localization of untrimmed videos. Several studieshighlight the challenges of DVC and introduce improved methods utilizing priorknowledge, such as pre-training and external memory. In this research, wepropose a model that leverages the prior knowledge of human-orientedhierarchical compact memory inspired by human memory hierarchy and cognition.To mimic human-like memory recall, we construct a hierarchical memory and ahierarchical memory reading module. We build an efficient hierarchical compactmemory by employing clustering of memory events and summarization using largelanguage models. Comparative experiments demonstrate that this hierarchicalmemory recall process improves the performance of DVC by achievingstate-of-the-art performance on YouCook2 and ViTT datasets.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp