Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos

نویسندگان

  • Hossein Hosseini
  • Baicen Xiao
  • Radha Poovendran
چکیده

Despite the rapid progress of the techniques for image classification, video annotation has remained a challenging task. Automated video annotation would be a breakthrough technology, enabling users to search within the videos. Recently, Google introduced the Cloud Video Intelligence API for video analysis. As per the website, the system can be used to “separate signal from noise, by retrieving relevant information at the video, shot or per frame” level. A demonstration website has been also launched, which allows anyone to select a video for annotation. The API then detects the video labels (objects within the video) as well as shot labels (description of the video events over time). In this paper, we examine the usability of the Google’s Cloud Video Intelligence API in adversarial environments. In particular, we investigate whether an adversary can subtly manipulate a video in such a way that the API will return only the adversary-desired labels. For this, we select an image, which is different from the video content, and insert it, periodically and at a very low rate, into the video. We found that if we insert one image every two seconds, the API is deceived into annotating the video as if it only contained the inserted image. Note that the modification to the video is hardly noticeable as, for instance, for a typical frame rate of 25, we insert only one image per 50 video frames. We also found that, by inserting one image per second, all the shot labels returned by the API are related to the inserted image. We perform the experiments on the sample videos provided by the API demonstration website and show that our attack is successful with different videos and images.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Atacking Automatic Video Analysis Algorithms: A Case Study of Google Cloud Video Intelligence API

Due to the growth of video data on Internet, automatic video analysis has gained a lot of attention from academia as well as companies such as Facebook, Twitter and Google. In this paper, we examine the robustness of video analysis algorithms in adversarial settings. Specifically, we propose targeted attacks on two fundamental classes of video analysis algorithms, namely video classification an...

متن کامل

Deceiving Google's Perspective API Built for Detecting Toxic Comments

Social media platforms provide an environment where people can freely engage in discussions. Unfortunately, they also enable several problems, such as online harassment. Recently, Google and Jigsaw started a project called Perspective, which uses machine learning to automatically detect toxic language. A demonstration website has been also launched, which allows anyone to type a phrase in the i...

متن کامل

Videos as Global Networks in the Practice of Migration (An Iranian Case Study)

Network society is an ever-changing robust system expanding new nods as long as they can communicate. Videos, as a source of information and communication, are one of the most strategic nods in this architecture. The present study is a scholarly attempt in investigating the effects of videos on facilitating the process of migration for the Iranian students. To this end, our case studies partici...

متن کامل

Live coding youtube: organizing streaming media for an audiovisual performance

Music listening has changed greatly with the emergence of music streaming services, such as Spotify and YouTube. In this paper, we discuss an artistic practice that organizes streaming videos to perform a real-time improvisation via live coding. A live coder uses any available video from YouTube, a video streaming service, as source material to perform an improvised audiovisual piece. The chall...

متن کامل

Fast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard

three dimensional- high efficiency video coding (3D-HEVC) is the expanded version of the latest video compression standard, namely high efficiency video coding (HEVC), which is used to compress 3D videos. 3D videos include texture video and depth map. Since the statistical characteristics of depth maps are different from those of texture videos, new tools have been added to the HEVC standard fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017