An MIT team of researchers has managed to create an AI algorithm which can look at a picture and make a video prediction of how the scene may unfold.
The team of scientists behind the projects is part of the Massachusetts Institute of Technology or MIT Computer Science and Artificial Intelligence Laboratory or CSAIL.
The study was led by first-time author and MIT CSAIL Ph.D. student, Carl Vondrick and was made with the support of various programs.
These include the UMBC START program, a Goggle Ph.D. fellowship, and the National Science Foundation.
According to Vondrick, the AI algorithm can help machines read and generate human activities without the need of extensive annotations.
As the AI algorithm is presented with a photo or another still image, it can analyze and predict the potential future of the scene by generating a brief video.
The team of researchers stated that the future usages of the algorithm can be quite diverse. Such a technology could come to be used in anything from improving security tactics to ensuring a safer self-driving vehicle.
A paper explaining the algorithm was written in association with two an MIT professor and a former CSAIL postdoc.
It will be presented during the Neural Information Processing Systems or NIPS conference which will be taking place next week in Barcelona.
The model developed by Antonio Torralba, the associated MIT professor, will be able to generate completely new videos.
As the AI algorithm will be able to process the entire scene from a single view, it will be able to generate up to 32 new frames per second.
In order to be able to generate such an algorithm, an AI or Artificial Intelligence model had to be used. The AI technology is the only one capable of learning the processing required by the algorithm.
The team used AI so as to teach it to generate images which resulted into videos by using a fixed learning method.
This had the AI generate separate foreground and background elements which were then placed in the scene. By analyzing which object moved and which didn’t, the AI algorithm can move which elements change and move.
In order to achieve this feature, researchers used “adversarial learning”. This is a deep-learning technique which trains two neural networks which compete with one another.
As one of the neural networks generates a video, the other can discriminate the real from the generated video.
With the repetition of the process, the generator can learn to fool or trick the discriminator.
As the generator can make videos resembling anything from beaches to train stations, and their various animated functions, it still needs various improvements.
The current AI algorithm is quite short as they are only 1 and a half seconds long. The team stated its hope that future videos will be longer.
Current model settings will also have to be adjusted as objects and humans tend to be larger than life.
Following its improvements, the AI algorithm could come to be used in many areas. As it could bring animation to still photos, the Harry Potter-like skills are not its only use, although it may one of its most entertaining.
Image Source: Wikimedia