An Innovative Video Searching Approach using Video Indexing

Abstract
Authors
Keywords
Conclusion
References

Searching for a Video in World Wide Web has augmented expeditiously as there's been an explosion of growth in video on social media channels and networks in recent years. At present video search engines use the title, description, and thumbnail of the video for identifying the right one. In this paper, a novel video searching methodology is proposed using the Video indexing method. Video indexing is a technique of preparing an index, based on the content of video for the easy access of frames of interest. Videos are stored along with an index which is created out of video indexing technique. The video searching methodology check the content of index attached with each video to ensure that video is matching with the searching keyword and its relevance ensured, based on the word count of searching keyword in video index. The video searching methodology check the content of index attached with each video to ensure that video is matching with the searching keyword and its relevance ensured, based on the word count of searching keyword in video index. Video captions are generated by the deep learning network model by combining global local (glocal) attention and context cascading mechanisms using VIST-Visual Story Telling dataset. Video Index generator uses Wormhole algorithm, that ensure minimum worst-case time for searching a key with a length of L. Video searching methodology extracts the video clip where the frames of interest lies from the original huge sized source video. Hence, searcher can get and download a video clip instead of downloading entire video from the video storage. This reduces the bandwidth requirement and time taken to download the videos.

Published In : IJCSN Journal Volume 8, Issue 2

Date of Publication : April 2019

Pages : 144-147

Figures :01

Tables : --

Jaimon Jacob : achieved the degrees B.Tech in Computer Science and Engineering from University of Calicut in 2003, M.Tech in Digital Image processing from Anna University, Chennai in 2010, MBA in Information Technology from Sikkim Manipal University in 2012, M.Tech in Computer and Information Science from Cochin University of Science and Technology in 2014. Currently working as Asst. professor in Computer Science and Engineering, Department of Computer Science, Govt. Model Engineering College. Thrikkakara, Ernakulam, Kerala. Four International Conference papers and Two National Conference research papers published. Author passionate in research area "video processing". Associate with professional bodies ISTE,IETE and IE.

Prof.(Dr.) Sudeep Ilayidom : achieved the degrees B.Tech, M.Tech, PhD. Currently Working as Professor, Division of Computer Engineering ,School of Engineering, Cochin university of Science and Technology. Ernakulam, Kerala. Published a Text book on "Data mining and warehousing" by Cengage Fifty Five research papers published in the related area Data mining. A well known musician in Malayalam Film Industry. Passionate ion research area Data Mining, Big Data and related areas.

Prof.(Dr.) V.P.Devassia : achieved the degrees B.Sc. Engineering from MA College of Engineering, Kothamangalam, in 1983, M.Tech in Industrial Electronics from Cochin University of Science and Technology, Ph.D in Signal Processing from Cochin University of Science and Technology in 2001. Worked as Graduate Engineer(T) in Hindustan Paper Corporation Ltd, Design Engineer, HMT Limited, Principal, Govt. Model Engineering College, Ernakulam. Author passionate in research area Signal Processing. associate with professional bodies ISTE,IETE and IE.

Video Indexing, Video Searching, Visual Story Telling, Wormhole, glocal, VIST

In this paper, an Efficient Video Searching methodology using Video Indexing is proposed using the Video, Audio, and Textual information. RNN based speech recognition model is used for audio to text conversion, OCR technique is used for Text extraction from preprocessed frames. Video captions generated for preparing the video index from video content uses the VIST-Visual Story Telling dataset for the generation of multi stage cued story. Perfect and continuous Visual story formation ensured from the consecutive set of images by the deep learning network model. The features of Global attention for overall summarizing and Local attention for image specific encoding along with context cascading mechanisms efficiently caption the video given as input. Effective implementation of this methodology in Video Search Engine, will initiate incredible changes in data traffic by minimizing the size of video transport. Also, from the user point of view, the intended part of video only need be accessed.

[1] Cisco "Visual Networking Index: Forecast and Trends, 2017-2022", CISCO, February 27, 2019, https://www.cisco.com/c/en/us/solutions/collateral/service -provider/visual-networking-index-vni/white-paper-c11- 741490.pdf. [2] Zheng Cao, and Ming Zhu, "An Efficient Video Similarity Search Algorithm", IEEE Transactions on Consumer Electronics, Vol. 56, No. 2, May 2010. [3] Qiu Chen , Koji Kotani , Feifei Lee and Tadahiro Ohmi, "A fast search algorithm for large video database using HOG based features", David C., Wyld et al. (Eds) : ITCS, JSE, SIP, ARIA, NLP-2016, pp. 35-41, 2016. [4] H.Aradhye, G. Toderici, and J. Yagnik. "Video2text: Learning to annotate video content".ICDM Workshop on Internet Multimedia Mining, Google, Inc,USA,2009. [5] N. Krishnamoorthy, G. Malkarnenkar, R. J. Mooney, K. Saenko, and S. Guadarrama,. "Generating Naturallanguage video descriptions using text-mined knowledge". In Proceedings of the Workshop on Vision and Natural Language Processing, pp 10-19,July 2013. [6] J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell, "Long-term recurrent convolutional networks for visual recognition and description", arXiv:1411.4389v4 [cs.CV], May 2016 [7] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, "Show and tell: A Neural Image Caption Generator", arXiv:1411.4555v2 [cs.CV], April 2015. [8] S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, and K. Saenko. "Translating videos to natural language using deep recurrent neural networks", arXiv:1412.4729v3 [cs.CV], April, 2015. [9] J. Liu, Q.Yu, O. Javed, S.Ali, Amir Tamrakar, A.Divakaran , H. Cheng, H. Sawhney, "Video event recognition using concept attributes", IEEE Workshop on Applications of Computer Vision (WACV), March 2013 [10] Masoud Mazloom, Amirhossein Habibian and Cees G. M. Snoek ISLA,. "Querying for Video Events by Semantic Signatures from Few Examples". Proceedings of the 21st ACM International Conference on multimedia, pp 609- 612 October, 2013. [11] Subhashini Venugopalan, Marcus Rohrbach Jeff Donahue Raymond Mooney Trevor Darrell Kate Saenko, "Sequence to Sequence - Video to Text", Proceedings of the 2015, IEEE International Conference on Computer Vision (ICCV),Pages 4534-4542 December, 2015. [12] N.Gayathri, K.Mahesh, "A Systematic study on Video Indexing", International Journal of Pure and Applied Mathematics Volume 118 No. 8 2018, 425-428 [13] M.Ravinder, T.Venugopal, Sultanpur, Medak, "Content- Based Video Indexing and Retrieval using Key frames Texture, Edge and Motion Features", International Journal of Current Engineering and Technology, Vol.6, No.2,April,2016. [14] Natsuda Laokulrat, Sang Phan, Noriki Nishida, Raphael Shu Yo Ehara , Naoaki Okazaki, Yusuke Miyao and Hideki Nakayama, "Generating Video Description using Sequence-to-sequence Model with Temporal Attention", Proceedings of International Conference on Computational Linguistics: Technical Papers, pages 44- 52, Osaka, Japan, December, 2016. [15] Anubhav Kumar , Raj Kumar Goel , "An Efficient Algorithm for Text Localization and Extraction in Complex Video Text Images", IEEE International Conference on Information Management in the Knowledge Economy,2013. [16] T. K. Huang, F.Ferraro,"Visual storytelling". Annual Conference of the North American Chapter of the Association for Computational Linguistics, arXiv:1604.03968v1 [cs.CL], 2016. [17] T.Kim, Min-OhHeo, SeonilSon Kyoung-WhaPark ,Byoung-Tak Zhang "GLAC-Net: GLobal Attention Cascading Networks for Multi-Image Cued Story Generation", arXiv:1805.10973v3, Feb 2019. [18] Xingbo Wu, Fan Ni , Song Jiang, "Wormhole: A Fast Ordered Index for In-memory Data Management", arXiv:1805.02200v2 [cs.DB], May 2018.