English/Japanese

Yoshitaka Ushiku

Talk Slides

Keynotes

Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning from Yoshitaka Ushiku

Education

2009
BS of Engineering (The Univeresity of Tokyo)
2011
MA of Information Science and Technology (The University of Tokyo)
2014
Ph.D. (The University of Tokyo)

Profession

Apr. 2013 - Mar. 2014
Research Fellow, Japan Society for Promotion of Science
June 2013 - Aug. 2013
Intern, Microsoft Research Redmond
Apr. 2014 - Mar. 2016
Research Scientist, NTT Communication Science Laboratories.
Apr. 2016 - Sep. 2018
Lecturer (full-time), Department of Mechano-Informatics, Graduate School of Information Science and Technology, the University of Tokyo
June 2016 -
Visiting Researcher, National Institute of Advanced Industrial Science and Technology (AIST)
Sep. 2016 - Sep. 2018
Collaborative Researcher, National Institute for Japanese Language and Linguistics (NINJAL)
Apr. 2018 - Sep. 2018
Technical Advisor, OMRON SINIC X Corporation (OSX)
Oct. 2018 -
Principal Investigator, OMRON SINIC X Corporation (OSX)
Jan. 2019 -
Director, Chief Research Officer, Ridge-i Co., Ltd.
Apr. 2020 -
Lecturer (part-time), Tsuda University

Activity

Society

June 2018
International Conference on Multimedia Retrieval (ICMR 2018) Publication Co-chairs
October 2019
International Conference on Computer Vision (ICCV 2019) Workshop on Multi-Discipline Approach for Learning Concepts--Zero-Shot, One-Shot, Few-Shot and Beyond-- Organizer
November 2020
Asian Conference on Computer Vision (ACCV 2020) Area Chair

Reviewer

Conference
AAAI 2020, ACMMM 2013 2016 2018 2019, ACPR 2017, BMVC 2020, CVPR 2019 2020 2021, ECCV 2020, ICCV 2019 2021, ICLR 2020 2021, ICML 2021, IJCAI 2018 2019, NeurIPS 2020 2021, PCM 2018
Journal
Advanced Robotics, Computer Speech and Language, IEEE Access, International Journal of Computer Vision, Neural Networks, Pattern Recognition Letters, Robotics and Automation Letters, Transactions on Systems, Man and Cybernetics: Systems, Transactions on Affective Computing, Transactions on Audio, Speech and Language Processing, Transactions on Multimedia, Transactions on Multimedia Computing, Communications, and Applications, Transactions on Computer Vision and Applications, The Visual Computer

Biography

Yoshitaka Ushiku is a Principal Investigator at OMRON SINIC X and Chief Research Officer at Ridge-i. He received his B.E., M.A., and Ph.D. degrees from the University of Tokyo in 2009, 2011, and 2014, respectively. In 2014, he joined NTT CS Labs, Japan, where he was involved in research on image recognition. From 2016 to 2018, he was a lecturer at the University of Tokyo, Japan. Currently, he is a Principal Investigator at OMRON SINIC X and Chief Research Officer at Ridge-i since 2018 and 2019, respectively. His research interests lie in cross-media understanding through machine learning, mainly for computer vision and natural language processing. He received ACM Mutlimedia Grand Challenge Special Prize in 2011, ACM Multimedia Open Source Software Competition Honorable Mention in 2017, and NVIDIA Pioneering Research Awards in 2017 and 2018.

Contact

Papers

Journal (refereed)

  1. Taichi Nishimura, Atsushi Hashimoto, Yoshitaka Ushiku, Hirotaka Kameko, Yoko Yamakata, and Shinsuke Mori. Structure-Aware Procedural Text Generation From an Image Sequence. IEEE Access, Vol.9, pp.2125-2141, 2021.
  2. Hiroaki Minoura, Ryo Yonetani, Mai Nishimura, and Yoshitaka Ushiku. Crowd Density Forecasting by Modeling Patch-Based Dynamics. IEEE Robotics and Automation Letters, Vol.6, No.2, pp.287-294, 2021.
  3. Yusuke Mori, Hiroaki Yamane, Yoshitaka Ushiku, and Tatsuya Harada. How narratives move your mind: A corpus of shared-character stories for connecting emotional flow and interestingness. Information Processing & Management, Vol.56, No.5, pp.1865-1879, 2019.

International Conference (refereed)

  1. Qing Yu, Atsushi Hashimoto, and Yoshitaka Ushiku. Divergence Optimization for Noisy Universal Domain Adaptation. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  2. Ukyo Honda, Yoshitaka Ushiku, Atsushi Hashimoto, Taro Watanabe, and Yuji Matsumoto. Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning. The Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
  3. Taichi Nishimura, Suzushi Tomori, Hayato Hashimoto, Atsushi Hashimoto, Yoko Yamakata, Jun Harashima, Yoshitaka Ushiku, and Shinsuke Mori. Visual Grounding Annotation of Recipe Flow Graph. Language Resources and Evaluation Conference (LREC), 2020.
  4. Takuhiro Kaneko, Yoshitaka Ushiku, and Tatsuya Harada. Class-distinct and class-mutual image generation with GANs. British Machine Vision Conference (BMVC), 2019.
  5. Mikihiro Tanaka, Takayuki Itamochi, Kenichi Narioka, Ikuro Sato, Yoshitaka Ushiku, and Tatsuya Harada. Generating Easy-to-Understand Referring Expressions for Target Identifications. The IEEE International Conference on Computer Vision (ICCV), 2019.
  6. Takuhiro Kaneko, Yoshitaka Ushiku, and Tatsuya Harada. Label-noise robust generative adversarial networks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  7. Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, and Kate Saenko. Strong-weak distribution alignment for adaptive object detection. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
  8. Yang Li, Yoshitaka Ushiku, and Tatsuya Harada. Pose Graph Optimization for Unsupervised Monocular Visual Odometry. International Conference on Robotics and Automation (ICRA), 2019.
  9. Akane Iseki, Yusuke Mukuta, Yoshitaka Ushiku, and Tatsuya Harada. Estimating the causal effect from partially observed time series. The AAAI Conference on Artificial Intelligence (AAAI), 2019.
  10. Kohei Uehara, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada. Visual Question Generation for Class Acquisition of Unknown Objects. The 15th European Conference on Computer Vision (ECCV), 2018.
  11. Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada. Open Set Domain Adaptation by Backpropagation. The 15th European Conference on Computer Vision (ECCV), 2018.
  12. Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada. Customized Image Narrative Generation via Interactive Visual Question Generation and Answering. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (spotlight presentation)
  13. Atsushi Kanehira, Luc Van Gool, Yoshitaka Ushiku, Tatsuya Harada. Viewpoint-aware Video Summarization. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (spotlight presentation)
  14. Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada. Neural 3D Mesh Renderer. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (spotlight presentation)
  15. Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, Tatsuya Harada. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (oral presentation)
  16. Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada. Between-class Learning for Image Classification. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018..
  17. Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko. Adversarial Dropout Regularization. The 6th International Conference on Learning Representations (ICLR), 2018.
  18. Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada. Learning from Between-class Examples for Deep Sound Recognition. The 6th International Conference on Learning Representations (ICLR), 2018.
  19. Katsunori Ohnishi, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada. Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture. AAAI Conference on Artificial Intelligence (AAAI), 2018. (oral presentation)
  20. Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Alternating Circulant Random Features for Semigroup Kernels. AAAI Conference on Artificial Intelligence (AAAI), 2018.
  21. Masatoshi Hidaka, Yuichiro Kikura, Yoshitaka Ushiku, Tatsuya Harada. WebDNN: Fastest DNN Execution Framework on Web Browser. ACM International Conference on Multimedia (ACMMM), Open Source Software Competition, pp.1213-1216, 2017.
  22. Masataka Yamaguchi, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada. Spatio-temporal Person Retrieval via Natural Language Queries. IEEE International Conference on Computer Vision (ICCV), 2017.
  23. Qishen Ha, Kohei Watanabe, Takumi Karasawa, Yoshitaka Ushiku, Tatsuya Harada. MFNet: Towards Real-Time Semantic Segmentation for Autonomous Vehicles with Multi-Spectral Scenes. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
  24. Kuniaki Saito, Yoshitaka Ushiku, and Tatsuya Harada. Asymmetric Tri-training for Unsupervised Domain Adaptation. International Conference on Machine Learning (ICML), pp.2988-2997, 2017.
  25. Kuniaki Saito, Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. DualNet: Domain-Invariant Network for Visual Question Answering. IEEE International Conference on Multimedia and Expo (ICME), pp.829-834, 2017. (oral presentation)
  26. Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. Image Captioning with Sentiment Terms via Weakly-Supervised Sentiment Dataset. British Machine Vision Conference (BMVC), pp.53.1-53.12, 2016.
  27. Yoshitaka Ushiku, Masataka Yamaguchi, Yusuke Mukuta, and Tatsuya Harada. Common subspace for model and similarity: Phrase learning for caption generation from images. IEEE International Conference on Computer Vision (ICCV), pp.2668-2676, 2015. (acceptance rate: 30.9%)
  28. Yoshitaka Ushiku, Masatoshi Hidaka, and Tatsuya Harada. Three guidelines of online learning for large-scale visual recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3574-3581, 2014. (acceptance rate: 29.9%)
  29. Asako Kanezaki, Shogo Inaba, Yoshitaka Ushiku, Yukihiko Yamashita, Hiroaki Muraoka, Yasuo Kuniyoshi, and Tatsuya Harada. Hard negative classes for multiple object detection. IEEE International Conference on Robotics and Automation (ICRA), pp.3066-3073, 2014.
  30. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Efficient Image Annotation for Automatic Sentence Generation. ACM International Conference on Multimedia (ACMMM), pp.549-558, 2012. (full paper, acceptance rate: 20.2%)
  31. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Understanding Images with Natural Sentences. ACM International Conference on Multimedia (ACMMM), Multimedia Grand Challenge, pp.679-682, 2011. (Special Prize on the Best Application of a Theoretical Framework) [pdf]
  32. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Automatic Sentence Generation from Images. ACM International Conference on Multimedia (ACMMM), pp.1533-1536, 2011. (short, acceptance rate: usually 30%) [pdf]
  33. Tatsuya Harada, Yoshitaka Ushiku, Yuya Yamashita, and Yasuo Kuniyoshi. Discriminative Spatial Pyramid. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1617-1624, 2011. (acceptance rate: 26.4%) [pdf]
  34. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Improvement of Image Similarity Measures for Image Browsing and Retrieval Via Latent Space Learning between Images and Long Texts. IEEE International Conference on Image Processing (ICIP), pp.2365-2368, 2010. [pdf]

International Conference (unrefereed or workshop)

  1. Kuniaki Saito, Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Deep Modality Invariant Adversarial Network for Shared Representation Learning. The 16th International Conference on Computer Vision Workshop on Transferring and Adapting Source Knowledge in Computer Vision (ICCV, Workshop), 2017.
  2. Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Spatial-Temporal Weighted Pyramid using Spatial Orthogonal Pooling. The 16th International Conference on Computer Vision Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision (ICCV, Workshop), 2017.
  3. Takumi Karasawa, Kohei Watanabe, Qishen Ha, Antonio Tejero-De-Pablos, Yoshitaka Ushiku, Tatsuya Harada. Multispectral Object Detection for Autonomous Vehicles. The 25th Annual ACM International Conference on Multimedia (ACMMM), 2017, (workshop).
  4. Yoshitaka Ushiku, Hiroshi Muraoka, Sho Inaba, Teppei Fujisawa, Koki Yasumoto, Naoyuki Gunji, Takayuki Higuchi, Yuko Hara, Tatsuya Harada, and Yasuo Kuniyoshi. ISI at ImageCLEF 2012: Scalable System for Image Annotation. the 3rd Conference and Labs of the Evaluation Forum (CLEF 2012), pp.1-12, 2012.

Technical Report

  1. Shoji Yamamoto, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, and Tatsuya Harada. Conditional Video Generation Using Action-Appearance Captions. arXiv, 1812.01261, 2018.
  2. Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA). arXiv, 1609.06657, 2016.

Domestic Journal (refereed, In Japanese)

Go to japanese page for domestic papers.

Domestic Conference (refereed, In Japanese)

Go to japanese page for domestic papers.

Domestic Conference (unrefereed, In Japanese)

Go to japanese page for domestic papers.

Books

  1. Yoshitaka Ushiku. Long Short-Term Memory. In: Ikeuchi K. (eds) Computer Vision. Springer, 2020.

Go to japanese page for domestic books.

Invited Talks

  1. Yoshitaka Ushiku. Multimodal Understanding: Vision and Language, and its Beyond. International Workshop on Frontiers of Computer Vision, Daegu, Korea, 2021/02/22.
  2. Yoshitaka Ushiku. Deep Learning for Natural Language Processing and Computer Vision. Tutorial on Asian Conference on Machine Learning, Nagoya, Japan, 2019/11/17.
  3. Yoshitaka Ushiku. Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning. Workshop of Machine Learning under International Conference on Document Analysis and Recognition, Kyoto, Japan, 2017/11/11.
  4. Yoshitaka Ushiku. Recognize, Describe, and Generate: Introduction of Recent Work at MIL. GPU Technology Conference, San Jose, CA, 2017/05/11.
  5. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Efficient Image Annotation for Automatic Sentence Generation. Greater Tokyo Area Multimedia/Vision Workshop, Tokyo, Japan, 2012/08/30.

Go to japanese page for domestic talks.

Awards and Competitions

  1. 2018. NVIDIA Pioneering Research Awards for Neural 3D Mesh Renderer.
  2. 2017. NVIDIA Pioneering Research Awards for Asymmetric Tri-training for unsupervised domain adoptation.
  3. 2017. Honorable Mention. ACM Multimedia Open Source Software Competition.
  4. 2016. First place in the abstract image task. Visual Question Answering Challenge 2016.
  5. 2012. First place in the fine-grained classification task, second place in the classification task. Large Scale Visual Recognition Challenge 2012 (ILSVRC2012).
  6. 2011. Special Prize on the Best Application of a Theoretical Framework. ACM Mutlimedia Grand Challenge.
  7. 2011. Third place in the classification task, second place in the detection task. Large Scale Visual Recognition Challenge 2011 (ILSVRC2011).
  8. 2010. Third place. Large Scale Visual Recognition Challenge 2010 (ILSVRC2010).

Go to japanese page for domestic awards.