English/Japanese

Yoshitaka Ushiku

Talk Slides

Keynotes

Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning from Yoshitaka Ushiku

Education

2009
BS of Engineering (The Univeresity of Tokyo)
2011
MA of Information Science and Technology (The University of Tokyo)
2014
Ph.D. (The University of Tokyo)

Profession

Apr. 2013 - Mar. 2014
Research Fellow, Japan Society for Promotion of Science
June 2013 - Aug. 2013
Intern, Microsoft Research Redmond
Apr. 2014 - Mar. 2016
Research Scientist, NTT Communication Science Laboratories.
Apr. 2016 - Sep. 2018
Associate Professor, Department of Mechano-Informatics, Graduate School of Information Science and Technology, the University of Tokyo
June 2016 -
Visiting Researcher, National Institute of Advanced Industrial Science and Technology (AIST)
Sep. 2016 - Sep. 2018
Collaborative Researcher, National Institute for Japanese Language and Linguistics (NINJAL)
Apr. 2018 - Sep. 2018
Technical Advisor, OMRON SINIC X Corporation (OSX)
Oct. 2018 -
Principal Investigator, OMRON SINIC X Corporation (OSX)
Jan. 2019 - Oct. 2020
Chief Research Officer, Ridge-i Co., Ltd.
Apr. 2020 - Mar. 2023
Lecturer (part-time), Tsuda University
Nov. 2020 - Oct. 2021
External Director, Chief Research Officer, Ridge-i Co., Ltd.
July 2021 -
Lecturer (part-time), Tohoku University
Nov. 2021 -
Chief Research Officer, Ridge-i Co., Ltd.
Jan. 2022 -
Chief Executive Officer, Nine Bulls, LLC (In Japanese only)
Oct. 2023 -
Project Manager, National Institute of Advanced Industrial Science and Technology (AIST) Kakusei Project

Activity

Society

June 2018
International Conference on Multimedia Retrieval (ICMR) Publication Co-chairs
October 2019
International Conference on Computer Vision (ICCV) Workshop on Multi-Discipline Approach for Learning Concepts--Zero-Shot, One-Shot, Few-Shot and Beyond-- Organizer
November 2020
Asian Conference on Computer Vision (ACCV) Area Chair
December 2022
Asian Conference on Computer Vision (ACCV) Industrial Chair
August 2023
International Joint Conferences on Artificial Intelligence (IJCAI) Area Chair
December 2023
Neural Information Processing Systems (NeurIPS) Track Datasets and Benchmarks Area Chair

Reviewer

Conference
AAAI 2020, ACMMM 2013 2016 2018 2019, ACPR 2017, BMVC 2020, CVPR 2019 2020 2021 2022 2023, ECCV 2020 (Outstanding Reviewer) 2022, ICCV 2019 2021, ICLR 2020 2021 2022 (Highlighted Reviewer) 2023, ICML 2021 2022, IJCAI 2018 2019, NeurIPS 2020 2021 2022, PCM 2018
Journal
Advanced Robotics, Computer Speech and Language, IEEE Access, International Journal of Computer Vision, Neural Networks, Pattern Recognition Letters, Robotics and Automation Letters, Speech and Language Processing, The Visual Computer Transactions on Affective Computing, Transactions on Audio, Transactions on Computer Vision and Applications, Transactions on Intelligent Systems and Technology, Transactions on Multimedia, Transactions on Multimedia Computing, Communications, and Applications, Transactions on Pattern Analysis and Machine Intelligence, Transactions on Systems, Man and Cybernetics: Systems.

Biography

Yoshitaka Ushiku is a Principal Investigator at OMRON SINIC X and Chief Research Officer at Ridge-i. He received his B.E., M.A., and Ph.D. degrees from the University of Tokyo in 2009, 2011, and 2014, respectively. In 2014, he joined NTT CS Labs, Japan, where he was involved in research on image recognition. From 2016 to 2018, he was an Associate Professor at the University of Tokyo, Japan. Currently, he is a Principal Investigator at OMRON SINIC X and Chief Research Officer at Ridge-i since 2018 and 2019, respectively. Since 2022, he is also the managing partner of Nine Bulls, LLC. His research interests lie in cross-media understanding through machine learning, mainly for computer vision and natural language processing. He received ACM Mutlimedia Grand Challenge Special Prize in 2011, ACM Multimedia Open Source Software Competition Honorable Mention in 2017, and NVIDIA Pioneering Research Awards in 2017 and 2018.

Contact

Papers

Journal (refereed)

  1. Naoya Chiba, Yuta Suzuki, Tatsunori Taniai, Ryo Igarashi, Yoshitaka Ushiku, Kotaro Saito, and Kanta Ono. Neural structure fields with application to crystal structure autoencoders. Communications Materials, Vol.4, No.1, p.106, 2023.
  2. Kazuhiro Ogata, Reo Gakumi, Atsushi Hashimoto, Yoshitaka Ushiku, and Shigeo Yoshida. The influence of Bouba-and Kiki-like shape on perceived taste of chocolate pieces. Frontiers in Psychology, Vol.14, 2023.
  3. Taichi Nishimura, Atsushi Hashimoto, Yoshitaka Ushiku, Hirotaka Kameko, and Shinsuke Mori. State-aware video procedural captioning. Multimedia Tools and Applications, Vol.82, pp.37273-37301, 2023.
  4. Yutaka Maruyama, Ryo Igarashi, Yoshitaka Ushiku, and Ayori Mitsutake. Analysis of Protein Folding Simulation with Moving Root Mean Square Deviation. Journal of Chemical Information and Modeling, Vol.63, No.5, pp.1529-1541, 2023.
  5. Yuta Suzuki, Tatsunori Taniai, Kotaro Saito, Yoshitaka Ushiku, and Kanta Ono. Self-supervised learning of materials concepts from crystal structures via deep neural networks. Machine Learning: Science and Technology, pp.2632-2153, 2022.
  6. Mutsuki Nakahara, Mai Nishimura, Yoshitaka Ushiku, Takayuki Nishio, Kazuki Maruta, Yu Nakayama, and Daisuke Hisano. Edge Computing-Assisted DNN Image Recognition System With Progressive Image Retransmission. IEEE Access, Vol.10, pp.91253-91262, 2022.
  7. Takehiko Ohkawa, Takuma Yagi, Atsushi Hashimoto, Yoshitaka Ushiku, and Yoichi Sato. Foreground-Aware Stylization and Consensus Pseudo-Labeling for Domain Adaptation of First-Person Hand Segmentation. IEEE Access, Vol.9, pp.94644-94655, 2021.
  8. Taichi Nishimura, Atsushi Hashimoto, Yoshitaka Ushiku, Hirotaka Kameko, Yoko Yamakata, and Shinsuke Mori. Structure-Aware Procedural Text Generation From an Image Sequence. IEEE Access, Vol.9, pp.2125-2141, 2021.
  9. Hiroaki Minoura, Ryo Yonetani, Mai Nishimura, and Yoshitaka Ushiku. Crowd Density Forecasting by Modeling Patch-Based Dynamics. IEEE Robotics and Automation Letters, Vol.6, No.2, pp.287-294, 2021.
  10. Yusuke Mori, Hiroaki Yamane, Yoshitaka Ushiku, and Tatsuya Harada. How narratives move your mind: A corpus of shared-character stories for connecting emotional flow and interestingness. Information Processing & Management, Vol.56, No.5, pp.1865-1879, 2019.

International Conference (refereed)

  1. Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki, Naoya Chiba, Kotaro Saito, Yoshitaka Ushiku, and Kanta Ono. Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding. The 12th International Conference on Learning Representations (ICLR), 2024.
  2. Yusaku Nakajima, Masashi Hamaya, Kazutoshi Tanaka, Takafumi Hawai, Felix von Drigalski, Yasuo Takeichi, Yoshitaka Ushiku, and Kanta Ono. Robotic Powder Grinding with Audio-Visual Feedback for Laboratory Automation in Materials Science. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.
  3. Yusaku Nakajima, Masashi Hamaya, Yuta Suzuki, Takafumi Hawai, Felix von Drigalski, Kazutoshi Tanaka, Yoshitaka Ushiku, and Kanta Ono. Robotic Powder Grinding with a Soft Jig for Laboratory Automation in Material Science. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.
  4. Keisuke Shirai, Atsushi Hashimoto, Taichi Nishimura, Hirotaka Kameko, Shuhei Kurita, Yoshitaka Ushiku, and Shinsuke Mori. Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows. International Conference on Computational Linguistics (COLING), 2022.
  5. Taichi Nishimura, Atsushi Hashimoto, Yoshitaka Ushiku, Hirotaka Kameko, and Shinsuke Mori. State-aware Video Procedural Captioning. ACM International Conference on Multimedia (ACMMM), 2021.
  6. Mutsuki Nakahara, Daisuke Hisano, Mai Nishimura, Yoshitaka Ushiku, Kazuki Maruta, and Yu Nakayama. Retransmission Edge Computing System Conducting Adaptive Image Compression Based on Image Recognition Accuracy. IEEE Vehicular Technology Conference (VTC-Fall), 2021.
  7. Qing Yu, Atsushi Hashimoto, and Yoshitaka Ushiku. Divergence Optimization for Noisy Universal Domain Adaptation. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  8. Ukyo Honda, Yoshitaka Ushiku, Atsushi Hashimoto, Taro Watanabe, and Yuji Matsumoto. Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning. The Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
  9. Taichi Nishimura, Suzushi Tomori, Hayato Hashimoto, Atsushi Hashimoto, Yoko Yamakata, Jun Harashima, Yoshitaka Ushiku, and Shinsuke Mori. Visual Grounding Annotation of Recipe Flow Graph. Language Resources and Evaluation Conference (LREC), 2020.
  10. Takuhiro Kaneko, Yoshitaka Ushiku, and Tatsuya Harada. Class-distinct and class-mutual image generation with GANs. British Machine Vision Conference (BMVC), 2019.
  11. Mikihiro Tanaka, Takayuki Itamochi, Kenichi Narioka, Ikuro Sato, Yoshitaka Ushiku, and Tatsuya Harada. Generating Easy-to-Understand Referring Expressions for Target Identifications. The IEEE International Conference on Computer Vision (ICCV), 2019.
  12. Takuhiro Kaneko, Yoshitaka Ushiku, and Tatsuya Harada. Label-noise robust generative adversarial networks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  13. Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, and Kate Saenko. Strong-weak distribution alignment for adaptive object detection. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
  14. Yang Li, Yoshitaka Ushiku, and Tatsuya Harada. Pose Graph Optimization for Unsupervised Monocular Visual Odometry. International Conference on Robotics and Automation (ICRA), 2019.
  15. Akane Iseki, Yusuke Mukuta, Yoshitaka Ushiku, and Tatsuya Harada. Estimating the causal effect from partially observed time series. The AAAI Conference on Artificial Intelligence (AAAI), 2019.
  16. Kohei Uehara, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada. Visual Question Generation for Class Acquisition of Unknown Objects. The 15th European Conference on Computer Vision (ECCV), 2018.
  17. Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada. Open Set Domain Adaptation by Backpropagation. The 15th European Conference on Computer Vision (ECCV), 2018.
  18. Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada. Customized Image Narrative Generation via Interactive Visual Question Generation and Answering. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (spotlight presentation)
  19. Atsushi Kanehira, Luc Van Gool, Yoshitaka Ushiku, Tatsuya Harada. Viewpoint-aware Video Summarization. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (spotlight presentation)
  20. Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada. Neural 3D Mesh Renderer. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (spotlight presentation)
  21. Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, Tatsuya Harada. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (oral presentation)
  22. Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada. Between-class Learning for Image Classification. The 31th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2018..
  23. Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko. Adversarial Dropout Regularization. The 6th International Conference on Learning Representations (ICLR), 2018.
  24. Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada. Learning from Between-class Examples for Deep Sound Recognition. The 6th International Conference on Learning Representations (ICLR), 2018.
  25. Katsunori Ohnishi, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada. Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture. AAAI Conference on Artificial Intelligence (AAAI), 2018. (oral presentation)
  26. Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Alternating Circulant Random Features for Semigroup Kernels. AAAI Conference on Artificial Intelligence (AAAI), 2018.
  27. Masatoshi Hidaka, Yuichiro Kikura, Yoshitaka Ushiku, Tatsuya Harada. WebDNN: Fastest DNN Execution Framework on Web Browser. ACM International Conference on Multimedia (ACMMM), Open Source Software Competition, pp.1213-1216, 2017.
  28. Masataka Yamaguchi, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada. Spatio-temporal Person Retrieval via Natural Language Queries. IEEE International Conference on Computer Vision (ICCV), 2017.
  29. Qishen Ha, Kohei Watanabe, Takumi Karasawa, Yoshitaka Ushiku, Tatsuya Harada. MFNet: Towards Real-Time Semantic Segmentation for Autonomous Vehicles with Multi-Spectral Scenes. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
  30. Kuniaki Saito, Yoshitaka Ushiku, and Tatsuya Harada. Asymmetric Tri-training for Unsupervised Domain Adaptation. International Conference on Machine Learning (ICML), pp.2988-2997, 2017.
  31. Kuniaki Saito, Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. DualNet: Domain-Invariant Network for Visual Question Answering. IEEE International Conference on Multimedia and Expo (ICME), pp.829-834, 2017. (oral presentation)
  32. Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. Image Captioning with Sentiment Terms via Weakly-Supervised Sentiment Dataset. British Machine Vision Conference (BMVC), pp.53.1-53.12, 2016.
  33. Yoshitaka Ushiku, Masataka Yamaguchi, Yusuke Mukuta, and Tatsuya Harada. Common subspace for model and similarity: Phrase learning for caption generation from images. IEEE International Conference on Computer Vision (ICCV), pp.2668-2676, 2015. (acceptance rate: 30.9%)
  34. Yoshitaka Ushiku, Masatoshi Hidaka, and Tatsuya Harada. Three guidelines of online learning for large-scale visual recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3574-3581, 2014. (acceptance rate: 29.9%)
  35. Asako Kanezaki, Shogo Inaba, Yoshitaka Ushiku, Yukihiko Yamashita, Hiroaki Muraoka, Yasuo Kuniyoshi, and Tatsuya Harada. Hard negative classes for multiple object detection. IEEE International Conference on Robotics and Automation (ICRA), pp.3066-3073, 2014.
  36. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Efficient Image Annotation for Automatic Sentence Generation. ACM International Conference on Multimedia (ACMMM), pp.549-558, 2012. (full paper, acceptance rate: 20.2%)
  37. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Understanding Images with Natural Sentences. ACM International Conference on Multimedia (ACMMM), Multimedia Grand Challenge, pp.679-682, 2011. (Special Prize on the Best Application of a Theoretical Framework) [pdf]
  38. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Automatic Sentence Generation from Images. ACM International Conference on Multimedia (ACMMM), pp.1533-1536, 2011. (short, acceptance rate: usually 30%) [pdf]
  39. Tatsuya Harada, Yoshitaka Ushiku, Yuya Yamashita, and Yasuo Kuniyoshi. Discriminative Spatial Pyramid. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1617-1624, 2011. (acceptance rate: 26.4%) [pdf]
  40. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Improvement of Image Similarity Measures for Image Browsing and Retrieval Via Latent Space Learning between Images and Long Texts. IEEE International Conference on Image Processing (ICIP), pp.2365-2368, 2010. [pdf]

International Conference (unrefereed, demo or workshop)

  1. Rintaro Yanagi, Atsushi Hashimoto, Naoya Chiba, and Yoshitaka Ushiku. Reference-based dense pose estimation via Partial 3D Point Cloud Matching. ACM International Conference on Multimedia (ACMMM) Demo Paper Track, 2023.
  2. Kuniaki Saito, Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Deep Modality Invariant Adversarial Network for Shared Representation Learning. The 16th International Conference on Computer Vision Workshop on Transferring and Adapting Source Knowledge in Computer Vision (ICCV, Workshop), 2017.
  3. Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada. Spatial-Temporal Weighted Pyramid using Spatial Orthogonal Pooling. The 16th International Conference on Computer Vision Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision (ICCV, Workshop), 2017.
  4. Takumi Karasawa, Kohei Watanabe, Qishen Ha, Antonio Tejero-De-Pablos, Yoshitaka Ushiku, Tatsuya Harada. Multispectral Object Detection for Autonomous Vehicles. The 25th Annual ACM International Conference on Multimedia (ACMMM), 2017, (workshop).
  5. Yoshitaka Ushiku, Hiroshi Muraoka, Sho Inaba, Teppei Fujisawa, Koki Yasumoto, Naoyuki Gunji, Takayuki Higuchi, Yuko Hara, Tatsuya Harada, and Yasuo Kuniyoshi. ISI at ImageCLEF 2012: Scalable System for Image Annotation. the 3rd Conference and Labs of the Evaluation Forum (CLEF 2012), pp.1-12, 2012.

Technical Report

  1. Shoji Yamamoto, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, and Tatsuya Harada. Conditional Video Generation Using Action-Appearance Captions. arXiv, 1812.01261, 2018.
  2. Andrew Shin, Yoshitaka Ushiku, and Tatsuya Harada. The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA). arXiv, 1609.06657, 2016.

Domestic Journal (refereed, In Japanese)

Go to japanese page for domestic papers.

Domestic Conference (refereed, In Japanese)

Go to japanese page for domestic papers.

Domestic Conference (unrefereed, In Japanese)

Go to japanese page for domestic papers.

Books

  1. Yoshitaka Ushiku. Long Short-Term Memory. In: Ikeuchi K. (eds) Computer Vision. Springer, 2020.

Go to japanese page for domestic books.

Invited Talks

  1. Yoshitaka Ushiku. Towards a symbiosis between AI and humans: The State-of-the-art. Director General of Intellectual Property, online, Indonesia, 2022/01/13.
  2. Yoshitaka Ushiku. Challenges of Integrating Vision and Language. International Display Workshops, online, Japan, 2021/12/01.
  3. Yoshitaka Ushiku. Towards a symbiosis between AI and humans: The State-of-the-art. Intellectual Property Office of Vietnam, online, Vietnam, 2021/11/17.
  4. Yoshitaka Ushiku. Multimodal Understanding: Vision and Language, and its Beyond. International Workshop on Frontiers of Computer Vision, Daegu, Korea, 2021/02/22.
  5. Yoshitaka Ushiku. Deep Learning for Natural Language Processing and Computer Vision. Tutorial on Asian Conference on Machine Learning, Nagoya, Japan, 2019/11/17.
  6. Yoshitaka Ushiku. Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning. Workshop of Machine Learning under International Conference on Document Analysis and Recognition, Kyoto, Japan, 2017/11/11.
  7. Yoshitaka Ushiku. Recognize, Describe, and Generate: Introduction of Recent Work at MIL. GPU Technology Conference, San Jose, CA, 2017/05/11.
  8. Yoshitaka Ushiku, Tatsuya Harada, and Yasuo Kuniyoshi. Efficient Image Annotation for Automatic Sentence Generation. Greater Tokyo Area Multimedia/Vision Workshop, Tokyo, Japan, 2012/08/30.

Go to japanese page for domestic talks.

Awards and Competitions

  1. 2018. NVIDIA Pioneering Research Awards for Neural 3D Mesh Renderer.
  2. 2017. NVIDIA Pioneering Research Awards for Asymmetric Tri-training for unsupervised domain adoptation.
  3. 2017. Honorable Mention. ACM Multimedia Open Source Software Competition.
  4. 2016. First place in the abstract image task. Visual Question Answering Challenge 2016.
  5. 2012. First place in the fine-grained classification task, second place in the classification task. Large Scale Visual Recognition Challenge 2012 (ILSVRC2012).
  6. 2011. Special Prize on the Best Application of a Theoretical Framework. ACM Mutlimedia Grand Challenge.
  7. 2011. Third place in the classification task, second place in the detection task. Large Scale Visual Recognition Challenge 2011 (ILSVRC2011).
  8. 2010. Third place. Large Scale Visual Recognition Challenge 2010 (ILSVRC2010).

Go to japanese page for domestic awards.