year Ph.D. student at the University of Hong Kong, with deep focus and passion on NLP and fundamental model-based agent. I'm fortunate to get advised by Prof. Tao Yu and be a part of the XLANG Lab and HKU NLP Lab. Previously I got my bachlor's degree of Computer Science and Technology at Tsinghua University focusing on Human-Computer Interaction (HCI), advised by Prof. Chun Yu and Prof. Yuanchun Shi.
My long-term research goal is to build autonomous GUI agents that can 1) understand open-ended natural language (or even multi-modal) instructions; 2) observe the wild GUI environment (OS, Mobile and more) with grounded knowledge; 3) generate executable steps (e.g. atomic actions, system shortcuts, codes, etc.) iteratively to finalize the task.
The ultimate goal of my research is to build fundamental large agent models that could be seamlessly integrated into the daily use just like ChatGPT, rather than prompt-based warpper of existing LLMs. To realize this vision, my research encompasses a wide range of topics, including data collection, model training/fine-tuning, and evaluation, presenting both significant challenges and exciting opportunities.
AutoTask: Executing Arbitrary Voice Commands by Exploring and Learning from Mobile GUI
Bowen Wang*, Lihang Pan*, Chun Yu, Yuxuan Chen, Xiangyu Zhang, Yuanchun Shi
This paper presents AutoTask, a VCI capable of automating any task in any mobile application without configuration or modification from developers or end users. AutoTask executes VCIs on mobile (i.e. Android smartphones) by exploring and learning from UIs.
Interaction Proxy Manager: Semantic Model Generation and Run-time Support for Reconstructing Ubiquitous User Interfaces of Mobile Services
Tian Huang, Chun Yu, Weinan Shi, Bowen Wang, David Yang, Yihao Zhu, Zhaoheng Li, Yuanchun Shi
This paper introduces the Interaction Proxy, enabling mobile apps to adapt to new devices without a complete rebuild. Key contributions include the UIAD model and the IPManager tool, streamlining integration across platforms like mobile-smartwatch and mobile-vehicle.