About Me

Hi! I'm Haoran Wu (George) (邬浩冉), a second-year PhD student in the Department of Computer Science and Technology at the University of Cambridge, supervised by Prof. Robert Mullins. My PhD is funded by the Scaling Compute project via ARIA.

My current research interests cover NPU design, ML systems, and ML inference acceleration. Previously I worked on RISC-V verification and design-space exploration for customisable processors.

Download my CV

Education

University of Cambridge

Ph.D. in Computer Architecture

Supervised by Prof. Robert Mullins and Prof. Timothy Jones.

2024 - Present

Imperial College London

MEng in Electronic and Information Engineering

Graduated with First Class Honours.

Dean's List (Top 10% in the department, 2020/21, 2021/22 & 2023/24). Runner-up, Second-Year Group Project.

Master's thesis supervised by Prof. Wayne Luk and Dr. Ce Guo.

2020 - 2024

Experience

Institute of Computing Technology, Chinese Academy of Sciences

Research Intern

Advisor: Dr. Kan Shi

Built an FPGA-based hardware fuzzer that automatically generates stimuli for RISC-V processors, and developed a verification system running fully on a Xilinx Zynq SoC.

Apr 2023 - Sep 2024

Imperial College London — UROP

Undergraduate Research Opportunities Programme

Advisor: Dr. James Davis

Developed an FPGA-based object detection system using Binary Neural Networks; trained and converted a YOLO model into a Residual Binary Network with TensorFlow.

Jun 2022 - Sep 2022

Selected Publications

ISCA
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
H Wu, C Xiao, J Nie, X Guo, B Lou, JTH Wong, Z Mo, C Zhang, P Forys, et al.
The 53rd International Symposium on Computer Architecture (ISCA'26), 2026
2026
arXiv
MemExplorer: Navigating the Heterogeneous Memory Design Space for Agentic Inference NPUs
H Wu, Z Cao, Y Lai, B Lou, J Nie, C Xiao, T Adeniran, P Forys, K Johar, et al.
arXiv preprint arXiv:2604.16007, 2026
2026
ICML
KernelCraft: Benchmarking for Agentic Close-to-Metal Kernel Generation on Emerging Hardware
J Nie, H Wu, Y Lai, Z Cao, C Zhang, B Lou, E Wang, J Cheng, TM Jones, et al. (equal contribution: J Nie, H Wu)
The Forty-Third International Conference on Machine Learning (ICML'26), 2026
2026
HPCA
TurboFuzz: FPGA Accelerated Hardware Fuzzing for Processor Agile Verification
Y Zhong, H Wu, X Li, S Wang, D Boland, Y Bao, K Shi (equal contribution: Y Zhong, H Wu)
IEEE International Symposium on High-Performance Computer Architecture (HPCA'26), 2026
2026
FPL
ASPO: Constraint-Aware Bayesian Optimization for FPGA-based Soft Processors
H Wu, C Guo, W Luk, R Mullins
35th International Conference on Field-Programmable Logic and Applications (FPL'25), 2025
2025
arXiv
TriAxialKV: Toward Extreme Low-Precision KV-Cache Quantization for Agentic Inference Tasks
H Shen, H Wu, Y Zhao, R Mullins
arXiv preprint arXiv:2605.17170, 2026
2026
arXiv
Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM Sampling
B Lou, H Wu, Y Lai, J Nie, C Xiao, X Guo, R Antonova, R Mullins, A Zhao
arXiv preprint arXiv:2601.20706, 2026
2026
arXiv
Rethinking Compute Substrates for 3D-Stacked Near-Memory LLM Decoding: Microarchitecture-Scheduling Co-Design
C Ai, Y Zhang, H Wu, Y Pan, L Zhao, W Ou
arXiv preprint arXiv:2604.04253, 2026
2026
HEART
Resource-Constraint Bayesian Optimization for Soft Processors on FPGAs
C Guo, H Wu, W Luk
14th International Symposium on Highly Efficient Accelerators and Reconfigurable Computing (HEART'24), 2024
2024