OUR SECTORS
At Tech Recruit, our sectors cover a wide range of industries within the field of technology.
At European Recruitment, our sectors cover a wide range of industries within the field of technology
At European Recruitment, our sectors cover a wide
range of industries within the field of technology
At European Recruitment, our sectors cover a wide
range of industries within the field of technology
Client services
Learn about the range of client services we offer at Tech Recruit, and browse through our case sudies.
At European Recruitment, our sectors cover a wide range of industries within the field of technology
About us
Learn about Tech Recruit's mission, values, our team, and our commitment to DE&I.
At European Recruitment, our sectors cover a wide range of industries within the field of technology
Senior/Principal Researcher: Next-generation NPU and Agentic CPU Micro-architecture
Senior/Principal Researcher: Next-generation NPU and Agentic CPU Micro-architecture
Responsibilities:
• The research investigates the next generation of Neural Processing Units (NPUs), with particular focus on core micro-architecture. This spans the frontend, including branch prediction, BTB, and instruction prefetchers; register files; issue and wake-up logic; scalar functional units; vector functional units; and tensor units. It also covers the backend, including TLBs, L1 caches, scratchpads, broader cache hierarchies, and memory systems that feed the compute units.
• In parallel, the work explores a new generation of CPUs tailored to the agentic AI era. As agent-based systems now spend time not only on the GPU/NPU but also on the CPU, handling tool calling, agent-logic scheduling, context management, and similar tasks, the CPU must become resilient to bursty compute and capable of rapid context switching, without the resource thrashing and limited on-chip contexts that constrain today’s designs, most of which support only 2-way SMT.
• Investigate and prototype new architectural features, including but not limited to:
• NPU Core Micro-architecture: Explore frontend mechanisms, including branch prediction, BTB, and instruction prefetching; register file organization; issue and wake-up logic; and the design of scalar, vector, and tensor functional units to maximize throughput and utilization for AI workloads.
• NPU Backend and Memory System: Investigate backend structures including TLBs, L1 caches, and scratchpads, along with cache hierarchies and memory systems that sustain high bandwidth and low latency to the compute units.
• Agentic CPU Architecture: Design CPUs tailored to agentic AI, where the processor handles tool calling, agent-logic scheduling, and context management alongside accelerator-bound work.
• Resilience to Bursty Compute and Context Switching: Develop architectural support for rapid context switching and high thread-level concurrency that goes beyond conventional 2-way SMT, mitigating the on-chip resource thrashing and limited context capacity of current designs.
• Produce and present research papers at top-tier conferences and journals, such as ASPLOS, ISCA, MICRO, and HPCA.
• Establish and maintain collaborations with leading academic institutions and faculty.
• Mentor and support junior researchers and interns in their professional development.
Requirements:
• PhD in Computer Science, Electrical Engineering, or a related field.
• Strong background in computer architecture and micro-architecture.
• Creativity and the ability to think beyond conventional approaches to develop innovative technologies.
• Research experience and/or strong knowledge in at least one of the following areas:
• Computer Architecture and Micro-architecture: Modern CPU, GPU, or AI accelerator architecture and micro-architecture.
• Computer Architecture Simulators: Experience with tools such as Gem5, ChampSim, Sniper, ZSim, or QFlex.
• Vector and Matrix Extensions: ARM SVE/SME or Intel SSE/AVX/AMX.
• GPU Programming Model: Understanding of CUDA kernels and PTX/SASS instructions.
• Micro-architecture Characterization: Experience with profiling, characterization, and bottleneck analysis on CPU/GPU applications using Performance Monitoring Units.
• Proven track record of publishing research papers in top-tier conferences or journals.
• Excellent analytical, problem-solving, and system-level thinking skills.
• Strong development and prototyping skills.
• Strong interpersonal skills, with a collaborative spirit and the ability to work independently.
Apply Now
By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.
For more information, please refer to our
Privacy
Notice