Job search

Our sectors

Client services

About us

Looking for tech jobs in
the US or Europe?

Client services

At European Recruitment, our sectors cover a wide
range of industries within the field of technology

Submit Vacancy

About us

At European Recruitment, our sectors cover a wide
range of industries within the field of technology

Submit Vacancy

Client services

Learn about the range of client services we offer at Tech Recruit, and browse through our case sudies.

Submit vacancy
Looking for tech jobs in
the US or Europe?
Looking for tech jobs in
the US or Europe?

Our Sectors

At European Recruitment, our sectors cover a wide range of industries within the field of technology

Submit Vacancy

About us

Learn about Tech Recruit's mission, values, our team, and our commitment to DE&I.

Submit vacancy
Looking for tech jobs in
the US or Europe?
>
Looking for tech jobs in
the US or Europe?

Our Sectors

At European Recruitment, our sectors cover a wide range of industries within the field of technology

Submit Vacancy

Senior Researcher: AI Computing Systems

Recruitment Consultant
Simon Troupe
Posted
10 days ago

Senior Researcher – AI Computing Systems (LLM Inference & RAG Optimization)

Position Overview

A research-driven technology organization is seeking a Senior Researcher in AI Computing Systems to advance the efficiency of large language model (LLM) inference and retrieval-augmented generation (RAG) pipelines.

This role operates at the intersection of systems research and low-level performance engineering, focusing on optimizing attention mechanisms, KV-cache strategies, and end-to-end inference stacks. The position involves translating cutting-edge research into high-performance, production-ready implementations.

Key Responsibilities

LLM Inference Optimization

  • Design and implement techniques to reduce inference latency and improve throughput, including:
    • KV-cache precomputation
    • Cache reuse and blending strategies
    • Efficient batching and scheduling
  • Optimize time-to-first-token (TTFT) and overall system efficiency.

KV-Cache Systems & Memory Optimization

  • Develop and integrate KV-cache reuse and blending pipelines into inference systems.
  • Design caching policies including:
    • Paging and eviction strategies
    • Memory layout optimization
    • Trade-offs between accuracy and performance
  • Ensure correctness and stability under high-throughput workloads.

Attention Mechanism Optimization

  • Implement and optimize sparse and selective attention techniques.
  • Develop efficient masking strategies and block-level computation methods.
  • Work closely with attention kernels to maximize hardware utilization.

Low-Level Performance Engineering

  • Profile and optimize model execution using modern attention backends and kernel frameworks.
  • Work with:
    • PyTorch internals
    • High-performance attention kernels (e.g., FlashAttention-style implementations)
  • Identify and resolve performance bottlenecks across compute and memory subsystems.

Research Translation & Innovation

  • Stay current with advances in LLM inference, caching systems, and RAG architectures.
  • Translate research ideas into robust, scalable implementations.
  • Contribute to internal innovation and potentially to external publications or open-source projects.

Required Qualifications

  • PhD in Computer Science, Electrical Engineering, or a related field.
  • Strong software engineering skills in Python, with deep experience in PyTorch.
  • Solid understanding of transformer inference, including:
    • Prefill vs decode stages
    • KV-cache structure and memory layout
    • Masking and batching strategies
    • Latency vs throughput trade-offs
  • Experience with benchmarking and profiling large-scale LLM workloads.
  • Ability to diagnose and resolve performance bottlenecks.
  • Strong communication skills and ability to collaborate across research and engineering teams.

Preferred Qualifications

  • Experience working with modern LLM inference frameworks (e.g., vLLM-like systems or similar).
  • Familiarity with attention kernel development and optimization:
    • CUDA, Triton, or custom kernel implementations
  • Experience building or optimizing RAG pipelines, including:
    • Retrieval and indexing
    • Chunking and reranking
    • Interaction between retrieval and inference latency
  • Contributions to open-source projects or publications in AI systems or ML infrastructure.
  • Systems-level expertise, including:
    • Linux environments
    • Memory hierarchy and storage systems
    • Performance engineering close to hardware

Personal Attributes

  • Strong systems-thinking mindset with attention to performance and scalability.
  • Ability to bridge research concepts and production engineering.
  • Detail-oriented with a focus on measurable performance improvements.
  • Collaborative approach in multidisciplinary environments.
  • Curiosity and drive to explore emerging AI infrastructure techniques.
Industry
AI & Machine Learning
Contract Type
Permanent
Location
Switzerland
City
Zurich
Work Model
On-Site

Apply Now

By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.

For more information, please refer to our
Privacy Notice

    Name
    Email
    Phone
    Location
    Message

    Upload CV:

    Choose file

    Formats: Word, PDF (max. size: 20MB)

    Subscribe for industry highlights.

    Send Application

     

    Other relevant jobs

    Posted 1 day ago

    Founding Engineer (Full Stack)

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 3 days ago

    Product Engineer (Python)

    Type of contract
    Permanent
    Location
    Germany
    Type
    remote
    Posted 4 days ago

    Senior LLM Agent Researcher – Contract Role

    Type of contract
    Contract
    Location
    Ireland
    Type
    remote
    Posted 5 days ago

    Simulation Engineer

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 5 days ago

    Simulation Platform Engineer

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 5 days ago

    Member of Technical Staff

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 5 days ago

    Frontend CFD Visualization Engineer

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 9 days ago

    System Administrator

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 9 days ago

    Software Engineer

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 10 days ago

    3D Machine Learning Engineer

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 10 days ago

    Senior Researcher: AI Computing Systems

    Type of contract
    Permanent
    Location
    Switzerland
    Type
    On-Site
    Posted 11 days ago

    Software Engineer (Frontend)

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 19 days ago

    Sr MLOps Enigneer

    Type of contract
    Permanent
    Location
    Spain
    Type
    hybrid
    Posted 19 days ago

    Head of Global Marketing & Communications

    Type of contract
    Permanent
    Location
    Spain
    Type
    hybrid
    Posted 26 days ago

    Founding Frontend Software Engineer

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 27 days ago

    Model Based Developer – Senior Expert

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 29 days ago

    Programmatic Bidding Data Scientist – Contractor

    Type of contract
    Contract
    Location
    Ireland
    Type
    On-Site
    Posted 30 days ago

    Systems Engineer (ML/C++/C)

    Type of contract
    Permanent
    Location
    Ireland
    Type
    On-Site
    Posted 1 month ago

    Senior Researcher – LLM System Architecture

    Type of contract
    Permanent
    Location
    Switzerland
    Type
    On-Site
    Posted 1 month ago

    Fullstack Software Engineer

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 1 month ago

    Senior DevSecOps Engineer

    Type of contract
    Permanent
    Location
    Spain
    Type
    hybrid
    Posted 1 month ago

    Physics Simulation Team Lead

    Type of contract
    Permanent
    Location
    United States
    Type
    hybrid
    Posted 1 month ago

    Research Scientist / Founding Member – Agentic AI

    Type of contract
    Permanent
    Location
    France
    Type
    On-Site
    Posted 1 month ago

    Software Engineer (C++ Systems)

    Type of contract
    Permanent
    Location
    United States
    Type
    On-Site
    Posted 1 month ago

    M/L Compiler Engineer

    Type of contract
    Permanent
    Location
    United Kingdom
    Type
    On-Site
    Posted 2 months ago

    US – Enterprise Account Executive (AI / LLM / Infrastructure)

    Type of contract
    Permanent
    Location
    United States
    Type
    hybrid
    Posted 2 months ago

    Legal Counsel

    Type of contract
    Permanent
    Location
    United Kingdom
    Type
    On-Site
    Posted 2 months ago

    Embedded Software Senior Engineer –SoC Firmware

    Type of contract
    Permanent
    Location
    Ireland
    Type
    On-Site
    Posted 2 months ago

    Senior Deep Learning Researcher – Generative Vision

    Type of contract
    Permanent
    Location
    Netherlands
    Type
    On-Site
    Posted 2 months ago

    Engineering Director (Product)

    Type of contract
    Permanent
    Location
    Spain
    Type
    hybrid
    Posted 2 months ago

    DataOps & MLOps Engineer

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 2 months ago

    Senior Deep Learning Researcher – Model Efficiency

    Type of contract
    Permanent
    Location
    Netherlands
    Type
    On-Site
    Posted 2 months ago

    Infrastructure & DevOps Engineer

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 3 months ago

    Technical Leader: AI Systems Architecture

    Type of contract
    Permanent
    Location
    Switzerland
    Type
    On-Site
    Posted 3 months ago

    Deep Learning & Computer Vision Engineer

    Type of contract
    Permanent
    Location
    France
    Type
    On-Site
    Posted 3 months ago

    Fullstack Web Developer

    Type of contract
    Permanent
    Location
    France
    Type
    On-Site
    Posted 3 months ago

    C++ CUDA Engineer

    Type of contract
    Permanent
    Location
    France
    Type
    On-Site
    Posted 3 months ago

    Senior Platform Engineer – Customer Facing

    Type of contract
    Permanent
    Location
    Germany
    Type
    On-Site
    Posted 4 months ago

    Neural Rendering & Graphics Engineer

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 4 months ago

    3D Computer Vision Engineer

    Type of contract
    Permanent
    Location
    Italy
    Type
    On-Site
    Posted 5 months ago

    AI Strategy Consultant (Contractor)

    Type of contract
    Contract
    Location
    United States
    Type
    On-Site
    Posted 7 months ago

    LLM Engineer

    Type of contract
    Permanent
    Location
    Spain
    Type
    hybrid
    Posted 9 months ago

    Principal AI Researcher

    Type of contract
    Permanent
    Location
    Ireland
    Type
    On-Site
    Submit CV
    Submit Vacancy
    Cookie Settings
    We use cookies to enhance your experience and analyze site traffic and movements. Read our cookie policy here.