## Full Description
Job Title: Edge / Embedded AI Engineer (On-device inference) — 1 Position
About the Job
We are looking for an experienced Edge / Embedded AI Engineer to design, optimize and deploy ML models that run reliably and efficiently **on-device**. You will bridge research-quality models and production firmware by implementing model compression, hardware-specific acceleration, and robust device integration. The role suits an engineer who knows both ML model internals (quantization, pruning, distillation) and embedded systems (RTOS, cross-compilation, low-power operation), and who enjoys shipping real-world on-device AI features.
Responsibilities
- Convert, optimize and deploy ML models for on-device inference (vision, audio, sensor fusion, or NLP) using frameworks like TensorFlow Lite, ONNX Runtime, PyTorch Mobile, TensorRT, OpenVINO, Vitis AI, EdgeTPU toolchain, etc.
- Implement quantization (post-training & QAT), pruning, knowledge distillation and other compression techniques to meet tight memory/latency/power budgets.
- Integrate models into embedded firmware and edge platforms (MCUs, Cortex-A, Arm NPU, Coral Edge TPU, NVIDIA Jetson, Qualcomm/MediaTek NPUs) and implement efficient inference pipelines.
- Work with RTOS or lightweight OS stacks (FreeRTOS, Zephyr, Yocto, Embedded Linux) and toolchains for cross-compilation, linking and debugging.
- Build and maintain performance/accuracy testing, CI/CD for models and firmware, automated regression tests and reproducible deployment pipelines.
- Profile and optimize inference (latency, throughput, memory, power) using hardware profilers, trace logs and telemetry; propose hardware/software trade-offs.
- Implement secure model provisioning, encrypted model storage, and OTA model update strategies suitable for edge devices.
- Collaborate with product, firmware, hardware and cloud teams to define requirements, system architecture and end-to-end data flows.
- Document model choices, deployment recipes, performance results and runbooks; participate in code reviews and knowledge sharing.
Qualifications
- BS/MS (or equivalent) in Computer Science, Electrical/Computer Engineering, Robotics, or related field.
- 3+ years professional experience deploying ML to edge/embedded platforms or equivalent product experience.
- Strong hands-on skills in Python for ML workflows and C/C++ for embedded integration.
- Proven experience with at least two of: TensorFlow Lite, PyTorch Mobile, ONNX Runtime, TensorRT, OpenVINO, EdgeTPU/Coral toolchains, Vitis AI.
- Knowledge of model optimization techniques: quantization (INT8/FP16), pruning, fused ops, operator kernels, and accuracy/performance trade-offs.
- Experience with embedded platforms: ARM Cortex-M/A, NVIDIA Jetson, Coral, Qualcomm/MediaTek NPUs, or similar.
- Familiarity with build systems (CMake, cross-toolchains), debugging via JTAG/SWD, and profiling tools.
- Understanding of systems constraints: memory map, caches, DMA, real-time scheduling, power management and thermal considerations.
- Strong problem-solving, communication and documentation skills.
Preferred
- Experience with TinyML / CMSIS-NN, edge computer vision stacks (OpenCV, GStreamer), or audio/speech on-device inference.
- Experience with hardware accelerators and writing/optimizing custom operator kernels.
- Familiarity with secure model lifecycle, encrypted provisioning and OTA strategies.
- Open-source contributions, published work or a portfolio of deployed edge ML projects.
What We Offer
- Work on impactful on-device AI features in a product-driven environment.
- Access to edge hardware lab (Jetsons, Coral, NPUs, dev kits) and cloud resources for training/CI.
- Collaborative cross-disciplinary team and opportunities for ownership and technical leadership.
- Competitive compensation, flexible working arrangements and support for conferences/training.
How to Apply
Please email the following to **** with subject line **"Edge / Embedded AI Engineer (On-device inference)"**:
- Cover letter (1 page) describing a deployed edge/embedded ML project you led or contributed to (challenges, trade-offs, results).
- Links to repo(s), demos, technical notes or short videos (GitHub, GitLab, Colab, etc.).
- Two professional references (name, role, contact).
Shortlisted candidates will be invited to a technical interview and may be asked to complete a short hands-on or take-home task (model optimization or integration exercise).
We welcome applicants from diverse backgrounds and encourage engineers who bridge ML research and embedded productization to apply.