HomeEngineering Projects For YouFace and Object Tracking AI Smart Robot

Face and Object Tracking AI Smart Robot

efy tested

Precision in human-robot interaction depends on the ability to recognise and track human faces along with detailed facial landmarks. This capability underpins emotional analysis and the development of machine learning models grounded in real human behaviour. By analysing facial features, robots can interpret responses and communicate more naturally.

Face tracking further enables real-time interaction and coordinated movement. However, accurately detecting and tracking multiple facial landmarks for expression and emotion analysis remains a challenge. Advanced robotic systems must monitor over 100 facial points in real time and respond accordingly—for example, by synchronising head movements with the user and maintaining natural eye contact.

This system addresses the challenge using the MediaPipe algorithm for real-time face landmark detection and tracking. Detected facial points can be recorded for analysis and for training machine learning models. The code runs across multiple platforms, including PCs, Nvidia Jetson, and Raspberry Pi, supporting face tracking, landmark analysis, and precise head and eye movement based on detected facial data.

- Advertisement -

An earlier version of this system focused on face recognition but lacked continuous face tracking and object-tracking capabilities. It can be accessed here for reference: https://www.electronicsforu.com/electronics-projects/smart-robot-face-recognition

MediaPipe can also detect and track common objects such as phones, bottles, and cups in real time. The robot can move its eyes and head to follow these objects while interacting with both people and its surroundings.

- Advertisement -

This capability enables the development of task-oriented robots that can perform activities such as picking, sorting, delivering, or handing over objects. By combining facial landmark tracking with object detection, the robot becomes more aware and responsive. It can also collect data on human-object interactions to train machine learning models for grasping, manipulation, and autonomous decision-making.

The lightweight MediaPipe framework allows the system to run efficiently on edge platforms such as Raspberry Pi, Nvidia Jetson, and standard PCs. Fig. 1 shows the author’s prototype robot tracking a face in real time while detecting facial landmark points. Components required to build the system are listed in the Bill of Materials table.

Fig. 1: Author’s prototype robot tracking a face and detecting facial landmark points in real time
Bill OF Materials
NameDescriptionQuantity
3D printed parts of a robot3D parts of the inMoover robot face1
Servo motorFor eye movement and head movement: 2 micro geared, 2 metallic geared MG995 or similar4
Raspberry Pi/Nvidia Jetson/Nvidia OrionSBC for AI image processing and control1
2A, 5V DC power supplyFor robot and servo motor power1
Raspberry Pi camera5MP or 10MP CSI camera1
EFY note: In addition to the components mentioned above, screws, a screwdriver, and similar hardware tools may be required to assemble the robot.

Robot design and assembly: Robot face

In this design, a Raspberry Pi 4 with 2GB RAM serves as the main controller. Nvidia Jetson and Orin boards can also be connected and are recommended for improved performance. The robot’s face is based on the open source InMoov robot design. Most components are 3D-printed, along with custom mounts for the camera and Raspberry Pi. Additional single-board controllers are installed inside the head to control servos and internal mechanisms.

First, assemble the 3D-printed parts of the robot’s head. Then install the servos to enable eye and head movement. Refer to the assembly instructions here: https://inmoov.fr/eye-mechanism

Refer to Fig. 2 for guidance. After this step, proceed with assembling the eyes.

Fig. 2: Robot head assembly and servo motor installation

Adding camera

Place the camera inside the 3D-printed eye housing so that the robot can capture visual input. Fix the camera on one side and assemble the remaining eye components. Finally, mount the assembled eyes onto the servo-driven mechanism in the robot’s head. Refer to Fig. 3.

Fig. 3: Camera mounted inside the robot’s eye

Adding computing

Next, install the robot’s computing unit, which performs processing tasks, runs AI-based image processing, and supports machine learning models. Fig. 4 shows this setup.

EFY Icon

EFY++ CONTENT: ACCESS TO THIS CONTENT IS FREE! BUT YOU NEED TO BE A REGISTERED USER.

Oops! This is an EFY++ article, which means it's our Premium Content. You need to be a Registered User of our website to read its complete content.

Good News: You can register to our website for FREE! REGISTER NOW

Already a registered member? If YES, then simply login to you account below. (TIP: Use 'forgot password' feature and reset and save your new password in your browser, if you forgot the last one!)

Ashwini Sinha
Ashwini Sinha
A tech journalist at EFY, with hands-on expertise in electronics DIY. He has an extraordinary passion for AI, IoT, and electronics. Holder of two design records and two times winner of US-China Makers Award.

SHARE YOUR THOUGHTS & COMMENTS

EFY Prime

Unique DIY Projects

Truly Innovative Electronics

Electronics News

Latest DIY Videos

Electronics Components

Electronics Jobs

Calculators For Electronics