Objective
The goal of this lab is to extend the students’ understanding of image-based machine learning from classification to regression. Specifically, students will implement a real-time image regression pipeline to detect facial landmarks—namely the nose, left eye, and right eye—using a webcam and a modified ResNet-18 model deployed on the Jetson Orin Nano.
Learning Outcomes
After completing this lab, students will be able to:
- Understand the conceptual and practical differences between classification and regression in computer vision.
- Collect and annotate real-time data for coordinate-based tasks using iPython widgets.
- Modify and train deep learning models for predicting numerical coordinates (X, Y) of facial keypoints.
- Visualize regression outputs using a real-time camera feed.
- Refine model performance iteratively by collecting additional data under varying conditions.
Lab Tasks
- Jetson Setup: Start the Jetson Orin Nano in headless mode or connect directly to a display using a DP cable. Launch the Docker container using the command ./docker_dli_run.sh.
- Notebook Initialization: Open the notebook regression_interactive.ipynb located in the regression
directory via JupyterLab.
- Data Collection: Use the live feed and select a target feature (e.g., nose, left eye, right eye). Click on the corresponding point in the image to annotate and store data.
- Model Definition: Utilize a pretrained ResNet-18 model where the final classification layer is replaced with two output nodes to regress the X and Y coordinates.
- Training: Set training hyperparameters (e.g., epochs = 10) and begin the training loop. Monitor model loss and training accuracy across epochs.
- Live Testing: Evaluate model predictions by checking whether the output point overlays correctly on the selected facial feature during live webcam feed.
- Iterative Improvement: Introduce new data by changing lighting, backgrounds, and subjects. Retrain the model and assess performance improvements.
Technologies Used
- Jetson Orin Nano (Ubuntu-based platform)
- Docker with NVIDIA runtime (nvcr.io/nvidia/dli/dli-nano-ai)
- JupyterLab with Python notebooks
- ResNet-18 (PyTorch-based) modified for regression
- Webcam input and real-time regression output visualization
Expected Deliverables
- Annotated screenshots of training, testing, and regression outputs.
- A short report comparing classification vs. regression tasks.
- Summary of challenges faced and techniques used to improve model accuracy.