ECE Undergraduate Laboratory
ECE 381 - Applied Machine Learning

ECE 381 - Applied Machine Learning Lab

Lab 3: Advanced Object Detection And Visual Transformers
Part I: YOLOv11n-Based Object Detection, Segmentation, and Pose Estimation

Objective

This lab introduces students to YOLOv11n (You Only Look Once version 11 nano), a family of high-performance real-time object detection models. The primary focus is on deploying these models inside a Docker container on the Jetson Orin Nano platform to perform object detection, image segmentation, classification, pose/keypoint estimation, and oriented bounding box detection.


Learning Outcomes


Lab Tasks

  1. YOLO Docker Setup: Pull and run the YOLOv11n Docker image using NVIDIA runtime. Mount the local workspace for data persistence and connect the webcam device.
  2. Workspace Configuration: Create and link a local directory yolo_workspace for storing processed output and Python scripts.
  3. Model Conversion: Use ModelConversion.py to convert pretrained YOLOv11n models to TensorRT format for efficient inference.
  4. Execution of Tasks:
    • ObjectDetect.py for object detection
    • Segmentation.py for semantic segmentation
    • Classification.py for image classification
    • Pose.py for keypoint estimation and oriented bounding box
  5. Output Handling: All frames will be saved in appropriately named directories like processed_frames, pose_frames, etc.


Part II: Visual Transformers using NanoOWL


Objective

This lab enables students to explore the power of Visual Transformers (ViT) for multi-object hierarchical detection using NanoOWL—a lightweight ViT module adapted for the Jetson platform. Students will run inference through a browser interface and write prompt-based queries to locate complex scene elements.

Learning Outcomes


Lab Tasks

  1. SSD Setup: Format and mount an external SSD to /mnt/nvme/my_storage for storing model files and data.
  2. NanoOWL Installation: Clone the jetson-containers repository and install necessary dependencies. Enter the NanoOWL container using jetson-containers run.
  3. Module Execution: Navigate to the examples/tree_demo directory and run tree_demo.py with a live camera feed.
  4. Prompt-Based Tasks:
    • Identify facial parts with prompts like: [a face[an eye, a nose]]
    • Identify body parts: [a person[a t-shirt, trousers]]
    • Detect workstation components: [a desk[a monitor, a keyboard, a mouse]]
  5. Visualization: Access the model’s predictions in a browser-based UI via a local IP link.

Expected Deliverables