HyperAI超神経
Back to Headlines

Building Modern Interactive GUIs for Computer Vision with Python and CustomTkinter

10日前

Interactive GUI Applications for Computer Vision in Python As a computer vision engineer, interactive visualizations are invaluable during daily image processing tasks. These visual tools help you make informed decisions and iteratively improve your pipelines. In this article, we explore how to create interactive GUI applications for computer vision using OpenCV and CustomTkinter. Prerequisites To follow along, set up your environment with the necessary packages: uv add numpy opencv-Python pillow customtkinter Goal The goal is to build an application that uses the webcam feed and allows users to select various filters, displaying the processed image in real-time. A simple UI would include filter options and the live webcam stream. Basic OpenCV GUI Displaying the Webcam Feed Let's start with a simple loop that fetches and displays frames from the webcam using OpenCV: ```python import cv2 cap = cv2.VideoCapture(0) while True: ret, frame = cap.read() if not ret: break cv2.imshow("Video Feed", frame) key = cv2.waitKey(1) & 0xFF if key == ord('q'): break cap.release() cv2.destroyAllWindows() ``` Adding Keyboard Input To make the application interactive, we can add keyboard inputs to cycle through different filters: ```python filter_type = "normal" while True: ret, frame = cap.read() if not ret: break if filter_type == "grayscale": frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) elif filter_type == "normal": pass if key == ord('1'): filter_type = "normal" if key == ord('2'): filter_type = "grayscale" if filter_type == "grayscale": frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR) cv2.putText(frame, "Grayscale", (frame.shape[1] // 2 - 50, frame.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA) cv2.imshow("Video Feed", frame) key = cv2.waitKey(1) & 0xFF if key == ord('q'): break cap.release() cv2.destroyAllWindows() ``` Adding Trackbars For a more user-friendly interface, we can use OpenCV's trackbar to select filters: ```python filter_types = ["normal", "grayscale"] win_name = "Webcam Stream" cv2.namedWindow(win_name) cv2.createTrackbar("Filter", win_name, 0, len(filter_types) - 1, lambda _: None) def apply_filters(frame, filter_type): if filter_type == "grayscale": frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR) cv2.putText(frame, "Grayscale", (frame.shape[1] // 2 - 50, frame.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA) return frame while True: ret, frame = cap.read() if not ret: break filter_id = cv2.getTrackbarPos("Filter", win_name) filter_type = filter_types[filter_id] frame = apply_filters(frame, filter_type) cv2.imshow("Video Feed", frame) key = cv2.waitKey(1) & 0xFF if key == ord('q'): break cap.release() cv2.destroyAllWindows() ``` Modern GUI with CustomTkinter Setting Up the Basic UI CustomTkinter provides a more modern and customizable design compared to OpenCV's built-in GUI elements. We'll create an application with two frames: one for filter options and another for image display. ```python import customtkinter class App(customtkinter.CTk): def init(self) -> None: super().init() self.title("Webcam Stream") self.geometry("800x600") self.filter_var = customtkinter.IntVar(value=0) # Frame for filters self.filters_frame = customtkinter.CTkFrame(self) self.filters_frame.pack(side="left", fill="both", expand=False, padx=10, pady=10) # Frame for image display self.image_frame = customtkinter.CTkFrame(self) self.image_frame.pack(side="right", fill="both", expand=True, padx=10, pady=10) self.image_display = CTkImageDisplay(self.image_frame) self.image_display.pack(fill="both", expand=True, padx=10, pady=10) app = App() app.mainloop() ``` Creating Filter Radio Buttons We'll populate the filter frame with radio buttons for each filter type: ```python class App(customtkinter.CTk): def init(self) -> None: super().init() self.title("Webcam Stream") self.geometry("800x600") self.filter_var = customtkinter.IntVar(value=0) filter_types = ["normal", "grayscale"] self.filters_frame = customtkinter.CTkFrame(self) self.filters_frame.pack(side="left", fill="both", expand=False, padx=10, pady=10) for filter_id, filter_type in enumerate(filter_types): rb_filter = customtkinter.CTkRadioButton( self.filters_frame, text=filter_type.capitalize(), variable=self.filter_var, value=filter_id, ) rb_filter.pack(padx=10, pady=10) if filter_id == 0: rb_filter.select() self.image_frame = customtkinter.CTkFrame(self) self.image_frame.pack(side="right", fill="both", expand=True, padx=10, pady=10) self.image_display = CTkImageDisplay(self.image_frame) self.image_display.pack(fill="both", expand=True, padx=10, pady=10) ``` Implementing the Image Display Component Since CustomTkinter lacks built-in components for OpenCV images, we create a custom CTkImageDisplay class: ```python import cv2 import customtkinter from PIL import Image import numpy as np class CTkImageDisplay(customtkinter.CTkLabel): def init(self, master: Any) -> None: self._textvariable = customtkinter.StringVar(master, "Loading...") super().init(master, textvariable=self._textvariable, image=None) def set_frame(self, frame: np.ndarray) -> None: target_width, target_height = frame.shape[1], frame.shape[0] frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) frame_pil = Image.fromarray(frame_rgb, "RGB") ctk_image = customtkinter.CTkImage( light_image=frame_pil, dark_image=frame_pil, size=(target_width, target_height), ) self.configure(image=ctk_image, text="") self._textvariable.set("") ``` Running the Webcam Loop To avoid blocking the main GUI thread, we run the webcam loop in a separate thread using Python's threading module: ```python import threading import queue class App(customtkinter.CTk): def init(self) -> None: super().init() self.title("Webcam Stream") self.geometry("800x600") self.filter_var = customtkinter.IntVar(value=0) filter_types = ["normal", "grayscale", "blur", "threshold", "canny", "sobel", "laplacian"] self.filters_frame = customtkinter.CTkFrame(self) self.filters_frame.pack(side="left", fill="both", expand=False, padx=10, pady=10) for filter_id, filter_type in enumerate(filter_types): rb_filter = customtkinter.CTkRadioButton( self.filters_frame, text=filter_type.capitalize(), variable=self.filter_var, value=filter_id, ) rb_filter.pack(padx=10, pady=10) if filter_id == 0: rb_filter.select() self.image_frame = customtkinter.CTkFrame(self) self.image_frame.pack(side="right", fill="both", expand=True, padx=10, pady=10) self.image_display = CTkImageDisplay(self.image_frame) self.image_display.pack(fill="both", expand=True, padx=10, pady=10) self.queue = queue.Queue(maxsize=1) self.webcam_thread = threading.Thread(target=self.run_webcam_loop, daemon=True) self.webcam_thread.start() self.frame_loop_dt_ms = 16 # ~60 FPS self.after(self.frame_loop_dt_ms, self._update_frame) def _update_frame(self) -> None: try: frame = self.queue.get_nowait() self.image_display.set_frame(frame) except queue.Empty: pass self.after(self.frame_loop_dt_ms, self._update_frame) def run_webcam_loop(self) -> None: self.cap = cv2.VideoCapture(0) if not self.cap.isOpened(): return while True: ret, frame = self.cap.read() if not ret: break filter_id = self.filter_var.get() filter_type = filter_types[filter_id] if filter_type == "grayscale": frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR) cv2.putText(frame, "Grayscale", (frame.shape[1] // 2 - 50, frame.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA) elif filter_type == "blur": frame = cv2.GaussianBlur(frame, ksize=(15, 15), sigmaX=0) elif filter_type == "threshold": gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) _, frame = cv2.threshold(gray,thresh=127,maxval=255,type=cv2.THRESH_BINARY) frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR) elif filter_type == "canny": frame = cv2.Canny(frame, threshold1=100, threshold2=200) frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR) elif filter_type == "sobel": frame = cv2.Sobel(frame, ddepth=cv2.CV_64F, dx=1, dy=0, ksize=5) cv2.normalize(frame, frame, 0, 255, cv2.NORM_MINMAX) frame = frame.astype(np.uint8) cv2.putText(frame, "Sobel", (frame.shape[1] // 2 - 50, frame.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA) elif filter_type == "laplacian": frame = cv2.Laplacian(frame, ddepth=cv2.CV_64F) cv2.normalize(frame, frame, 0, 255, cv2.NORM_MINMAX) frame = frame.astype(np.uint8) cv2.putText(frame, "Laplacian", (frame.shape[1] // 2 - 50, frame.shape[0] - 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA) elif filter_type == "normal": pass self.queue.put(frame) ``` Conclusion By combining a UI framework like Tkinter with OpenCV, we can build modern, interactive applications for real-time computer vision tasks. The use of multiple threads and a synchronized queue ensures a responsive user interface, even when performing complex image processing operations. This approach is particularly useful for engineers and developers who need to rapidly prototype and test different image processing pipelines. Industry Evaluation and Company Profiles Industry insiders praised the modular and clean structure of the application, highlighting its potential for use in educational and prototyping environments. CustomTkinter, developed by Florian Höch, offers a modern aesthetic and flexibility that is missing in older frameworks like Tkinter. The use of a single-slot queue for synchronization is an efficient solution to common threading issues, making this a robust example for beginners and intermediate developers alike. Florian Höch, the creator of CustomTkinter, emphasized the importance of separating GUI logic from processing logic, a principle clearly demonstrated in this project. His goal was to make Tkinter applications look and feel more modern, which this application achieves successfully. The repository for the project, available on GitHub, includes a more modular and well-structured version of the demo, which serves as an excellent starting point for those looking to build more complex computer vision GUIs.

Related Links