🔖

4강. 미디어 파이프 환경 구축과 기초

미디어파이프란?

미디어파이프(MediaPipe)는 Google에서 개발한 오픈소스 프레임워크로, 실시간 미디어 처리를 위한 크로스 플랫폼 솔루션입니다. 주로 컴퓨터 비전과 머신러닝 파이프라인을 구축하는 데 사용됩니다.

주요 특징

•

실시간 처리: 비디오 및 오디오 스트림을 실시간으로 처리할 수 있습니다.

•

다양한 솔루션: 손 추적, 얼굴 감지, 포즈 추정, 객체 감지 등 다양한 사전 구축된 솔루션을 제공합니다.

•

크로스 플랫폼: Android, iOS, 웹, 데스크톱 등 여러 플랫폼에서 동작합니다.

•

높은 성능: 모바일 기기에서도 효율적으로 동작하도록 최적화되어 있습니다.

주요 사용 사례

•

손 제스처 인식

•

얼굴 필터 및 AR 효과

•

포즈 추정 및 피트니스 애플리케이션

•

객체 감지 및 추적

Python 샘플 코드

1. 설치

먼저 미디어파이프를 설치합니다:

pip install mediapipe opencv-python
Bash
복사

2. 손 추적 예제

웹캠을 사용하여 실시간으로 손을 추적하는 예제입니다:

import cv2
import mediapipe as mp

# 미디어파이프 손 추적 모듈 초기화
mp_hands = mp.solutions.hands
mp_drawing = mp.solutions.drawing_utils

# 손 추적 객체 생성
hands = mp_hands.Hands(
    static_image_mode=False,
    max_num_hands=2,
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)

# 웹캠 열기
cap = cv2.VideoCapture(0)

while cap.isOpened():
    success, image = cap.read()
    if not success:
        print("웹캠을 찾을 수 없습니다.")
        break
    
    # BGR을 RGB로 변환
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image.flags.writeable = False
    
    # 손 추적 수행
    results = hands.process(image)
    
    # 이미지를 다시 BGR로 변환
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    
    # 손이 감지되면 랜드마크 그리기
    if results.multi_hand_landmarks:
        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(
                image,
                hand_landmarks,
                mp_hands.HAND_CONNECTIONS
            )
    
    # 결과 화면에 표시
    cv2.imshow('MediaPipe Hands', image)
    
    # 'q' 키를 누르면 종료
    if cv2.waitKey(5) & 0xFF == ord('q'):
        break

# 리소스 해제
hands.close()
cap.release()
cv2.destroyAllWindows()
Python
복사

3. 얼굴 감지 예제

얼굴 메쉬를 감지하는 예제입니다:

import cv2
import mediapipe as mp

# 미디어파이프 얼굴 메쉬 모듈 초기화
mp_face_mesh = mp.solutions.face_mesh
mp_drawing = mp.solutions.drawing_utils

# 얼굴 메쉬 객체 생성
face_mesh = mp_face_mesh.FaceMesh(
    max_num_faces=1,
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)

# 웹캠 열기
cap = cv2.VideoCapture(0)

while cap.isOpened():
    success, image = cap.read()
    if not success:
        break
    
    # BGR을 RGB로 변환
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image.flags.writeable = False
    
    # 얼굴 메쉬 감지
    results = face_mesh.process(image)
    
    # 이미지를 다시 BGR로 변환
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    
    # 얼굴이 감지되면 랜드마크 그리기
    if results.multi_face_landmarks:
        for face_landmarks in results.multi_face_landmarks:
            mp_drawing.draw_landmarks(
                image=image,
                landmark_list=face_landmarks,
                connections=mp_face_mesh.FACEMESH_CONTOURS,
                landmark_drawing_spec=None,
                connection_drawing_spec=mp_drawing.DrawingSpec(
                    color=(0, 255, 0), thickness=1
                )
            )
    
    # 결과 화면에 표시
    cv2.imshow('MediaPipe Face Mesh', image)
    
    if cv2.waitKey(5) & 0xFF == ord('q'):
        break

face_mesh.close()
cap.release()
cv2.destroyAllWindows()
Python
복사

4. 포즈 추정 예제

전신 포즈를 추정하는 예제입니다:

import cv2
import mediapipe as mp

# 미디어파이프 포즈 모듈 초기화
mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils

# 포즈 객체 생성
pose = mp_pose.Pose(
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)

# 웹캠 열기
cap = cv2.VideoCapture(0)

while cap.isOpened():
    success, image = cap.read()
    if not success:
        break
    
    # BGR을 RGB로 변환
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image.flags.writeable = False
    
    # 포즈 추정
    results = pose.process(image)
    
    # 이미지를 다시 BGR로 변환
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    
    # 포즈 랜드마크 그리기
    if results.pose_landmarks:
        mp_drawing.draw_landmarks(
            image,
            results.pose_landmarks,
            mp_pose.POSE_CONNECTIONS,
            landmark_drawing_spec=mp_drawing.DrawingSpec(
                color=(0, 0, 255), thickness=2, circle_radius=2
            ),
            connection_drawing_spec=mp_drawing.DrawingSpec(
                color=(0, 255, 0), thickness=2
            )
        )
    
    # 결과 화면에 표시
    cv2.imshow('MediaPipe Pose', image)
    
    if cv2.waitKey(5) & 0xFF == ord('q'):
        break

pose.close()
cap.release()
cv2.destroyAllWindows()
Python
복사

코드 설명

•

초기화: 미디어파이프의 솔루션 모듈(손, 얼굴, 포즈 등)을 초기화합니다.

•

웹캠 캡처: OpenCV를 사용하여 웹캠에서 프레임을 읽어옵니다.

•

색상 변환: OpenCV는 BGR 형식을 사용하지만, 미디어파이프는 RGB를 요구하므로 변환이 필요합니다.

•

처리: 각 프레임을 미디어파이프에 전달하여 랜드마크를 감지합니다.

•

시각화: 감지된 랜드마크와 연결선을 화면에 그립니다.

•

종료: 'q' 키를 누르면 프로그램이 종료됩니다.