Desktop App for macOS / Windows

SightCue

カメラ映像をAIがリアルタイムに解析し、映像内のイベントをOSCで外部アプリケーションに通知するデスクトップアプリ。 Pure Data、TouchDesigner、Max、Ableton等と連携して、インタラクティブな作品やパフォーマンスを実現します。

A desktop app that analyzes camera footage with AI in real-time and notifies external applications of visual events via OSC. Connect with Pure Data, TouchDesigner, Max, Ableton and more for interactive installations and performances.

macOS Windows GitHub

AI Engines

用途に応じて3つの画像認識AIを切り替え

Choose the right AI engine for your use case

BLIP Caption

Python + PyTorch MPS

カメラ映像をAIがリアルタイムにテキスト化。登録した状況との意味的類似度を計算し、閾値を超えるとトリガー発火。

Real-time AI image captioning. Computes semantic similarity with registered triggers and fires when threshold is exceeded.

BLIP + Sentence-Transformers
BLIP + Sentence-Transformers
ROI対応（領域ごとの独立判定）
ROI support (independent per-region)
テキスト類似度トリガー
Text similarity triggers

MediaPipe Tracking

Browser (WASM)

手（21ランドマーク/手、8ジェスチャー）と顔（32ランドマーク）をリアルタイム検出。

Real-time hand (21 landmarks, 8 gestures) and face (32 landmarks) tracking.

手トラッキング + ジェスチャー認識
Hand tracking + gesture recognition
顔ランドマーク検出
Face landmark detection
スムージング調整
Smoothing control

Teachable Machine

Browser (TensorFlow.js)

Google Teachable Machineで学習したカスタムモデルをROIごとに推論。

Run custom Teachable Machine models on each ROI region.

URL / ZIPからモデルロード
Load models from URL or ZIP
ROIごとの独立推論
Independent inference per ROI
クラス名・確信度をOSC送信
Send class + confidence via OSC

Shared Features

全モード共通の機能

Features available across all modes

ROI（関心領域）

ROI (Region of Interest)

カメラ映像上に複数のROIを描画。BLIP/TMモードで各領域を独立して処理。

Draw multiple ROIs on camera feed. Each region processed independently in BLIP/TM modes.

OSC出力

OSC Output

全モードからOSCで外部アプリに通知。Pure Data、TouchDesigner、Max、Ableton等と連携。

All modes send OSC to external apps. Works with Pure Data, TouchDesigner, Max, Ableton, etc.

リアルタイムモニター

Real-time Monitor

OSCモニター、ログ、類似度バーをダッシュボードでリアルタイム表示。

OSC monitor, log, and similarity bars displayed in real-time dashboard.

Apple Silicon最適化

Apple Silicon Optimized

BLIPモードはPyTorch MPS、MediaPipe/TMはWebGPU/WASMで高速推論。

BLIP uses PyTorch MPS, MediaPipe/TM use WebGPU/WASM for fast inference.

Use Cases

こんな用途に

Example applications

インタラクティブアート

Interactive Art

映像中の状況変化に応じて音響・照明を制御

Control sound/lighting based on scene changes

研究プロトタイプ

Research Prototype

クロスモーダル知覚実験のパイプライン構築

Build cross-modal perception experiment pipelines

ライブパフォーマンス

Live Performance

手や顔のトラッキングでPd/Abletonを制御

Control Pd/Ableton with hand/face tracking

空間監視

Space Monitoring

カスタムモデルで特定の物体・状況を検知

Detect specific objects/situations with custom models

Getting Started

方法1: ビルド済みアプリをダウンロード

Option 1: Download pre-built app

Releases から .dmg をダウンロードして起動するだけ。

Download .dmg from Releases and launch.

方法2: ソースから実行

Option 2: Run from source

git clone https://github.com/634nakajima/sightcue.git
cd sightcue
npm install
cd python && pip install -r requirements.txt && cd ..
npm start

※ BLIPモードを使う場合のみPython環境が必要です。MediaPipe/TMモードはPython不要。

* Python environment only needed for BLIP mode. MediaPipe/TM modes work without Python.

Electron Python PyTorch MPS BLIP Sentence-Transformers MediaPipe TensorFlow.js OSC Socket.IO