Desktop App for macOS / Windows

SoundCue

マイク入力をAIがリアルタイムに解析し、音の特徴をOSCで外部アプリケーションに送信するデスクトップアプリ。 Pure Data、TouchDesigner、Max、Ableton等と連携して、音声駆動のインタラクティブ作品を実現します。

A desktop app that analyzes microphone input with AI in real-time and sends audio features to external applications via OSC. Connect with Pure Data, TouchDesigner, Max, Ableton and more for sound-driven interactive installations and performances.

macOSmacOS WindowsWindows GitHub

⚠️ macOSで「開けません」と表示される場合は、システム設定 → プライバシーとセキュリティ → 「このまま開く」 をクリックしてください。

⚠️ On macOS, if you see "cannot be opened", go to System Settings → Privacy & Security → "Open Anyway".

Analysis Modes

5つの音声解析モードをタブで切り替え

Switch between 5 audio analysis modes via tabs

YAMNet

TensorFlow.js

521カテゴリの環境音分類。犬の鳴き声、楽器、車のクラクションなど幅広い音を検出。

521-category environmental sound classification. Detects dogs barking, instruments, car horns and more.

Google YAMNet (AudioSet)
Google YAMNet (AudioSet)
Top-N クラス表示
Top-N class display
信頼度バー
Confidence bars

CLAP

Zero-Shot AI

テキストラベルで自由に音を分類。学習不要のゼロショット音声分類（CLAP: Contrastive Language-Audio Pretraining）。

Classify sounds with custom text labels. Zero-shot audio classification powered by CLAP (Contrastive Language-Audio Pretraining).

自由なテキストラベル設定
Custom text labels
プリセット（環境音・楽器等）
Presets (environment, instruments, etc.)
学習不要のゼロショット分類
No training needed

Teachable Machine

Custom Model

Google Teachable Machineで学習したカスタム音声分類モデルを読み込んで推論。

Load and run custom audio classification models trained with Google Teachable Machine.

URL / ZIPからモデルロード
Load models from URL or ZIP
カスタムクラス分類
Custom class classification

Music Info

Web Audio + Meyda.js

ピッチ・RMS・MFCC・Chroma等16種の音楽特徴量をリアルタイム抽出。チェックボックスでOSC送信を個別制御。

Real-time extraction of 16+ audio features including pitch, RMS, MFCC, Chroma. Checkbox control for OSC output per feature.

Meyda.js による包括的特徴量
Comprehensive features via Meyda.js
MFCC (13係数) / Chroma (12音)
MFCC (13 coeff.) / Chroma (12 pitch classes)
波形・スペクトル可視化
Waveform & spectrum visualization

Speech

Whisper (Local)

Whisperによるローカル音声認識。オフライン対応、多言語。モデルサイズ選択可。

Local speech recognition powered by Whisper. Offline, multilingual. Selectable model size.

tiny / small / medium 選択
tiny / small / medium selectable
日本語・英語・他多言語
Japanese, English, and more

Features

全モード共通の機能

Features available across all modes

OSC出力

OSC Output

全解析モードからOSCで外部アプリに送信。Pure Data、TouchDesigner、Max、Ableton等と連携。

All modes send analysis data via OSC. Works with Pure Data, TouchDesigner, Max, Ableton, etc.

OSCモニター

OSC Monitor

送信中のOSCアドレスと値をリアルタイムに表示するダッシュボード。

Real-time dashboard showing active OSC addresses and their values.

マイク入力

Audio Input

複数のオーディオ入力デバイスに対応。レベルメーター常時表示。

Multiple audio input devices supported. Always-on level meter.

ローカル処理

Local Processing

YAMNet、CLAP、Music Info、Whisperは全てローカルで実行。インターネット接続不要（初回モデルDL除く）。

YAMNet, CLAP, Music Info, and Whisper all run locally. No internet needed (except initial model download).

Use Cases

こんな用途に

Example applications

サウンドアート

Sound Art

環境音を分類して映像・照明を制御

Classify ambient sounds to control visuals and lighting

研究

Research

音響特徴量の抽出と外部ソフトへの送信

Extract audio features and send to external software

ライブパフォーマンス

Live Performance

ピッチやRMSでPd/Abletonをリアルタイム制御

Control Pd/Ableton with pitch and RMS in real-time

字幕生成

Subtitling

Whisperで音声をテキスト化してOSCで送信

Transcribe speech with Whisper and send via OSC

OSC Addresses

各モードで送信されるOSCアドレス一覧

OSC addresses sent by each mode

Address	Type	説明Description
/yamnet/class	string	分類クラス名Classification class name
/yamnet/confidence	float	信頼度 (0-1)Confidence (0-1)
/clap/class	string	ゼロショット分類クラス名Zero-shot classification class
/clap/confidence	float	信頼度 (0-1)Confidence (0-1)
/music/pitch	float	ピッチ (Hz)Pitch (Hz)
/music/note	string	音名 (e.g. A4)Note name (e.g. A4)
/music/rms	float	RMS音量 (0-1)RMS level (0-1)
/music/centroid	float	スペクトル重心 (Hz)Spectral centroid (Hz)
/music/loudness	float	知覚的音量Perceptual loudness
/music/zcr	float	ゼロクロッシングレートZero crossing rate
/music/flatness	float	スペクトル平坦度 (0-1)Spectral flatness (0-1)
/music/energy	float	エネルギーEnergy
/music/mfcc/0-12	float	MFCC 13係数MFCC 13 coefficients
/music/chroma/0-11	float	クロマ 12音階Chroma 12 pitch classes
/speech/text	string	認識テキストRecognized text
/tm/class	string	分類クラス名Classification class name

Getting Started

方法1: ビルド済みアプリをダウンロード

Option 1: Download pre-built app

Releases からダウンロードして起動するだけ。

Download from Releases and launch.

方法2: ソースから実行

Option 2: Run from source

git clone https://github.com/634nakajima/SoundCue.git
cd SoundCue
npm install
npm run dev

Electron React TypeScript TensorFlow.js CLAP Whisper Meyda.js Web Audio API Tailwind CSS OSC