Skip to content

Commit 98b1f33

Browse files
committed
feat: add music cover command and upgrade to music-2.6
- Add `mmx music cover` command for generating cover versions from reference audio (base64 or URL) - Upgrade music generate model from music-2.5 to music-2.6-free / music-2.6 - Add native `is_instrumental` and `lyrics_optimizer` API fields (replaces 无歌词 workaround) - Auto-detect key type (sk-cp-* = Token Plan, sk-api-* = PAYG) to select correct model variant - Add `models.ts` shared utility for model selection logic - Update MusicRequest type with new fields: is_instrumental, lyrics_optimizer, audio_url, audio_base64, seed, channel - Fix NodeJS.ErrnoException type in main.ts stdout EPIPE handler - Update README, README_CN, and skill/SKILL.md Made-with: Cursor
1 parent e6de2aa commit 98b1f33

10 files changed

Lines changed: 266 additions & 67 deletions

File tree

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
- **Image** — Text-to-image with aspect ratio and batch controls
2222
- **Video** — Async video generation with progress tracking
2323
- **Speech** — TTS with 30+ voices, speed control, streaming playback
24-
- **Music** — Text-to-music with optional lyrics
24+
- **Music** — Text-to-music with lyrics, instrumental mode, auto lyrics, and cover generation from reference audio
2525
- **Vision** — Image understanding and description
2626
- **Search** — Web search powered by MiniMax
2727
- **Dual Region** — Seamless Global (`api.minimax.io`) and CN (`api.minimaxi.com`) support
@@ -99,9 +99,15 @@ mmx speech voices
9999
### `mmx music`
100100

101101
```bash
102-
mmx music generate --prompt "Upbeat pop" --lyrics "[verse] La da dee, sunny day"
103-
mmx music generate --prompt "Jazz" --lyrics "La la la" --out song.mp3
102+
# Generate with lyrics
103+
mmx music generate --prompt "Upbeat pop" --lyrics "[verse] La da dee, sunny day" --out song.mp3
104+
# Auto-generate lyrics from prompt
105+
mmx music generate --prompt "Indie folk, melancholic, rainy night" --lyrics-optimizer --out song.mp3
106+
# Instrumental (no vocals)
104107
mmx music generate --prompt "Cinematic orchestral" --instrumental --out bgm.mp3
108+
# Cover — generate a cover version from a reference audio file
109+
mmx music cover --prompt "Jazz, piano, warm female vocal" --audio-file original.mp3 --out cover.mp3
110+
mmx music cover --prompt "Indie folk" --audio https://example.com/song.mp3 --out cover.mp3
105111
```
106112

107113
### `mmx vision`

README_CN.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
- **图像生成** — 文生图,支持比例和批量控制
2222
- **视频生成** — 异步生成,进度追踪
2323
- **语音合成** — 30+ 音色、语速调节、流式播放
24-
- **音乐生成** — 文生音乐,支持自定义歌词
24+
- **音乐生成** — 文生音乐,支持自定义歌词、纯音乐、自动生词,以及基于参考音频的 Cover 生成
2525
- **图像理解** — 图片描述与识别
2626
- **网络搜索** — MiniMax 搜索引擎
2727
- **双区域** — 国际版(`api.minimax.io`)和国内版(`api.minimaxi.com`)自动切换
@@ -99,9 +99,15 @@ mmx speech voices
9999
### `mmx music`
100100

101101
```bash
102-
mmx music generate --prompt "欢快的流行乐" --lyrics "[主歌] 啦啦啦,阳光照"
103-
mmx music generate --prompt "爵士风" --lyrics "啦啦啦" --out song.mp3
102+
# 带歌词生成
103+
mmx music generate --prompt "欢快的流行乐" --lyrics "[主歌] 啦啦啦,阳光照" --out song.mp3
104+
# 自动生成歌词
105+
mmx music generate --prompt "忧郁的独立民谣,雨夜" --lyrics-optimizer --out song.mp3
106+
# 纯音乐(无人声)
104107
mmx music generate --prompt "史诗管弦乐" --instrumental --out bgm.mp3
108+
# Cover — 基于参考音频生成翻唱版本
109+
mmx music cover --prompt "爵士钢琴,慵懒女声" --audio-file original.mp3 --out cover.mp3
110+
mmx music cover --prompt "民谣吉他" --audio https://example.com/song.mp3 --out cover.mp3
105111
```
106112

107113
### `mmx vision`

skill/SKILL.md

Lines changed: 52 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,9 @@ echo "Breaking news." | mmx speech synthesize --text-file - --out news.mp3
192192

193193
### music generate
194194

195-
Generate music. Model: `music-2.5`. Responds well to rich, structured descriptions.
195+
Generate music. Responds well to rich, structured descriptions.
196+
197+
**Model:** `music-2.6-free` — unlimited for API key users, RPM = 3.
196198

197199
```bash
198200
mmx music generate --prompt <text> [--lyrics <text>] [flags]
@@ -201,8 +203,10 @@ mmx music generate --prompt <text> [--lyrics <text>] [flags]
201203
| Flag | Type | Description |
202204
|---|---|---|
203205
| `--prompt <text>` | string | Music style description (can be detailed) |
204-
| `--lyrics <text>` | string | Song lyrics with structure tags. Use `"\u65e0\u6b4c\u8bcd"` for instrumental. Cannot be used with `--instrumental` |
206+
| `--lyrics <text>` | string | Song lyrics with structure tags. Required unless `--instrumental` or `--lyrics-optimizer` is used. |
205207
| `--lyrics-file <path>` | string | Read lyrics from file. Use `-` for stdin |
208+
| `--lyrics-optimizer` | boolean | Auto-generate lyrics from prompt. Cannot be used with `--lyrics` or `--instrumental`. |
209+
| `--instrumental` | boolean | Generate instrumental music (no vocals). Cannot be used with `--lyrics`. |
206210
| `--vocals <text>` | string | Vocal style, e.g. `"warm male baritone"`, `"bright female soprano"`, `"duet with harmonies"` |
207211
| `--genre <text>` | string | Music genre, e.g. folk, pop, jazz |
208212
| `--mood <text>` | string | Mood or emotion, e.g. warm, melancholic, uplifting |
@@ -215,7 +219,6 @@ mmx music generate --prompt <text> [--lyrics <text>] [flags]
215219
| `--structure <text>` | string | Song structure, e.g. `"verse-chorus-verse-bridge-chorus"` |
216220
| `--references <text>` | string | Reference tracks or artists, e.g. `"similar to Ed Sheeran"` |
217221
| `--extra <text>` | string | Additional fine-grained requirements |
218-
| `--instrumental` | boolean | Generate instrumental music (no vocals). Cannot be used with `--lyrics` or `--lyrics-file` |
219222
| `--aigc-watermark` | boolean | Embed AI-generated content watermark |
220223
| `--format <fmt>` | string | Audio format (default: `mp3`) |
221224
| `--sample-rate <hz>` | number | Sample rate (default: 44100) |
@@ -226,19 +229,62 @@ mmx music generate --prompt <text> [--lyrics <text>] [flags]
226229
At least one of `--prompt` or `--lyrics` is required.
227230

228231
```bash
229-
# Simple usage
232+
# With lyrics
230233
mmx music generate --prompt "Upbeat pop" --lyrics "La la la..." --out song.mp3 --quiet
231234

235+
# Auto-generate lyrics from prompt
236+
mmx music generate --prompt "Upbeat pop about summer" --lyrics-optimizer --out summer.mp3 --quiet
237+
238+
# Instrumental
239+
mmx music generate --prompt "Cinematic orchestral, building tension" --instrumental --out bgm.mp3 --quiet
240+
232241
# Detailed prompt with vocal characteristics
233242
mmx music generate --prompt "Warm morning folk" \
234243
--vocals "male and female duet, harmonies in chorus" \
235244
--instruments "acoustic guitar, piano" \
236245
--bpm 95 \
237246
--lyrics-file song.txt \
238247
--out duet.mp3
248+
```
249+
250+
---
251+
252+
### music cover
253+
254+
Generate a cover version of a song based on reference audio.
255+
256+
**Model:** `music-cover-free` — unlimited for API key users, RPM = 3.
257+
258+
```bash
259+
mmx music cover --prompt <text> (--audio <url> | --audio-file <path>) [flags]
260+
```
261+
262+
| Flag | Type | Description |
263+
|---|---|---|
264+
| `--prompt <text>` | string, **required** | Target cover style, e.g. `"Indie folk, acoustic guitar, warm male vocal"` |
265+
| `--audio <url>` | string | URL of reference audio (mp3, wav, flac, etc. — 6s to 6min, max 50MB) |
266+
| `--audio-file <path>` | string | Local reference audio file (auto base64-encoded) |
267+
| `--lyrics <text>` | string | Cover lyrics. If omitted, extracted from reference audio via ASR. |
268+
| `--lyrics-file <path>` | string | Read lyrics from file. Use `-` for stdin |
269+
| `--seed <number>` | number | Random seed 0–1000000 for reproducible results |
270+
| `--format <fmt>` | string | Audio format: `mp3`, `wav`, `pcm` (default: `mp3`) |
271+
| `--sample-rate <hz>` | number | Sample rate (default: 44100) |
272+
| `--bitrate <bps>` | number | Bitrate (default: 256000) |
273+
| `--channel <n>` | number | Channels: `1` (mono) or `2` (stereo, default) |
274+
| `--out <path>` | string | Save audio to file |
275+
| `--stream` | boolean | Stream raw audio to stdout |
276+
277+
```bash
278+
# Cover from URL
279+
mmx music cover --prompt "Indie folk, acoustic guitar, warm male vocal" \
280+
--audio https://filecdn.minimax.chat/public/d20eda57-2e36-45bf-9e12-82d9f2e69a86.mp3 --out cover.mp3 --quiet
281+
282+
# Cover from local file with custom lyrics
283+
mmx music cover --prompt "Jazz, piano, slow" \
284+
--audio-file original.mp3 --lyrics-file lyrics.txt --out jazz_cover.mp3 --quiet
239285

240-
# Instrumental (use --instrumental flag)
241-
mmx music generate --prompt "Cinematic orchestral, building tension" --instrumental --out bgm.mp3
286+
# Reproducible result with seed
287+
mmx music cover --prompt "Pop, upbeat" --audio https://filecdn.minimax.chat/public/d20eda57-2e36-45bf-9e12-82d9f2e69a86.mp3 --seed 42 --out cover.mp3
242288
```
243289

244290
---

src/commands/music/cover.ts

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
import { readFileSync } from 'fs';
2+
import { defineCommand } from '../../command';
3+
import { CLIError } from '../../errors/base';
4+
import { ExitCode } from '../../errors/codes';
5+
import { request, requestJson } from '../../client/http';
6+
import { musicEndpoint } from '../../client/endpoints';
7+
import { formatOutput, detectOutputFormat } from '../../output/formatter';
8+
import { saveAudioOutput } from '../../output/audio';
9+
import type { Config } from '../../config/schema';
10+
import type { GlobalFlags } from '../../types/flags';
11+
import type { MusicRequest, MusicResponse } from '../../types/api';
12+
import { musicCoverModel } from './models';
13+
14+
export default defineCommand({
15+
name: 'music cover',
16+
description: 'Generate a cover version of a song based on reference audio (music-cover-free)',
17+
usage: 'mmx music cover --prompt <text> (--audio <url> | --audio-file <path>) [--lyrics <text>] [--out <path>] [flags]',
18+
options: [
19+
{ flag: '--prompt <text>', description: 'Target cover style, e.g. "Indie folk, acoustic guitar, warm male vocal"' },
20+
{ flag: '--audio <url>', description: 'URL of the reference audio (mp3, wav, flac, etc. — 6s to 6min, max 50MB)' },
21+
{ flag: '--audio-file <path>', description: 'Local reference audio file (auto base64-encoded)' },
22+
{ flag: '--lyrics <text>', description: 'Cover lyrics. If omitted, extracted from reference audio via ASR.' },
23+
{ flag: '--lyrics-file <path>', description: 'Read lyrics from file (use - for stdin)' },
24+
{ flag: '--seed <number>', description: 'Random seed 0–1000000 for reproducible results', type: 'number' },
25+
{ flag: '--format <fmt>', description: 'Audio format: mp3, wav, pcm (default: mp3)' },
26+
{ flag: '--sample-rate <hz>', description: 'Sample rate: 16000, 24000, 32000, 44100 (default: 44100)', type: 'number' },
27+
{ flag: '--bitrate <bps>', description: 'Bitrate: 32000, 64000, 128000, 256000 (default: 256000)', type: 'number' },
28+
{ flag: '--channel <n>', description: 'Channels: 1 (mono) or 2 (stereo, default)', type: 'number' },
29+
{ flag: '--stream', description: 'Stream raw audio to stdout' },
30+
{ flag: '--out <path>', description: 'Save audio to file' },
31+
],
32+
examples: [
33+
'mmx music cover --prompt "Indie folk, acoustic guitar, warm male vocal" --audio https://example.com/song.mp3 --out cover.mp3',
34+
'mmx music cover --prompt "Jazz, piano, slow" --audio-file original.mp3 --lyrics-file lyrics.txt --out jazz_cover.mp3',
35+
'mmx music cover --prompt "Pop, upbeat" --audio https://example.com/ref.mp3 --seed 42 --out reproducible.mp3',
36+
],
37+
async run(config: Config, flags: GlobalFlags) {
38+
const prompt = flags.prompt as string | undefined;
39+
const audioUrl = flags.audio as string | undefined;
40+
const audioFile = flags.audioFile as string | undefined;
41+
42+
if (!prompt) {
43+
throw new CLIError('--prompt is required.', ExitCode.USAGE, 'mmx music cover --prompt <text> --audio <url>');
44+
}
45+
46+
if (!audioUrl && !audioFile) {
47+
throw new CLIError(
48+
'One of --audio <url> or --audio-file <path> is required.',
49+
ExitCode.USAGE,
50+
'mmx music cover --prompt <text> --audio <url>',
51+
);
52+
}
53+
54+
if (audioUrl && audioFile) {
55+
throw new CLIError('Use either --audio or --audio-file, not both.', ExitCode.USAGE);
56+
}
57+
58+
let lyrics = flags.lyrics as string | undefined;
59+
if (flags.lyricsFile) {
60+
const { readTextFromPathOrStdin } = await import('../../utils/fs');
61+
lyrics = readTextFromPathOrStdin(flags.lyricsFile as string);
62+
}
63+
64+
const ts = new Date().toISOString().slice(0, 19).replace(/[T:]/g, '-');
65+
const ext = (flags.format as string) || 'mp3';
66+
const outPath = (flags.out as string | undefined) ?? `cover_${ts}.${ext}`;
67+
const format = detectOutputFormat(config.output);
68+
69+
const model = musicCoverModel(config);
70+
const body: MusicRequest = {
71+
model,
72+
prompt,
73+
lyrics,
74+
seed: flags.seed as number | undefined,
75+
audio_setting: {
76+
format: ext,
77+
sample_rate: (flags.sampleRate as number) ?? 44100,
78+
bitrate: (flags.bitrate as number) ?? 256000,
79+
channel: (flags.channel as number) ?? undefined,
80+
},
81+
output_format: 'hex',
82+
stream: flags.stream === true,
83+
};
84+
85+
if (audioUrl) {
86+
body.audio_url = audioUrl;
87+
} else {
88+
body.audio_base64 = readFileSync(audioFile!).toString('base64');
89+
}
90+
91+
if (config.dryRun) {
92+
console.log(formatOutput({ request: body }, format));
93+
return;
94+
}
95+
96+
const url = musicEndpoint(config.baseUrl);
97+
98+
if (flags.stream) {
99+
const res = await request(config, { url, method: 'POST', body, stream: true });
100+
const reader = res.body?.getReader();
101+
if (!reader) throw new CLIError('No response body', ExitCode.GENERAL);
102+
while (true) {
103+
const { done, value } = await reader.read();
104+
if (done) break;
105+
process.stdout.write(value);
106+
}
107+
reader.releaseLock();
108+
return;
109+
}
110+
111+
const response = await requestJson<MusicResponse>(config, {
112+
url,
113+
method: 'POST',
114+
body,
115+
});
116+
117+
if (!config.quiet) process.stderr.write(`[Model: ${model}]\n`);
118+
saveAudioOutput(response, outPath, format, config.quiet);
119+
},
120+
});

0 commit comments

Comments
 (0)