Issue when fine-tune dataset

Thanks for taking time. 

I tried fine-tune with my own dataset, following default instruction. However, when I ran inference with the new model, I saw this error:

`
2024-09-23 22:29:19.347 | INFO     | __main__:main:662 - Loading model ...
Traceback (most recent call last):
    model: Union[NaiveTransformer, DualARTransformer] = BaseTransformer.from_pretrained(
  File "/home/colafly/Repo/fish-speech/fish_speech/models/text2semantic/llama.py", line 325, in from_pretrained
    config = BaseModelArgs.from_pretrained(str(path))
  File "/home/colafly/Repo/fish-speech/fish_speech/models/text2semantic/llama.py", line 77, in from_pretrained
    data = json.load(f)
  File "/home/colafly/.pyenv/versions/3.10.14/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/home/colafly/.pyenv/versions/3.10.14/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte
`

any idea what it might be the issue? 

I did not modify any of the steps, except reduce batch size to 4 to fit into my GPU. Any help is much appreciated. 
`
python tools/llama/generate.py     --text "深呼吸"     --checkpoint-path "checkpoints/fish-speech-1.4-yth-lora/model.pth"     --num-samples 2     --compile
`



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue when fine-tune dataset #31

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue when fine-tune dataset #31

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions