openai模型BPE标记,压缩文本数据 https://github.com/openai/tiktoken
|
1 year ago | |
---|---|---|
README.md | 1 year ago |
openai模型BPE标记
pip install tiktoken
import tiktoken
import torch
text = f"""
hello world
"""
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
# Encode
tokens = encoding.encode(text)
print(tokens);
# Decode
[encoding.decode_single_token_bytes(token) for token in tokens]