openai模型BPE标记,压缩文本数据 https://github.com/openai/tiktoken

天问 7685f9bb93 Update 'README.md' 1 year ago
README.md 7685f9bb93 Update 'README.md' 1 year ago

README.md

tiktoken

openai模型BPE标记

Usage

pip install tiktoken

import tiktoken
import torch

text = f"""
hello world
"""
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")

# Encode 
tokens = encoding.encode(text)
print(tokens);

# Decode
[encoding.decode_single_token_bytes(token) for token in tokens]