README.md 320 B

tiktoken

openai模型BPE标记

Usage

pip install tiktoken

import tiktoken
import torch

text = f"""
hello world
"""
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")

# Encode 
tokens = encoding.encode(text)
print(tokens);

# Decode
[encoding.decode_single_token_bytes(token) for token in tokens]