集成opanai、huggingface,直接使用pandas dataframe进行问答,无需额外的数据处理。
liuyuqi-dellpc 9012224e0d 源码分析 | 1 year ago | |
---|---|---|
README.md | 1 year ago |
集成opanai、huggingface的pandasai,可以直接使用pandas dataframe进行问答,无需额外的数据处理。
pip install pandasai
import pandas as pd
from pandasai import PandasAI
# Sample DataFrame
df = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
"happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})
# Instantiate a LLM
from pandasai.llm.openai import OpenAI
llm = OpenAI()
# OpenAI
#llm = OpenAI(api_token="YOUR_OPENAI_API_KEY")
#llm = Starcoder(api_token="YOUR_HF_API_KEY")
pandas_ai = PandasAI(llm)
pandas_ai.run(df, prompt='Which are the 5 happiest countries?')
pandas-ai 可用 openai 和huggingface 的模型,这里使用的是openai的模型分析,需要设置openai的api_token。
1、生成提取词:
There is a dataframe in pandas (python).
The name of the dataframe is `df`.
This is the result of `print(df.head({rows_to_display}))`:
{df_head}.
Return the python code (do not import anything) and make sure to prefix the python code with {START_CODE_TAG} exactly and suffix the code with {END_CODE_TAG} exactly
to get the answer to the following question :
上面加上用户的问题。
2、调用openai接口,上面提取词的任务是生成python代码,包含前后缀。
3、执行run_code()方法,执行 exec(code_to_run),返回结果。
注意:
为了可视化结果,需要在 notebook 中运行。
依赖:pandas,openai,requests,dotenv。其他包含 notebook 基础作图包。