|
@@ -1,2 +1,58 @@
|
|
|
# pandas-ai
|
|
|
|
|
|
+集成opanai、huggingface的pandasai,可以直接使用pandas dataframe进行问答,无需额外的数据处理。
|
|
|
+
|
|
|
+## Usage
|
|
|
+
|
|
|
+```
|
|
|
+pip install pandasai
|
|
|
+
|
|
|
+
|
|
|
+import pandas as pd
|
|
|
+from pandasai import PandasAI
|
|
|
+
|
|
|
+# Sample DataFrame
|
|
|
+df = pd.DataFrame({
|
|
|
+ "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
|
|
|
+ "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
|
|
|
+ "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
|
|
|
+})
|
|
|
+
|
|
|
+# Instantiate a LLM
|
|
|
+from pandasai.llm.openai import OpenAI
|
|
|
+llm = OpenAI()
|
|
|
+# OpenAI
|
|
|
+#llm = OpenAI(api_token="YOUR_OPENAI_API_KEY")
|
|
|
+#llm = Starcoder(api_token="YOUR_HF_API_KEY")
|
|
|
+
|
|
|
+pandas_ai = PandasAI(llm)
|
|
|
+pandas_ai.run(df, prompt='Which are the 5 happiest countries?')
|
|
|
+
|
|
|
+```
|
|
|
+
|
|
|
+## 源码分析
|
|
|
+
|
|
|
+pandas-ai 可用 openai 和huggingface 的模型,这里使用的是openai的模型分析,需要设置openai的api_token。
|
|
|
+
|
|
|
+1、生成提取词:
|
|
|
+
|
|
|
+```
|
|
|
+There is a dataframe in pandas (python).
|
|
|
+The name of the dataframe is `df`.
|
|
|
+This is the result of `print(df.head({rows_to_display}))`:
|
|
|
+{df_head}.
|
|
|
+
|
|
|
+Return the python code (do not import anything) and make sure to prefix the python code with {START_CODE_TAG} exactly and suffix the code with {END_CODE_TAG} exactly
|
|
|
+to get the answer to the following question :
|
|
|
+```
|
|
|
+上面加上用户的问题。
|
|
|
+
|
|
|
+2、调用openai接口,上面提取词的任务是生成python代码,包含前后缀。
|
|
|
+
|
|
|
+3、执行run_code()方法,执行 exec(code_to_run),返回结果。
|
|
|
+
|
|
|
+**注意:**
|
|
|
+
|
|
|
+为了可视化结果,需要在 notebook 中运行。
|
|
|
+
|
|
|
+依赖:pandas,openai,requests,dotenv。其他包含 notebook 基础作图包。
|