Browse Source

源码分析

liuyuqi-dellpc 2 years ago
parent
commit
9012224e0d
1 changed files with 56 additions and 0 deletions
  1. 56 0
      README.md

+ 56 - 0
README.md

@@ -1,2 +1,58 @@
 # pandas-ai
 
+集成opanai、huggingface的pandasai,可以直接使用pandas dataframe进行问答,无需额外的数据处理。
+
+## Usage
+
+```
+pip install pandasai
+
+
+import pandas as pd
+from pandasai import PandasAI
+
+# Sample DataFrame
+df = pd.DataFrame({
+    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
+    "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
+    "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
+})
+
+# Instantiate a LLM
+from pandasai.llm.openai import OpenAI
+llm = OpenAI()
+# OpenAI
+#llm = OpenAI(api_token="YOUR_OPENAI_API_KEY")
+#llm = Starcoder(api_token="YOUR_HF_API_KEY")
+
+pandas_ai = PandasAI(llm)
+pandas_ai.run(df, prompt='Which are the 5 happiest countries?')
+
+```
+
+## 源码分析
+
+pandas-ai 可用 openai 和huggingface 的模型,这里使用的是openai的模型分析,需要设置openai的api_token。
+
+1、生成提取词:
+
+```
+There is a dataframe in pandas (python).
+The name of the dataframe is `df`.
+This is the result of `print(df.head({rows_to_display}))`:
+{df_head}.
+
+Return the python code (do not import anything) and make sure to prefix the python code with {START_CODE_TAG} exactly and suffix the code with {END_CODE_TAG} exactly 
+to get the answer to the following question :
+```
+上面加上用户的问题。
+
+2、调用openai接口,上面提取词的任务是生成python代码,包含前后缀。
+
+3、执行run_code()方法,执行 exec(code_to_run),返回结果。
+
+**注意:**
+
+为了可视化结果,需要在 notebook 中运行。
+
+依赖:pandas,openai,requests,dotenv。其他包含 notebook 基础作图包。