2 years ago · 6a4eed7139
--- a/README.md
+++ b/README.md
@@ -1,58 +1,21 @@
 
															 # LocalAI
														
 
															-go语言开发，基于开源本地模型 llama.cpp 后端，可以docker部署，有webui。
														
 
															-
														
 
															-**LocalAI** is a drop-in replacement REST API compatible with OpenAI API specifications for local inferencing. It allows to run models locally or on-prem with consumer grade hardware, supporting multiple models families compatible with the `ggml` format. For a list of the supported model families, see [the model compatibility table below](https://github.com/go-skynet/LocalAI#model-compatibility-table).
														
 
															-- OpenAI drop-in alternative REST API
														
 
															-- Supports multiple models
														
 
															-- Once loaded the first time, it keep models loaded in memory for faster inference
														
 
															-- Support for prompt templates
														
 
															-- Doesn't shell-out, but uses C++ bindings for a faster inference and better performance. 
														
 
															+https://github.com/mudler/LocalAI
														
 
															-LocalAI is a community-driven project, focused on making the AI accessible to anyone. Any contribution, feedback and PR is welcome! It was initially created by [mudler](https://github.com/mudler/) at the [SpectroCloud OSS Office](https://github.com/spectrocloud).
														
 
															-LocalAI uses C++ bindings for optimizing speed. It is based on [llama.cpp](https://github.com/ggerganov/llama.cpp), [gpt4all](https://github.com/nomic-ai/gpt4all), [rwkv.cpp](https://github.com/saharNooby/rwkv.cpp), [ggml](https://github.com/ggerganov/ggml), [whisper.cpp](https://github.com/ggerganov/whisper.cpp) for audio transcriptions, and [bert.cpp](https://github.com/skeskinen/bert.cpp) for embedding.
														
 
															+go语言开发，基于开源本地模型 llama.cpp 后端，可以docker部署，有webui。
														
 
															 See [examples on how to integrate LocalAI](https://github.com/go-skynet/LocalAI/tree/master/examples/).
														
 
															-### How does it work?  
														
 
															-
														
 
															-<details>
														
 
															-  
														
 
															-![LocalAI](https://github.com/go-skynet/LocalAI/assets/2420543/38de3a9b-3866-48cd-9234-662f9571064a)
														
 
															-
														
 
															-</details>
														
 
															-
														
 
															-## News
														
 
															-
														
 
															-- 14-05-2023: __v1.11.1__ released! `rwkv` backend patch release
														
 
															-- 13-05-2023: __v1.11.0__ released! 🔥 Updated `llama.cpp` bindings: This update includes a breaking change in the model files ( https://github.com/ggerganov/llama.cpp/pull/1405 ) - old models should still work with the `gpt4all-llama` backend.
														
 
															-- 12-05-2023: __v1.10.0__ released! 🔥🔥 Updated `gpt4all` bindings. Added support for GPTNeox (experimental), RedPajama (experimental), Starcoder (experimental), Replit (experimental), MosaicML MPT. Also now `embeddings` endpoint supports tokens arrays. See the [langchain-chroma](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain-chroma) example! Note - this update does NOT include https://github.com/ggerganov/llama.cpp/pull/1405 which makes models incompatible.
														
 
															-- 11-05-2023: __v1.9.0__ released! 🔥 Important whisper updates ( https://github.com/go-skynet/LocalAI/pull/233 https://github.com/go-skynet/LocalAI/pull/229 ) and extended gpt4all model families support ( https://github.com/go-skynet/LocalAI/pull/232 ). Redpajama/dolly experimental ( https://github.com/go-skynet/LocalAI/pull/214 )
														
 
															-- 10-05-2023: __v1.8.0__ released! 🔥 Added support for fast and accurate embeddings with `bert.cpp` ( https://github.com/go-skynet/LocalAI/pull/222 )
														
 
															-- 09-05-2023: Added experimental support for transcriptions endpoint ( https://github.com/go-skynet/LocalAI/pull/211 )
														
 
															-- 08-05-2023: Support for embeddings with models using the `llama.cpp` backend ( https://github.com/go-skynet/LocalAI/pull/207 )
														
 
															-- 02-05-2023: Support for `rwkv.cpp` models ( https://github.com/go-skynet/LocalAI/pull/158 ) and for `/edits` endpoint
														
 
															-- 01-05-2023: Support for SSE stream of tokens in `llama.cpp` backends ( https://github.com/go-skynet/LocalAI/pull/152 )
														
 
															-Twitter: [@LocalAI_API](https://twitter.com/LocalAI_API) and [@mudler_it](https://twitter.com/mudler_it)
														
 
															-### Blogs and articles
														
 
															-
														
 
															-- [Question Answering on Documents locally with LangChain, LocalAI, Chroma, and GPT4All](https://mudler.pm/posts/localai-question-answering/) by Ettore Di Giacinto
														
 
															-- [Tutorial to use k8sgpt with LocalAI](https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65) - excellent usecase for localAI, using AI to analyse Kubernetes clusters. by Tyller Gillson
														
 
															-
														
 
															-## Contribute and help
														
 
															-
														
 
															-To help the project you can:
														
 
															-
														
 
															-- Upvote the [Reddit post](https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai_openai_compatible_api_to_run_llm_models/) about LocalAI.
														
 
															+```
														
 
															+docker run -ti --rm --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu
														
 
															+```
														
 
															-- [Hacker news post](https://news.ycombinator.com/item?id=35726934) - help us out by voting if you like this project.
														
 
															+访问： http://127.0.0.1:8080/swagger
														
 
															-- If you have technological skills and want to contribute to development, have a look at the open issues. If you are new you can have a look at the [good-first-issue](https://github.com/go-skynet/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) and [help-wanted](https://github.com/go-skynet/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels.
														
 
															-- If you don't have technological skills you can still help improving documentation or add examples or share your user-stories with our community, any help and contribution is welcome!
														
 
															 ## Model compatibility
														
@@ -99,7 +62,7 @@ Depending on the model you are attempting to run might need more RAM or CPU reso
 
															 <details>
														
 
															 | Backend         | Compatible models     | Completion/Chat endpoint | Audio transcription | Embeddings support                | Token stream support | Github                                     | Bindings                                  |
														
 
															-|-----------------|-----------------------|--------------------------|---------------------|-----------------------------------|----------------------|--------------------------------------------|-------------------------------------------|
														
 
															+| --------------- | --------------------- | ------------------------ | ------------------- | --------------------------------- | -------------------- | ------------------------------------------ | ----------------------------------------- |
														
 
															 | llama           | Vicuna, Alpaca, LLaMa | yes                      | no                  | yes (doesn't seem to be accurate) | yes                  | https://github.com/ggerganov/llama.cpp     | https://github.com/go-skynet/go-llama.cpp |
														
 
															 | gpt4all-llama   | Vicuna, Alpaca, LLaMa | yes                      | no                  | no                                | yes                  | https://github.com/nomic-ai/gpt4all        | https://github.com/go-skynet/gpt4all      |
														
 
															 | gpt4all-mpt     | MPT                   | yes                      | no                  | no                                | yes                  | https://github.com/nomic-ai/gpt4all        | https://github.com/go-skynet/gpt4all      |
														
@@ -108,8 +71,8 @@ Depending on the model you are attempting to run might need more RAM or CPU reso
 
															 | dolly           | Dolly                 | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															 | redpajama       | RedPajama             | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															 | stableLM        | StableLM GPT/NeoX     | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															-| replit       | Replit             | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															-| gptneox       | GPT NeoX             | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															+| replit          | Replit                | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															+| gptneox         | GPT NeoX              | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															 | starcoder       | Starcoder             | yes                      | no                  | no                                | no                   | https://github.com/ggerganov/ggml          | https://github.com/go-skynet/go-gpt2.cpp  |
														
 
															 | bloomz          | Bloom                 | yes                      | no                  | no                                | no                   | https://github.com/NouamaneTazi/bloomz.cpp | https://github.com/go-skynet/bloomz.cpp   |
														
 
															 | rwkv            | RWKV                  | yes                      | no                  | no                                | yes                  | https://github.com/saharNooby/rwkv.cpp     | https://github.com/donomii/go-rwkv.cpp    |
														
@@ -335,14 +298,14 @@ Usage:
 
															 local-ai --models-path <model_path> [--address <address>] [--threads <num_threads>]
														
 
															 ```
														
 
															-| Parameter    | Environment Variable | Default Value | Description                            |
														
 
															-| ------------ | -------------------- | ------------- | -------------------------------------- |
														
 
															-| models-path        | MODELS_PATH           |               | The path where you have models (ending with `.bin`).      |
														
 
															-| threads      | THREADS              | Number of Physical cores     | The number of threads to use for text generation. |
														
 
															-| address      | ADDRESS              | :8080         | The address and port to listen on. |
														
 
															-| context-size | CONTEXT_SIZE         | 512           | Default token context size. |
														
 
															-| debug | DEBUG         | false           | Enable debug mode. |
														
 
															-| config-file | CONFIG_FILE         | empty           | Path to a LocalAI config file. |
														
 
															+| Parameter    | Environment Variable | Default Value            | Description                                          |
														
 
															+| ------------ | -------------------- | ------------------------ | ---------------------------------------------------- |
														
 
															+| models-path  | MODELS_PATH          |                          | The path where you have models (ending with `.bin`). |
														
 
															+| threads      | THREADS              | Number of Physical cores | The number of threads to use for text generation.    |
														
 
															+| address      | ADDRESS              | :8080                    | The address and port to listen on.                   |
														
 
															+| context-size | CONTEXT_SIZE         | 512                      | Default token context size.                          |
														
 
															+| debug        | DEBUG                | false                    | Enable debug mode.                                   |
														
 
															+| config-file  | CONFIG_FILE          | empty                    | Path to a LocalAI config file.                       |
														
 
															 </details>
														
@@ -449,7 +412,7 @@ LocalAI can be installed inside Kubernetes with helm.
 
															     ```bash
														
 
															     helm repo add go-skynet https://go-skynet.github.io/helm-charts/
														
 
															     ```
														
 
															-1. Create a values files with your settings:
														
 
															+2. Create a values files with your settings:
														
 
															 ```bash
														
 
															 cat <<EOF > values.yaml
														
 
															 deployment:
														
@@ -606,7 +569,7 @@ curl http://localhost:8080/v1/audio/transcriptions -H "Content-Type: multipart/f
 
															 ```
														
 
															 </details>
														
 
															-  
														
 
															+
														
 
															 ## Frequently asked questions
														
 
															 Here are answers to some of the most common questions.
														
@@ -638,8 +601,7 @@ Yes! If the client uses OpenAI and supports setting a different base URL to send
 
															 </details>
														
 
															-
														
 
															-### Can this leverage GPUs? 
														
 
															+### Can this leverage GPUs?
														
 
															 <details>
														
@@ -647,7 +609,7 @@ Not currently, as ggml doesn't support GPUs yet: https://github.com/ggerganov/ll
 
															 </details>
														
 
															-### Where is the webUI? 
														
 
															+### Where is the webUI?
														
 
															 <details> 
														
 
															 There is the availability of localai-webui and chatbot-ui in the examples section and can be setup as per the instructions. However as LocalAI is an API you can already plug it into existing projects that provides are UI interfaces to OpenAI's APIs. There are several already on github, and should be compatible with LocalAI already (as it mimics the OpenAI API)