开始使用LangChain：构建由LLM驱动的应用程序的新手指南

Two stochastic parrots sitting on a chain of large language models: LangChain

自从ChatGPT发布以来，大型语言模型（LLM）已经获得了很大的普及。虽然你可能没有足够的资金和计算资源在你的家里从头开始训练一个LLM，但你仍然可以使用预先训练好的LLM来建立一些很酷的东西，比如说：

可以根据你的数据与外界互动的个人助理
为您的目的定制的聊天机器人
对你的文件或代码进行分析或总结
凭借其不可思议的API和提示工程，LLM正在改变我们构建人工智能驱动的产品的方式。这就是为什么在 "LLMOps "一词下，新的开发者工具正在到处出现。

其中可以使用的一个新工具便是LangChain.

什么是LangChain？

LangChain是一个框架，通过提供以下内容，帮助你更容易地建立由LLM驱动的应用程序：

一个通向各种不同基础模型的通用接口（见模型(Models)）
一个帮助你管理你的提示的框架（见提示(Prompts)）
一个连接长期记忆（见记忆(Memory)）、外部数据（见索引(Indexes)）、其他LLM（见链(Chains)）和其他代理的中央接口，以完成LLM无法处理的任务（如计算或搜索）（见代理(Agents)）

它是一个开源项目（GitHub仓库），由Harrison Chase创建。目前已经有41.7k的star。

LangChain 支持多种模型接口，如 OpenAI、HuggingFace 和 AzureOpenAI，同时提供 Fake LLM 用于测试。此外，它还支持缓存（如内存、SQLite、Redis 和 SQL）、用量记录、流模式（类似打字效果）以及 Prompt 管理等功能。因为LangChain有很多不同的功能，所以一开始要理解它的作用可能会有难度。这就是为什么我们将在本文中介绍LangChain的必知概念以及（目前）六个关键模块，让你更好地了解其功能。

必知概念

首先我们来了解一些LangChain 的关键概念：

Loader（加载器）

加载器用于从指定源加载数据。LangChain 提供了丰富的加载器，例如：

GoogleDriveLoader（Google 网盘）
UnstructuredHTMLLoader（任意网页）
DirectoryLoader（文件夹）
AzureBlobStorageContainerLoader（Azure 存储）
CSVLoader（CSV 文件）
EverNoteLoader（印象笔记）
PyPDFLoader（PDF 文件）
YoutubeLoader（YouTube 视频）

Document（文档）

当加载器读取到数据源后，需要将数据源转换成 Document 对象，方便后续使用。

Text Splitters（文本分割器）

文本分割器用于分割文本。由于向 OpenAI API 发送文本或使用其 embedding 功能时有字符限制，因此需要使用文本分割器分割加载器读取到的 Document。

Vectorstores（向量数据库）

将数据 Document 转换成向量后，便可通过向量运算进行相关性搜索。将数据存储到相应的向量数据库中即可完成向量转换。LangChain 提供了多种向量数据库供您选择。

Chain（链）

Chain 可串联执行多个任务。

Agent（代理）

Agent 可以简单地理解为动态帮助我们选择和调用 Chain 或已有工具的功能。

Embedding（嵌入）

嵌入用于衡量文本的相关性，是 OpenAI API 构建知识库的关键技术。相较于 fine-tuning，嵌入的优势在于无需训练，可实时添加新内容，且成本较低。了解更多关于嵌入和 fine-tuning 的比较，请观看这个视频：https://www.youtube.com/watch?v=9qq6HTr7Ocw

前提：

要继续学习本教程，你需要安装 langchain Python软件包和所有相关的API密钥，以便使用。

在安装 langchain 包之前，确保你的Python版本≥3.8.1和<4.0。

要安装 langchain Python软件包，你可以 pip 安装它。

pip install langchain

在本教程中，我们使用的是0.0.147版本。GitHub仓库非常活跃；因此，确保你有一个最新的版本。

一旦你全部设置完毕，导入 langchain Python包。

import langchain

用LLM构建一个应用程序需要你想使用的一些服务的API密钥，而一些API有相关费用。

LLM供应商（必填）--你首先需要一个你想使用的LLM供应商的API密钥。我们目前正在经历 "人工智能的Linux时刻"，开发人员必须根据主要在性能和成本之间的权衡，在专有或开源的基础模型之间作出选择。

LLM Providers: Proprietary and open-source foundation models

专有模型是闭源基础模型，由拥有大型专家团队和大额人工智能预算的公司拥有。它们通常比开源模型更大，因此有更好的性能，但它们也有昂贵的API。专有模型供应商的例子有OpenAI、co:here、AI21实验室或Anthropic.

大多数可用的LangChain教程都使用OpenAI，但请注意，OpenAI的API（对于实验来说并不昂贵，但它）并不是免费的。要获得OpenAI的API密钥，你需要一个OpenAI账户，然后在API密钥下 "创建新的秘密密钥"。

import os
os.environ["OPENAI_API_KEY"] = ... # insert your API_TOKEN here

开源模型通常是较小的模型，功能比专有模型低，但它们比专有模型更具成本效益。开源模型的例子有：

BLOOM by BigScience
LLaMA by Meta AI
Flan-T5 by Google
GPT-J by Eleuther AI

许多开源模型被组织并托管在Hugging Face上，作为社区中心。要获得Hugging Face的API密钥，您需要一个Hugging Face账户，并在访问令牌下创建一个 "新令牌"。

import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = ... # insert your API_TOKEN here

你可以免费使用Hugging Face的开源LLM，但你将被限制在性能较小的LLM上。

矢量数据库（可选）--如果你想使用一个特定的矢量数据库，如Pinecone、Weaviate或Milvus，你需要向他们注册，以获得一个API密钥并检查他们的价格。在本教程中，我们使用的是Faiss，它不需要注册。

工具（可选）--根据你希望LLM与之互动的工具，如OpenWeatherMap或SerpAPI，你可能需要向它们注册，以获得API密钥并检查它们的价格。在本教程中，我们只使用不需要API密钥的工具。

你能用LangChain做什么？

该langchain软件包提供了许多基础模型的通用接口，实现了提示管理，并作为一个中心接口，通过代理与其他组件如提示模板、其他LLM、外部数据和其他工具连接。

在撰写本文时，LangChain（0.0.147版）涵盖六个模块：

Models:Choosing from different LLMs and embedding models
Prompts:Managing LLM inpputs
Chains:Combining LLMs with other components
Indexes:Accessing external data
Memory:Remembering previous conversations
Agents: Accessing other tools

下面几节中的代码例子是从LangChain文档中复制和修改的。

模型（Models）：从不同的LLMs和嵌入模型中进行选择

目前，许多不同的LLM正在出现。LangChain提供了与各种模型的集成，并为所有模型提供了一个精简的接口。

LangChain区分了三种类型的模型，它们在输入和输出方面有所不同：

LLM接受一个字符串作为输入（提示），并输出一个字符串（完成）。

# Proprietary LLM from e.g. OpenAI
# pip install openai
from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")

# Alternatively, open-source LLM hosted on Hugging Face
# pip install huggingface_hub
from langchain import HuggingFaceHub
llm = HuggingFaceHub(repo_id = "google/flan-t5-xl")

# The LLM takes a prompt as an input and outputs a completion
prompt = "Alice has a parrot. What animal is Alice's pet?"
completion = llm(prompt)

LLM models take a prompt as an input and a completion as an output

聊天模型(Chat models )类似于LLMs。它们接受一个聊天信息列表作为输入，并返回一个聊天信息。

文本嵌入模型(Text embedding models)接受文本输入并返回一个浮点列表（嵌入），这是输入文本的数字表示。嵌入有助于从文本中提取信息。这些信息随后可以被使用，例如，用于计算文本之间的相似性（例如，电影摘要）。

# Proprietary text embedding model from e.g. OpenAI
# pip install tiktoken
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

# Alternatively, open-source text embedding model hosted on Hugging Face
# pip install sentence_transformers
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name = "sentence-transformers/all-MiniLM-L6-v2")

# The embeddings model takes a text as an input and outputs a list of floats
text = "Alice has a parrot. What animal is Alice's pet?"
text_embedding = embeddings.embed_query(text)

Text embedding models take a text as an input and output its numerical representation as a list of floats

提示(Prompts)：LLM的投入

LLM的API很让人着迷。虽然用自然语言向LLM输入提示信息应该感觉很直观，但需要对提示信息进行相当多的调整，直到你从LLM获得所需的输出。这个过程被称为提示工程。

一旦你有了一个好的提示，你可能想把它作为一个模板用于其他目的。因此，LangChain为你提供了所谓的 PromptTemplates ，它可以帮助你从多个组件中构建提示语。

from langchain import PromptTemplate

template = "What is a good name for a company that makes {product}?"

prompt = PromptTemplate(
    input_variables=["product"],
    template=template,
)

prompt.format(product="colorful socks")

上述提示可以被看作是一个零起点的问题设置，你希望LLM是在足够多的相关数据上训练出来的，以提供一个令人满意的回答。

改善LLM输出的另一个技巧是在提示中加入一些例子，使其成为一个几率很大的问题设置( few-shot problem setting)。

from langchain import PromptTemplate, FewShotPromptTemplate

examples = [
    {"word": "happy", "antonym": "sad"},
    {"word": "tall", "antonym": "short"},
]

example_template = """
Word: {word}
Antonym: {antonym}\n
"""

example_prompt = PromptTemplate(
    input_variables=["word", "antonym"],
    template=example_template,
)

few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Word: {input}\nAntonym:",
    input_variables=["input"],
    example_separator="\n",
)

few_shot_prompt.format(input="big")

上述代码将生成一个提示模板，并根据所提供的例子和输入的内容组成以下提示：

Give the antonym of every input

Word: happy
Antonym: sad



Word: tall
Antonym: short


Word: big
Antonym:

链条（Chains）：将LLMs与其他组件结合起来

LangChain中的Chaining简单地描述了将LLM与其他组件结合起来以创建一个应用程序的过程。一些例子是：

将LLM与提示模板相结合
将第一个LLM的输出作为第二个LLM的输入，依次组合多个LLM
将LLM与外部数据相结合，例如，用于回答问题
将LLM与长期记忆相结合，例如，用于聊天历史

在上一节中，我们创建了一个提示模板。当我们想在我们的LLM中使用它时，我们可以使用一个 LLMChain ，如下：

from langchain.chains import LLMChain

chain = LLMChain(llm = llm, 
                  prompt = prompt)

# Run the chain only specifying the input variable.
chain.run("colorful socks")

如果我们想用这第一个LLM的输出作为第二个LLM的输入，我们可以使用 SimpleSequentialChain ：

from langchain.chains import LLMChain, SimpleSequentialChain

# Define the first chain as in the previous code example
# ...

# Create a second chain with a prompt template and an LLM
second_prompt = PromptTemplate(
    input_variables=["company_name"],
    template="Write a catchphrase for the following company: {company_name}",
)

chain_two = LLMChain(llm=llm, prompt=second_prompt)

# Combine the first and the second chain 
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

# Run the chain specifying only the input variable for the first chain.
catchphrase = overall_chain.run("colorful socks")

索引(index)：访问外部数据

LLMs的一个局限性是他们缺乏上下文信息（例如，对一些特定文件或电子邮件的访问）。你可以通过让LLM访问特定的外部数据来解决这个问题。

为此，你首先需要用文档加载器加载外部数据。LangChain为不同类型的文件提供了各种加载器，从PDF和电子邮件到网站和YouTube视频。

让我们从一个YouTube视频中加载一些外部数据。如果你想加载一个大的文本文件并用文本分割器分割它，你可以参考官方文档。

# pip install youtube-transcript-api
# pip install pytube

from langchain.document_loaders import YoutubeLoader

loader = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
    
documents = loader.load()

现在你已经准备好了你的外部数据作为 documents ，你可以用文本嵌入模型（见模型）在一个矢量数据库--VectorStore中进行索引。流行的矢量数据库包括Pinecone、Weaviate和Milvus。在这篇文章中，我们使用Faiss，因为它不需要API密钥。

# pip install faiss-cpu
from langchain.vectorstores import FAISS

# create the vectorestore to use as the index
db = FAISS.from_documents(documents, embeddings)

你的文件（在这种情况下，是一个视频）现在以嵌入的方式存储在一个矢量存储中。

现在你可以用这个外部数据做各种事情。让我们用它来完成一个信息检索器的答题任务：

from langchain.chains import RetrievalQA

retriever = db.as_retriever()

qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, 
    return_source_documents=True)

query = "What am I never going to do?"
result = qa({"query": query})

print(result['result'])

记忆(Memory)：记住以前的对话

对于像聊天机器人这样的应用来说，它们能够记住以前的对话是至关重要的。但默认情况下，LLM没有任何长期记忆，除非你输入聊天历史。

Chat with and without conversational memory of LLM

LangChain通过提供几个不同的选项来处理聊天记录来解决这个问题：

保持所有的对话
保持最新的K对话
对谈话进行总结

在这个例子中，我们将使用一个 ConversationChain 来给这个应用程序提供对话式内存。

from langchain import ConversationChain

conversation = ConversationChain(llm=llm, verbose=True)

conversation.predict(input="Alice has a parrot.")

conversation.predict(input="Bob has two cats.")

conversation.predict(input="How many pets do Alice and Bob have?")

这将导致上图中右侧的对话。如果没有 ConversationChain 来保持对话记忆，对话就会像上图中左边的那样

代理(Agent)：访问其他工具

尽管相当强大，但LLM也有一些局限性：它们缺乏上下文信息（例如，获取训练数据中不包含的特定知识），它们可能很快就会过时（例如，GPT-4是在2021年9月之前的数据上训练的），而且它们的数学能力很差。

由于LLMs会对他们自己无法完成的任务产生幻觉，我们需要让他们获得辅助工具，如搜索（如谷歌搜索）、计算器（如Python REPL或Wolfram Alpha）和查询（如维基百科）。

此外，我们需要代理人根据LLM的输出来决定使用哪些工具来完成一项任务。

请注意，一些LLM如 google/flan-t5-xl 将不适合下面的例子，因为它们不遵循 conversational-react-description 的模板。对我来说，这是我在OpenAI上建立付费账户并转而使用OpenAI API的时候。

下面是一个例子，代理人首先用维基百科查找巴拉克-奥巴马的出生日期，然后用计算器计算他在2022年的年龄。

# pip install wikipedia
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType

tools = load_tools(["wikipedia", "llm-math"], llm=llm)
agent = initialize_agent(tools, 
                         llm, 
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
                         verbose=True)


agent.run("When was Barack Obama born? How old was he in 2022?")

总结

就在几个月前，我们所有人（或者至少是大多数人）都对ChatGPT的能力印象深刻。现在，像LangChain这样的新的开发者工具使我们能够在几个小时内在我们的笔记本电脑上建立类似的令人印象深刻的原型--这是一些真正令人兴奋的时代！