• Import from "@langchain/community/document_loaders/web/github" instead. This entrypoint will be removed in 0.3.0.

A class that extends the BaseDocumentLoader and implements the GithubRepoLoaderParams interface. It represents a document loader for loading files from a GitHub repository.

Hierarchy (view full)

Implements

Constructors

Properties

accessToken?: string
apiUrl: string

The API endpoint URL of the GitHub instance. To be used when you are not targeting github.com, e.g. a GitHub Enterprise instance.

baseUrl: string

The base URL of the GitHub instance. To be used when you are not targeting github.com, e.g. a GitHub Enterprise instance.

branch: string
caller: AsyncCaller
ignore?: Ignore
ignoreFiles: (string | RegExp)[]
ignorePaths?: string[]
maxConcurrency?: number

The maximum number of concurrent calls that can be made. Defaults to 2.

maxRetries?: number

The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 2.

processSubmodules: boolean

Set to true to recursively process submodules. Is only effective, when recursive=true.

recursive: boolean
verbose?: boolean

Methods

  • Fetches the files from the GitHub repository and creates Document instances for each file. It also handles error handling based on the unknown handling option.

    Returns Promise<Document<Record<string, any>>[]>

    A promise that resolves to an array of Document instances.

  • Parameters

    • Optionalsplitter: BaseDocumentTransformer<DocumentInterface<Record<string, any>>[], DocumentInterface<Record<string, any>>[]>

    Returns Promise<Document<Record<string, any>>[]>

    A Promise that resolves with an array of Document instances, each split according to the provided TextSplitter.

    Use this.load() and splitter.splitDocuments() individually. Loads the documents and splits them using a specified text splitter.

  • Asynchronously streams documents from the entire GitHub repository. It is suitable for situations where processing large repositories in a memory-efficient manner is required.

    Returns AsyncGenerator<Document<Record<string, any>>, void, undefined>

    Yields a Promise that resolves to a Document object for each file or submodule content found in the repository.

  • Determines whether a file or directory should be ignored based on its path and type.

    Parameters

    • path: string

      The path of the file or directory.

    • fileType: string

      The type of the file or directory.

    Returns boolean

    A boolean indicating whether the file or directory should be ignored.