Class WebPDFLoader

Deprecated

Import from "@langchain/community/document_loaders/web/pdf" instead. This entrypoint will be removed in 0.3.0.

A document loader for loading data from PDFs.

Example

const loader = new WebPDFLoader(new Blob());
const docs = await loader.load();
console.log({ docs });

Hierarchy (view full)

BaseDocumentLoader
- WebPDFLoader

Index

Constructors

constructor

Methods

load loadAndSplit

Constructors

constructor

new WebPDFLoader(blob, __namedParameters?): WebPDFLoader
Parameters
- blob: Blob
- __namedParameters: {
      parsedItemSeparator: undefined | string;
      pdfjs: undefined | (() => Promise<{
          getDocument: ((src:
              | string
              | URL
              | ArrayBuffer
              | TypedArray
              | DocumentInitParameters) => PDFDocumentLoadingTask);
          version: string;
      }>);
      splitPages: undefined | boolean;
  } = {}
  - parsedItemSeparator: undefined | string
  - pdfjs: undefined | (() => Promise<{
        getDocument: ((src:
            | string
            | URL
            | ArrayBuffer
            | TypedArray
            | DocumentInitParameters) => PDFDocumentLoadingTask);
        version: string;
    }>)
  - splitPages: undefined | boolean
Returns WebPDFLoader
Overrides BaseDocumentLoader.constructor
- Defined in langchain/src/document_loaders/web/pdf.ts:31

Properties

`Protected`blob

blob: Blob

`Protected`parsedItemSeparator

parsedItemSeparator: string

`Protected`splitPages

splitPages: boolean = true

Methods

load

load(): Promise<Document<Record<string, any>>[]>
Loads the contents of the PDF as documents.

Returns Promise<Document<Record<string, any>>[]>
An array of Documents representing the retrieved data.
Overrides BaseDocumentLoader.load
- Defined in langchain/src/document_loaders/web/pdf.ts:50

loadAndSplit

loadAndSplit(splitter?): Promise<Document<Record<string, any>>[]>
Parameters
- Optionalsplitter: BaseDocumentTransformer<DocumentInterface<Record<string, any>>[], DocumentInterface<Record<string, any>>[]>
Returns Promise<Document<Record<string, any>>[]>
A Promise that resolves with an array of Document instances, each split according to the provided TextSplitter.

Deprecated
Use this.load() and splitter.splitDocuments() individually. Loads the documents and splits them using a specified text splitter.
Inherited from BaseDocumentLoader.loadAndSplit
- Defined in langchain-core/dist/document_loaders/base.d.ts:27

Class WebPDFLoader

Deprecated

Example

Hierarchy (view full)

Index

Constructors

Properties

Methods

Constructors

constructor

Parameters

parsedItemSeparator: undefined | string

pdfjs: undefined | (() => Promise<{
    getDocument: ((src:
        | string
        | URL
        | ArrayBuffer
        | TypedArray
        | DocumentInitParameters) => PDFDocumentLoadingTask);
    version: string;
}>)

splitPages: undefined | boolean

Returns WebPDFLoader

Properties

`Protected`blob

`Protected`parsedItemSeparator

`Protected`splitPages

Methods

load

Returns Promise<Document<Record<string, any>>[]>

loadAndSplit

Parameters

Returns Promise<Document<Record<string, any>>[]>

Deprecated

Settings

On This Page

Class WebPDFLoader

Deprecated

Example

Hierarchy (view full)

Index

Constructors

Properties

Methods

Constructors

constructor

Parameters

parsedItemSeparator: undefined | string

pdfjs: undefined | (() => Promise<{ getDocument: ((src: | string | URL | ArrayBuffer | TypedArray | DocumentInitParameters) => PDFDocumentLoadingTask); version: string; }>)

splitPages: undefined | boolean

Returns WebPDFLoader

Properties

Protectedblob

ProtectedparsedItemSeparator

ProtectedsplitPages

Methods

load

Returns Promise<Document<Record<string, any>>[]>

loadAndSplit

Parameters

Returns Promise<Document<Record<string, any>>[]>

Deprecated

Settings

On This Page

pdfjs: undefined | (() => Promise<{
getDocument: ((src:
| string
| URL
| ArrayBuffer
| TypedArray
| DocumentInitParameters) => PDFDocumentLoadingTask);
version: string;
}>)

`Protected`blob

`Protected`parsedItemSeparator

`Protected`splitPages