Automatically convert documents from Google Drive into vector embeddings using OpenAI, LangChain, and PGVector — fully automated through n8n.
This workflow monitors a Google Drive folder for new files, supports multiple file types (PDF, TXT, JSON), and processes them into vector embeddings using OpenAI’s text-embedding-3-small model. These embeddings are stored in a Postgres database using the PGVector extension, making them query-ready for semantic search or RAG-based AI agents.
After successful processing, files are moved to a separate “vectorized” folder to avoid duplication.
Search Folder, Download File, and Move File nodes)Embeddings OpenAI node)Postgres PGVector Store node)Search Folder node — this is where incoming files are placed.Move File node — files will be moved here after vectorization.Postgres PGVector Store node.Embeddings OpenAI node and select text-embedding-3-small.Schedule Trigger node to run daily or configure your own schedule.When clicking ‘Test workflow’ for on-demand ingestion.Want to support more file types or enhance the pipeline?
Extract from File with other formats like DOCX, Markdown, or HTML.Switch node routes files to the correct extraction method based on MIME type (application/pdf, text/plain, application/json).Search Folder or Switch node logic to skip specific files or folders.This workflow is available under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license. You are free to use, adapt, and share this workflow for non-commercial purposes under the terms of this license.
Full license details: https://creativecommons.org/licenses/by-nc-sa/4.0/