JOBSEARCHER

Software Development Engineer

SpectraforceLake, ILApril 27th, 2026
Role: Software Development EngineerLocation: Lake County, IL (Hybrid – 3 days/week onsite)Duration: 5+ months (possibility of extension)Job Description:We are looking for a Software Development Engineer to build and scale an AI-powered document parsing platform that extracts structured data from complex PDFs (pharmaceutical batch records, certificates, regulatory documents) using OCR, LLMs, and RAG. You will work across the full stack — backend AI pipelines, frontend chat interface, and cloud infrastructure.Roles & Responsibilities:• Design and develop production-grade RAG (Retrieval-Augmented Generation) pipelines for domain-specific document querying with hybrid search, reranking, and multi-agent answer synthesis• Build and optimize document processing pipelines using AWS Textract for OCR extraction from tables, handwritten content, and structured forms• Integrate and orchestrate multiple LLM models (Claude, Gemini) for intent classification, data extraction, validation, and conversational AI• Develop and maintain the FastAPI backend — REST APIs, streaming endpoints (SSE), authentication, and background task processing• Build responsive frontend features using Next.js, React, and TypeScript — chat interface, PDF viewer with highlights, real-time progress tracking• Manage cloud infrastructure on AWS — EC2 deployment, S3 storage, RDS (PostgreSQL), and IAM configuration• Work with vector databases (Weaviate) and graph databases (Neo4j) for semantic search and structural document querying• Implement chunking strategies, embedding generation, cross-encoder reranking, and semantic caching for accurate document retrieval• Deploy and monitor AI models and services in production — model fallback chains, retry mechanisms, error handling• Write clean, maintainable code with proper logging, error handling, and documentationRequired Skills:• Python (FastAPI, async programming, pandas)• TypeScript / React (Next.js)• RAG systems — vector search, embeddings, chunking, reranking (production-grade)• LLM integration — prompt engineering, structured output, multi-model orchestration• AWS — EC2, S3, Textract, RDS• PostgreSQL• REST API design with streaming (SSE)• Git, basic CI/CD, Linux server managementGood to Have:• Weaviate, Neo4j, or similar vector/graph databases• Gemini Vision or GPT-4V for document image analysis• LangChain / LangGraph• Docke, nginx• Pharmaceutical/regulated document experienceExperience:• 3–6 years