Sr Python Developer || Plano, TX (Hybrid)
ARCHIVED
We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.
Sr Python DeveloperPlano, TX (Hybrid)Phone + VideoJob DescriptionTechnical Stack: Python 3.13+, Crawl4AI (Playwright-based async deep crawling), OpenAI GPT-4o (structured data extraction), Kreuzberg (PDF/DOCX/XLSX extraction), Pydantic v2, PostgreSQL, AWS S3, Prefect 3 (workflow orchestration), Docker, pytest + snapshot testing, Ruff, structlog.What You'll Work On **New scrapers** — Build modules that discover and crawl grant listings from state government websites, handling SPAs, paginated results, and document downloadsAI extraction** — Design LLM prompts for structured grant field extraction and document classification; work with cost controls like content fingerprinting and relevance filteringDocument processing** — Extract content from NOFO attachments (PDFs, DOCX, XLSX) and integrate with the existing multi-document assembly pipelineChange detection & sync** — Implement fingerprint-based change detection and status derivation; sync only changed grants downstreamTesting** — Write unit tests, record HAR fixtures for deterministic crawl replay, and maintain QC regression tests against golden snapshotsRequired Skills: Strong Python (async/await, type hints, Pydantic)Web scraping experience — ideally with Playwright or similar browser automationFamiliarity with LLM APIs and prompt engineering for data extractionComfort with immutable data models and clean architecture patternsExperience with PostgreSQL and DockerBonus: Prefect or workflow orchestration toolsBonus: document extraction (PDF/DOCX parsing)(“Believe you can and you’re halfway there.”) – Theodore Roosevelt Yogesh Sharma | Lead Tech RecruiterAn -E Verified CompanyE: P: +1 |