Engineering · Remote

Data
Engineer

Plexi is a data platform for asset managers, hedge funds, and fintechs. We are heavy AI users and we are looking for a data engineer hungry to grow into both modern data engineering and the new agentic way of building software.

Apply Now All open roles

About Plexi

Plexi is two connected products. Plexi Fact is our data lakehouse platform - we pull data from dozens of vendors, clean it, model it, and serve it back through dashboards, APIs, Excel, and AI agents. Enterprise clients, real traction, growing fast. Plexi Flexor is the agent platform we use to build everything else - 50+ specialized agents that pair with us on schema review, query tuning, anomaly diagnosis, lineage, code review, and more. We don't just use those agents. We design them, refine them, and ship the patterns we find.

Why You Want to Be Part of This Story

If you've been pushing what Claude Code or Cursor can do beyond autocomplete - if you have opinions about prompts, agent design, and where this is all going - you'll be at home here. You'll work directly with the founders. You'll have a roster of agents reviewing your DAGs, explaining query plans, diffing schemas, and diagnosing pipeline failures, and you'll be writing the skills and prompts that make them better. The work moves fast and the bar is high; you'll learn quickly.

You'll be trusted to find what needs doing, dive in, build the context, make a plan, and ship - pulling others in when their expertise is genuinely needed. We're a small team and there's more work than people. That's the deal, and it's also the opportunity.

What You'll Actually Do

Build and operate Apache Airflow DAGs that ingest messy data from real financial vendors
Design and optimize ETL/ELT pipelines in Python - Pandas, NumPy, SQLAlchemy
Model data in SQL Server: an operational schema and a data warehouse, hundreds of tables across the two
Write and tune T-SQL: stored procedures, views, complex queries, indexing decisions
Pair with AI agents on data ops: direct them, review their work, push back when they're wrong
Write Flexor skills and hooks for data ops, schema management, and pipeline diagnostics
Sit in on client calls, map their data into our model, and ship the integration that same week
Contribute to architecture decisions - we don't gatekeep that to senior people

Tech Stack

Orchestration

Apache Airflow (multiple DAGs in production)

Languages

Python 3
SQL (T-SQL primary, PostgreSQL secondary)

Data Libraries

Pandas
NumPy
SQLAlchemy
pyodbc

Databases

SQL Server (with Flyway migrations)
PostgreSQL

Cloud

Azure
Azure DevOps CI/CD

AI / Agentic

Claude Code
Anthropic API
MCP (Model Context Protocol)
Ollama
our internal Flexor platform

Vendors Integrated

Bloomberg DL
CME
CSI
Eze
IBKR
SEC EDGAR
Snowflake

What Matters

You're a self-starter who drives a problem end-to-end - dive in, build understanding, make a plan, execute, and pull others in when their expertise is needed. You don't wait to be told what to do, and you don't get stuck waiting for permission.
You actually work with data - degree, bootcamp, previous job, or side projects that grew teeth. We care what you can do, not how you got there.
You already use AI in your work and you push it harder than your friends do
Strong Python data manipulation - Pandas and NumPy in your sleep, or close to it
Real SQL fluency or hunger to get there fast - SELECTs aren't enough; you need to think in joins, indexes, and query plans
You've shipped or operated at least one pipeline that ran on a schedule and mattered (school project counts if it actually ran)
You're comfortable reading vendor docs and integrating messy external APIs
You can sit in front of a client and translate "how is your data shaped?" into a real schema

Nice to Have (We'll Teach You the Rest)

Hands-on Airflow (or Prefect, Dagster, Luigi - any orchestrator)
An MCP server, Claude skill, custom agent, or hook you've built or extended
Financial / market / reference data background
Stored procs and Flyway-style versioned migrations
Snowflake or other cloud data warehouse
Data modeling: dimensional, normalized, lakehouse
dbt, Kafka, Event Hubs, Azure Data Factory, Synapse

What We Offer

Direct mentorship from founders and senior engineers who have been doing this a long time

Greenfield design alongside operational ownership - you build it, you run it

Client exposure early - your work lands in front of paying users

A modern data stack with practical pragmatism, no rebuild-for-the-sake-of-it

Remote-friendly, flexible schedule

Competitive compensation calibrated to experience

We respond to every application within 3 business days.

Apply for This Role

Questions first? Email careers@plexl.ai

DataEngineer

Data
Engineer