Job Details

Job Overview

Own ML systems for a voice-AI product driving accurate match-making and continuous improvement. Collaborate on data pipelines, model training, evaluation, and deployment with emphasis on efficiency and low-latency. Key responsibilities: Design multi-stage retrieval and re-ranking for personalization; manage data pipelines ensuring reproducibility; train and fine-tune LLMs; run offline and online evaluations; set latency and cost targets for inference services. Required skills: 3+ years in ...

Responsibilities

Known is building a voice-AI product that powers curated introductions, agentic scheduling, and post-date feedback. You will own the ML systems that make matches feel accurate and improve every week — from data and features to training, evaluation, and low-latency inference — working closely with platform and product.

Our stack: Python, PyTorch, Hugging Face, OpenAI/Anthropic APIs, embeddings and vector search (pgvector/Pinecone/FAISS), Postgres + a warehouse for analytics, Airflow/Prefect/dbt for pipelines, online experimentation/A/B testing, observability for models and services on AWS (S3, ECS/Kubernetes, Lambda), CI/CD with GitHub Actions.

Key responsibilities

  • Design and ship multi-stage retrieval + re-ranking for compatibility scoring, search, and personalization.
  • Build and maintain data/feature pipelines for training, evaluation, and reporting; ensure reproducibility and data quality.
  • Train, fine-tune, or prompt LLM/encoder models; manage model versioning, rollout, and rollback.
  • Run offline evaluation (e.g., AUC, NDCG, MAP) and online experiments to measure real user impact.
  • Stand up inference services with tight p95 latency and cost targets; add caching, batching, and fallback strategies.
  • Implement safety/guardrails and monitoring for drift, bias, and failure modes; define model SLOs and alerts.
  • Collaborate with infra/platform to productionize models and with product/design to turn signals from voice/text into better matches.
  • Document decisions, write lightweight runbooks, and share dashboards that track match quality and model health.

Qualifications

We are hiring a founding-caliber Infrastructure / Platform Engineer who has owned production cloud environments and data platforms in high-growth settings. You will set the golden paths for services, data, and model delivery, and you are comfortable working on-site in San Francisco five days a week.

  • 4 to 10+ years in infrastructure, platform, or data engineering with real ownership of uptime, performance, and security.
  • Expert with AWS and Infrastructure-as-Code (Terraform, Pulumi, or CloudFormation).
  • Strong proficiency in Python or TypeScript, plus tooling/scripting (Bash/YAML).
  • Containers and orchestration experience (Docker, Kubernetes or ECS) and CI/CD pipelines you designed and ran.
  • Proven ability to design and operate data pipelines and distributed systems for both batch and low-latency use cases.
  • PostgreSQL at scale, ideally with pgvector/embeddings exposure for ML-adjacent workloads.
  • Strong observability practices: metrics, tracing, alerting, incident management, and SLOs.
  • Excellent collaboration with AI/ML and product teams; clear communication of tradeoffs and risk.
  • Work authorization in the U.S. and willingness to be on-site five days a week in San Francisco.

Nice to have

  • Experience supporting model training and inference pipelines, feature stores, or evaluation loops.
  • Prior work with streaming voice, low-latency systems, or recommendation/retrieval stacks.

Examples of prior experience we value

  • Early infra/platform owner at a seed–Series B startup, scaling AWS with Terraform and CI/CD
  • Built real-time and batch data pipelines that powered matching, voice, or recommendations
  • Ran Postgres at scale (schema design, indexing, pooling), with pgvector or embeddings in prod
  • Set up observability and on-call (metrics, tracing, alerting) that improved SLOs
  • Partnered with ML to deploy and monitor model inference with clear latency and cost targets

Ideal Candidate

You are a founding-caliber AI/ML Engineer who ships ranking and recommendation systems in production. You move quickly while keeping reliability high, partner closely with platform and product, and turn rich voice and text signals into better matches week over week. On-site in San Francisco.

What great looks like

  • 3 to 8+ years building production matching, ranking, recommendations, or search in consumer products
  • Strong Python with PyTorch or TensorFlow and Hugging Face tooling
  • Hands-on with embeddings, LLMs, and vector search (pgvector, FAISS, Pinecone, or Weaviate)
  • Solid data foundations: feature engineering, labeling/feedback loops, reproducible training, and clear evaluation
  • Comfortable owning inference services with tight p95 latency and cost targets plus good runbooks and alerts
  • Able to design offline metrics (AUC, NDCG, MAP) and run A/B tests that tie to real user outcomes
  • Collaborative with platform/backend to productionize models safely and quickly

Examples of strong backgrounds

  • RecSys or Search engineer at a social, e-commerce, or dating app who improved ranking quality at scale
  • Applied ML engineer who built multi-stage retrieval + re-ranking and proved lift online
  • Conversational AI or voice intake work that improved downstream personalization or matching
  • Hybrid applied ML + MLOps experience setting evaluation standards and model SLOs

Must-Have Requirements

  • Must be authorized to work in the U.S. without future visa sponsorship.
  • Able to work onsite in San Francisco, CA five days per week.
  • 3+ years in applied ML focused on ranking, recommendations, or search in production.
  • Strong Python; experience with PyTorch or TensorFlow (Hugging Face a plus).
  • Hands-on with embeddings and vector search (pgvector, FAISS, Pinecone, or Weaviate).
  • Proven experience taking models from notebook to production: packaging, APIs, CI/CD, canary/rollback, monitoring.
  • Data pipelines for training and evaluation (e.g., Airflow, Prefect, Dagster, or dbt) and sound data-quality checks.

Screening Questions

1. (Optional Video). This step is completely optional. If you’d like, record a short 2–3 minute video introducing yourself and your experience — or share a recording of your interview with the recruiter if that’s easier. You can upload the link via Loom or Google Drive. This just helps us get to know you better, but there’s no pressure if you’d prefer to skip it.
2. (Optional Portfolio / GitHub) If available, please share a link to your GitHub, portfolio, or any recent projects you’ve worked on. This is entirely optional but helps provide more context about your work.
3. What excites you about building matching systems at Known, and why are you considering a move now?
4. Matching/Retrieval Describe a ranking system you built. Include retrieval method, re-ranking approach (e.g., LLM or learning-to-rank), features used, offline metrics you tracked, and one online metric you moved.
5. Production & Reliability Tell us about one model you owned in production. Include throughput and latency targets, how you monitored it (drift, bias, alerting), and one incident you diagnosed and resolved with the before/after impact.

Client Messaging Channel

Client Messaging Channel

Please sign in and apply for this bounty to gain access to the messaging channel.

Login & Apply to View More

Sign in to your account to access full job details and apply.