AI Cloud Solution Architect & Engineer
Overview
AI Cloud Solution Architect & Engineer at Neurons Lab — a unique hybrid role combining strategic solution design with
- on engineering execution. You-'ll bridge the gap between client requirements and technical implementation, designing AI/ML architectures and then building them yourself using modern cloud infrastructure practices.
About The Project
Join Neurons Lab as an AI Cloud Solution Architect & Engineer — a unique hybrid role combining strategic solution design with
- on engineering execution. You-'ll bridge the gap between client requirements and technical implementation, designing AI/ML architectures and then building them yourself using modern cloud infrastructure practices.
Our Focus
We specialize in serving Banking, Financial Services, and Insurance (BFSI) enterprise customers with stringent compliance, security, and regulatory requirements. You-'ll work on
- critical AI/ML systems where security architecture, data governance, and regulatory compliance are paramount.
This role is perfect for technical professionals who love both the -"what-" and the -"how-" - architecting elegant solutions AND rolling up their sleeves to code, deploy, and optimize them. You-'ll work across multiple AI consulting engagements, from Generative AI workshops to enterprise ML platform development, all while maintaining the highest standards of security and compliance required by financial institutions.
Duration: Part-time
- term engagement with
- based allocations
Reporting: Direct report to Head of Cloud
Objective
Deliver
-
- end AI cloud solutions by combining architectural excellence with
- on engineering capabilities:
- Architecture & Design: Gather requirements, design cloud architectures, calculate ROI, and create technical proposals for AI/ML solutions
- Engineering Excellence: Build
- grade infrastructure using Ia
C, develop APIs and prototypes, implement CI/CD pipelines, and manage AI workload operations - Client Success: Transform business requirements into working solutions that are secure, scalable,
- effective, and aligned with AWS best practices - Knowledge Transfer: Create reusable artifacts, comprehensive documentation, and architectural patterns that accelerate future project delivery
KPIs
- Architecture & Pre-Sales: Design and document 3+ solution architectures per month with comprehensive diagrams and specifications
- Achieve 80%+ client acceptance rate on proposed architectures and estimates
- Deliver ROI calculations and cost models within 2 business days of request
- Engineering Delivery: Deploy infrastructure through Ia
C (AWS CDK/Terraform) with zero manual configuration - Create at least 3 reusable Ia
C components or architectural patterns per quarter - Implement CI/CD pipelines for all projects with automated testing and deployment
- Maintain 95%+ uptime for production AI/ML inference endpoints
- Document architecture and implementation details weekly for knowledge sharing
- Quality & Best Practices: Ensure all solutions pass AWS Well-Architected Review standards
- Deliver comprehensive documentation within 1 week of architecture completion
- Create simplified UIs/demos for Po
C validation and client presentations
Areas of Responsibility
Solution Architecture (40%)
Requirements & Design:
- Elicit and document business and technical requirements from clients
- Design
-
- end cloud architectures for AI/ML solutions (training, inference, data pipelines) - Create architecture diagrams, technical specifications, and implementation roadmaps
- Evaluate technology options and recommend optimal AWS services for specific use cases
Business Analysis:
- Calculate ROI, TCO, and
- benefit analysis for proposed solutions - Estimate project scope, timelines, team composition, and resource requirements
- Participate in presales activities: technical presentations, demos, and proposal support
- Collaborate with sales team on SOW creation and customer workshops
Strategic Planning:
- Design for scalability, security, compliance, and cost optimization from day one
- Create reusable architectural patterns and reference architectures
- Stay current with AWS AI/ML services and emerging cloud technologies
Cloud Engineering & AI Infrastructure (60%)
Infrastructure as Code Development:
- Build and maintain cloud infrastructure using AWS CDK (primary) and Terraform
- Develop reusable Ia
C components and modules for common patterns - Implement infrastructure for AI/ML workloads: GPU clusters, model serving, data lakes
- Manage compute resources: EC2, ECS, EKS, Lambda, Sage
Maker compute instances
Application Development:
- Develop Python applications: Fast
API backends, data processing scripts, automation tools - Create prototype interfaces using Streamlit, React, or similar frameworks
- Build and integrate RESTful APIs for AI model serving and data access
- Implement authentication, authorization, and API security best practices
AI/ML Operations (MLOps):
- Deploy and manage AI/ML model serving infrastructure (Sage
Maker endpoints, containerized models) - Build ML pipelines: data ingestion, preprocessing, training automation, model deployment
- Implement model versioning, experiment tracking, and A/B testing frameworks
- Manage GPU resource allocation, training job scheduling, and compute optimization
- Monitor model performance, inference latency, and system health metrics
Dev
Ops & Automation:
- Design and implement CI/CD pipelines using Git
Hub Actions, Git
Lab CI, or AWS Code
Pipeline - Automate deployment processes with infrastructure testing and validation
- Implement monitoring, logging, and alerting using Cloud
Watch, Prometheus, Grafana - Manage containerization with Docker and orchestration with Kubernetes/ECS
Data Engineering:
- Build data pipelines for AI training and inference using AWS Glue, Step Functions, Lambda
- Design and implement data lakes using S3, Lake Formation, and data cataloging
- Implement automated and scheduled data synchronization processes
- Optimize data storage and retrieval for ML workloads
Security & Compliance:
- Implement cloud security best practices: IAM, VPC design, encryption, secrets management
- Build enterprise security and compliance strategies for AI/ML workloads
- Ensure solutions meet regulatory requirements (PCI-DSS, GDPR, SOC2, MAS TRM, etc where applicable)
- Conduct security reviews and implement remediation strategies
Cost & Performance Optimization:
- Optimize cloud spend for
- intensive AI workloads - Implement spot instance strategies,
- scaling, and resource scheduling - Monitor and optimize GPU utilization, inference latency, and throughput
- Perform cost analysis and implement
- saving measures
Operations & Support:
- Implement disaster recovery procedures for AI models and training data
- Manage backup strategies and business continuity planning
- Troubleshoot and resolve production issues in AI infrastructure
- Provide technical guidance to project teams during implementation
Skills
Cloud Architecture & Design:
- Strong solution architecture skills with ability to translate business requirements into technical designs
- Experience in Well Architected review and remediation
- Deep understanding of AWS services, particularly compute, storage, networking, and AI/ML services
- Experience designing scalable, highly available, and
- tolerant systems - Ability to create clear architecture diagrams and technical documentation
- Cost modeling and ROI calculation capabilities
Technical Leadership:
- Comfortable leading technical discussions with clients and stakeholders
- Ability to guide engineers and share knowledge effectively
- Strong
- solving and analytical thinking skills - Experience with architectural
- making and
- off analysis
Programming & Development:
- Advanced Python programming:
- oriented design, async programming, testing - API development with Fast
API, Flask, or similar frameworks - Frontend development basics: React, etc (for prototypes and demos with AI code generation tools)
- Shell scripting for automation and deployment
- Git version control and collaborative development workflows
Infrastructure as Code:
- AWS CDK (required) - Cloud
Formation experience is valuable - Terraform (highly preferred) for
- cloud or hybrid scenarios - Understanding of Ia
C best practices: modularity, reusability, testing - Experience with infrastructure testing and validation frameworks
AI/ML Infrastructure:
- Hands-on experience with AWS Sage
Maker: training jobs, endpoints, pipelines, notebooks - Understanding of ML lifecycle: data preparation, training, deployment, monitoring
- Experience with GPU management and optimization for training/inference
- Knowledge of containerization for ML models (Docker, container registries)
- Familiarity with ML frameworks: Py
Torch, Tensor
Flow, Lang
Chain, Llamaindex, etc
Dev
Ops & Automation:
- CI/CD pipeline design and implementation (Git
Hub Actions, Git
Lab CI, AWS Code
Pipeline) - Container orchestration: Docker, Kubernetes, Amazon ECS
- Configuration management and deployment automation
- Monitoring and observability: Cloud
Watch, Prometheus, Grafana, ELK stack
Communication & Collaboration:
- Excellent written and verbal communication in Advanced English
- Ability to explain complex technical concepts to
- technical stakeholders - Comfortable with
- facing presentations and technical demos - Strong documentation skills with attention to detail
- Collaborative mindset with ability to work across functional teams
Problem-Solving:
- Advanced task breakdown and estimation abilities
- Debugging and troubleshooting complex distributed systems
- Performance optimization and tuning
- Incident response and root cause analysis
Knowledge
AWS Cloud Platform (Required):
- AWS Certified Solutions Architect Associate (minimum requirement)
- AWS Certified Solutions Architect Professional or AWS Certified Machine Learning - Specialty (highly preferred)
- Deep knowledge of core AWS services:
- Compute: EC2, Lambda, ECS, EKS, Sage
Maker - Storage: S3, EFS, EBS, FSx
- Networking: VPC, Route53, Cloud
Front, API Gateway, Load Balancers - AI/ML: Sage
Maker, Bedrock, Rekognition, Textract, Comprehend, Lex, Polly - Data: RDS, Dynamo
DB, Redshift, Glue, Athena, Kinesis - Security: IAM, KMS, Secrets Manager, Security Hub, Guard
Duty - Dev
Ops: Git
Hub Action, Code
Pipeline, Code
Build, Code
Deploy, Cloud
Formation, CDK, Terraform - Understanding of machine learning concepts and model training/deployment lifecycle
- Familiarity with Generative AI technologies: LLMs, RAG, vector databases, prompt engineering
- Knowledge of ML frameworks and libraries: Py
Torch, Tensor
Flow,
- learn, pandas, numpy - Experience with MLOps practices and tools
- Understanding of model serving patterns:
- time vs batch inference - Modern software development practices: testing, code review, documentation
- API design principles: RESTful, Graph
QL,
- driven architectures - Database design and optimization: SQL and No
SQL - Authentication and authorization: OAuth, JWT, IAM
- Linux/UNIX system administration
- Networking fundamentals: TCP/IP, DNS, HTTP/HTTPS, load balancing
- Security best practices for cloud environments
- Disaster recovery and business continuity planning
- Understanding of cloud consulting delivery models
- Familiarity with agile/scrum methodologies
- Awareness of compliance frameworks: GDPR, HIPAA, SOC2, ISO27001
- Knowledge of Fin
Tech, or other regulated industries (plus) - Azure or GCP certifications and experience
- Multi-cloud architecture patterns
- Serverless architecture patterns
- Data engineering and data lake design
- Cost optimization strategies and Fin
Ops practices - 5+ years in cloud engineering, Dev
Ops, or solution architecture roles - 3+ years
- on experience with AWS services and architecture - Proven track record of designing and implementing cloud solutions from scratch
- Experience with both greenfield projects and cloud migration initiatives
- 2+ years working with AI/ML workloads on cloud platforms
- Hands-on experience deploying and managing ML models in production
- Experience with GPU-based compute for training or inference
- Understanding of AI/ML infrastructure challenges and optimization techniques
- 3+ years building infrastructure using Ia
C tools (AWS CDK, Terraform, Cloud
Formation) - Experience creating reusable Ia
C modules and components - Track record of infrastructure automation and standardization
- 4+ years programming experience in Python (required)
- Experience building APIs with Fast
API, Flask, or similar frameworks - History of creating prototypes, MVPs, or Po
C applications - Comfortable with
- stack development for demos and prototypes - 3+ years implementing CI/CD pipelines and deployment automation
- Experience with containerization (Docker) and orchestration (Kubernetes/ECS)
- Linux/UNIX system administration experience
- Monitoring and observability implementation
- Experience gathering requirements and translating them into technical solutions
- History of presenting technical architectures to clients and stakeholders
- Participation in presales activities, demos, or technical workshops
- Ability to work directly with customers to solve complex problems
- Consulting or professional services background
- Experience in regulated industries (Fin
Tech, Insurance, Banks) - Work with enterprise clients on
- scale implementations - Startup or
- paced environment experience - Mid-Senior level
- Full-time
- Engineering and Information Technology
- Industries IT Services and IT Consulting
AI/ML Technologies:
Software Development:
Dev
Ops & Infrastructure:
Industry Knowledge:
Additional Knowledge (Preferred):
Experience
Cloud Engineering & Architecture:
AI/ML Infrastructure:
Infrastructure as Code:
Software Development:
Dev
Ops & Automation:
Client-Facing Work:
Industry Experience (Preferred):
Seniority level
Employment type
Job function
Referrals increase your chances of interviewing at Neurons Lab by 2x
Sign in to set job alerts for “Solutions Architect” roles.
Continue with Google Continue with Google
Open Roles
Data Platform Architect - Lisbon (remote) (m/f/d)
Solutions Architect - Financial Services
Senior Enterprise Solutions Architect
- Informações detalhadas sobre a oferta de emprego
Empresa: Neurons Lab Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 16. 10. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!