Alexandru Paicu

Machine Learning Engineer | Data Scientist

IaΘ™i, Romania

Profile

Machine Learning Engineer with 5+ years of experience in artificial intelligence and data science. Specialized in predictive modeling, computer vision, generative AI, RAG systems, Agentic AI, and AWS-based MLOps. Proven ability to build production-ready ML systems, optimize pipelines, and align technical solutions with business goals through cross-functional collaboration.

Skills

Programming

  • Python
  • JavaScript
  • HTML/CSS
  • C++

Machine Learning & AI

  • TensorFlow
  • PyTorch
  • Scikit-Learn
  • XGBoost
  • Keras
  • Neural Networks
  • Deep Learning
  • Computer Vision
  • NLP

Generative AI & RAG

  • LangChain
  • LangSmith
  • RAG Systems
  • Agentic AI
  • Prompt Engineering
  • QLoRA
  • PiSSA
  • Large Language Models
  • Cohere
  • Pinecone
  • GraphDB
  • OpenAI
  • Hugging Face

Cloud & MLOps

  • AWS SageMaker
  • AWS Bedrock
  • Lambda
  • S3
  • ECS
  • EC2
  • Fargate
  • CloudFormation
  • MLFlow
  • Model Deployment
  • CI/CD
  • Docker
  • GitHub Actions

Web Development

  • FastAPI
  • Flask
  • REST APIs
  • Streamlit
  • Gradio

Web & 3D Visualization

  • Three.js
  • React-Three/Drei

Data Science

  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Feature Engineering
  • Statistical Analysis
  • SQL
  • NoSQL
  • Data Pipelines
  • ETL Processes

Projects

LEANN RAG UI screenshot

LEANN RAG – Ultra-Lightweight Vector Search Engine

LEANN, HNSW, Qwen3-0.6B, all-MiniLM-L6-v2, Gradio, CPU-Optimized

Revolutionary RAG system achieving 97% storage reduction through on-demand embedding computation. Implements LEANN (Lightweight Embedding & Neural Network) with HNSW graph backend that stores only navigable small-world graph structure instead of dense vectors. Features real-time PDF/text ingestion with adaptive chunking (256-token blocks), Qwen3-0.6B LLM for response generation, and all-MiniLM-L6-v2 embeddings. Optimized for CPU deployment on Hugging Face Free Tier (2 vCPU, 16GB RAM) with 2-5s search latency. Graph-traversal architecture recomputes embeddings during search, eliminating index bloat while maintaining semantic accuracy. Supports GPU acceleration (10x faster on T4/A10G) with zero code changes.

Sleep Disorder Risk Model UI screenshot

Sleep Disorder Risk Model – End-to-End Classification

XGBoost, Optuna, SHAP, Streamlit, NHANES Data

Binary classification model predicting sleep disorder risk using NHANES survey data (2005-2016 cycles). Engineered features from lifestyle and health variables including BMI, exercise, diet, sleep duration, and cardiovascular metrics. Implemented XGBoost as baseline model with Optuna hyperparameter tuning. Integrated SHAP explainability for global and local feature importance analysis and counterfactual recommendations. Deployed with Streamlit UI for real-time risk prediction and personalized health recommendations.

Super Creator Agent UI screenshot

Super Creator Agent – AI Coding Assistant with Self-Healing

LangGraph, Qwen2.5-Coder, RAG 2.0, FlashRank, ReWOO, RAPTOR, Ollama

Autonomous code generation system with self-healing capabilities using dual-model architecture (3B/7B Qwen2.5-Coder). Implements LangGraph state machine orchestrating ReWOO planning framework (Planner-Worker-Solver). Features advanced RAG 2.0 pipeline with RAPTOR hierarchical indexing, HyDE hypothetical document generation, MultiQuery expansion, and FlashRank cross-encoder reranking. Includes iterative self-correction loop: Generate β†’ Execute β†’ Reflect β†’ Fix (max 3 iterations). Supports document upload for context-aware code generation. Built with async Gradio streaming interface and ChromaDB vector store with BGE-base-en-v1.5 embeddings.

Housing Regression UI demo screenshot

Housing Regression UI – End-to-End

XGBoost, Optuna, MLflow, FastAPI, Gradio, Docker, AWS

End-to-end housing-price regression pipeline built with production ML engineering best practices: time-aware splits, robust preprocessing & feature engineering, XGBoost with Optuna hyperparameter tuning, MLflow experiment tracking and containerized deployment. Includes an interactive Gradio UI and a REST API, plus CI/CD and AWS-ready task definitions.

Healthcare Analytics - Heart Disease Prediction

Python, Pandas, NumPy, Matplotlib, Seaborn, Jupyter, Feature Engineering, Statistical Analysis

Developed classification models using Random Forest, SVM, and Neural Networks on UCI heart disease dataset. Achieved 87% accuracy through feature engineering and hyperparameter optimization. Implemented cross-validation and ROC analysis for model validation.

Automotive Pricing Intelligence - Car Sales Prediction

Python, Scikit-Learn, Ensemble Methods

Created regression models predicting vehicle prices using ensemble methods. Processed 15K+ vehicle records with feature engineering on categorical and numerical data. Reduced prediction error by 23% through advanced feature selection techniques.

Industrial Equipment Valuation - Bulldozer Price Forecasting

Python, Time-Series Regression

Built time-series regression models for heavy equipment auction price prediction. Achieved RMSE reduction of 15% compared to baseline linear models. Implemented model retraining pipeline for continuous learning.

Computer Vision - Dog Breed Classification System

TensorFlow, CNN, Transfer Learning, ResNet

Developed CNN using TensorFlow and Transfer Learning with pre-trained ResNet models. Achieved 85% classification accuracy across 120 dog breeds. Created web interface for real-time image classification.

LightDrift - Astronomical Data Processing

Python, Gaia DR3, SDSS, Data Visualization, Statistical Models

Developed Python scripts for processing Gaia DR3 and SDSS astronomical datasets. Created data visualization tools for stellar motion analysis and cosmic event pattern recognition. Implemented statistical models for anomaly detection in large-scale astronomical data.

Futuristic interface displaying AI-generated code snippets with neon blue and purple gradients

Code Generator AI

AWS Bedrock, FastAPI, S3, Lambda

Engineered a serverless AI code generation tool using Anthropic Claude models via AWS Bedrock, integrated with FastAPI for real-time API access.

Vibrant AI-generated digital artwork with abstract patterns in neon cyan and magenta

Image Generation AI

AWS Bedrock, FastAPI, S3

Developed an AI-powered image generation tool using foundation models via AWS Bedrock, enabling rapid creation of high-quality visuals.

Sleek text summarization dashboard with glowing text overlays and holographic effects

Summarization AI

AWS Bedrock, FastAPI, S3, Lambda

Built a serverless text summarization tool using Anthropic Claude models via AWS Bedrock, integrated with FastAPI for API-driven access.

Key Achievements

  • Delivered 8+ production ML systems end-to-end
  • Cut model training time by 35% through pipeline optimization
  • Built reliable AWS ML infrastructure with high availability and sub-second inference
  • Maintained >95% accuracy across deployed models in multiple domains

Experience & Education

2020 – Present

Freelance Machine Learning Engineer

Self-Employed

  • Built and deployed production ML models using TensorFlow, PyTorch, Scikit-Learn for predictive analytics
  • Designed and implemented RAG systems of varying complexity, combining retrieval techniques with LLMs for enhanced context-aware responses
  • Developed Agentic AI solutions using LangChain, LangSmith, and integration with vector databases (Pinecone, GraphDB) and AI platforms (Cohere, OpenAI, Hugging Face)
  • Optimized model hyperparameters with Optuna's Bayesian optimization, improving performance by 35%
  • Implemented MLFlow for experiment tracking, model versioning, and deployment pipeline management
  • Developed FastAPI microservices with custom UIs for model inference and real-time predictions
  • Containerized applications with Docker and automated CI/CD pipelines using GitHub Actions
  • Deployed scalable cloud solutions on AWS (SageMaker, Lambda, S3) with sub-second inference times
  • Engineered various AI tools powered by transformer architectures and open-source models from Hugging Face, Kaggle, and Keras
2021 – 2025

Data Systems Specialist

Conduent | Enterprise Data Management & Automation, IaΘ™i, Romania

  • Automated data processes with Python, reducing manual effort by 30%
  • Built GDPR-compliant ETL pipelines for sensitive authentication data
  • Created reporting dashboards for KPI tracking across departments
  • Partnered with cross-functional IT and business teams to align reporting with operational needs
2013 – Present

Freelance Audio Engineer

Self-Employed, International

  • Developed audio plugins using JUCE modules and C++
  • Performed internationally with hardware synthesizers
  • Created custom audio projects and effects

Psychology

Alexandru Ioan Cuza University, IaΘ™i

AWS Bedrock: Build & Scale Generative AI

Amazon Web Services

AI Engineering Bootcamp

Zero To Mastery

Machine Learning & Data Science Bootcamp

Zero To Mastery

Prompt Engineering for Developers

Zero To Mastery

Complete Python Developer

Zero To Mastery

Hobbies & Interests

πŸ”­
Astronomy
🧠
Neuroaesthetics
🎨
3D Visualization
πŸ”¬
Scientific Computing
πŸ§ͺ
Health Technology
↑