NLP | Embeddings | Text processing | Streamlit
In this data science project, I deploy an NLP app that recommends papers based on the similarity between the papers’ abstracts and the user’s interests using text processing, vector embeddings, Pinecone and Streamlit.
Supervised learning | Logistic regression | Streamlit | NBA
In this data science project, I deploy a ML app that compares the influence of a pair of NBA teammates’ stats on their team’s winning chances using logistic regression, SHAP values, Hopsworks and Streamlit.
GeoPandas | TopoJSON | Power BI
This data science and visualization project creates TopoJSON maps of Colombian departments and towns from corresponding GeoJSON and Esri files, improving the visualization of specific territories.
Econometrics | Statistics | Python | Inventory management
This research project studies how people make ordering decisions in advance of a selling season when they have the chance to expedite more units in case the initial order falls short of demand.
Unsupervised learning | Clustering | SQL
This data science project analyzes accident risks for motorcyclists in Bogotá and prioritizes highway corridors for road safety operations using unsupervised learning.
Supervised learning | Logistic regression | XGBoost
This data science project performs an EDA on data related to default payments of credit card clients and trains several supervised ML models to predict defaulting. The dataset is available in Kaggle.
Git | GitHub
This is a personal endeavor where I've trained myself in basic Git tools and collect some useful Git commands.