Data scientist at Booz Allen Hamilton and recent astrophysics graduate with a strong mathematics foundation. Specializes in developing proprietary Machine Learning (ML) and Artifical Intelligence (AI) algorithms. Highly skilled in research, and experienced in undertaking award-winning research projects in multiple fields of study, such as astrophysics, bioinformatics, economics, geology, and volcanology. Interested in applying machine learning techniques to intellectually challenging tasks, learning new technologies and tools, and communicating results through compelling data visualizations. Achievements include, 3 time NASA research fellow and award winning public speaker.
Alongside my academic/industry skills: I love to cook, try amazing food, and drink whiskey!
This website is a collection of miscellaneous notes, thoughts, and tutorials.
Experience
Data Scientist, Associate | Booz Allen Hamilton (July 2019 \(\to\) Present)
- Lead AI/ML R&D developer with a focus in delivering cutting edge technologies to a variety of clients across the federal government. Successfully led and developed over a dozen technical solutions, leveraging AI technologies such as Natural Language Processing (NLP), Computer Vision (CV), and Reinforcement Learning (RL).
- Served as lead data analyst and consultant on the CDCs largest COVID-19 data source, which amounts to over one petabyte of patient medical records. Here I supported various research efforts by immunologists, epidemiologists, virologists, and other researchers during this response to understand COVID-19 and its impacts. Effectively applied my expertise in statistical modeling, big data distributed computing, and visualization to analyze this data and present findings. Successfully lead numerous technical presentations to high-ranking government officials. Relevant technologies include: Databricks (Apache Spark) and AWS SageMaker.
- Utilized ML-Ops best practices to deploy ML models at scale, including on-prem, Azure, and AWS cloud environments. Leveraged orchestration services, such as Apache Airflow and Kubeflow, on end-to-end frameworks such as Tensorflow Extended, Ray, and PyTorch. Thereby standardizing the process for model development, serving, and monitoring.
- Served as Agile scrum master and developer on geographically-distributed teams of over 20 individuals. Resulting in the delivery of technical solutions that consistently exceeded client expectations. Aware of all Agile processes, such as backlog grooming, sprint planning, and user story creation.
- Sample projects include: (a) detecting temporal differences in surveillance imagery using Siamese CV networks, (b) Deep RL for military Course of Action (COA) analysis, (c) Deep RL for cyber security detection and reaction, (d) ensemble learning for sequence geolocation tagging on the edge, (e) established MLOps reference architectures using Tensorflow Extended.
Machine Learning Operations and Data Engineering Consultant | LifeDNA (Feb. 2022 \(\to\) June 2022)
- Performed an internal audit of their cloud infrastructure and ML methods in order to provide recommendations.
- Crafted reference architecture and implementation to support (1) rapid training, deployment, and monitoring of over 100 ML models in a production environment and (2) scalable and modular data pipelines. Instantiated a Databricks sandbox environment and provided training on the Databricks platform, Apache Spark, and big data best practices. Relevant technologies include: Ray ML, Apache Airflow, Docker, and various AWS services.
Machine Learning Researcher | University of Hawaiʻi, at Mānoa (Jan 2018 \(\to\) June 2019)
- Trained and deployed a soft-margin Support Vector Machine (SVM) using Scikit-Learn to perform binary classification on bioinformatic data.
- Developed ETL algorithms to convert patient records into a consolidated data warehouse format.
Undergraduate Research Intern | Princeton University (May 2018 \(\to\) August 2018)
- Employed probabilistic machine learning methods to analyze the stellar inclination distribution in the Pleiades.
- Created and executed PostgreSQL scripts for database merging and querying.
- Transliterated and modernized legacy Python 2.7 scripts to Python 3.x
- Generate weekly reports for analysis of data using DASH (by Plotly) and deployed on webpages using Django.
- Part of the National Astronomy Consortium (NAC) led by the National Radio Astronomy Observatory (NRAO).
Akamai Research Intern | Gemini Observatory (June 2016 \(\to\) August 2016)
- Used Python to develop various analyses tools for the VISTA Variables in the Via Lactea (VVV) survey.
- Transliterated and modernized legacy IDL scripts into Python 3.x.
- Performed SQL queries to extract and merge data from various astronomical data sources, including GAIA and 2MASS, based on target coordinates and CCD resolution.
- In association with the Akamai Workforce Initiative.
Education
- B.S. Astrophysics, University of Hawaiʻi, at Mānoa (Spring 2019)
- High School Diploma, Kamahemeha Schools Kapālama (Spring 2015)
Academic Honors & Awards
- 2x UROP Research Grant, Undergraduate Research Opportunities Grant (Spring 2017, Spring 2019)
- NAC Research Fellow , Princeton University, National Radio Consortium (Summer 2018)
- 3x NASA Research Fellow, NASA Hawaii Space Grant Consortium (Fall 2016, Spring 2017, Fall 2017)
- 2nd Place, Undergraduate Research, AISES National Conference (Fall 2016)
- Feynman Award, American Association of Physics Teachers (AAPT) (Spring 2015)
Scholarships
Makalapua Na`auao | Kamehameha Schools Kapalama | 2016 \(\to\) 2019 |
Ka Hikina O Ka La (KHOKL) | University of Hawaii, Maui College | 2015 \(\to\) 2019 |
STEM Scholarship | Office of Native Hawaiian Affairs | Fall 2017 |
Excellence in Research Scholarship | McInerny Foundation | Fall 2015 |