Enter Portfolio :)

All About Me!




Hello, my name is Sukanya (or Suki for short)! I’m currently a PhD student at Harvard University in Engineering Sciences with a concentration in Computer Science and a secondary in Data Science. I graduated from the University of California San Diego where I majored in Bioengineering with a double minor in Data Science and Cognitive Science. I am interested in applying computational methods and applied machine learning with respect to the medical sciences. As an avid programmer and engineer, I’m very passionate about using CS and Data Science (DS) in order to help solve some of the world's most interesting problems.

I’m interested in exploring and learning about all sorts of technology and software. Some of my passions for learning surround app development, finance, geospatial analysis, computer vision, renewable energy, and healthcare. I find it incredible how CS and DS can be used in so many intersecting ways and all the opportunities that exist in these fields. If there are any interesting collaboration opportunities or would like to reach out, feel free to reach me through the "Contact Me" message form!

To learn a bit more about me, separate from school, work, and tech, I really enjoy listening to music on YouTube or Spotify, making music (I have been playing the flute for ~10 years now since middle school!), watching kdramas or anime (my favorites at the moment are Blue Lock and One Punch Man), and learning new languages (I am currently learning Korean through my university!).

Skills

Below are some of the skills that I am familiar with, and I'm always looking to learn more.

Python

5 years +

Proficient with NumPy, Pandas, Keras, TensorFlow, PyTorch, Flask, Django


SQL & R

1 year

Proficient with working with relational databases, querying, and performing data analysis.


Java

3 years +

Proficient with object-oriented programming (OOP) principles.


Git & GitHub

4 years +

Proficient with using Git and GitHub as a version control system for managing projects.


Cloud & Containerization

1 year

Proficient with using Cloud Computing and Containerization Technology: AWS, Kubernetes, Poseidon.


HTML & CSS

1 year

Adequate at using HTML and CSS for web-development.


Select Work Experiences

Undergraduate Student Researcher @ Systems Biology and Systems Medicine Lab

UCSD     La Jolla, CA     April 2024 - Sep. 2024

  • Developing an image-based methodology to stratify the heterogeneity and classify the disease state of tumors in triple-negative breast cancer (TNBC) using fluorescent microscopy images obtained from GeoMx experiments.
Subramaniam Lab Picture

Machine Learning (Pharmacometrics) Intern @ Bristol Myers Squibb

Bristol Myers Squibb     San Diego, CA     July 2023 - Sep. 2023

  • Employed machine learning techniques to identify key features for predicting diabetes using two distinct datasets. Investigated multiple scikit-learn classifiers, explainable AI techniques, and neural networks
  • Achieved up to 86% model accuracy for smaller Pima Indians Diabetes dataset and 75% accuracy for larger Diabetes Readmission dataset.
  • Implemented Generative Adversarial Networks (GANs) to augment new patient data, enhancing the project’s scope beyond feature selection
BMS Logo Picture

Product Analytics Intern @ One Medical

One Medical (Amazon)     San Francisco, CA     May 2023 - Aug. 2023

  • Worked with big data to develop an efficient data model on Snowflake, aggregating patient data to enhance performance and analytical capabilities.
  • Designed and published Tableau dashboards, visually representing 8 crucial success metrics sourced from Snowflake and Mixpanel. Utilized a pre-aggregated data model to ensure superior performance.
One Medical Logo Picture

Open Source Contributor @ Google Summer of Code

Ontario Institute for Cancer Research>     Remote     May 2023 - Sep. 2023

  • Spearheaded implementation of a CI/CD pipeline using Argo CD, Argo Workflows, AWS, GitHub, and Jenkins.
  • Successfully automated continuous integration deployments for 2 repositories using Git Hooks, with plans to expand to over 20 by the next release.
  • Dockerized critical repositories in the release pipeline to reduce manual intervention, improve curator workflows, and enhance developer productivity.
OICR Logo Picture

Undergraduate Student Researcher @ Robotic and Haptic Devices Lab

UCSD     La Jolla, CA     May 2023 - Aug. 2024

  • Part of bioengineering senior design team that aims to develop a proof-of-concept demonstration of a novel vine biomedical robot which can be steered by local actuation of responsive material
  • Evaluating and testing the attachment of heating actuators (LCEs) to different vine materials, and characterizing the performance of LCEs when activated using hydronic heating or pneumatic heating
Morimoto Lab Logo

Data Science Engineering Intern @ Medtronic

Medtronic     Northridge , CA     June 2022 - Aug. 2022

  • Further developed the Digital Twin (DT) model (works to get a set of parameter values that minimizes the amount of error between patient sensor glucose values and the fitted sensor glucose values from the DT model) for speed optimization.
  • Used Python and developed on Poseidon clusters.
  • Achieved 7.5x reduction in compute time with stable fitting (1.4% deviation in MARD), and stable parameter estimation -- 4% parameter variation for 10-minute step (against parameters fitted using max discretization at 1-minute steps).
  • Estimated 5x cost reduction in cloud resources (AWS) at scale.
Medtronic Logo Picture

Undergaduate Student Researcher @ Duarte Lab

UCSD     La Jolla, CA     June 2021 - Feb. 2024

  • Performing anomaly detection analysis at the Duarte Lab (lab studying Particle Physics using Machine Learning) using graph-based ML models for LHC data analysis to discover exotic new physics.
  • Working on development of particle graph autoencoders, unsupervised deep learning models for application in anomaly detection using Python and developed on Kubernetes clusters.
  • IRIS-HEP fellow for summer 2021

Undergraduate Student Researcher @ Zorrilla Lab

Scripps Research     La Jolla, CA     Sep. 2021 - June 2022

  • Training in bioinformatics techniques, including GWAS and whole exome sequence analysis using Python and R.
  • Troubleshooting and applying code in order to perform psychiatric genetic association analysis on the UK Biobank database. For analyzing priori genes and conduct a gene variant discovery analysis
  • Worked on the development of a random forest predictive model to determine if there is a relationship between a patient profile and whether they would have an alcohol related rehospitalization

Project Work

Research, Group, and Personal projects.




Software Projects

Fake Amazon Reviews (FARS)

FARS is a 4 person team I led to work on a Data Science/ML project upon the US Amazon Customer Reviews dataset from Amazon (2014-2015) archives to predict whether a given review is verified or unverified.

Developed KNN (K-Nearest Neighbors) and Bigrams & Random Forest Classification ML models for this analysis. Optimized the KNN classifier by working on feature selection and data cleaning - achieved around 70% test accuracy for both models

Interaction Network (IN)

This project under the Duarte Lab investigates a kind of type of graph-based autoencoder and randomized neural network architecture, the interaction network autoencoder and variational autoencoder (ie. CNN, DNN). The objective is to evaluate, against other kinds of autoencoder and variational autoencoder structures, to see which structures can be best optimized to fit in an FPGA (to meet L1 trigger requirements) that is also good at anomaly detection.

National Park Size & Diversity Analysis

This project is a simple linear regression analysis done to look into whether there is a relationship between national park size and diversity in each park done upon open source Kaggle datasets.

Power Outages Analysis

This project is a simple linear regression analysis done to look into whether there is a relationship between national park size and diversity in each park done upon open source Kaggle datasets.

Effects of Food Accessibility on Type II Diabetes

As part of 5 person team worked on project that explored the prevalence of Type II diabetes in accordance to Californian adults' access to fresh foods, their race, and their gender in 2017.

The ordinal data was processed into seven column elements through one-hot encoding: if they have diabetes, type of diabetes, their accessibility to fresh foods, their accessibility to affordable fresh foods, race, if they are prediabetic, and if they are a female. We performed univariate analysis to determine the distribution of data among each element as well as using multivariate analysis and scatterplot to determine the relationship between them. Furthermore, we used a decision tree in order to further analyze the relationship of diabetes and prediabetes among those other elements.

Our results showed that there is a high volume of adults who have access to affordable fresh foods, however there is a higher prevalence of type II diabetes for those have greater accessibility to affordable fresh foods.

Song Visualizer (Spotify)

This is a project I have recently started on which by taking in a song title and corresponding artist name, will generate a rating for the overall mood/vibe of the song based on various musical indicators and the sentiment of the song lyrics.

My goal is to qualitatively describe the song using an "emoji"/emotion visual so that the user can filter through what kinds of songs they want to listen to by how they're feeling. Plan to showcasethe analysis through an online web dashboard and goal for deployment for Nov 2022.

Inspiration for this idea came when I was having difficulty finding songs on my playlist that were of a particular mood from an unorganized playlist (Liked Songs) by style or genre.

News Articles Classifier

This project incorporated web scraping using BeautifulSoup and predictive machine learning using a HuggingFace zero-shot-classification model which would scrape an article, determine what topic that article is most correlated to and label that article under 1 of 13 topic categories : "politics", "international news", "celebrity", "sports", "health", "nutrition", "fitness", "beauty", "business", "economy", "finance", "technology", "science", "lifestyle".

Personal Portfolio

This personal portfolio is one of my first projects getting to work with and learn how to use HTML, CSS, and JavaScript to create a nice deliverable.

Mech-E / Design Projects

Construct-A-Thon: Demolition Robot

Amazon Redesign - Data Driven UX and Product Design Case Study

VEX Robotics

VEX Robotics is a robotics competition where different teams in high school division would build a robot based on the year's game and compete.

  • 2017 - Qualified for World Championships
  • 2018, 2019, 2020 - Qualified for State Championships

Publications

Contact Me


Email: sukanyakrishna@g.harvard.edu ; sukanya.krishna@gmail.com


If you may have any questions for me, opportunities to share, or would like to chat, please let me know by email or by filling out the messaging form fields below!