Andrew Schell

Logo

Resume | LinkedIn | GitHub

I am pursing a MIXx Micro Masters degree in Statistics and Data Science from Massachusetts Institute of Technology online via Coursera.

I'm a Full Stack Product Data Analyst, my research and study of over 400 github repos of engineering interests are mainly in the areas of data cleaning, NLP chatbot agents, artificial intelligence, and algorithms that identify patterns for business or social value in user-friendly software engineering..

Portfolio


Natural Language Processing

Twitter Sentiment Analysis: Lets Make Friends

0 of 7

The Let’s Make (Twitter) Friends in Seattlehackathon was hosted by a Institute for Systems Biology in Seattle to encourage python users on writing useful code integrating API calls and AWS hosting. AWARD: First Prize Most Entertaining and Useful Social Media Engagement.

() We trained a Markov agent on episodes of Silicon Valley and Tweets from Startups that were mashed with @marvelavengers to automatically generate computationally efficient replacement texts that increase engagement and transform (otherwise) mundane Traffic (noisy) reports into engaging, educational social media engagement for about $12 a month in AWS charges.

https://s3-us-west-2.amazonaws.com/justindevelopspublic/aws-puppy-20160910.pdf tinyurl.com/j8d53vs @Justin_Devs

RankUp: Lets Find Jobs

0 of 7

The hackathon was hosted by a lifescience company in Seattle to encourage API education.

KimBot

MATH341: Statistical t-test implementation

1 of 7

t-test gif here My complete implementation of assignments and projects in MATH341: Applied Statistical Methods I by Bellevue College (Winter, 2021) (GitHub)

MATH340: Automated K-Means Computation

2 of 7

KMEANS GIF here

Multi-Environment bug in Python 3.7

3 of 7

Fixed bug in Python 3.7 for multi-environment computation. bug list or other GIF here

Python Constants Contributor In October 2022 python 3.7 was being deprecated I had terabytes of data that needed to be migrated and requiring a tool on Python 3.7 while using a multi-environment that I was migrating but the code wasn’t working. After searching, I found the repo for constants thankfully was written in python and once I was able to dig into the code base it was found that the TOX.ini file was only updated through python 3.3 a PR was submitted and I was able to process 11 terrabytes of data and while python 3.7 is officially deprecated I can still use it on select tools. Future updates are in the que to update this tool through Python 3.10

Morningstar API integration with SQL database bug

4 of 7

MORNINGSTAR LOGO here

Mstables Contributor Having worked in finance I knew I needed to study the math and stats behind it before I tried my hand at trading. What I chose in stead was to build my data engineering skills by contributing and improving on well-written but deprecated code. I found that a few years ago while webscraping the Morningstar.com and NASDAQ API and saving to databases after transformation. natural starting point for. Expanding upon the great code that Caiobran we automated the SQL database setup and are now working to expand financial source data to include any existing flatfile or databases and the suite of tools from Panda Datareader

  1. Tiingo
  2. IEX
  3. Alpha Vantage,
  4. Econdb
  5. Enigma,
  6. Quandl
  7. St.Louis FED (FRED)
  8. Kenneth French’s data library
  9. World Bank
  10. OECD
  11. Eurostat
  12. Thrift Savings Plan
  13. Nasdaq
  14. Stooq
  15. MOEX
  16. Naver Finance
  17. Yahoo Finance

MATH340: Automated K-Means Computation

5 of 7

When you are starting a project and need to assume or at least explore the data without introducing bias then k-means a frequent first stop along with time series or PCA might also work. This was a way to give automated reporting once data is found is this my implementation that I’ve open-sourced with a version that can automatically find it for the user.

FUTURE

DA460: Implementation of Bert Glove Transformer for Resume ingestion

6 of 7

DA460: Testing of Bert-Ernie Glove Transformer

6 of 7

What started as an expansion upon my early work in Chatbots evolved to need a more efficent solution found in a hybrid implementation of Hugging Face BERT model for expansion to an NLP chatbot for a class project soon grew to beyond the scope and lack of GPU access. BERT stands for Bidirectional Encoder Representations from Transformer ERNIE is my prototype Emotional Recognition Natural Language Intelligence Engine BERT-ERNIE gif As part of my research for Machine Learning is to build a model that can process and understand emotional intent or causality from conversational logs with people.

References:

Semantic Scholar - Smart NLP Architecture